RT Journal Article
SR Electronic
T1 Modern machine learning far outperforms GLMs at predicting spikes
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 111450
DO 10.1101/111450
A1 Ari S. Benjamin
A1 Hugo L. Fernandes
A1 Tucker Tomlinson
A1 Pavan Ramkumar
A1 Chris VerSteeg
A1 Lee Miller
A1 Konrad Paul Kording
YR 2017
UL http://biorxiv.org/content/early/2017/02/24/111450.abstract
AB Neuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. Modern machine learning techniques have the potential to perform better. Here we directly compared GLMs to three leading methods: feedforward neural networks, gradient boosted trees, and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods produced far better spike predictions and were less sensitive to the preprocessing of features. XGBoost and the ensemble were the best-performing methods and worked well even on neural data with very low spike rates. This overall performance suggests that tuning curves built with GLMs are at times inaccurate and can be easily improved upon. Our publicly shared code uses standard packages and can be quickly applied to other datasets. Encoding models built with machine learning techniques more accurately predict spikes and can offer meaningful benchmarks for simpler models.