RT Journal Article SR Electronic T1 Modern machine learning far outperforms GLMs at predicting spikes JF bioRxiv FD Cold Spring Harbor Laboratory SP 111450 DO 10.1101/111450 A1 Ari S. Benjamin A1 Hugo L. Fernandes A1 Tucker Tomlinson A1 Pavan Ramkumar A1 Chris VerSteeg A1 Lee Miller A1 Konrad Paul Kording YR 2017 UL http://biorxiv.org/content/early/2017/02/24/111450.abstract AB Neuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. Modern machine learning techniques have the potential to perform better. Here we directly compared GLMs to three leading methods: feedforward neural networks, gradient boosted trees, and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods produced far better spike predictions and were less sensitive to the preprocessing of features. XGBoost and the ensemble were the best-performing methods and worked well even on neural data with very low spike rates. This overall performance suggests that tuning curves built with GLMs are at times inaccurate and can be easily improved upon. Our publicly shared code uses standard packages and can be quickly applied to other datasets. Encoding models built with machine learning techniques more accurately predict spikes and can offer meaningful benchmarks for simpler models.