TY - JOUR T1 - Modern machine learning far outperforms GLMs at predicting spikes JF - bioRxiv DO - 10.1101/111450 SP - 111450 AU - Ari S. Benjamin AU - Hugo L. Fernandes AU - Tucker Tomlinson AU - Pavan Ramkumar AU - Chris VerSteeg AU - Lee Miller AU - Konrad Paul Kording Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/02/24/111450.abstract N2 - Neuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. Modern machine learning techniques have the potential to perform better. Here we directly compared GLMs to three leading methods: feedforward neural networks, gradient boosted trees, and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods produced far better spike predictions and were less sensitive to the preprocessing of features. XGBoost and the ensemble were the best-performing methods and worked well even on neural data with very low spike rates. This overall performance suggests that tuning curves built with GLMs are at times inaccurate and can be easily improved upon. Our publicly shared code uses standard packages and can be quickly applied to other datasets. Encoding models built with machine learning techniques more accurately predict spikes and can offer meaningful benchmarks for simpler models. ER -