PT - JOURNAL ARTICLE AU - Richard Dinga AU - Lianne Schmaal AU - Brenda W.J.H. Penninx AU - Dick J. Veltman AU - Andre F. Marquand TI - Controlling for effects of confounding variables on machine learning predictions AID - 10.1101/2020.08.17.255034 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.08.17.255034 4099 - http://biorxiv.org/content/early/2020/08/18/2020.08.17.255034.short 4100 - http://biorxiv.org/content/early/2020/08/18/2020.08.17.255034.full AB - Machine learning predictive models are being used in neuroimaging to predict information about the task or stimuli or to identify potentially clinically useful biomarkers. However, the predictions can be driven by confounding variables unrelated to the signal of interest, such as scanner effect or head motion, limiting the clinical usefulness and interpretation of machine learning models. The most common method to control for confounding effects is regressing out the confounding variables separately from each input variable before machine learning modeling. However, we show that this method is insufficient because machine learning models can learn information from the data that cannot be regressed out. Instead of regressing out confounding effects from each input variable, we propose controlling for confounds post-hoc on the level of machine learning predictions. This allows partitioning of the predictive performance into the performance that can be explained by confounds and performance independent of confounds. This approach is flexible and allows for parametric and non-parametric confound adjustment. We show in real and simulated data that this method correctly controls for confounding effects even when traditional input variable adjustment produces false-positive findings.Competing Interest StatementThe authors have declared no competing interest.