PT - JOURNAL ARTICLE AU - Tim Pyrkov AU - Konstantin Slipensky AU - Mikhail Barg AU - Alexey Kondrashin AU - Boris Zhurov AU - Alexander Zenin AU - Mikhail Pyatnitskiy AU - Leonid Menshikov AU - Sergei Markov AU - Peter O. Fedichev TI - Extracting biological age from biomedical data via deep learning: too much of a good thing? AID - 10.1101/219162 DP - 2017 Jan 01 TA - bioRxiv PG - 219162 4099 - http://biorxiv.org/content/early/2017/11/16/219162.short 4100 - http://biorxiv.org/content/early/2017/11/16/219162.full AB - Aging-related physiological changes are systemic and, at least in humans, are linearly associated with age. Therefore, linear combinations of physiological measures trained to estimate chronological age have recently emerged as a practical way to quantify aging in the form of biological age. Aging acceleration, defined as the difference between the predicted and chronological age was found to be elevated in patients with major diseases and is predictive of mortality. In this work, we compare three increasingly accurate biological age models: metrics derived from unsupervised Principal Components Analysis (PCA), alongside two supervised biological age models; a multivariate linear regression and a state-of-the-art deep convolution neural network (CNN). All predictions were made using one-week long locomotor activity records from a 2003-2006 National Health and Nutrition Examination Survey (NHANES) dataset. We found that application of the supervised approaches improves the accuracy of the chronological age estimation at the expense of a loss of the association between the aging acceleration predicted by the model and all-cause mortality. Instead, we turned to the NHANES death register and introduced a novel way to train parametric proportional hazards models in a form suitable for out-of-the-box implementation with any modern machine learning software. Finally, we characterized a proof-of-concept example, a separate deep CNN trained to predict mortality risks that outperformed any of the biological age or simple linear proportional hazards models. Our findings demonstrate the emerging potential of combined wearable sensors and deep learning technologies for applications involving continuous health risk monitoring and real-time feedback to patients and care providers.