PT  - JOURNAL ARTICLE
AU  - Fabrizio Kuruc
AU  - Harald Binder
AU  - Moritz Hess
TI  - Stratified neural networks in a time-to-event setting
AID  - 10.1101/2021.02.01.429169
DP  - 2021 Jan 01
TA  - bioRxiv
PG  - 2021.02.01.429169
4099  - http://biorxiv.org/content/early/2021/02/02/2021.02.01.429169.short
4100  - http://biorxiv.org/content/early/2021/02/02/2021.02.01.429169.full
AB  - Deep neural networks are now frequently employed to predict survival conditional on omics-type biomarkers, e.g. by employing the partial likelihood of Cox proportional hazards model as loss function. Due to the generally limited number of observations in clinical studies, combining different data-sets has been proposed to improve learning of network parameters. However, if baseline hazards differ between the studies, the assumptions of Cox proportional hazards model are violated. Based on high dimensional transcriptome profiles from different tumor entities, we demonstrate how using a stratified partial likelihood as loss function allows for accounting for the different baseline hazards in a deep learning framework. Additionally, we compare the partial likelihood with the ranking loss, which is frequently employed as loss function in machine learning approaches due to its seemingly simplicity. Using RNA-seq data from the Cancer Genome Atlas (TCGA) we show that use of stratified loss functions leads to an overall better discriminatory power and lower prediction error compared to their nonstratified counterparts. We investigate which genes are identified to have the greatest marginal impact on prediction of survival when using different loss functions. We find that while similar genes are identified, in particular known prognostic genes receive higher importance from stratified loss functions. Taken together, pooling data from different sources for improved parameter learning of deep neural networks benefits largely from employing stratified loss functions that consider potentially varying baseline hazards. For easy application, we provide PyTorch code for stratified loss functions and an explanatory Jupyter notebook in a GitHub repository.Competing Interest StatementThe authors have declared no competing interest.