RT Journal Article SR Electronic T1 A robust and interpretable, end-to-end deep learning model for cytometry data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.02.05.934521 DO 10.1101/2020.02.05.934521 A1 Zicheng Hu A1 Alice Tang A1 Jaiveer Singh A1 Sanchita Bhattacharya A1 Atul J. Butte YR 2020 UL http://biorxiv.org/content/early/2020/02/05/2020.02.05.934521.abstract AB Cytometry technologies are essential tools for immunology research, providing high-throughput measurements of the immune cells at the single-cell level. Traditional approaches in interpreting and using cytometry measurements include manual or automated gating to identify cell subsets from the cytometry data, providing highly intuitive results but may lead to significant information loss, in that additional details in measured or correlated cell signals might be missed. In this study, we propose and test a deep convolutional neural network for analyzing cytometry data in an end-to-end fashion, allowing a direct association between raw cytometry data and the clinical outcome of interest. Using nine large CyTOF studies from the open-access ImmPort database, we demonstrated that the deep convolutional neural network model can accurately diagnose the latent cytomegalovirus (CMV) in healthy individuals, even when using highly heterogeneous data from different studies. In addition, we developed a permutation-based method for interpreting the deep convolutional neural network model and identified a CD27-CD94+ CD8+ T cell population significantly associated with latent CMV infection. Finally, we provide a tutorial for creating, training and interpreting the tailored deep learning model for cytometry data using Keras and TensorFlow (github.com/hzc363/DeepLearningCyTOF).