PT - JOURNAL ARTICLE AU - Disi Ji AU - Eric Nalisnick AU - Yu Qian AU - Richard H. Scheuermann AU - Padhraic Smyth TI - Bayesian Trees for Automated Cytometry Data Analysis AID - 10.1101/414904 DP - 2018 Jan 01 TA - bioRxiv PG - 414904 4099 - http://biorxiv.org/content/early/2018/09/19/414904.short 4100 - http://biorxiv.org/content/early/2018/09/19/414904.full AB - Cytometry is an important single cell analysis technology in furthering our understanding of cellular biological processes and in supporting clinical diagnoses across a variety hematological and immunological conditions. Current data analysis workflows for cytometry data rely on a manual process called gating to classify cells into canonical types. This dependence on human annotation significantly limits the rate, reproducibility, and scope of cytometry’s use in both biological research and clinical practice. We develop a novel Bayesian approach for automated gating that classifies cells into different types by combining cell-level marker measurements with an informative prior. The Bayesian approach allows for the incorporation of biologically-meaningful prior information that captures the domain expertise of human experts. The inference algorithm results in a hierarchically-structured classification of individual cells in a manner that mimics the tree-structured recursive process of manual gating, making the results readily interpretable. The approach can be extended in a natural fashion to handle data from multiple different samples by the incorporation of random effects in the Bayesian model. The proposed approach is evaluated using mass cytometry data, on the problems of unsupervised cell classification and supervised clinical diagnosis, illustrating the benefits of both incorporating prior knowledge and sharing information across multiple samples.