PT - JOURNAL ARTICLE AU - Rory M. Donovan-Maiye AU - Jackson M. Brown AU - Caleb K. Chan AU - Liya Ding AU - Calysta Yan AU - Nathalie Gaudreault AU - Julie A. Theriot AU - Mary M. Maleckar AU - Theo A. Knijnenburg AU - Gregory R. Johnson TI - A deep generative model of 3D single-cell organization AID - 10.1101/2021.06.09.447725 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.06.09.447725 4099 - http://biorxiv.org/content/early/2021/06/10/2021.06.09.447725.short 4100 - http://biorxiv.org/content/early/2021/06/10/2021.06.09.447725.full AB - Abstract We introduce a framework for end-to-end integrative modeling of 3D single-cell multi-channel fluorescent image data of diverse subcellular structures. We employ stacked conditional β-variational autoencoders to first learn a latent representation of cell morphology, and then learn a latent representation of subcellular structure localization which is conditioned on the learned cell morphology. Our model is flexible and can be trained on images of arbitrary subcellular structures and at varying degrees of sparsity and reconstruction fidelity. We train our full model on 3D cell image data and explore design trade-offs in the 2D setting. Once trained, our model can be used to impute structures in cells where they were not imaged and to quantify the variation in the location of all subcellular structures by generating plausible instantiations of each structure in arbitrary cell geometries. We apply our trained model to a small drug perturbation screen to demonstrate its applicability to new data. We show how the latent representations of drugged cells differ from unperturbed cells as expected by on-target effects of the drugs.Author summary It’s impossible to acquire all the information we want about every cell we’re interested in in a single experiment. Even just limiting ourselves to imaging, we can only image a small set of subcellular structures in each cell. If we are interested in integrating those images into a holistic picture of cellular organization directly from data, there are a number of approaches one might take. Here, we leverage the fact that of the three channels we image in each cell, two stay the same across the data set; these two channels assess the cell’s shape and nuclear morphology. Given these two reference channels, we learn a model of cell and nuclear morphology, and then use this as a reference frame in which to learn a representation of the localization of each subcellular structure as measured by the third channel. We use β-variational autoencoders to learn representations of both the reference channels and representations of each subcellular structure (conditioned on the reference channels of the cell in which it was imaged). Since these models are both probabilistic and generative, we can use them to understand the variation in the data from which they were trained, to generate instantiations of new cell morphologies, and to generate imputations of structures in real cell images to create an integrated model of subcellular organization.Competing Interest StatementThe authors have declared no competing interest.