Abstract
Large cohorts of human induced pluripotent stem cells (iPSCs) from healthy donors are a potentially powerful tool for investigating the relationship between genetic variants and cellular phenotypes. Here we integrate high content imaging, gene expression and DNA sequence datasets from over 100 human iPSC lines to explore the genetic basis of inter-individual variability in cell behaviour. By applying a dimensionality reduction approach, Probabilistic Estimation of Expression Residuals (PEER), we extracted factors that captured the effects of intrinsic (genetic) and extrinsic (environmental) conditions. We identified genes that correlated in expression with intrinsic and extrinsic PEER factors and mapped outlier cell behaviour to expression of genes containing rare deleterious SNVs. Our study thus establishes a strategy for determining the genetic basis of inter-individual variability in cell behaviour.