RT Journal Article SR Electronic T1 Airpart: Interpretable statistical models for analyzing allelic imbalance in single-cell datasets JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.10.15.464546 DO 10.1101/2021.10.15.464546 A1 Wancen Mu A1 Hirak Sarkar A1 Avi Srivastava A1 Kwangbom Choi A1 Rob Patro A1 Michael I. Love YR 2021 UL http://biorxiv.org/content/early/2021/10/16/2021.10.15.464546.abstract AB Motivation Allelic expression analysis aids in detection of cis-regulatory mechanisms of genetic variation which produce allelic imbalance (AI) in heterozygotes. Measuring AI in bulk data lacking time or spatial resolution has the limitation that cell-type-specific (CTS), spatial-, or time-dependent AI signals may be dampened or not detected.Results We introduce a statistical method airpart for identifying differential CTS AI from single-cell RNA-sequencing (scRNA-seq) data, or other spatially- or time-resolved datasets. airpart outputs discrete partitions of data, pointing to groups of genes and cells under common mechanisms of cis-genetic regulation. In order to account for low counts in single-cell data, our method uses a Generalized Fused Lasso with Binomial likelihood for partitioning groups of cells by AI signal, and a hierarchical Bayesian model for AI statistical inference. In simulation, airpart accurately detected partitions of cell types by their AI and had lower RMSE of allelic ratio estimates than existing methods. In real data, airpart identified differential AI patterns across cell states and could be used to define trends of AI signal over spatial or time axes.Availability The airpart package is available as an R/Bioconductor package at https://bioconductor.org/packages/airpart.Competing Interest StatementR.P. is a co-founder of Ocean Genomics Inc