TY - JOUR T1 - Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model JF - bioRxiv DO - 10.1101/574574 SP - 574574 AU - F. William Townes AU - Stephanie C. Hicks AU - Martin J. Aryee AU - Rafael A. Irizarry Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/03/11/574574.abstract N2 - Single cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero-inflation. Current normalization pro-cedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We pro-pose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform current practice in a downstream clustering assessment using ground-truth datasets. ER -