RT Journal Article SR Electronic T1 Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model JF bioRxiv FD Cold Spring Harbor Laboratory SP 574574 DO 10.1101/574574 A1 F. William Townes A1 Stephanie C. Hicks A1 Martin J. Aryee A1 Rafael A. Irizarry YR 2019 UL http://biorxiv.org/content/early/2019/03/11/574574.abstract AB Single cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero-inflation. Current normalization pro-cedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We pro-pose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform current practice in a downstream clustering assessment using ground-truth datasets.