PT - JOURNAL ARTICLE AU - F. William Townes AU - Stephanie C. Hicks AU - Martin J. Aryee AU - Rafael A. Irizarry TI - Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model AID - 10.1101/574574 DP - 2019 Jan 01 TA - bioRxiv PG - 574574 4099 - http://biorxiv.org/content/early/2019/03/11/574574.short 4100 - http://biorxiv.org/content/early/2019/03/11/574574.full AB - Single cell RNA-Seq (scRNA-Seq) profiles gene expression of individual cells. Recent scRNA-Seq datasets have incorporated unique molecular identifiers (UMIs). Using negative controls, we show UMI counts follow multinomial sampling with no zero-inflation. Current normalization pro-cedures such as log of counts per million and feature selection by highly variable genes produce false variability in dimension reduction. We pro-pose simple multinomial methods, including generalized principal component analysis (GLM-PCA) for non-normal distributions, and feature selection using deviance. These methods outperform current practice in a downstream clustering assessment using ground-truth datasets.