PT - JOURNAL ARTICLE AU - Christopher S. McGinnis AU - Lyndsay M. Murrow AU - Zev J. Gartner TI - DoubletFinder: Doublet detection in single-cell RNA sequencing data using artificial nearest neighbors AID - 10.1101/352484 DP - 2018 Jan 01 TA - bioRxiv PG - 352484 4099 - http://biorxiv.org/content/early/2018/06/26/352484.short 4100 - http://biorxiv.org/content/early/2018/06/26/352484.full AB - Single-cell RNA sequencing (scRNA-seq) using droplet microfluidics occasionally produces transcriptome data representing more than one cell. These technical artifacts are caused by cell doublets formed during cell capture and occur at a frequency proportional to the total number of sequenced cells. The presence of doublets can lead to spurious biological conclusions, which justifies the practice of sequencing fewer cells to limit doublet formation rates. Here, we present a computational doublet detection tool – DoubletFinder – that identifies doublets based solely on gene expression features. DoubletFinder infers the putative gene expression profile of real doublets by generating artificial doublets from existing scRNA-seq data. Neighborhood detection in gene expression space then identifies sequenced cells with increased probability of being doublets based on their proximity to artificial doublets. DoubletFinder robustly identifies doublets across scRNA-seq datasets with variable numbers of cells and sequencing depth, and predicts false-negative and false-positive doublets defined using conventional barcoding approaches. We anticipate that DoubletFinder will aid in scRNA-seq data analysis and will increase the throughput and accuracy of scRNA-seq experiments.