Abstract
As part of the process of preparing sequencing libraries that include unique molecular identifiers (UMIs) such as many single cell RNA-seq (scRNA-seq) libraries, a diverse template must be amplified. During amplification, spurious chimeric molecules can be formed between molecules originating in different cells. While several computational and experimental strategies have been suggested to mitigate the impact of chimeric molecules, suitable approaches for scRNA-seq experiments do not exist. We demonstrate that chimeras become increasingly problematic as samples are sequenced deeply and propose both supervised and unsupervised computational solutions. These solutions are validated in the context of a deeply sequenced species mixing experiment, and, orthogonally, using replicate PCR amplifications of the same scRNA-seq library. Our code is publicly available at https://github.com/asncd/schimera.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
atray{at}coralgenomics.com
inclusion of species mixing experiment