PT - JOURNAL ARTICLE AU - Rizzetto, Simone AU - Eltahla, Auda A. AU - Lin, Peijie AU - Bull, Rowena AU - Lloyd, Andrew R. AU - Ho, Joshua W. K. AU - Venturi, Vanessa AU - Luciani, Fabio TI - Impact of sequencing depth and read length on single cell RNA sequencing data: lessons from T cells AID - 10.1101/134130 DP - 2017 Jan 01 TA - bioRxiv PG - 134130 4099 - http://biorxiv.org/content/early/2017/05/04/134130.short 4100 - http://biorxiv.org/content/early/2017/05/04/134130.full AB - Single cell RNA sequencing (scRNA-seq) has shown great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant sub-populations of T cells, and notably the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, such as RNA library capture, cell quality, and sequencing output have been suggested to affect the quality of scRNA-seq data, but these factors have not been systematically examined.We studied the effect of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. TCRαβ were detected in 1,027 cells (79%), with a success rate between 81% and 100% for datasets with at least 250,000 (PE) reads of length >50 bp.Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.