PT - JOURNAL ARTICLE AU - Sayols, Sergi AU - Scherzinger, Denise AU - Klein, Holger TI - dupRadar: a Bioconductor package for the assessment of PCR artifacts in RNA-Seq data AID - 10.1101/046243 DP - 2016 Jan 01 TA - bioRxiv PG - 046243 4099 - http://biorxiv.org/content/early/2016/03/29/046243.short 4100 - http://biorxiv.org/content/early/2016/03/29/046243.full AB - Background PCR clonal artefacts originating from NGS library preparation can affect both genomic as well as RNA-Seq applications when protocols are pushed to their limits. In RNA-Seq however the artifactual reads are not easy to tell apart from normal read duplication due to natural over-sequencing of highly expressed genes. Especially when working with little input material or single cells assessing the fraction of duplicate reads is an important quality control step for NGS data sets. Up to now there are only tools to calculate the global duplication rates that do not take into account the effect of gene expression levels which leaves them of limited use for RNA-Seq data.Results Here we present the tool dupRadar, which provides an easy means to distinguish artefactual from natural duplicate reads in RNA-Seq data. dupRadar assesses the fraction of duplicate reads per gene dependent on the expression level. Apart from the Bioconductor package dupRadar we provide shell scripts for easy integration into processing pipelines.Conclusions The Bioconductor package dupRadar offers straight-forward methods to assess RNA-Seq datasets for quality issues with PCR duplicates. It is aimed towards simple integration into standard analysis pipelines as a default QC metric that is especially useful for low-input and single cell RNA-Seq data sets.RPKreads per kilobasePCRpolymerase chain reactionUMIunique molecular identifiersQCquality controlChIPchromatin immunoprecipitationbpbase pairNGSnext-generation sequencing