Abstract
RNA sequencing (RNA-seq) is a powerful approach for measuring gene expression levels in cells and tissues, but it relies on high-quality RNA. We demonstrate here that statistical adjustment employing existing quality measures largely fails to remove the effects of RNA degradation when RNA quality associates with the outcome of interest. Using RNA-seq data from a molecular degradation experiment of human brain tissue, we introduce the quality surrogate variable (qSVA) analysis framework for estimating and removing the confounding effect of RNA quality in differential expression analysis. We show this approach results in greatly improved replication rates (>3x) across two large independent postmortem human brain studies of schizophrenia. Finally, we explored public datasets to demonstrate potential RNA quality confounding when comparing expression levels of different brain regions and diagnostic groups beyond schizophrenia. Our approach can therefore improve the interpretation of differential expression analysis of transcriptomic data from the human brain.