Abstract
In this study, we analyse RNA-Seq data from panels of human lymphoblastoid cell lines (LCLs) to identify covariation in the mRNA levels of large numbers of genes. Such large scale covariation may have biological origin or be due to technical variation in analysis (generally referred to as batch effects). We show that batch effects cannot explain this covariation by demonstrating reproducibility across different human populations and across different methods of analysis. This view is also supported by enrichment of single and combinations of transcription factors (TFs) binding to cognate promoter regions, enrichment of genes shown to be sensitive to the knockdown of individual TFs, enrichment of functional pathways, and finally enrichment of protein-protein interactions in proteins encoded by groups of covarying genes. The properties of the groups of covarying genes are therefore most readily explained by the influence of cumulative variations in the effectors of gene expression that act in trans on cognate genes. We suggest that covariation has functional outcomes by showing that covariation of 83 genes involved in the spliceosome pathway accounts for 8–16% of the variation in the alternative splicing patterns of genes expressed in human LCLs.