Abstract
Gene co-expression networks can capture biological relationships between genes, and are important tools in predicting gene function and understanding disease mechanism. We show that artifacts such as batch effects in gene expression data confound commonly used network reconstruction algorithms. We then demonstrate, both theoretically and empirically, that principal component correction of gene expression measurements prior to network inference can reduce false discoveries. Using expression data from the GTEx project in multiple tissues and hundreds of individuals, this approach improves precision and recall in the networks reconstructed.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.