Abstract
Genome-wide association studies (GWAS) have implicated specific alleles and genes as risk factors for numerous complex traits. However, translating GWAS results into biologically and therapeutically meaningful discoveries remains extremely challenging. Most GWAS results identify noncoding regions of the genome, suggesting that differences in gene regulation are the major driver of trait variability. To better integrate GWAS results with gene regulatory polymorphisms, we previously developed PrediXcan (also known as “transcriptome-wide association studies” or TWAS), which maps SNPs to predicted gene expression using GWAS data. In this study, we developed RatXcan, a framework that extends this methodology to outbred heterogeneous stock (HS) rats. RatXcan accounts for the close familial relationships among HS rats by modeling the relatedness with a random effect that encodes the genetic relatedness. RatXcan also corrects for polygenic-driven inflation because of the equivalence between a relatedness random effect and the infinitesimal polygenic model. To develop RatXcan, we trained transcript predictors for 8,934 genes using reference genotype and expression data from five rat brain regions. We found that the cis genetic architecture of gene expression in both rats and humans was sparse and similar across brain tissues. We tested the association between predicted expression in rats and two example traits (body length and BMI) using phenotype and genotype data from 5,401 densely genotyped HS rats and identified a significant enrichment between the genes associated with rat and human body length and BMI. Thus, RatXcan represents a valuable tool for identifying the relationship between gene expression and phenotypes across species and paves the way to explore shared biological mechanisms of complex traits.
Author Summary Understanding how genetic variation affects phenotypic variation is critical to leveraging the wealth of genetic studies to make biologically and therapeutically useful discoveries. Since most of the genetic loci associated with complex diseases are regulatory in nature—meaning that they do not alter protein coding but rather subtly affect gene expression—transcriptome-wide association studies have been developed. However, these apply only to human data where large samples of unrelated individuals are available. For animal models, relatedness is much higher, causing higher false-positive rates. We propose a computationally efficient method to address this problem and find shared biology between humans and rats. Taken together, our development paves the way to further explore shared biological mechanisms of complex traits across species.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
We have edited the discussion section and added additional figures to show that the calibration is robust across a large range of values.