ABSTRACT
Genetic signal detection in genome-wide association studies (GWAS) is enhanced by pooling small signals from multiple Single Nucleotide Polymorphism (SNP), e.g. across genes and pathways. Because genes are believed to influence traits via gene expression, it is of interest to combine information from expression Quantitative Trait Loci (eQTLs) in a gene or genes in the same pathway. Such methods, widely referred as transcriptomic wide association analysis (TWAS), already exist for gene analysis. Due to the possibility of eliminating most of the confounding effect of linkage disequilibrium (LD) from TWAS gene statistics, pathway TWAS methods would be very useful in uncovering the true molecular bases of psychiatric disorders. However, such methods are not yet available for arbitrarily large pathways/gene sets. This is possibly due to it quadratic (in the number of SNPs) computational burden for computing LD across large regions. To overcome this obstacle, we propose JEPEGMIX2-P, a novel TWAS pathway method that i) has a linear computational burden, ii) uses a large and diverse reference panel (33K subjects), iii) is competitive (adjusts for background enrichment in gene TWAS statistics) and iv) is applicable as-is to ethnically mixed cohorts. To underline its potential for increasing the power to uncover genetic signals over the state-of-the-art and commonly used non-transcriptomics methods, e.g. MAGMA, we applied JEPEGMIX2-P to summary statistics of most large meta-analyses from Psychiatric Genetics Consortium (PGC). While our work is just the very first step toward clinical translation of psychiatric disorders, PGC anorexia results suggest a possible avenue for treatment.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This manuscript builds on the theoretical foundations laid by our previous Bioinformatics papers (Chatzinakos, et al., 2017, JEPEGMIX2: improved gene-level joint analysis of eQTLs in cosmopolitan cohorts. Bioinformatics, 2018. 34(2): p. 286-288; Lee, D., et al., JEPEGMIX: gene-level joint analysis of functional SNPs in cosmopolitan cohorts. Bioinformatics, 2016. 32(2): p. 295-297; Lee, D., et al., JEPEG: a summary statistics-based tool for gene-level joint testing of functional variants. Bioinformatics, 2015. 31(8): p. 1176-82). However, i) while very successful, JEPEGMIX2 relies on genes analysis (first phase) and ii) due to the O(m^2) computational burden for computing linkage disequilibrium (LD) between numerous (m) SNPs across large regions (even chromosome arms), transcriptomic methods are not yet applicable to arbitrarily large pathways/gene sets, we proposed JEPEGMIX2-P. The method proposed in this manuscript, JEPEGMIX2-P, which is an extension of JEPEGMIX2, 1) automatically estimates the ethnic composition of the cohort for a) imputing unmeasured eQTLs and b) accurately estimating LD for gene statistics, 2) uses estimated LD and GWAS summary statistics to rapidly test for the association between trait and expression of genes even in the largest pathways and 3) for more accurate imputation and LD estimation, our method uses a novel, larger reference panel consisting of 33,000 subjects, including 11,000 Han Chinese. We believe that the proposed method would provide a great/fast discovery tool for applied human genetic researchers, regardless of their primary phenotype of interest.