Proteome-Wide Association Studies for Blood Lipids and Comparison with Transcriptome-Wide Association Studies

Blood lipid traits are treatable and heritable risk factors for heart disease, a leading cause of mortality worldwide. Although genome-wide association studies (GWAS) have discovered hundreds of variants associated with lipids in humans, most of the causal mechanisms of lipids remain unknown. To better understand the biological processes underlying lipid metabolism, we investigated the associations of plasma protein levels with total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL), and low-density lipoprotein cholesterol (LDL) in blood. We trained protein prediction models based on samples in the Multi-Ethnic Study of Atherosclerosis (MESA) and applied them to conduct proteome-wide association studies (PWAS) for lipids using the Global Lipids Genetics Consortium (GLGC) data. Of the 749 proteins tested, 42 were significantly associated with at least one lipid trait. Furthermore, we performed transcriptome-wide association studies (TWAS) for lipids using 9,714 gene expression prediction models trained on samples from peripheral blood mononuclear cells (PBMCs) in MESA and 49 tissues in the Genotype-Tissue Expression (GTEx) project. We found that although PWAS and TWAS can show different directions of associations in an individual gene, 40 out of 49 tissues showed a positive correlation between PWAS and TWAS signed p-values across all the genes, which suggests a high-level consistency between proteome-lipid associations and transcriptome-lipid associations.


S1 Additional results for all lipids
Table S1: Data characteristics of the MESA dataset.For continuous variables, the mean and standard deviation (in parentheses) are displayed.

S2 Additional results for low-density lipoprotein (LDL)
Figure S2: Comparison of APOE's protein and gene expression predictive model weights with the LDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S3: GWAS for LDL and prediction models for FCGR2B's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S4: Comparison of FCGR2B's protein and gene expression predictive model weights with the LDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S5: GWAS for LDL and prediction models for LILRB2's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S6: Comparison of LILRB2's protein and gene expression predictive model weights with the LDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S7: GWAS for LDL and prediction models for MICB's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S8: Comparison of MICB's protein and gene expression predictive model weights with the LDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.

S3 Additional results for total cholesterol (TC)
Figure S9: GWAS for TC and prediction models for APOE's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S10: Comparison of APOE's protein and gene expression predictive model weights with the TC GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S11: GWAS for TC and prediction models for FCGR2B's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S12: Comparison of FCGR2B's protein and gene expression predictive model weights with the TC GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S13: GWAS for TC and prediction models for LILRB2's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S14: Comparison of LILRB2's protein and gene expression predictive model weights with the TC GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S15: GWAS for TC and prediction models for MICB's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.

S4 Additional results for triglycerides (TG)
Figure S18: GWAS for TG and prediction models for APOE's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S19: Comparison of APOE's protein and gene expression predictive model weights with the TG GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S20: GWAS for TG and prediction models for FCGR2B's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S21: Comparison of FCGR2B's protein and gene expression predictive model weights with the TG GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S22: GWAS for TG and prediction models for LILRB2's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S23: Comparison of LILRB2's protein and gene expression predictive model weights with the TG GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S24: GWAS for TG and prediction models for MICB's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S25: Comparison of MICB's protein and gene expression predictive model weights with the TG GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.

S5 Additional results for high-density lipoprotein (HDL)
Figure S27: GWAS for HDL and prediction models for APOE's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S28: Comparison of APOE's protein and gene expression predictive model weights with the HDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S29: GWAS for HDL and prediction models for FCGR2B's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S30: Comparison of FCGR2B's protein and gene expression predictive model weights with the HDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S31: GWAS for HDL and prediction models for LILRB2's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.
Figure S32: Comparison of LILRB2's protein and gene expression predictive model weights with the HDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.
Figure S33: GWAS for HDL and prediction models for MICB's protein and gene expression levels.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.In the center and bottom panels, the size of the circles indicates the SNP's GWAS z-score.The z-scores are used to compute the weighted average of the model weights (dashed line), which has the same sign as and is proportional to the predicted effect of protein or gene expression on the GWAS outcome.

Figure S16 :
FigureS16: Comparison of MICB's protein and gene expression predictive model weights with the TC GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.

Figure S17 :
Figure S17: Comparison of MESA PBMC PWAS, MESA PBMC TWAS, and GTEx tissue-specific TWAS results for TC.Panel (a): signed log p-value and significance of association.Missing values are shown in white.Significance of association is deteremined by the false discovery rate (FDR) threshold of 0.05.Panel (b): correlation between signed log p-values of MESA PBMC PWAS and signed log p-values of each GTEx tissue-specific TWAS (i.e. the correlation between the bottom row and every other row of the grid in Panel (a)).

Figure S26 :
Figure S26: Comparison of MESA PBMC PWAS, MESA PBMC TWAS, and GTEx tissue-specific TWAS results for TG.Panel (a): signed log p-value and significance of association.Missing values are shown in white.Significance of association is deteremined by the false discovery rate (FDR) threshold of 0.05.Panel (b): correlation between signed log p-values of MESA PBMC PWAS and signed log p-values of each GTEx tissue-specific TWAS (i.e. the correlation between the bottom row and every other row of the grid in Panel (a)).

Figure S34 :
FigureS34: Comparison of MICB's protein and gene expression predictive model weights with the HDL GWAS z-scores of the SNPs.The reference and alternative alleles for GWAS and the predictive models have been aligned and reordered so that all the SNPs have positive GWAS effects.The z-scores are used to compute the weighted average of the model weights (dashed lines), which have the same signs as and are proportional to the predicted effects of protein and gene expression on the GWAS outcome.

Figure S35 :
Figure S35: Comparison of MESA PBMC PWAS, MESA PBMC TWAS, and GTEx tissue-specific TWAS results for HDL.Panel (a): signed log p-value and significance of association.Missing values are shown in white.Significance of association is deteremined by the false discovery rate (FDR) threshold of 0.05.Panel (b): correlation between signed log p-values of MESA PBMC PWAS and signed log p-values of each GTEx tissue-specific TWAS (i.e. the correlation between the bottom row and every other row of the grid in Panel (a)).