Abstract
Background Thoracic aortic aneurysm (TAA) and abdominal aortic aneurysm (AAA) are known to have a strong genetic component.
Methods and Results In a genome-wide association study (GWAS) using the UK Biobank, we analyzed the genomes of 1,363 individuals with AAA compared to 27,260 age, ancestry, and sex-matched controls (1:20 case:control study design). A similar analysis was repeated for 435 individuals with TAA compared to 8,700 controls. Polymorphism with minor allele frequency (MAF) >0.5% were evaluated.
We identified novel loci near LINC01021, ATOH8 and JAK2 genes that achieved genome-wide significance for AAA (p-value <5×10−8), in addition to three known loci. For TAA, three novel loci in CTNNA3, FRMD6 and MBP achieved genome-wide significance. There was no overlap in the genes associated with AAAs and TAAs. Additionally, we identified a linkage group of high-frequency variants (MAFs ∼10%) encompassing FBN1, the causal gene for Marfan syndrome, which was associated with TAA. In Finngen PheWeb, this FBN1 haplotype was associated with aortic dissection. Finally, we found that baseline bradycardia was associated with TAA, but not AAA.
Conclusions Our GWAS found that AAA and TAA were associated with distinct sets of genes, suggesting distinct underlying genetic architecture. We also found association between baseline bradycardia and TAA. These findings, including JAK2 association, offer plausible mechanistic and therapeutic insights. We also found a common FBN1 linkage group that is associated with TAA and aortic dissection in patients who do not have Marfan syndrome. These FBN1 variants suggest shared pathophysiology between Marfan disease and sporadic TAA.
Condensed Abstract In genome-wide association study (GWAS) of thoracic aortic aneurysm (TAA) and abdominal aortic aneurysm (AAA) using UK Biobank database, we found 3 novel loci associated with TAA, and 3 novel loci associated AAA. We also found significant association between baseline bradycardia and TAA. These findings, including JAK2 association, offer plausible mechanistic and therapeutic insights. Additionally, we identified a common FBN1 linkage group associated with TAA in patients who do not have Marfan syndrome. In the FinnGen cohort, this haplotype is associated with aortic dissection. These results suggest a shared pathophysiology between Marfan disease and sporadic TAA.
Study Limitations As with any GWAS study, the discovery of novel loci associated with aortopathies does not prove functional causality, and the findings described herein needs to be validated by analysis of other databases, ideally in a patient population of more diverse genetic origins than the UK Biobank. The use of the ICD10 codes to classify disease carriers and noncarriers in a population cohort may not be the most accurate assessment of prevalence of aortopathies. The association between baseline bradycardia and TAA does not take into account the concurrent use of medications that may impact heart rate.
Highlights
Identification of 3 novel AAA-associated loci near LINC01021, ATOH8 and JAK2 genes.
Identification of 3 novel TAA-associated loci near CTNNA3, FRMD6 and MBP genes.
Identification of a linkage group of common FBN1 variants associated with non-syndromic TAA in the UK Biobank and with aortic dissection in the FinnGen cohort, strengthening the evidence for a shared pathophysiology between Marfan disease and nonsyndromic aortopathy.
Association between baseline bradycardia and TAA but not AAA.
Introduction
Aortic aneurysms (AA) carry a significant burden of morbidity and mortality. In 2018, thoracic aortic aneurysms (TAA) and abdominal aortic aneurysms (AAA) together were responsible for 9,923 deaths in the United States, typically from complications such as aortic dissection and rupture(1). AA primarily affect elderly males in the sixth and seventh decades of life with risk factors of tobacco use, hypertension and atherosclerosis(1). However, AAs found in patients younger than 65 years are more often attributed to genetic predisposition. For AAA, individuals with first-degree family members with the disease are at two-fold risk of developing AAA as compared to patients with no family history(2). Genetic studies of TAA and AAA have revealed genetic heterogeneity and polygenic inheritance patterns with variable disease penetrance(2–4).
For thoracic aortic aneurysms and dissection (TAAD), rare monogenetic syndromic disorders such as Marfan, Ehlers-Danlos and Loeys-Dietz syndromes are known to dramatically increase risk in younger individuals(5). This risk results from mutations in FBN1, COL3A1, and genes that encode TGF-β signaling proteins respectively(5). Beyond these, up to one-fifth of patients with TAAD have a familial predisposition toward aneurysmal disease(5). For instance, mutations in FBN1 (fibrillin-1), the causal gene for Marfan syndrome, may increase the risk of thoracic aortic aneurysms or dissections even in individuals who do not have Marfan syndrome(4, 6, 7). Additionally, rare mutations in ACTA2, encoding smooth muscle protein alpha (α)-2 actin, may account for up to 14% of familial forms of TAAs(8). Furthermore, genetic studies of TAAD show it is closely associated with bicuspid aortic valve, another condition with strong heritability(5, 9).
Since AA has a strong genetic component in certain individuals, an enhanced understanding of these factors may ultimately aid the early detection of this silent disease before it progresses into life-threatening aortic dissections and ruptures. There is evidence to suggest that a strong personal or family history of aneurysms or dissection in individuals under the age of 50 should be an indication for genetic testing to diagnose inherited aortopathy(10). In appropriately selected patients with suspected familial aneurysmal disease, the yield of genetic testing could be as high as 36%(10). Yet, much of the underlying genetic risk factors remain unknown.
In this paper, we address the discovery of novel genetic loci that may portend increased risk for the development of thoracic and abdominal aortic aneurysms based on a genome-wide association study (GWAS) using data from the UK Biobank. The UK Biobank allows for powerful association studies with tremendous potential for expanding knowledge on the genetic basis of aortic aneurysms, as well as uncovering novel genetic variants associated with clinical diseases.
Methods
Ethical Approval
The present study, which involved deidentified data obtained from the UK Biobank Resource under Application Number 49852, received the proper ethical oversight, including the determination by the University of Maryland, Baltimore Institutional Review Board that the study is not human research (IRB #: HF-00088022).
Study population
The UK Biobank, which was used for the GWAS presented here, is a large, ongoing prospective cohort study that recruited 502,682 UK participants between 2006-2010, ranging in age from 40-69 years at the time of recruitment. Extensive health-related records were collected from these participants, including clinical and genetic data, with over 820,000 genotyped single nucleotide polymorphisms (SNPs) and up to 90 million imputed variants available for most individuals. We carried out a genome-wide association study using the UK Biobank to interrogate the genome for statistically significant associations between SNPs and clinical manifestations of abdominal and thoracic aortic aneurysms at the population level.
Genome-wide association study (GWAS)
Using data from the UK Biobank Resource on 487,310 subjects with imputed genotypes, we performed quality control by removing those with genetic relatedness exclusions (1,532), sex chromosome aneuploidy (651), mismatch between self-reported sex and genetically determined sex (372), recommended genomic analysis exclusions (480), and outliers for heterozygosity or missing rate (968). For “cases” we selected subjects with the following ICD10 (international classification of diseases) diagnostic codes, classified as either “main” or “secondary”: “abdominal aortic aneurysm, without mention of rupture” (1,363 patients, ICD10 code I71.4) and “abdominal aortic aneurysm, ruptured” (131 patients, ICD10 code I71.3). The selected set was purged of relatedness by removing one of each related pair in an iterative fashion until no related subjects remained. A pool of possible control subjects was generated by removing the cases and removing subjects with ICD10 codes listed as “excluded from controls” in Supplementary Table 1. Subjects with these ICD10 codes were removed from the population of controls to avoid introducing confounding factors, specifically the TAA and aortic valvular disorders, in the analysis. The pool was further reduced by removing related subjects. From the resulting reduced pool of possible controls, 20 control subjects were selected for each case subject, matched by sex, age and ancestry with sex as a required match (n=27,260 controls).
Incremental tolerances were used for age and ancestry with tolerances being expanded with each iteration until the desired number of matching controls were found for each case subject. The age tolerance ranged from 0 (exact match) up to 7 years. Ancestry matching was performed using principal components (PCs) supplied by the UK Biobank. The mathematical distance in a graph created by plotting PC1 x PC2 was used to test ancestry similarity. The ancestry “distance” tolerated ranged from 2 PC units up to a maximum of 80 PC units, where PC1 ranged from 0 to +400 and PC2 ranged from -300 to +100 units. Using these tolerances, 20 matching controls were found for every case. The analysis was repeated with patients who carried either a main or secondary diagnosis of “thoracic aortic aneurysm, without mention of rupture” (435 patients, ICD10 code I71.2) or “thoracic aortic aneurysm, ruptured” (22 patients, ICD10 code I71.1). Case to control ratio was again set at 1:20 (n=8,700 controls), and patients with the diagnoses listed in Supplementary Table 2 were excluded from the control population. Subjects with these ICD10 codes were removed from the population of controls to avoid introducing confounding factors, specifically the AAA and aortic valvular disorders, in the analysis.
The association analysis was performed with PLINK2 using logistic regression(11). The thoracic aortic aneurysm (TAA) and abdominal aortic aneurysm (AAA) phenotypes were run against 40 million imputed variants supplied by the UK Biobank with imputation quality scores greater than 0.70. The analysis included covariates of sex, age, and principal components 1 through 5 to adjust for ancestry. Pre-calculated PC data was supplied by the UK Biobank. Our preliminary analysis showed that only the first 5 PCs had significance.
Identification of significant SNPs for AAA and TAA Phenotypes
The SNPs identified in the analysis were filtered to include only those with minor allele frequency (MAF) of at least 0.5% and p-value <1 × 10−6, which is suggestive of genome-wide significance. The potential functional significance of associated variants was assessed by Eigen PC scores, presence of promoter or enhancer elements, and presence of DNase hypersensitivity sites in the affected regions of the genome. The Eigen PC score provides an aggregate score of functional importance for variants of interest, and is useful for prioritizing likely causal variants in a given genomic region. More positive Eigen PC scores are suggestive of more functional (disruptive) variants.
Results
Abdominal Aortic Aneurysm
Of the 1,363 individuals documented to have abdominal aortic aneurysms, 131 (9.61%) had rupture of the aneurysm (Table 1). The affected patients ranged in age at diagnosis from 42.42 to 79.04 years (mean = 68.08); 86.72% were male and 13.28% female. Genetic ancestry was predominantly British (93.54%); however, patients were also represented from Irish (2.64%), Indian (0.22%), Caribbean (0.51%), and African (0.15%) backgrounds. Based on GWAS of these 1,363 individuals, we identified four independent loci near the LINC01021, ADAMTS8, ATOH8 and JAK2 genes (7 SNPs in total) that achieved a genome-wide significance level of p-value <5 × 10−8 for variants with MAF ≥0.5% (Figure 1A). Of these, LINC01021, ADAMTS8 and JAK2 represent novel loci associated with AAA, and each of the four loci harbors SNPs that possess features suggestive of functional importance, with biological plausibility as disease susceptibility loci (Table 2). In addition, we found several distinct variants that do not reach the definitive threshold for genome-wide significance but nevertheless possess a strong basis for biologic plausibility and are within the suggestive threshold for genome-wide significance (p-value <1e−6; Table 2). Full GWAS results for AAA are included in Supplementary Table 5 for variants with MAF ≥0.5% and p-value <1 × 10−6.
Among the SNPs with genome-wide significance, we identified a linkage group of 4 variants in close proximity to the ADAM metallopeptidase with thrombospondin type 1 motif 8 gene (ADAMTS8), which encodes an inflammation-regulated enzyme expressed in macrophage-rich areas of atherosclerotic plaques(12). Prior studies have described upregulation of ADAMTS8 in the macrophages of patients with abdominal aortic aneurysms(13). The significant linkage group of ADAMTS8 variants that we identified includes rs7936928 (intronic), rs4936099 (intronic), rs11222084 (intergenic), and rs3740888 (intronic); these variants have p-values 7.51 × 10−9-1.59 × 10−8, MAF 36.7%-40.7%, and odds ratios 0.785-0.790 (Table 2; Figure 1B). Of note, the ADAMTS8 locus was recently identified among 14 novel AAA-risk loci identified from a study of the Million Veteran Program(14).
Other notable variants identified in this genome-wide analysis include the intronic variant rs193181528 (Table 2; Figure 1C; p-value 3.26 × 10−8, MAF 0.58%, OR 2.776), located within the gene encoding JAK2 tyrosine kinase. Mutations in JAK2 are suspected to play a potential role in the progression of AAAs(15, 16). Among human aortic tissues collected from patients undergoing AAA surgery, JAK2 expression levels were higher in patients with AAA as compared to controls(15). Treatment with JAK2/STAT3 pathway inhibitors attenuated experimental AAA progression by reducing the expression of pro-inflammatory cytokines and matrix metalloproteinases as well as inflammatory cell infiltration(15, 16).
The analysis also identified variant rs113626898 (Table 2; Figure 1D; p-value 9.06 ×10− 9, MAF 0.59%, OR 2.714) in the gene encoding atonal bHLH transcription factor 8 (ATOH8), which plays a role in myogenesis and contributes to endothelial cell differentiation, proliferation and migration(17). Dysregulation of this gene could plausibly contribute to aneurysmal formation given its important role in the cell cycles of myocytes and endothelial cells.
Variant rs116390453 (Table 2; Figure 1E; p-value 4.26 ×10−9, MAF 0.77%, OR 2.505) is within a long intergenic non-coding RNA (LINC01021), also known as p53 upregulated regulator of p53 levels (PURPL). To date, PURPL has not been linked to aneurysmal formation.
In addition, we identified several distinct variants that do not reach the standard threshold for genome-wide significance for association with AAA, but are nevertheless within the suggestive threshold for genome-wide significance (p-value <1 × 10−6) and possess a strong basis for biologic plausibility (Table 2). Variant rs1537373 (Figure 1F; p-value 6.68 × 10−7, MAF 49.9%, OR 0.821) in CDKN2B-AS1, encoding the long non-coding RNA known as cyclin dependent kinase inhibitor CDKN2B antisense RNA1, is located within the CDKN2B-CDKN2A gene cluster at chromosome 9p21, a major genetic susceptibility locus for coronary artery disease, atherosclerosis, and myocardial infarction(18). This locus has also been previously associated with intracranial aneurysm and AAA formation(14, 19–22). Thus, CDKN2B-AS1 variant rs1537373 may increase the risk of AAA formation indirectly through the development of atherosclerosis, a major clinical risk factor for AAA.
Finally, our analysis identified a linkage group of six SNPs within the CELSR2 gene, encoding the cadherin EGF LAG seven-pass G-type receptor 2 (Table 2; Figure 1G; p-values 2.04 ×10−7-6.81 ×10−7, MAF 21.3-22%, OR 0.767-0.777). While CELSR2 SNPs do not meet traditional P-value cutoff for genome-wide significance, our findings corroborate prior GWAS associations of CELSR2 with AAA(14, 23, 24), and identify new common SNPs that are associated with AAA (specifically rs3832016 and rs660240). Of these variants, rs12740374, rs660240 and rs7528419 have a particularly high Eigen PC score of 4.4, 3.8 and 3.2, respectively, suggesting a functional role (Table 2).
Thoracic Aortic Aneurysm
Of the 435 individuals with thoracic aortic aneurysms, 22 (5.06%) had rupture of the aneurysm (Table 1). The affected patients ranged in age at diagnosis from 36.47 to 78.65 years (mean = 65.28); 68.74% were male and 31.26% female, with genetic ancestry of British (87.59%), Irish (2.07%), Indian (1.61%), Caribbean (1.38%), and African (0.23%) origins. Based on GWAS, we identified three SNPs that achieved a genome-wide significance level (p-value <5 × 10−8) together with a MAF ≥0.5% (Figure 2A; Table 3). Full GWAS results for TAA are included in Supplementary Table 6 for variants with MAF ≥ 0.5%.
Among the SNPs with significant p-values, variant rs149014140 in CTNNA3 gene is of particular interest (Figure 2B; p-value 1.82 × 10−8, MAF 0.78%, OR 4.268) since CTNNA3 encodes a vinculin/alpha-catenin family protein known to play a role in cell-to-cell adhesion of muscle cells(25). Another significant variant is rs148927240 (Figure 2C; p-value 2.19 × 10−8, MAF 0.71%, OR 4.23), an intergenic variant located between the long non-coding RNA FERM domain containing 6 (FRMD6), involved in cell contact inhibition and cell cycle regulation(26), and the gene encoding one of the gamma subunits of a guanine nucleotide-binding protein (GNG2). Finally, variant rs78851735 (Figure 2D; p-value 3.79 × 10−8, MAF 0.98%, OR 3.446) is an intronic variant within the gene encoding myelin basic protein (MBP), which is the major protein in myelin sheaths of the nervous system(27). The biological relevance of FRMD6, GNG2 and MBP with respect to the development of aortic aneurysms is unclear at this time.
In addition, we identified we identified a linkage group of high-frequency variants (MAFs 9.56-9.97%, odds ratios 1.615-1.644) that do not reach the standard threshold for genome-wide significance for association with TAA, but nevertheless have a strong basis for biologic plausibility since they fall in a large region of linkage disequilibrium encompassing FBN1 (Table 5). FBN1 encodes the fibrillin-1 protein and is implicated in the pathogenesis of Marfan syndrome as well as other disorders of the connective tissues including ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome(28). Fibrillin-1 is important in maintaining the integrity of connective tissues throughout the body, as it serves as a structural component of calcium-binding myofibrils(28). Intronic variant rs1561207 (Figure 3A; p-value 1.57 ×10−6, MAF 9.89%, OR 1.615) is of special interest given its high Eigen PC score. This variant is in near-perfect linkage with the seven other FBN1 variants, all of which meet the suggestive statistical threshold for genome-wide significance (p-value <1 × 10−6) (Table 4).
Interestingly, each of these FBN1 variants demonstrated a pronounced dose-dependence: homozygotes had significantly higher prevalence of thoracic aortic aneurysm than heterozygotes (Figure 3A). Comorbid conditions of TAA patients with these FBN1 variants (Supplementary Table 3), as well as gender and age at diagnosis (Supplementary Table 4) were similar to the overall population of TAA patients. In FinnGen PheWeb, comprised of genetic and clinical information on 178,899 Finnish cohort, this haplotype was associated with aortic dissection (p =2.3 × 10−5; Supplementary Figure 1). Interestingly, GWAS of thoracic aorta images in the UK Biobank revealed similar association of FBN1 with dilated aorta(29). Thus, our study strengthens the emerging functional association between FBN1 and nonsyndromic aortopathy(30).
Relationship between pulse rate and prevalence of aortic aneurysms
When we analyzed AA prevalence by baseline pulse rate for each of the statistically significant SNPs identified in this study, an unexpected correlation emerged between heart rate and prevalence of aortic aneurysms. For TAA, but not AAA, there was a general trend toward increased prevalence in individuals with bradycardia (defined as heart rate ≤54 beats per minute), regardless of genotype (Figure 3C). Interestingly, this trend seems more marked for homozygous carriers of FBN1 variants (Figure 3B). In contrast, a general trend toward slightly increased AAA prevalence is seen with tachycardia, although the effect is smaller overall (Supplementary Figure 2).
Discussion
Genome wide association analysis of abdominal and thoracic aortic aneurysmal disease in the UK Biobank revealed novel loci near LINC01021, ATOH8 and JAK2 genes associated with AAA, and novel loci near CTNNA3, FRMD6 and MBP genes associated with TAA. Out of the 24 loci previously established for AAA, three were replicated by our analysis, ADAMTS8, CELSR2 and CDKN2B-AS1(14, 31). Based on the data compiled here with the thresholds for p-value and minor allele frequency as set forth in the methods section, there was no significant overlap in the SNPs associated with AAAs and those associated with TAAs. This suggests a distinct underlying genetic architecture, and a distinct pathophysiology, of these two aortopathies.
A potentially clinically relevant finding is the identification of a linkage group of SNPs encompassing the FBN1 gene, which is associated with TAA in the UK Biobank and with aortic dissection in FinnGen cohorts. This haplotype demonstrated a pronounced dose-dependence, with homozygous carriers associated with ∼2.2-fold higher prevalence of thoracic aortic aneurysm than heterozygotes. Given the relatively high minor allele frequencies for these SNPs (9.56-9.97%), as well as the well-defined role of FBN1 in the pathogenesis of connective tissues disorders including Marfan syndrome, we hypothesize that mutations within this linkage group may account for a non-trivial portion of nonsyndromic thoracic aortic aneurysms and dissections, particularly those within the context of positive family history. Therefore, these variants could merit inclusion in genetic screening panels for familial thoracic aneurysmal disease. Taken together with the findings of Pirruccello, et al., our finding suggests some degree of shared pathophysiology between aortic disease in Marfan syndrome and sporadic thoracic aortic aneurysm(29).
An unexpected finding of this study is the apparent association of TAA prevalence and baseline bradycardia. It is unknown whether bradycardia is a consequence of beta-adrenergic blocker usage in those diagnosed with TAA. Nevertheless, the fact that this association is seen only with TAA, but not AAA, suggests some biological basis. For example, widening of pulse pressure accompanying bradycardia may disproportionately increase wall tension in the proximal thoracic aorta but not the distal abdominal aorta. The possibility that bradycardia somehow contributes to TAA formation and therefore should be avoided in susceptible individuals, particularly with FBN variants, may warrant further investigation.
The approach used in this paper has several limitations. As with any GWAS study, the discovery of novel loci associated with aortopathies does not prove functional causality, and the findings described herein needs to be validated by analysis of other databases, ideally in a patient population of more diverse genetic origins than the UK Biobank. Nonetheless, some of these loci, such JAK2, suggest plausible mechanistic and therapeutic insights that merit further investigation. We also note that our study identified fewer associations than recent GWAS studies on the Million Veterans Program (MVP) cohort and on aortic images in the UK Biobank(14, 29). The fewer associations we identified compared to the MVP study may due to the fact that we examined 1,363 patients with AAA whereas the MVP study examined 7,642 patients(14). Additionally, the greater number of associations found by Pirruccello et al. could reflect the fact that they used imaging to identify individuals with subclinical aortopathy not captured by ICD10 codes(29). Finally, we also note that, in addition to the variants and loci discussed here, there are many more that didn’t make the genome-wide significance cutoff of p < 5 × 10−8 or MAF cutoff ≥ 0.5%. Thus, much of the genetic underpinnings of abdominal and thoracic aortic aneurysm formation remain to be discovered.
Competing interests
The authors declare no competing interests.
Acknowledgements
This research was conducted using the UK Biobank Resource under Application Number 49852. This work was supported by NIGMS R01GM118557 and NHLBI R01HL1351291 to CCH, and NHLBI 1U01HL137181 to JP. The funders had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; or decision to submit the manuscript for publication.
Abbreviations
- AA
- (aortic aneurysm)
- AAA
- (abdominal aortic aneurysm)
- GWAS
- (genome-wide association study)
- ICD
- (international classification of diseases)
- LD
- (linkage disequilibrium)
- MAF
- (minor allele frequency)
- MVP
- (Million Veteran Program)
- PC
- (principal component)
- SNP
- (single nucleotide polymorphism)
- TAA
- (thoracic aortic aneurysm)
- TAAD
- (thoracic aortic aneurysms and dissection)
- UK
- (United Kingdom)