The putative causal effect of type 2 diabetes in risk of cataract: a Mendelian randomization study in East Asian

Background The epidemiological association between type 2 diabetes and cataract has been well-established. However, it remains unclear whether the two diseases share a genetic basis, and if so, whether this reflects a causal relationship. Methods We utilized East Asian population-based genome-wide association studies (GWAS) summary statistics of type 2 diabetes (Ncase=36,614, Ncontrol=155,150) and cataract (Ncase=24,622, Ncontrol=187,831) to comprehensively investigate the shared genetics between the two diseases. We performed 1. linkage disequilibrium score regression (LDSC) and heritability estimation from summary statistics (ρ-HESS) to estimate the genetic correlation and local genetic correlation between type 2 diabetes and cataract; 2. multiple Mendelian randomization (MR) analyses to infer the putative causality between type 2 diabetes and cataract; and 3. Summary-data-based Mendelian randomization (SMR) to identify candidate risk genes underling the causality. Results We observed a strong genetic correlation (rg=0.58; p-value=5.60×10−6) between type 2 diabetes and cataract. Both ρ-HESS and multiple MR methods consistently showed a putative causal effect of type 2 diabetes on cataract, with estimated liability-scale MR odds ratios (ORs) at around 1.10 (95% confidence interval [CI] ranging from 1.06 to 1.17). In contrast, no evidence supports a causal effect of cataract on type 2 diabetes. SMR analysis identified two novel genes MIR4453HG (βSMR=−0.34, p-value=6.41×10−8) and KCNK17 (βSMR=−0.07, p-value=2.49×10−10), whose expression levels were likely involved in the putative causality of type 2 diabetes on cataract. Conclusions Our results provided robust evidence supporting a causal effect of type 2 diabetes on the risk of cataract in East Asians, and posed new paths on guiding prevention and early-stage diagnosis of cataract in type 2 diabetes patients. Key Messages We utilized genome-wide association studies of type 2 diabetes and cataract in a large Japanese population-based cohort and find a strong genetic overlap underlying the two diseases. We performed multiple Mendelian randomization models and consistently disclosed a putative causal effect of type 2 diabetes on the development of cataract. We revealed two candidate genes MIR4453HG and KCNK17 whose expression levelss are likely relevant to the causality between type 2 diabetes and cataract. Our study provided theoretical fundament at the genetic level for improving early diagnosis, prevention and treatment of cataract in type 2 diabetes patients in clinical practice

• Our study provided theoretical fundament at the genetic level for improving early diagnosis, prevention and treatment of cataract in type 2 diabetes patients in clinical practice

Introduction
Type 2 diabetes is one of the most prevalent chronic diseases in East Asians 1 , and cataract is a major cause of vision impairment among patients with type 2 diabetes 2 . Previous studies 2,3 have revealed a strong phenotypic association between type 2 diabetes and cataract for East Asians. For instance, Foster et al. conducted a cross-sectional study of 1,206 Singapore Chinese and found patients with diabetes had higher risks for obtaining cortical cataract 3 . Another Asian population-based study recruited 10,033 participants and identified diabetes as a significant risk factor for elevating incidence of cataract surgery 2 .
The phenotypic association between type 2 diabetes and cataract could be partially explained by their shared genetics 4,5 . As pieces of evidence for disclosing their shared genetics, Lee et al. 4 analyzed a Hong Kong Chinese cohort and found cataract is common in patients with type 2 diabetes who carried microsatellite polymorphism around aldose reductase-related genes. Lin et al. 5 identified multiple candidate genes that had significantly different expression levels in the type 2 diabetes patients with higher Lens Opacities Classification System (LOCS) score (i.e., a system used to grade age-related cataract 6 ), comparing to the patients with zero or minor LOCS score. However, the magnitude of the genetic association between type 2 diabetes and cataract remains unclear, as does the problem of whether their genetic association reflects a causal relationship.
Traditional methods estimate the shared genetics by comparing the concordance between monozygotic and dizygotic twins 7 , and establish causal conclusions using the randomized controlled trials (RCTs) 8 , a widely accepted gold standard for causal inference. However, these methods are occasionally limited or impracticable due to their own methodological weakness, such as the laborious data collection process and unethical study design. With the development of genome-wide association studies (GWAS) during past decades, some alternatively feasible statistical methods have been proposed to estimate the shared genetics between focal traits directly using the GWAS summary data 9 . For instance, Bulik-Sullivan et al. 10 developed a technique named linkage disequilibrium score regression (LDSC) to estimate the contributions of polygenic genetic effects for a focal trait (i.e., single-trait heritability) and the magnitude of shared genetic overlap underlying two traits (i.e., cross-trait genetic correlation). Shi et al. 11 extended LDSC and proposed heritability estimation from summary statistics (ρ-HESS), a method to quantify the local single-trait heritability and cross-trait genetic correlation from approximately LD-independent genomic regions. For pair of traits with significant genetic correlation, Mendelian randomization (MR) 12 methods are capable of inferring the potential genetic causal relationship between traits using single nucleotide polymorphisms (SNPs) as instruments. To further investigate any putative functional genes underlying the susceptibility to a trait, Zhu et al. 13 proposed summary data-based Mendelian randomization (SMR), which is an approach to identify gene expressions in an association with a target trait, by integrating GWAS summary data with expression quantitative trait loci (eQTL) summary data.
In this study, we leveraged the large East Asian population-based GWAS summary statistics of type 2 diabetes and cataract from BioBank Japan Project (BBJ) 14 to comprehensively investigate the shared genetics between the two diseases. We applied LDSC, ρ -HESS, and seven MR or MR-equivalent approaches to estimate the genetic correlation, local genetic correlation, and potential genetic causality between type 2 diabetes and cataract, respectively.
We also conducted SMR to the single-trait GWAS (i.e., type 2 diabetes, cataract) and cross-trait GWAS meta-analyses of type 2 diabetes and cataract to explore candidate genes involved in the causality between two diseases. A brief overview of our study is summarized in Fig. 1.

GWAS Data Source
We downloaded the GWAS summary statistics of type 2 diabetes 15 and cataract 16

ρ -HESS of local genetic correlation
To explore whether type 2 diabetes had significant genetic overlap with cataract in some specific independent genomic regions, we performed ρ -HESS 11 to estimate the local genetic correlations between type 2 diabetes and cataract according to the hg19-based 1000 Genomes East Asian reference. A total of 1,439 approximately LD-independent genomic regions (with the exclusion of the MHC region) 23 were utilized in our analysis. The regions were excluded if the estimated local single-trait heritability was negative because of the insufficient study power.
The estimated local genetic correlations were divided into four regional types: 1. the regions harboring significant type 2 diabetes-specific SNPs (i.e., 'type 2 diabetes-specific'); 2. the regions harboring significant cataract-specific SNPs (i.e., 'cataract-specific'); 3. the regions harboring shared SNPs significantly associated with both type 2 diabetes and cataract (i.e., 'intersection'); and 4. other regions (i.e., 'neither'). Three GWAS p-value thresholds, 5×10 -8 , 1×10 -5 , and 1×10 -3 , were used to define the significant SNPs. For these four regional types occupied by more than 10 regions, we calculated the mean and standard error of local genetic correlations within each type. A potential causal effect of type 2 diabetes on cataract is suggested if the average local genetic correlation at type 2 diabetes-specific regions and cataract-specific regions were significantly and non-significantly different from zero, respectively. The opposite is true for inferring potential causal effect of cataract on type 2 diabetes. Besides, the existence of pleiotropic effect may be implicated if there is a non-zero average local genetic correlation at intersection regions.

MR analyses for genetic causality inference
The causal relationship between type 2 diabetes and cataract was evaluated by six MR approaches (i.e., inverse variance weighted [IVW] model 24 , MR-Egger model 25  We performed these models using R packages "cause" (version: 1.0.0), "LCV", "gsmr"

Strong genetic association between type 2 diabetes and cataract
As shown in Table 1

ρ -HESS analyses of local genetic correlations
We conducted ρ -HESS to estimate the local heritability of type 2 diabetes and cataract (detailed results in Table S1 and Fig. 2A). We also estimated the local genetic covariance and correlation between type 2 diabetes and cataract in 824 regions (detailed in Table S1) after excluding the regions with negative local heritability. As shown in Table 2, we identified six genomic regions at a nominal significance level (p-value<0.05) from different chromosomes, with estimated local r g at [0.48, 1).
We further investigated the distribution of local genetic correlations in four regional types. As shown in Fig. 2B, regions harboring type 2 diabetes-specific SNPs were identified with average local r g significantly higher than zero. In reverse, regions harboring cataract-specific SNPs showed a non-significant average local r g close to zero. Therefore, the distribution of local r g revealed by ρ -HESS suggested a potential putative causal relationship of type 2 diabetes on cataract. Besides, the average local r g from the 'intersection' regions harboring shared significant SNPs with GWAS p-value<1×10 -3 was estimated at 0. 22 (Fig. S1).

Two candidate genes likely involved in the causality of type 2 diabetes on cataract
As shown in Table S3, we performed a cross-trait meta-analysis of type 2 diabetes and cataract using METAL, and identified 9 independent 'novel' SNPs that were associated with cross-trait of type 2 diabetes and cataract but not with the original GWAS of type 2 diabetes or cataract.
Next, we applied SMR to the single-trait GWAS and the cross-trait meta-analysis GWAS of type 2 diabetes and cataract, and identified two candidate risk genes (Table 3), MIR4453HG (β SMR =-0.34; SMR p-value=6.41×10 -8 ; HEIDI p-value=0.08 from 13 SNPs) and KCNK17 (β SMR =-0.07; SMR p-value=2.49×10 -10 ; HEIDI p-value=0.08 from 17 SNPs), whose expression levels were negatively associated (i.e., lower gene expression level increases the disease risk) with the susceptibility to co-morbid type 2 diabetes and cataract but not with the single-traits. These genes likely play crucial roles in the casual effects of type 2 diabetes on cataract.

Discussion
To our knowledge, this is the first study to quantify the genetic correlation and explore the potential causality between type 2 diabetes and cataract specifically using East Asian population-based GWAS summary statistics. Our results have highly enriched our current knowledge on the shared genetic architecture between type 2 diabetes and cataract.
Previously, researchers preferred to define co-occurrence of cataract and diabetes as a single outcome (i.e., diabetic cataract) and explored its genetics straightforwardly. For example, Lin et al. 5 performed a GWAS using 758 Chinese cases with type 2 diabetic cataract and 649 healthy controls and identified 15 independent genome-wide significant SNPs, which are associated with blood sugar regulation and cataract development. Another study 37 recruited 2,501 Scottish cases and 3,032 controls and found a significant role of rs2283290 in triggering diabetic cataract. Instead of using a single GWAS dataset with a small number of diabetic cataract patients, we leveraged large population-based GWAS summary statistics of type 2 diabetes and cataract, which is more powerful and provided robust evidence supporting the shared genetics between type 2 diabetes and cataract 3,4,38 .
Using ρ -HESS, we identified six genomic regions with a significant local genetic correlation between type 2 diabetes and cataract. Assuming these regions might contribute to the causal effect of type 2 diabetes on cataract, any SNPs or genes that are located within such regions and associated with type 2 diabetes and/or cataract risks are of great interest to understand the mechanisms underlying the regions. Therefore, we collected information from a total of 254 SNPs in ClinVar 39 and genes in Malacards (supported by trustworthy sources or Cochrane based reviews 40 ) for further analyses (Table S5) SNPs or genes located in the ρ -HESS estimated significant genomic regions were found to be associated with cataract risk. Additionally, this result revealed a large proportion of shared genetics between type 2 diabetes and cataract were from the 'type 2 diabetes-specific' regions.
These findings provided further evidence that the strong genetic correlation between type 2 diabetes and cataract is due to the type 2 diabetes-specific variants.
Application of seven MR and MR-equivalent methods provided consistent results for a causal effect of type 2 diabetes on cataract. Our findings raise an important clinical concern in prevention and early-diagnosis of cataract in patients with type 2 diabetes. We provided theoretical basis at genetic level for suggesting that assessing the development and severity of type 2 diabetes is likely yielding new targets for early-diagnosis of cataract, while further studies are required to pinpoint the potential aetiology underlying type 2 diabetes and cataract.
We also tried to replicate our findings in the European cohort using the European population-based publicly available GWAS summary statistics of type 2 diabetes 43 (N case =62,892, N control =596,424) and cataract (N case =5,045, N control =356,096; UKB field ID: 20002; accessed from URL: http://www.nealelab.is/uk-biobank). However, LDSC analysis indicated a non-significant genetic correlation between the two diseases according to either European or East Asian reference (see Table S6). This result suggests the shared genetic variance between type 2 diabetes and cataract in East Asians may have strong genetic heterogeneity compared to Europeans. Future investigations are required for a better understanding of such difference.
To identify any blood-based biomarkers that may contribute to the causal effect of type 2 diabetes, we performed the multi-trait-based conditional & joint analysis (mtCOJO) 44  However, this result was possibly caused by the high genetic correlation between HbA1c and type 2 diabetes (r g =0.57 and 0.84 with and without constrained intercept) which may greatly decrease the heritability of type 2 diabetes and thus reduced the genetic correlation and putative causal relationship between type 2 diabetes and cataract. Future investigations should focus on this finding with the recruitment of a larger sample size.
We identified two candidate functional genes MIR4453HG and KCNK17 that are likely relevant to the genetic causality between type 2 diabetes and cataract. Interestingly, both genes described a significant association with single-trait type 2 diabetes due to linkage (i.e., not passed HEIDI-outlier test), and then showed a more significant association with cross-trait type 2 diabetes and cataract due to causality or pleiotropy, further suggesting that cataract is likely an outcome triggered by the genetic mutations of type 2 diabetes. MIR4453HG is an IncRNA gene and located nearby some risk genes that have been reported to be associated with blood protein level (gene ARFIP1 48 ) and lipoprotein cholesterol levels (gene TRIM2 49 ). Both traits are highly relevant to the risk for type 2 diabetes 50,51 and cataract 52,53 . KCNK17 encoded a protein in the family of potassium channel 54 . The mutation of KCNK17 may cause the abnormal opening of potassium channels and is associated with cardiovascular diseases (e.g., ischemic stroke and cerebral hemorrhage) 54 , which are known to be involved in the susceptibility to both type 2 diabetes 55 and cataract 56 . These results provided novel insights on the genetic mechanisms underlying the causality between type 2 diabetes and cataract. Further wet-lab experiments were required to approve the roles of these two genes in increasing cataract risks in type 2 diabetes patients.
Our study has several limitations. First, the heritability of cataract was tiny with an estimate less than 2%, which may bias the estimate of genetic correlation between type 2 diabetes and cataract. Nevertheless, this effect should be negligible as the heritability of cataract is significantly different from zero. Secondly, the number of instrumental SNPs using cataract 21 as exposure is less than 10. Instead, we selected the 'proxy' instrumental SNPs with p-value <1×10 -5 , which may violate assumptions of some MR methods (e.g., GSMR). However, the MR effects of these MR methods are highly consistent with CAUSE, suggesting the feasible application of using the 'proxy' instrumental SNPs. Thirdly, due to the limitation of our statistical models, we did not investigate the genetic contributions of the MHC region on the susceptibility to co-morbid type 2 diabetes and cataract, which possibly underestimated the shared genetic between the two diseases.
In summary, we provide robust evidence for a strong genetic association between type 2 diabetes and cataract, and a putative causal effect of type 2 diabetes on cataract particularly in East Asians. Lower expression of two novel candidate genes MIR4453HG and KCNK17 were identified to be possibly involved in the causality between type 2 diabetes and cataract. Our results provided theoretical fundament at the genetic level for improving early diagnosis, prevention and treatment of cataract in type 2 diabetes patients in clinical practice.

Data availability statement
Summary statistics are publicly available at http://jenger.riken.jp/en/.

Supplementary Data
Supplementary data are available at IJE online.    between type 2 diabetes and cataract. B. The average local genetic correlation between type 2 diabetes and cataract in four regional types (i.e., 'type 2 diabetes-specific', 'cataract-specific', 'intersection', and 'neither') harboring risk SNPs with GWAS p-value <1×10 -3 (colored in red;