Recessive genetic effects on type 2 diabetes-related metabolites in a consanguineous population

Autozygosity, meaning inheritance of an ancestral allele in the homozygous state is known to lead bi-allelic mutations that manifest their effects through the autosomal recessive inheritance pattern. Autosomal recessive mutations are known to be the underlying cause of several Mendelian metabolic diseases, especially among the offspring of related individuals. In line with this, inbreeding coefficient of an individual as a measure of cryptic autozygosity among the general population is known to lead adverse metabolic outcomes including Type 2 diabetes (T2DM); a multifactorial metabolic disease for which the recessive genetic causes remain unknown. In order to unravel such effects for multiple metabolic facades of the disease, we investigated the relationship between the excess of homozygosity and the metabolic signature of T2DM. We included a set of 53 metabolic phenotypes, including 47 metabolites, T2DM and five T2DM risk factors, measured in a Dutch genetic isolate of 2,580 people. For 20 of these markers, we identified 29 regions of homozygous (ROHs) associated with the nominal significance of P-value < 1.0 × 10−3. By performing association according to the recessive genetic model within these selected regions, we identified and replicated two intronic variants: rs6759814 located in KCNH7 associated with valine and rs1573707 located in PTPRT associated with IDL-free cholesterol and IDL-phospholipids. Additionally, we identified a rare intronic SNV in TBR1 for which the homozygous individuals were enriched for obesity. Interestingly, all three genes are mainly neuronally expressed and pointed out the involvement of glutamergic synaptic transmission pathways in the regulation of metabolic pathways. Taken together our study underline the additional benefits of model supervised analysis, but also seconds the involvement of the central nervous system in T2DM and obesity pathogenesis.


1
Consanguineous marriages between close relatives as a result of assertive mating is known to cause 2 severe metabolic diseases in the off-spring (VERNON 2015). In addition to that, moderate inbreeding due to isolation 3 in populations has been shown to cause unfavorable outcomes among with cardio-metabolic and neuropsychiatric 4 parameters (VERWEIJ et al. 2014;HOWRIGAN et al. 2016). The previous report shows evidence that inbreeding 5 associates with an increase in blood pressure, glucose and decrease in high-density lipoprotein cholesterol (HDL-6 C), intelligence quotient (IQ) and height (MCQUILLAN et al. 2012). In the decade, technological advances in 7 metabolomics allow researchers to capture the biochemical status in an organism. As one of the most common 8 metabolic disorders, studying the metabolomics of type 2 diabetes (T2DM) is particularly promising as the 9 deregulation of biochemical processes is involved in the pathophysiology of T2DM. In line with this, many 10 circulating metabolites have been found associated with T2DM: including phospholipids, branch-chain amino-acids

12
In our previous report, we tested the evidence of recessive SNP effects. We revealed that six out of the 13 eight quantitative metabolite genetic loci showed a recessive rather than an additive effect on the metabolite 14 (DEMIRKAN et al. 2015). This raises an important question of whether such recessive variants are relevant for 15 metabolic diseases such as T2DM and its markers. In order to answer this question and find genetic loci that act 16 under recessive inheritance, we first defined runs of homozygosity (ROHs) that relate to T2DM and its circulating 17 profile in blood in the genetically isolated ERF population and then looked for the causal recessive variants within 18 the loci using coding variants. We focused on a set of 47 metabolites which we selected based on their correlation were asked to bring all their current medications for registration during the interview. Venous blood samples were 37 collected after at least eight hours of fasting. Baseline type 2 diabetes was defined according to the fasting plasma 38 glucose ≥ 7.0mmol/L and/or anti-diabetic treatment, yielding 212 cases and 2,564 controls, totaling up to 2,776.

39
The follow-up data collection of the ERF study took place in May 2016 (9 to 14 years after baseline visit). During 40 the follow up a total of 1,935 participants' records were scanned for the incidence of type 2 diabetes in general       individual. The best guess information used for SNPs with imputation quality (R 2 ) > 0.95. The algorithm used to define the region-wise homozygosity was developed by the ROHGEN consortium as explained before(CEBALLOS et 85 al. 2018). In brief, the genome was divided into 3 Mb windows (n=992) and for each window, a plink sliding window 86 routine was used to identify the proportion of >1.5 Mb homozygosity within the window. For each window, a 87 maximum of one heterozygous SNP and five missing measurements were allowed. All SNP ids were mapped to the 88 human genome build 19 (hg19) coordinates. The regression analyses were performed using mixed models, 89 adjusting for genetic relatedness using a genomic kinship matrix; age, sex, and first 10 principal components were 90 used as covariates in the model.  Study participants from the ERF study whose exomes were not sequenced (N = 1,527) were genotyped 109 on the Illumina Infinium HumanExome BeadChip, version 1.1, which contains over 240,000 exonic variants selected 110 from multiple sources together spanning 12,000 samples from multiple ethnicities. Calling was performed with 111 GenomeStudio. We removed subjects with a call rate < 95%, IBS > 0.99 and heterozygote ratio > 0.60. Ethnic outliers identified using a principal component analysis with 1000 Genomes data and individuals with sex 113 discrepancies were also removed. The SNVs that were monomorphic in our sample or had a call rate < 95% were 114 removed. After quality control, we retrieved about 70,000 polymorphic SNVs in 1,515 subjects.

155
(β = 0.43, P-value=5.43 × 10 -6 ) and included candidate genes PSMD6 and ADAMTS9 (Table 1) Table 2a and Table 2b with sequence and chip-based genotyping sets respectively. We used three 165 steps of variant filtering. First, we adopted a liberal variant list and included all the SNVs that are captured within 166 the sequence data, also including the intergenic and intronic variants that are captured around the exons. For each 167 region, we set up a region-wide Bonferroni threshold based on the number of SNVs tested and based on that we identified the significantly associated SNVs. By this way five SNVs from exonic sequences were detected inside 169 genes TTC7A (rs57182920, intronic, with XL-HDL-Cholesterol), FRMD4B (rs73095903, intronic, with M-VLDL-TGs), 170 CSMD3 (rs72685825, intronic, with S-HDL-ApoA1), PREX1 (rs3746820, synonymous, with IDL-phospholipids), 171 LAMA5 (rs35653162, intronic, with glucose) in addition to one SNV detected inside a non-transcribed pseudogene 172 EXOC5P1 (rs6551721, exonic, with Phosphatidylcholine diacyl C 40:6) as shown in Table 2 and Supplementary 173 Table 1. By using the exome chip derived genetic data, we detected three SNVs; one located inside KCNH7 174 (rs6759814, intronic, with valine), and two from intergenic regions (rs6469084, associated with S-HDL-ApoA1 and 175 rs6101991 associated with IDL-phospholipids) detected passing the pre-defined region-specific association 176 thresholds ( Table 3 and Supplementary Table 2). Second, we limited the analysis to synonymous and non-177 synonymous SNVs stop codons, UTR and splice variants only. By this way, additional two synonymous SNVs located 178 in OXR1 (rs1681904, with S-HDL-ApoA1) and FRMD4B (rs62254461, with M-VLDL-triglycerides) were found using 179 the exonic sequences ( Table 2). Using the same approach one missense coding SNV in the exome chip was found 180 associated within KRT15 gene (rs1050784, with T2DM) ( Table 3). Third, we filtered the dataset such that we 181 focused on only to those missense and premature stop codons. However, no SNVs were found associated using 182 such strict filtering.

192
In order to investigate the effect of these variants in the outbred population, Rotterdam Study we set out 193 a replication. Out of the 18 genetic associations coming either from exome sequencing, exome chip or imputation 194 sets, we were able to test 12 in the Rotterdam study as 6 of the phenotypes were not available for replication.

211
We report an in-depth association mapping effort for a total of 53 selected metabolic phenotypes by using 212 three sets of independently generated genotype data. First, we point out increased homozygosity in 29 genomic 213 loci influencing 20 different outcomes, suggesting 51 outcome-genomic locus pairs to be investigated by more in-214 depth analysis. By using the recessive genetic model for association testing, we detected and replicated 2 intronic 215 genetic variants inside genes KCNH7 and PTPRT. Additionally, within the ROHs we report 18 rare variants in a group 216 of individuals enriched by cases of obesity and T2DM. Among these 18 rare variants, 17 were specific to ERF 217 population, whereas one (rs116175783, located in TBR1) was also found in the Rotterdam study, and remarkably 218 six out of eight homozygous individuals are overweight.

219
Evolutionary selection is less effective in eliminating recessive deleterious alleles since it needs two copies 220 to reduce the fitness of the organism, those tend to become more frequent than expected. The most favorable     glutamergic synaptic transmission (P-value = 3.1 x 10 -7 ) associated with behavioral fear response (P-value = 2.7 x 251 10 -7 ). Remarkably, feeding behavior (P-value = 8.1 x 10 -3 ) is one of the top GO terms for this gene. Based on GWAS 252 using additive genetic model TBR1 was found associated with gene-based also associates with sodium in blood (P-253 value = 3.8 x 10 -10 ), educational attainment (P-value = 1.8 x 10 -8 ), diagnosed High blood pressure (2.7 x 10 -8 ) but 254 also with BMI (6.1 x 10 -6 ) to some degree (http://atlas.ctglab.nl/PheWAS, accession date: 16.04.2019). In our study, 255 the region harboring TBR1 was initially selected because the homozygosity was associated with a decrease in XL-

257
Overall by combining the power of recessive inheritance in a genetically isolated population with a wide 258 range of metabolic pathways, we pointed out several genetic loci that could be of interest for further research. By 259 performing fine mapping within these genetic loci we found and replicated two genetic variants in i.e. KCNH7,

260
PTPRT involved in glutamergic synaptic pathways by using recessive genetic association models. In addition, we                      Remarks: based on pathway databases and text mining. * Leading trait associated with the indicated loci, in case that there is associations with more than one trait. to the number of total/exonic SNVs; Minimum P-value total/exonic: P-value for the association of the top most significant SNV in the region for all SNVs/ for the exonic SNVs; RsID total/exonic of the of the top most significant SNV in the region for all SNVs/ for the exonic SNVs; Gene total/exonic annotated to the top SNV/exonic SNV. of SNVs that are found to be homozygous at least among 5 individuals and have functional annotation indicating synonymous, nonsynonymous, stop codon, UTR and splice variant; Bonferroni P-value total/exonic: Regional P-value threshold calculated according to the number of total/exonic SNVs; Minimum P-value total/exonic: P-value for the association of the top most significant SNV in the region for all SNVs/ for the exonic SNVs; RsID total/exonic of the of the top most significant SNV in the region for all SNVs/ for the exonic SNVs; Gene total/exonic annotated to the top SNV/exonic SNV.