Genetic Analyses of Epigenetic Predictors that Estimate Aging , 1 Metabolic Traits , and Lifespan 2

DNA methylation (DNAm) clocks are accurate molecular biomarkers of aging. However, the clock mechanisms remain unclear. Here, we used a pan-mammalian microarray to assay DNAm in liver from 339 predominantly female mice belonging to the BXD family. We computed epigenetic clocks and maximum lifespan predictor (predicted-maxLS), and examined associations with DNAm entropy, diet, weight, metabolic traits, and genetic variation. The epigenetic age acceleration (EAA) derived from the clocks, and predicted-maxLS were correlated with lifespan of the BXD strains. Quantitative trait locus (QTL) analyses uncovered significant QTLs on chromosome (Chr) 11 that encompasses the Erbb2/Her2 oncogenic region, and on Chr19 that contains a cytochrome P450 cluster. Both loci harbor candidate genes associated with EAA in humans (STXBP4, NKX2-3, CUTC). Transcriptome and proteome analyses revealed enrichment in oxidation-reduction, metabolic, and mitotic genes. Our results highlight loci that are concordant in human and mouse, and demonstrate intimate links between metabolism, body weight, and epigenetic aging.


36
Epigenetic clocks are widely used molecular biomarkers of aging. 1 These biological clocks are 37 based on the methylation status across an ensemble of "clock CpGs" that are collectively used 38 to derive a DNA methylation (DNAm) based estimate of age (DNAmAge). This estimate tracks 39 closely, but not perfectly, with an individual's chronological age. How much the DNAmAge 40 deviates from the known chronological age is a measure of the rate of biological aging. Denoted 41 as epigenetic age acceleration (EAA), a more accelerated measure (positive EAA) suggests an 42 older biological age. While DNAmAge predicts age, its age-adjusted counterpart, EAA, is 43 associated with health, fitness, exposure to stressors, body mass index (BMI), and even life 44 expectancy. 2-6 These DNAm clocks were initially reported for humans. 7,8 Since then, the 45 biomarker has been extended to model organisms, [9][10][11] and different variants of human clocks 46 have also been developed. Some clocks are tissue specific, others are pan-tissue, and others 47 perform well at predicting health and life expectancy. 5,8,12-14 48 A new microarray platform was recently developed to profile CpGs that have high conservation 49 across mammalian clades. This pan-mammalian DNAm array (HorvathMammalMethylChip40) 50 provides a common platform to measure DNAm, and has been used to build universal 51 epigenetic clocks that can estimate age across a variety of tissues and mammalian species. 15,16 52 Another remarkable development with this array is the novel lifespan predictor that can 53 estimate the maximum lifespan of over 190 mammals at high accuracy. 17 54 Here, we test the performance and applicability of these novel DNAm clocks and lifespan 55 predictor in a genetically diverse cohort of mice belonging to the BXD family. 18,19 The BXDs are a 56 well-established mouse genetic reference panel that were first created as a family of 57 recombinant inbred (RI) strains by crossing two inbred progenitors: C57BL/6J (B6) and DBA/2J 58 (D2). The family has been expanded to ~150 fully sequenced progeny strains. 20,21 The BXD 59 population incorporates significant genetic variation, and likewise, show considerable 60 phenotypic variability in their metabolic profiles, response to diet, aging rates, and natural life 61 expectancy. [18][19][20][22][23][24] The availability of genetic data, and accompanying deep multi-omics and 62 phenomics, make the BXDs a powerful experimental population to evaluate the DNAm 63 biomarkers, and to dissect the genetic modulators of epigenetic aging. Previously, we explored 64 the aging methylome in a small number of BXD cases and found that high fat diet (HFD) and 65 higher body weight were associated with higher age-dependent changes in methylation. 25 In 66 the present work, our goals were to (1) test the accuracy of the DNAm measures in predicting 67 age and lifespan of the BXDs, (2) examine associations with diet and metabolic characteristics, 68 and (3) apply quantitative trait locus (QTL) mapping and gene expression analyses to uncover 69 loci and genes associated with these DNAm biomarkers. 70 Our results are consistent with a faster clock for cases on HFD, and with higher body weight. 71 Both the DNAmAge and lifespan predictors were correlated with the genotype-dependent life 72 expectancy of female BXDs. We report QTLs on chromosomes (Chrs) 11 and 19. A strong 73 candidate gene in the chromosome (Chr) 11 interval (referred to as Eaaq11) is Stxbp4, a gene 74 that has been consistently associated with EAA by human genome-wide association studies 75 (GWAS). [26][27][28] The Chr19 QTL (Eaaq19) also harbors strong contenders including Cyp26a1, Myof,76 Cutc, and Nkx2-3, and the conserved genes in humans have been associated with longevity and 77 DNAm clocks and chronological age prediction 90 The first epigenetic clock we tested was the universal relative age clock described in Lu et al. 16 91 We refer to the age estimates as univ.DNAmAge, and this had a correlation of r = 0.92 with the 92 chronological age of mice ( Table 1; Fig 1a). 93 We also built three different mouse clock, and each was developed as a pair depending on 94 whether the training set used all tissues (pan-tissue) or only liver (liver specific). in response to aging related interventions such as caloric restriction and growth hormone 99 knockout. 9,11,25 These clocks were trained in a large mouse dataset that excluded the BXDs 100 (previously reported in 16,17 ), and are therefore unbiased to BXD characteristics. The specific 101 CpGs and coefficients for clock computation are in Data S2. All the mouse clocks performed 102 well in age estimation but the interventional clocks had higher deviation from chronological age 103 ( Table 1). The age-adjusted EAA derived from these clocks showed wide individual variation (Fig  104  1b). 105 We also used the universal maximum lifespan predictor 17 to estimate the potential maximum 106 lifespan (predicted-maxLS) of mice (Data 107 S1). Predicted-maxLS was uncorrelated 108 with chronological age ( The methylome-wide entropy provides a 121 measure of randomness and information 122 loss, and this increased with chronological 123 age (Fig 1c). 7 As direct correlates of 124 chronological age, all the DNAmAge 125 estimates had significant positive 126 correlations with entropy ( Table 1). 127 We hypothesized that higher entropy 128 levels will be associated with (a) higher 129 EAA, and (b) lower predicted-maxLS. 130 Indeed, the univ.EAA had a positive 131 correlation with entropy that was 132 significant regardless of diet (Fig 1d). 133 However, the EAA from the mouse clocks 134 showed only weak correlations with 135 entropy (Data S3). Entropy had a modest 136 negative correlation with predicted-maxLS 137 primarily in the CD group (Fig 1e). Taken (Table 2). Entropy was also significantly higher in the HFD group. The maxLS did not 147 differentiate between diets (Table 2). 148 Body weight. Body weight was first measured when mice were at an average age of 4.5 ± 2.7 149 months. We refer to this initial weight as baseline body weight (BW0). For mice on HFD, this 150 was usually before introduction to the diet, with the exception of 48 cases that were first 151 weighed 1 or 3 days after HFD (Data S1). In the CD group, only the EAA from the pan-tissue 152 standard clock and liver developmental clock showed significant positive correlations with BW0 153 ( Table 2). In the HFD group, the positive correlation with BW0 was more robust and consistent 154 across all the clocks, and this may have been due to the inclusion of the 48 cases that had been 155 on HFD for 1 or 3 days. Taking only these 48 cases, we found that higher weight even after 1 156 day of HFD had an age-accelerating effect (Data S3). This was particularly strong for the 157 interventional clocks (r = 0.45, p = 0.001 for int.EAA, pan-tissue; r = 0.58, p < 0.0001 for int.EAA, 158 liver), and for the universal clock (Fig 2a). Second weight was measured 7.4 ± 5.2 weeks after 159 BW0 (mean age 6.3 ± 2.8 months). We refer to this as BW1 and we estimated the weight 160 change as deltaBW = BW1 -BW0. DeltaBW was a positive correlate of EAA on both diets, albeit 161 more pronounce in the HFD group (Fig 2b; Data S3). The final body weight (BWF) was measured 162 is body weight at about 4.5 months of age (n = 339; 210 CD and 129 HFD); BWF is final weight at tissue collection (1 HFD case missing data; n = 338; 210 CD and 128 HFD) b Pearson correlation between strain means for n = 29 BXD genotypes kept on CD and HFD at the time of tissue harvest, and EAA from all the mouse clocks were significant correlates of 163 BWF on both diets (Table 2). 164 Somewhat unexpected, entropy had an inverse correlation with body weight. This effect was 165 primarily in the CD mice (Table 2). We found no association between predicted-maxLS and the 166 body weight traits ( Table 2). 167 Sex effect. Four BXD genotypes (B6D2F1, D2B6F1, BXD102, B6) had cases from both males and 168 females. We used these to test for sex effects. All the mouse clocks showed significant age 169 acceleration in male mice, and this effect was particularly strong for the pan-tissue int.EAA ( Fig  170  2c; Data S3). The predicted-maxLS was significantly lower in males (Fig 2d). Entropy on the 171 other hand, was significantly higher in females (Fig 2e). Correlates and modifiers of the epigenetic readouts (a) For 48 mice, initial body weight (BW0) was measured 1 or 3 days after high fat diet (HFD), and these showed significant correlation between BW0 and epigenetic age acceleration (EAA shown for the universal clock). (b) Weight was first measured at mean age of 4.5 ± 2.7 months, and then at 6.3 ± 2.8 months (BW1). Weight gains during this interval (deltaBW = BW1 -BW0) is a direct correlate of EAA derived from the interventional clock in both normal chow (control diet or CD; n = 210) and HFD mice (n = 128). (c) For the BXD genotypes with samples from both males and females, males have higher age acceleration and this sex effect is highly significant for the pan-tissue interventional clock (int.EAA). Bars represent mean ± standard error; 40 females (26 CD, 14 HFD) and 18 males (10 CD, 8 HFD). Males had (d) lower predicted maximum-lifespan, and (e) lower average methylome entropy. (f-i) The residual plots display the direction of association between metabolic traits and DNAm readouts (n = 276 cases with metabolic data). After adjusting for chronological age, diet and body weight, serum cholesterol has inverse associations with (f) predicted maximum lifespan (p = 0.002), and (g) methylome entropy (p = 9.1E-06). Serum glucose level has inverse associations with (h) epigenetic age acceleration derived from the universal clock (p = 0.005), and (i) methylome entropy (p = 0.003). examined whether these metabolic traits are associated with three of the DNAm readouts: the 177 univ.EAA, predicted-maxLS, and DNAm entropy. We applied linear regression with age, diet and 178 final body weight as covariates, and this showed significant effects of cholesterol on predicted-179 maxLS (p = 0.002), and entropy (p = 9E-06) ( Table S1). To visualize how cholesterol levels 180 associate with these, we plotted the residual values after the respective predictor and outcome 181 variables were adjusted for age, diet, and BWF. The residual plot shows an inverse association 182 between cholesterol and predicted-maxLS (Fig 2f). For entropy, similar to how it related with 183 weight, higher cholesterol predicted lower entropy. Cholesterol had no significant association 184 with univ.EAA (Table S1). 185 Glucose had an unexpected inverse association with the univ.EAA that indicates lower age 186 acceleration with higher glucose (p = 0.005) (Fig 2h; Table S1). Higher glucose was also 187 associated with lower entropy (p = 0.003) (Fig 2i). Glucose was not associated with predicted-188 maxLS (Table S1). 189 Association with strain longevity 190 We next obtained longevity data from a parallel cohort of female BXD mice that were allowed 191 to age on CD or HFD. 19 We evaluated whether the DNAm readouts were informative of strain-192 level lifespan. Since the strain lifespan was determined from female BXDs, we restricted this to 193 only the female cases. For strains with natural death data from n ≥ 5, we computed the 194 minimum (minLS), 25 th quartile (25Q-LS), mean, median lifespan, 75 th quartile (75Q-LS), and 195 maximum lifespan (maxLS) (Data S1). Specifically, we postulated (a) an accelerated clock for 196 strains with shorter lifespan (i.e., inverse correlation), (b) a direct correlation between 197 predicted-maxLS and observed lifespan, and (c) higher entropy with shorter lifespan. 198 Overall, the EAA measures showed the expected inverse correlation trend with the lifespan 199 summaries, and this was highly significant for the universal clock (Table S2; Fig 3a,b). For the 200 mouse clocks, this effect was significant for the liver int.EAA (Table S2). When separated by 201 diet, these correlations became weaker, but the negative trend remained consistent. 202 The DNAm entropy had an inverse correlation trend with strain lifespan (Table S2). This was 203 nominally significant only for the strain maxLS when CD and HFD groups were combined (r = -204 0.13, p = 0.02) but became non-significant when separated by diet. 205 The predicted-maxLS showed a positive correlation trend with the lifespan summaries, and this 206 was significant for the observed strain maxLS (Fig 3d). When separated by diet, the predicted-207 maxLS remained a significant correlate of strain maxLS only in the CD group. 208

209
The EAA traits had modest to high heritability, and averaged at 0.50 for mouse clocks ( Table 2). 210 The predicted-maxLS had heritability of 0.66 on CD, and 0.70 on HFD. Another way to gauge 211 level of genetic correlation is to compare between members of strains maintained on different 212 diets. All the EAA traits, and the predicted-maxLS had high strain-level correlations between 213 diets, indicating an effect of background genotype that is robust to dietary differences (Table  214 2). 215 To uncover genetic loci, we applied QTL mapping using mixed linear modeling that corrects for 216 the BXD kinship structure. 31 First, we performed the QTL mapping for the EAA derived from the 217 different mouse and universal clocks, with adjustment for diet and body weight. EAA from the 218 two interventional clocks had the strongest QTLs (Data S4). The pan-tissue int.EAA had a 219 significant QTL on Chr11 (90-99 Mb) with the highest linkage at ~93 Mb (p = 3.5E-06; 220 equivalent to a LOD score of 4.7) (Fig 4a). Taking a genotype marker at the peak interval (BXD 221 variant ID DA0014408.4 at Chr11, 92.750 Mb) 20 , we segregated the BXDs homozygous for either 222 the D2 (DD) or the B6 (BB) alleles. Strains with DD genotype at this locus had significantly higher 223 int.EAA (Fig 4a inset). 224 The liver int.EAA had the peak QTL on Chr19 (35-45 Mb) with the most significant linkage at 225 markers between 38-42 Mb (p = 9E-07; LOD score of 5.2) (Fig 4b). We selected a marker at the 226 peak interval (rs48062674 at Chr19, 38.650 Mb), and the BB genotype had significantly higher 227 int.EAA compared to DD (Fig 4b inset). The QTL map for the univ.EAA did not reach genome-228 wide significance (Fig 4c). However, there were nominally significant peaks at the Chr19 (p = 229 0.0004), and Chr11 (p = 0.004) intervals. 230   The liver-specific int.EAA has a peak QTL on Chr19 (~38 Mb). Trait means by genotype at this locus are shown in inset. (c) The linkage statistics are weaker for the EAA derived from the universal clock (univ.EAA). However, there are consistent nominally significant peaks on the Chr11 and Chr19 loci. (d) The DNA methylation based predicted maximum lifespan also maps to Chr19, but the peak markers are at ~47.5 Mb.
chronological age, and body weight). No locus reached genome-wide significance (DataS4). 232 There were modest QTLs on Chrs11 and 19. However, the Chr11 region is slightly distal to the 233 markers linked to the EAA traits (minimum p = 0.009 at Chr11, ~103.7 Mb). The Chr19 locus 234 somewhat overlapped the QTL for EAA, but the peak marker (minimum p = 0.0009) is slightly 235 distal at ~48 Mb (Data S4). 236 The predicted-maxLS had a significant QTL on Chr19 (Fig 4d; Data S4) with the peak markers 237 between 44-48 Mb (p = 2E-07; LOD score of 5.9). This overlaps the EAA QTL, but the peak 238 markers are distal (rs30567369 at 47.510 Mb). At this locus, mice with the BB genotype had 239 significantly higher predicted-maxLS (Fig 4d inset). 240 To identify regulatory loci that are consistent across the different EAA measures, we applied a 241 multi-trait analysis and derived the linkage meta-p-value for the mouse and universal EAA 242 traits. 32 The peaks on Chrs 11 and 19 attained the highest consensus p-values (Fig S1a;  We performed marker-specific linkage analyses for each of the mouse and universal clocks 251 using a regression model that adjusted for diet. With the exception of the liver int.EAA, all the 252 EAA traits had nominal to highly significant associations with the representative Eaaq11 marker 253 (DA0014408.4), and the DD genotype had higher age acceleration (Table 3). Mean plots by 254 genotype and diet shows that this effect was primarily in the CD mice (Fig S1b). The effect of 255 this locus appeared to be higher for the pan-tissue clocks compared to the corresponding liver-256 specific clocks. This marker in Eaaq11 was not associated with either entropy or predicted-257 maxLS. 258 For proximal Eaaq19, the representative marker (rs48062674) was associated with all the EAA 259 traits and the BB mice had higher age acceleration on both diets (Fig S1c). This marker was not 260 associated with entropy, and had only a weak effect on predicted-maxLS (Table 3). When we 261 performed the same analysis with the marker on distal Eaaq19 (rs30567369), the association 262 with EAA became weaker, and the association with predicted-maxLS became much stronger 263 (Table 3). This suggests that the proximal part of Eaaq19 is related to EAA while the distal part 264 is related to predicted-maxLS. 265 We also tested if these peak markers were associated with the recorded lifespan phenotype 266 and we found no significant association with the observed lifespan of the BXDs. Since gain in body weight with age was an accelerator of the clocks, we examined whether the 272 selected markers in Eaaq11 and Eaaq19 were also related to body weight change. We retrieved 273 Eaaq11 in the CD group, the DD genotype 280 in the CD group also had slightly higher 281 mean weight at older adulthood (12 and 282 18 months ; Fig 5a). However, this marker 283 had no significant association with body 284 weight when tested using a mixed effects 285 model (p = 0.07; Table 3). In proximal 286 Eaaq19, it was the BB genotype that 287 exhibited consistently accelerated clock 288 on both diets, and the BB genotype also 289 had higher average body weight by 6 290 months of age (Fig 5b), and this locus had 291 a significant influence on the body weight 292 trajectory (p = 7.6E-07; Table 3). The 293 nearby marker on distal Eaaq19 also 294 showed a similar pattern of association 295 with body weight (Table 3). 296 Candidate genes for epigenetic age acceleration 297 There are several positional candidate genes in Eaaq11 and Eaaq19. To narrow the list, we 298 applied two selection criteria: genes that (1) contain missense and/or stop variants, and/or (2) 299 contain non-coding variants and regulated by a cis-acting expression QTL (eQTL). For the eQTL 300 analysis, we utilized an existing liver transcriptome data from the same aging cohort. 18 We 301 identified 24 positional candidates in Eaaq11 that includes Stxbp4, Erbb2 (Her-2 oncogenic 302 gene), and Grb7 (growth factor receptor binding) (Data S5). Eaaq19 has 81 such candidates that 303 includes a cluster of cytochrome P450 genes, and Chuk (inhibitor of NF-kB) in the proximal 304 region, and Pcgf6 (epigenetic regulator) and Elovl3 (lipid metabolic gene) in the distal region 305 (Data S5). 306 For further prioritization, we converted the mouse QTL regions to the corresponding syntenic 307 regions in the human genome, and retrieved GWAS annotations for these intervals. 33 We 308 specifically searched for the traits: epigenetic aging, longevity, age of 309 menarche/menopause/puberty, Alzheimer's disease, and age-related cognitive decline and 310 dementia. This highlighted 5 genes in Eaaq11, and 3 genes in Eaaq19 (Table S4). We also 311 identified a GWAS study that found associations between variants near Myof-Cyp26a1 and 312 human longevity, 30 and a meta-GWAS that found gene-level associations between Nkx2-3 and 313 Cutc, and epigenetic aging (Table S4). 28 314 Liver RNA-seq data was available for 153 of the BXD cases that had DNAm data (94 CD, and 59 316 HFD). 18 We used this set to perform transcriptome-wide correlation analysis for the univ.EAA. 317

Gene expression correlates of EAA and predicted max-LS
To gain insights into gene functions, we selected the top 2000 transcriptome correlates (|r| ≥ 318 0.37, p ≤ 2.8E-06; Data S6) for functional enrichment analysis. These top correlates represented 319 transcript variants from 1052 unique genes and included a few positional candidates (e.g., Ikzf3, 320 Kif11, Cep55, Cyp2c29, Cyp2c37). Only 62 transcripts from 36 unique genes were negatively 321 correlated with univ.EAA, and this set was significantly enriched (Bonferroni correct p < 0.05) in 322 oxidation-reduction, and metabolic pathways (Data S7; Fig 6a). These functional categories 323 included the cytochrome genes, Cyp2c29 and Cyp2c37, located in Eaaq19. This set was also 324 highly liver specific. The positive correlates were enriched in a variety of gene functions, and 325 was not a liver-specific gene set (Data S7). Taking the top 10 GO categories, we can broadly 326 discern two functional domains: immune and inflammatory response, and mitosis and cell cycle 327 (Fig 6a). To verify that these associations are robust to the effect of diet, we repeated the 328 correlation and enrichment analysis in the CD group only (n = 94). Again, taking the top 2000 329 correlates (|r| ≤ 0.30; p ≤ 0.003), we found the same enrichment profiles for the positive and 330 negative correlates (Data S7). 331 Next, we performed the correlational analysis using liver proteomic data that was available for 332 164 of the BXDs with DNAm data. The proteome data quantifies over 32000 protein variants 333 from 3940 unique genes. 18 We took the top 2000 protein correlates of univ.EAA (|r| ≥ 0.27, p ≤ 334 6.0E-04) (Data S8). This represented protein levels from 563 unique genes. 1139 protein 335 variants (215 genes) had negative correlations, and similar to the mRNA correlates, there was 336 enrichment in oxidation-reduction and metabolic processes. This set was also enriched in liver 337 The top 10 negative proteome correlates of univ.EAA is also enriched in oxidation-reduction and related metabolic pathways, and these are populated by several cytochrome P450 genes. genes, and included pathways related to lipid and steroid metabolism, epoxygenase p450 338 pathway, and xenobiotics (Fig 6b; Data S9). These categories were populated by the 339 cytochrome genes including candidates in Eaaq19 (e.g., Cyp2c29, Cyp2c37). The positive 340 proteome correlates showed a different functional profile than the transcriptomic set. These 341 were enriched in genes related to transport (includes apolipoprotein such as APOE), cell 342 adhesion, protein translation, protein folding, and metabolic pathways related to glycolysis and 343 gluconeogenesis (Data S9). 344 We performed a similar transcriptome and proteome analysis for the predicted-maxLS. For 345 mRNA, both the negative and positive correlates were enriched in metabolic pathways 346 including glucose and lipid metabolism (Data S10, S11). Similarly, the positive and negative 347 protein correlates of predicted-maxLS converged on oxidation-reduction processes (included 348 cytochrome genes located in proximal Eaaq19) and metabolic pathways (Data S12, S13). 349

350
The goals of this study were to examine the aging methylome at conserved CpGs, test the 351 performance of the DNAm based biomarkers, and identify potential modifiers and genetic 352 drivers. HFD had a strong age-accelerating effect that concurs with the association between 353 EAA and obesity in humans. 25,34,35 Age-acceleration due to diet manifested within the first 1 to 3 354 days of transitioning from normal lab chow to HFD. Even among the CD mice, higher weight 355 gain at a younger age was associated with an accelerated clock. 356 Somewhat surprising was how entropy related to the metabolic traits. Epigenetic entropy 357 increases with age, and is likely an indicator of the increase in stochastic noise over time. 7,36 In 358 biological systems, entropy is kept at bay by the uptake of energy, and investment in 359 maintenance and repair. 37 As HFD increased entropy (possibly due to higher cellular 360 heterogeneity and adiposity of liver tissue), we expected entropy to be higher with higher body 361 weight. But instead, entropy had an inverse correlation with weight, an effect that was 362 primarily in the CD mice. Higher levels of serum glucose and total cholesterol were also 363 associated with lower entropy. The reason for this is unclear, and we can only speculate that 364 the enhanced energy consumption in mice that had higher metabolic substrates may have kept 365 the methylome in a more ordered state. Despite this, mice with higher entropy also tended to 366 have higher EAA. Entropy had a modest negative correlations with both the the DNAm based 367 predicted-maxLS, and the known strain-level maxLS. The predicted-maxLS on the other hand, 368 showed no direct association with diet or body weight, but higher total cholesterol and higher 369 EAA were associated with shorter predicted-maxLS. 370 For the BXDs, life expectancy is highly dependent on the background genotype, and mean 371 lifespan varies from under 16 months for strains such as BXD8 and BXD13, to over 28 months in 372 strains such as BXD91 and BXD175. 19,22,24 While the EAA from the universal was inversely 373 correlated with strain lifespan data, the predicted-maxLS had a direct correlation with observed 374 strain maxLS. We note that the predicted-maxLS overestimated the strain max-LS by 0.7 to 3 375 years (median error of +1.6 years). Nonetheless, the correlation between individual-level 376 predicted-, and strain-level observed maxLS is remarkable considering that both the universal 377 clock and max-LS predictor are pan-mammalian, and species-and tissue-agnostic. 16,17 Our 378 results suggest that these universal epigenetic predictors of biological aging and lifespan are 379 informative of the subtle and normative lifespan variation in a family of inbred mice. The 380 analysis between the epigenetic readouts and lifespan was also an indirect comparison. Unlike 381 the comparison with body weight and metabolic traits, which were traits measured from the 382 same individual, the lifespan data are strain characteristics computed from a parallel cohort of 383 mice that were allowed to survive till natural mortality. Nonetheless, this indirect comparison 384 demonstrates that these multi-species epigenetic predictors capture genotype-dependent 385 effects on aging and lifespan within a species. 386 We tested different versions of the mouse DNAmAge clocks, and these appeared to capture 387 slightly different aspects of epigenetic aging. For instance, the interventional clocks were 388 sensitive to diet and early weight change, but not related to BW0 in the CD mice. Instead, BW0 389 had a significant accelerating effect on the liver developmental clock (dev.EAA). 390 Our goal was to take these different clocks and identify regulatory loci that were the most 391 stable and robust to the slight algorithmic differences in building the clocks. A notable 392 candidate in Eaaq11 is Syntaxin binding protein 4 (Stxbp4, aka, Synip), located at 90.5 Mb. 393 Stxbp4 is a high-priority candidate due to the concordant evidence from human genetic studies. 394 The conserved gene in humans is a replicated GWAS hit for the intrinsic rate of epigenetic 395 aging. 26-28 In the BXDs, Stxbp4 contains several non-coding variants, and a missense mutation 396 (rs3668623), and the expression of Stxbp4 in liver is modulated by a cis-eQTL. Stxbp4 plays a 397 key role in insulin signaling, 38 and has oncogenic activity and implicated in different cancers. 39,40 398 Furthermore, GWAS have also associated STXBP4 with age of menarche. 41,42 Eaaq11 399 corresponds to the 17q12-21 region in humans, and the location of additional oncogenic genes, 400 e.g., ERBB2/HER2, GRB7, and BRCA1. 43 The mouse Brca1 gene is a little distal to the peak QTL 401 region and is not considered a candidate here, although it does segregate for two missense 402 variants in the BXDs. Erbb2 and Grb7 are in the QTL region, and Erbb2 contains a missense 403 variant (rs29390172), and Grb7 is modulated by a cis-eQTL. Nr1d1 is another candidate in 404 Eaaq11, and the co-activation of Erbb1, Grb7, and Nr1d1 has been linked to breast and other 405 cancers. 44,45 406 Eaaq19 was consistently associated with EAA from all the clocks we evaluated, and also with 407 body weight gains, irrespective of diet. The predicted-maxLS also maps to this region, and 408 DNAm entropy may also have a weak association with markers at this interval. The EAA traits 409 have peak markers in the proximal part of Eaaq19 (around the cytochrome cluster), and the 410 predicted-maxLS peaks in the distal portion (over candidates like Elovl3, Pcgf3). Two candidates 411 in Eaaq19 have been implicated in epigenetic aging in humans based on gene-level meta-412 GWAS: NK homeobox 3 (Nkx2-3, a developmental gene), and CutC copper transporter (Cutc). 28 413 Eaaq19 is also the location of the Cyp26a1-Myof genes, and the human syntenic region is 414 associated with longevity, metabolic traits, and lipid profiles. 30,46,47 Another noteworthy 415 candidate in Eaaq19 is Chuk, a regulator of mTORC2, that has been associated with age at 416 menopause. 41,48 Clearly, Eaaq19 presents a complex and intriguing QTL related to the different 417 DNAm readouts, and potentially metabolic traits. Both Eaaq19 and Eaaq11 exemplify the major 418 challenge that follows when a genetic mapping approach leads to gene-and variant-dense 419 regions. 49,50 Both loci have several biologically relevant genes, and identifying the causal gene 420 (or genes) will require a more fine-scaled functional genomic dissection. 421 The gene expression analyses highlighted metabolic pathways related to lipids, glucose, and 422 proteins for both the univ.EAA and predicted-maxLS. Other enriched pathways were mitosis 423 and cell division, and immune processes, but this was specific to the positive transcriptomic 424 correlates. The more compelling evidence is for the cytochrome P450 genes, which are both 425 positional candidates, as well as expression correlates at the transcriptomic and proteomic 426 levels. These genes have high expression in liver, and have major downstream impact on 427 metabolism. 51-53 One caveat is that these CYP genes are part of a gene cluster in Eaaq19 that 428 includes transcripts with cis-eQTLs (e.g. , Cyp2c66, Cyp2c39, Cyp2c68), and the tight clustering of 429 the genes, and proximity of trait QTL and eQTLs may result in tight co-expression due to linkage 430 disequilibrium. 54 Nonetheless, the cytochrome genes in Eaaq19 are strong candidate 431 modulators of EAA that calls for further investigation. 432 Aside from Eaaq11 and Eaaq19, loci with evidence of consensus QTLs were also detected on 433 Chrs 1 and 3. We do not delve into these in the present work, but the Chr3 interval is near 434 genes associated with human epigenetic aging (Ift80, Trim59, Kpna4). 26 in the mouse genome, and data from these were normalized using the SeSame method. 55 469 Unsupervised hierarchical clustering was performed to identify outliers and failed arrays, and 470 those were excluded. We also performed strain verification as an additional quality check. 471 While majority of the probes were free of DNA sequence variants, we found 45 probes that 472 overlapped variants in the BXD family. We leveraged these as proxies for genotypes, and 473 performed a principal component analysis. tissue clocks, all the samples were used, and for the liver specific clocks, the training was 497 limited to liver samples. The alpha was set at 0.5, and the optimal lamda parameter was 498 determined by 10-fold cross-validation (function cv.glmnet). This selected subsets of 'clock 499 CpGs' and coefficients (see Table S2 for clock CpG IDs, intercept, and coefficients). DNAmAge 500 was then calculated as: 501 where b0 is the intercept, and b1 to bi are the coefficients, and CpG1 to CpGi denote the beta-503 values for the respective clock CpGs, and f -1 () denotes the inverse function of f(). 504

505
Statistical analyses between the epigenetic predictors and continuous variables (body weight, 506 strain lifespan) were based on Pearson correlations, and t-test was used to evaluate the effect 507 of categorical predictors (sex, diet). 508 Two metabolic traits were downloaded from the bioinformatics platform GeneNetwork 2 (GN2) 509 59 : (1) fasted serum glucose, and (2) fasted serum total cholesterol (more information on how 510 to retrieve these data directly from GN2 are provided in Data S14). Association with metabolic 511 traits was examined using multivariable linear regression (the R equations are provided in Table  512 S1). For visualization, residuals for both the predictor and outcome variables were extracted 513 after regressing on age, diet, and BWF using the R code: residuals(lm( ~ age + diet + BWF)). 514 Longevity data (defined as age at natural death) was also downloaded from GN2 (Data S13). 19 515 Males were excluded and strain-by-diet lifespan summary statistics were derived. Only strain-516 by-diet groups with 5 or more observations were included in the correlational analyses with the 517 epigenetic predictors. 518

519
The broad sense heritability within diet was estimated as the fraction of variability that was 520 explained by background genotype. 20,60,61 For this, we applied a simple anova: aov(EAA ~ 521 strain), and heritability was computed as: H 2 = SSqstrain/(SSqstrain + SSqresidual), where SSqstrain is 522 the strain sum of squares, and SSqresidual is the residual sum of squares. 523 All QTL mapping was done on the GN2 platform, and these traits can be accessed from this 524 website 59 (trait accession IDs provided in Data S14). In the GN2 home page, the present set of 525 BXD mice belongs to the Group: BXD NIA Longevity Study, and GN2 provides a direct interface 526 to the genotype data. All QTL mapping was done for genotypes with minor allele frequency ≥ 527 0.05 using the genome-wide efficient mixed model association (GEMMA) algorithm, 31 which 528 corrects for the BXD kinship matrix. For the EAA traits, diet, weight at 6 months, and final 529 weight were fitted as cofactor. Chronological age had not correlation with EAA and this was not 530 included as a cofactor (including age does not change the results). Genome-wide linkage 531 statistics were downloaded for the full set of markers that were available from GN2 (3720 532 markers as of early 2021). For the combined p-values, QTL mapping was done separately using 533 GEMMA for each EAA derived from all the unbiased mouse and universal clocks. Fisher's p-534 value combination was then applied to get the meta-p-value. 32 We used this method to simply 535 highlight loci that had consistent linkage across the different EAA measures. QTL mapping for 536 entropy, major covariates-age, diet, BW1, and BWF-were included as co-factors. QTL 537 mapping for predicted-maxLS was done without co-factors (including variables such as age,  538 weight, and diet did not change the results). 539 For marker specific linkage, we selected SNPs located at the peak QTL regions (DA0014408,  540 rs48062674, rs30567369), and grouped the BXDs by their genotypes (F1 hybrids and other  541 heterozygotes were excluded from this), and marker specific linkage was tested using ANOVA. 542 rs48062674 and rs30567369 are reference variants that is already catalogued in dbSNP, 62 and is 543 used as a marker in the QTL mapping. DA0014408.4 is an updated variant at a recombinant 544 region in the Chr11 interval and within the peak QTL interval. 20 Genotypes at these markers for 545 individual BXD samples are in Data S1. 546 For marker specific QTL analysis for EAA, we performed linear regression using the data in Data 547 S1. Heterozygotes at the respective markers were excluded, and we applied the following 548 regression model for each of the mouse and universal EAA traits separately: lm(EAA ~ genotype 549 + diet). To test the effect on body weight change, body weight data measured at approximately 550 4 (baseline), 6, 12, 18, and 24 months were downloaded from GN2 (Data S14). Detailed 551 description of these weight data are in Roy et al. 19 We then applied a mixed effects regression 552 model using the lme4 R package 63 : lmer(weight ~ age + diet + genotype + (1|ID)), where ID is 553 the identifier for individual mouse. 554 The full microarray data will be released via NCBI's Gene Expression Omnibus upon official 583

Bioinformatic tools for candidate gene selection
publication. Genome annotations of the CpGs can be found on Github 584 https://github.com/shorvath/MammalianMethylationConsortium. Individual level BXD data are 585 available on www.genenetwork.org on FAIR+ compliant format; data identifiers, and way to 586 retrieve data are described in Data S14. 587 Acknowledgement. We thank the entire UTHSC BXD Aging Colony team, particularly 588 Casey J Chapman, Melinda S McCarty, Jesse Ingles, and everyone else who contributed to the 589 tissue harvest. We thank Evan G Williams for making the gene expression data readily available, 590 and to David Ashbrook for providing the BXD genotypes. We thank the GeneNetwork team, 591 especially Zach Sloan and Arthur Centeno, who have been extremely prompt and effective at 592 assisting with the GeneNetwork interface. 593 Author contributions. KM contributed to the data, conceived portion of the study, and 594 performed statistical analysis and drafted the article. ATL, CZL, AH contributed to the data 595 analysis and in computing the epigenetic clocks and predictor. JVS contributed to the lab work. 596 RWW conceived of the BXD Aging Colony, and provided access to the BXD biospecimen and 597 data. SH developed the array platform, and built the epigenetic clocks and predictor. All authors 598 contributed to, and approved the manuscript. 599 Research funding. This study was funded by the NIH NIA grants R21AG055841 and 600 R01AG043930 601