A morphometric double dissociation: cortical thickness is more related to aging; surface area is more related to cognition

The thickness and surface area of cortex are genetically distinct aspects of brain structure, and may be affected differently by age. However, their potential to differentially predict age and cognitive abilities has been largely overlooked, likely because they are typically aggregated into the commonly used measure of volume. In a large sample of healthy adults (N=647, aged 18-88), we investigated the brain-age and brain-cognition relationships of thickness, surface area, and volume, plus five additional morphological shape metrics. Cortical thickness was the metric most strongly associated with age cross-sectionally, as well as exhibiting the steepest longitudinal change over time (subsample N=261, aged 25-84). In contrast, surface area was the best single predictor of age-residualized cognitive abilities (fluid intelligence), and changes in surface area were most strongly associated with cognitive change over time. These findings were replicated in an independent dataset (N=1345, aged 18-93). Our results suggest that cortical thickness and surface area make complementary contributions the age-brain-cognition triangle, and highlight the importance of considering these volumetric components separately.


Introduction 28
As the human brain ages, it undergoes a pronounced structural transformation. Even in the 29 absence of neuropathology, overall brain volume shrinks -from age six onwards into old age 30 causes (e.g., in ageing) and consequences (e.g., for cognition) have rarely been discussed, 48 especially in adult samples. Moreover, additional detailed morphometric shape measures (such 49 as curvature or sulcal depth) may provide further insight into brain development across the 50 adult lifespan and its relationship with cognitive performance. 51 In this paper, we explore multiple morphometric measures in two large adult-lifespan cohorts. 52 We show, firstly, that the most pronounced structural changes in the aging brain are the . Secondly, we find that incorporating 56 multiple shape measures into a single model outperforms any individual metrics' ability to 57 capture age-related and fluid cognitive differences. This paper's main contribution, however, 58 lies in providing cross-sectional and longitudinal evidence of a double dissociation in two 59 independent, large-sample cohorts. Specifically, cortical thickness was more strongly associated 60 with age than cortical surface area, while surface area was more strongly associated with 61 cognition (as indexed by fluid intelligence). This pattern was most apparent longitudinally, but 62 we also observed it cross-sectionally after adjusting for age. This double dissociation points to 63 possibly distinct underlying biological processes (discussed below), and supports recent calls to 64 investigate thickness and surface area separately (Winkler et al., 2018) as brain volume (a 65 product of cortical thickness and surface area) likely conflates and therefore masks these 66 differentiable effects. 67

68
Cross-sectional results 69 We first calculated whole brain as well as regional correlations between each metric and age, 70 cognitive abilities (as indexed by fluid intelligence) and age-residualized cognitive abilities 71 Residualized cognitive scores allow one to separate concurrent age-related decline in cognitive 72 ability, thus providing an age-independent measure of cognition. Thickinthehead, which is a 73 measure of cortical thickness from the Mindboggle software, showed the strongest whole-brain-74 age correlations (r = -.83). This was followed by curvature (r = +.77), fractal dimensionality (a 75 measure of cortical complexity; = -.65) and FreeSurfer's standard cortical thickness (r = -.60), as 76 shown in Table 1 and plotted in Figure 1. Compared to the other metrics, surface area exhibited 77 the weakest age relationship (r = -.36). This order was reversed for age-residualized cognition. 78 Here, surface area was the strongest predictor (r = +0.21), while the two thickness metrics and 79 curvature did not show significant brain-cognition correlations after adjusting for age. The two 80 volume measures (FreeSurfer's cortical volume, plus SPM's cortical + subcortical volume) 81 predicted both age and age-residualized fluid-intelligence reasonably well (r ~ -.55 and 0.20, 82 respectively), as would be expected since they are proportional to the product of cortical 83 thickness and surface area. Fractal dimensionality was also a good predictor of both age and 84 age-residualized cognition (rage= -0. 65   Next, we estimated a series of path models to assess the relationship between brain structure 101 and age, fluid intelligence and age-residualized fluid intelligence when both surface area and 102 cortical thickness are included in the same model. Path analysis is an extension of multiple 103 linear regressions, allowing researchers to assess the relationships between the predictor 104 variables rather than having several independent variables predict one dependent variable 105 (Streiner, 2005). Age and fluid intelligence were best captured by surface area and cortical 106 thickness, while age-residualized fluid intelligence was associated only with surface area (see 107 Figure 3). We validated this frequentist modelling approach with Bayesian model selection 108 (supplementary Figures 4-5). Overall, the whole-brain, cross-sectional analyses suggest that 109 cortical thickness and surface area differentially associated with age and age-residualized 110 cognitive abilities, respectively.

114
Our regional investigations further support the morphological dichotomy found in the whole 115 brain analyses. As shown in Figure 4, for cortical thickness, all 32 brain regions (the 64 DKT 116 regions averaged across the hemispheres) were significantly correlated with age (all correlations 117 were FDR corrected at alpha = 0.05), while no region predicted age-residualized fluid 118 intelligence (r < 0.07, pFDR > 0.05; see supplementary tables 5-7). In contrast, for surface area, all 119 regions were significantly associated with age-residualized cognitive abilities (r > 0.11, pFDR < 120 0.05). While regional surface area also correlated with age, the correlations were substantially 121 weaker than the brain-age correlations for cortical thickness. 122 Finally, in addition to the "area and thickness only" path models, we ran three "full models" 123 which each included all eight brain structure metrics to assess the metrics' combined 124 associations with age and cognition. The total variance explained by these models was 76, 46 125 and 7 percent for age, fluid intelligence and age-residualized fluid intelligence, respectively -126 almost double the variance explained by thickness and area alone (see supplementary Figure 3). 127 Moreover, the fact that multiple morphometric measures provided partially complementary 128 information about the outcome highlights the potential usefulness in assessing various 129 morphological shape measures when investigating the ageing brain and cognitive abilities. This 130 was further supported by regional brain-age and brain-cognition correlations (supplementary 131 Figure 8): for instance, while volume-age effects were most pronounced in the frontal regions, 132 depth-age effects were strongest in the temporal lobes. It is plausible that the focus on frontal 133 brain regions in the brain and cognitive aging literature (Greenwood, 2000;Jung & Haier, 2007) 134 is informed in part by the field's traditional focus on brain volume, and that other aspects of 135 brain structure could point to more underappreciated regional effects. 136 137 138 Figure 4: Significant regional age-and age-residualized fluid intelligence correlations. Correlations are 139 FDR corrected at alpha = 0.05. For cortical thickness, all 32 brain regions are significantly associated 140 with age, while none are associated with age-residualized cognitive abilities. For surface area, all regions 141 are correlated with age-residualized cognition. While regional surface area also correlated with age, the 142 correlations were substantially weaker than the brain-age correlations for cortical thickness.

144
Longitudinal results

145
Although cross-sectional analyses offer an interesting insight into age-related cognitive and 146 morphometric differences, longitudinal data are needed to truly assess how brain and cognitive 147 change (Oschwald et al., 2020). Doing so, we found that the change-change relationship 148 between surface area and cognition was significantly stronger than the change-change 149 relationship between volume and cognition as well as that between thickness and cognition. 150 After establishing metric and scalar invariance (described in supplementary section 7), we used 151 Latent Change Score Models (LCSM) to examine morphometric and cognitive change over time. 152 The cognitive LCSM revealed significant change in cognition over time, as well as significant 153 variability in the rate of change (Table 2, variances). The effect size of change of fluid intelligence 154 was -0.04 (Cohen's D, computed by dividing the mean change by the SD at time 1). The three 155 brain-structure LCSMs also showed evidence of change over time (

167
Next, to investigate the relationship between cognitive change and morphometric change, we 168 fit three second order latent change score models (2LCSM), one for each brain structure metric. 169 We used full information maximum likelihood (FIML, Enders & Mansolf, 2018) with robust 170 standard errors to account for missing data. Results are shown in Table 3

176
All three models fit the data well: CFI area = 0.972; CFI volume = 0.975; CFI thickness = 0.978; (further 177 model fit indices can be found in section 7 of the supplementary materials). After fitting the 178 models, we extracted and correlated the cognitive rates of change with the brain structural rates 179 of change. Change in surface area showed the largest effect (r = 0.23, p <.001), followed by (non-180 significantly) volume (r=-0.11, p = 0.068) and cortical thickness (r=-0.022, p = 0.71). The Steiger's-181 Z tests (Steiger, 1980) in the R package psych can directly compare differences in correlation 182 strengths, accounting for the full correlation pattern among variables. Doing so revealed that 183 change in area was significantly more strongly associated with change in cognition than was 184 thickness or volume change (see Table 4

187
These results suggest that people whose surface area decreased more quickly also showed 188 steeper rates of cognitive decline; an effect not found for thickness or volume. 189 Note that the models shown above include observed (not latent) variables to ensure maximum 190 comparability between the LCBC and Cam-CAN models (in LCBC, it was not possible to derive 191 latent cognitive scores because only WASI sum scores were available). However, latent variable 192 Cam-CAN models (which we had run initially, before the replication study) show the same 193 pattern, with changes in surface area most strongly associated with changes in cognition 194

199
To examine whether our cross-sectional and longitudinal findings generalize to other cohorts, 200 we next (after finalizing the analyses in Cam-CAN) examined the same associations in an 201 independent sample, the LCBC data. Because of their widespread use and accessibility, we 202 included the three FreeSurfer-derived metrics (thickness, area, volume) in our replication 203 analyses. 204 205 Figure 6: The relationship between age, brain structure and cognition in LCBC.

206
Cross-sectionally, as shown in Figure 2 (E-H), thickness showed the strongest whole brain-age 207 correlation (R = -.78, p < 0.001), followed by volume (R = -0.64, p < 0.001) then surface area (R = 208 -0.34, p < 0.001). For age-residualized fluid intelligence, thickness had the weakest correlation 209 (R = 0.077, p = 0.009), followed by surface area (R = 0.13, p = 0.001) and volume (0.15, p < 0.001; 210 and supplemental Table 3). As was the case in Cam-CAN, the frequentist path models and 211 Bayesian model selection revealed that the best models to predict age and fluid intelligence 212 were comprised of both surface area and thickness, while age-residualized fluid intelligence was 213 best captured by surface area alone (Figure 7).  Longitudinally, we found evidence of significant change over time for the three brain metrics 218 (Table 5, intercepts), and significant variability over time for the brain metrics and cognition 219 (

225
As shown in Table 3, the three 2LCMs fit the data well: CFI area = 0. 0.987; CFI volume = 0.921; 226 CFI thickness = 0.994 (further model fit indices can be found in the supplementary materials). 227 Change in all structural brain metrics was significantly associated with change in cognition with 228 surface area showing the largest effect (r = 0.35, p <.001), followed by thickness (r=0.22, p <.001) 229 then volume (r=0.15, p =0.001). The Steiger's Z-Test revealed that the change-change 230 relationship between area and cognition was significantly stronger than that between volume 231 and cognition and thickness and cognition (see Table 4). 232 The LCBC longitudinal results replicated those found in Cam-CAN, further supporting the 233 finding that changes in surface area predict changes in cognition and that this relationship is 234 stronger than that between change in thickness and change in cognition. We therefore 235 successfully replicated Cam-CAN's cross-sectional and longitudinal findings. Across two independent cohorts, we found evidence of a morphometric double dissociation: 243 cortical thickness was more strongly associated with age than cortical surface area, both cross-244 sectionally and longitudinally, whereas surface area was more strongly associated with 245 cognition (fluid intelligence); certainly longitudinally, and also cross-sectionally, after removing 246 age-related variance. Note that we are not claiming that cortical thickness plays no role in 247 cognition -it shows a longitudinal association with cognitive change in one of the two datasets 248 (albeit significantly smaller than that of surface area), and its cross-sectional association with 249 fluid intelligence was significant. The lack of cross-sectional association with age-residualized 250 fluid intelligence could be due to collider bias whereby cortical thickness is causally related to 251 both age and cognition and that any thickness-cognition effect disappears when removing age. 252 Our results do suggest, however, that surface area and thickness, which tend to be investigated 253 together through the aggregate measure of volume, may have dissociable causes (e.g., in ageing) 254 and consequences (e.g., for cognition). Furthermore, our finding that cortical thickness is less strongly associated with cognitive 273 abilities than other measures of brain structure is also supported by animal research, showing 274 that rates of dendritic atrophy in rats did not differ between aged cognitive unpaired and aged The shape of the ageing brain 290 A second contribution this paper makes is to characterize structural age-related differences and included the breadth of morphometry assessed here. Our approach, therefore, allowed us to 295 directly compare the magnitude of cortical age-related differences and changes across a range 296 of metrics. 297 The biggest age-related change (cross-sectionally and longitudinally) was that of cortical 298 thickness, followed (cross-sectionally) by curvature. This suggests that the most striking 299 structural transformation the human brain undergoes with age -at least of those detectable 300 with MRI -is that the cortex thins while also becoming more 'curved'. The width and depth of 301 cortical sulci might influence the complexity metric, such that more atrophied brains might 302 exhibit an increase in gyral complexity but not a decrease in surface area (Narr, et al., 2004;303 Lemaitre et al., 2012). 304 We also show that combining shape measures outperforms any individual metrics' ability to 305 capture age-related and cognitive differences: together, the eight morphometric metrics 306 assessed here explained almost double the variance compared to that captured by thickness and 307 surface area alone. Thus, the fact that multiple morphometric measures provided partially 308 complementary information about the outcome highlights the potential usefulness in assessing 309 various morphological shape measures when investigating the ageing brain and cognitive 310 abilities. 311

Methodological strengths and limitations 312
In addition to the large sample size and the assessment of multiple shape metrics, the 313 integration of cross-sectional and longitudinal data is of note. Recent reviews and commentaries 314 have pointed to the limitations of cross-sectional analyses when investigating brain-cognition 315 relationships in the ageing brain (see Oschwald 2020 for a discussion). While we agree that 316 collecting longitudinal data is almost always preferable, we acknowledge that it is not always 317 attainable. Our approach of integrating cross-sectional and longitudinal data, where the latter 318 largely confirmed the findings of the former, offers some validation of cross-sectional 319 approaches. 320 Another key strength of this paper is that we successfully replicated our cross-sectional and 321 longitudinal findings in an independent cohort. In doing so, we not only validated the apparent 322 existence of the morphological double dissociation, but showed that it is not subject to specific 323 features of the Cam-CAN data. Indeed, replicating our results despite important differences 324 between the two datasets increases the robustness of our findings considerably. For instance, 325 the cognitive tests differed (Cattell in Cam-CAN, WASI Matrix in LCBC), suggesting that surface 326 area captures the broader construct of fluid intelligence (rather than test-specific features). 327 Moreover, while the morphological metrics assessed in our initial Cam-CAN study offered an 328 intriguing description of the ageing brain, obtaining them required five separate processing The breadth of structural brain metrics reviewed in this paper also comes with some important 335 limitations. First, we were not able investigate the changes of several of the metrics which we 336 had assessed in our cross-sectional analyses. This is because the pipelines used to calculate these 337 additional metrics (e.g. Mindboggle) are not yet optimised for longitudinal data. Particularly 338 curvature, which showed a very strong age effect cross-sectionally, would have been interesting 339 to explore longitudinally. Likewise, fractal dimensionality, which measures cortical complexity 340 and correlated strongly with age and cognition in our cross-sectional analyses, might be a 341 promising candidate for future longitudinal investigations. 342

343
In this paper, we found cross-sectional and longitudinal evidence for a brain-cognition double 344 dissociation: two morphological metrics, surface area and cortical thickness, which tend to be 345 investigated together through grey matter volume, are differentially associated with age and 346 fluid intelligence: while thickness is strongly associated with age, it has weak associations with 347 change in fluid intelligence -a pattern that is reversed for surface area, which captures 348 cognitive change and difference well, and age relatively poorly. We therefore recommend that 349 rather than using grey matter volume as the default measure, researchers should choose 350 structural brain metrics depending on the question under investigation. Doing so will allow us 351 to advance our understanding of the functional significance of these dissociable aspects of 352 score to be considered cognitively healthier than a younger individual with a higher score. 366 A subset of participants (N=261) was scanned twice, with an average interval between the first 367 and the second scan of 1.33 years (sd = 0.66). Additionally, a (partially separate) subset of 368 participants (N=233) completed the Cattell test twice with an average interval between the two 369 cognitive tests of 6.0 years (sd = 0.67). Two waves of both brain and cognitive data were available 370 for 115 participants. 371 Imaging data acquisition and pre-processing 372 T1-and T2-weighted 1 mm isotropic magnetic resonance imaging scans were available for 647 373 participants (Taylor et al., 2017). To ensure the quality of the image segmentations, we adapted 374 a recently developed supervised learning tool (Klapwijk et al., 2019), which led us to exclude six 375 participants due to low-quality segmentations. Our quality control process is described further 376 the supplementary materials. In order to investigate (cross-sectional) brain morphology in as 377 much detail as possible, we examined a total of eight brain metrics: in addition to three 378 FreeSurfer-derived measures of cortical volume, thickness and surface area (derived from a 379 standard FreeSurfer recon-all pipeline), we examined grey-matter volume derived from SPM 12 380

Cross-sectional analyses 395
All analyses were carried out using R (R Core Team, 2013), and the code used for this paper is 396 available on the Open Science Framework (https://osf.io/n6b4j/). 397 First, we calculated whole brain as well as regional correlations between each metric and age, 398 fluid intelligence and age-residualized fluid intelligence. Regional correlations were FDR 399 corrected at alpha = 0.05. Next, we estimated a series of path models to assess which 400 combination of whole brain metrics best predicted age, fluid intelligence and age-residualized 401 fluid intelligence. We then examined the robustness of our frequentist modelling approach with 402 a Bayesian modelling framework (see supplementary materials). 403

Longitudinal analyses 404
To assess neural and fluid intelligence change between time point 1 and time point 2, we fit a 405 series of longitudinal structural equation models for each longitudinal FreeSurfer metric (whole 406 brain volume, thickness and surface area) and fluid intelligence. Before assessing cognitive 407 change, we also tested for longitudinal measurement invariance (Widaman et al., 2010). 408 Additionally, as the second Cattell test was completed online by approximately half of the 409 participants, versus pencil and paper by the other half, we investigated whether these two 410 groups differed in their measurement properties by assessing metric invariance (constraining 411 factor loadings) and scalar invariance (constraining intercepts). 412 To understand whether cognitive change was correlated with morphometric change, and if so, performed to assess whether the change-change relationships differed significantly between the 419 different metrics (Steiger, 1980). Given that properties of the data, obtaining latent cognitive 420 scores was not possible in the replication sample (see below), so we also ran the models with 421 observed variables only within Cam-CAN to ensure maximal comparability between the two 422 sets of analyses. We ran models on participants with at least one cognitive score (N=362) using 423 full information maximum likelihood (FIML, which assumes data are missing-at-random, 424 Enders & Mansolf, 2018, and enables robust standard errors to account for missingness). 425

426
To assess the robustness of our results, we investigated whether our core findings replicated in 427 a second, independent dataset. To this end, we analysed data from the Centre for Lifespan  ., 2016)). At least two waves of cognitive and/or neural data were available for 437 389 participants. Where participants had more than two waves, we selected their first and last 438 time point, maximizing the interval between waves as well as the data similarity between 439 samples. This allowed us to include the largest possible number of participants in our 440 longitudinal analyses while maintaining two-wave models comparable to those described in 441 Cam-CAN. The mean interval between the two waves so defined was 5.18 years (min = 0.73, max 442 = 10.0, sd = 2.59 years). 443 Our analysis pipeline mirrored that described above: cross-sectionally, whole brain correlations 444 were followed by frequentist path models and Bayesian model selection analyses.

Imaging data acquisition and pre-processing
We based our quality control process on the supervised learning tool 'Qoala-T' developed by Klapwijk et al., which was originally developed for child and adolescent samples (see manual, 2019a, and manuscript, 2019b). First, we manually rated the quality of 12% of our FreeSurfer preprocessed Cam-CAN scans, thereby surpassing the proportion of 10% as recommended by the Qoala-T authors. These scans later served as input for Qoala-T, so the algorithm would learn to distinguish between scan qualities suitable or unsuitable for further analyses. Second, following the manual ratings, we used Qoala-T's publicly available quality control tool to assess the quality of all T1 CamCAN images. This resulted in six participants being excluded from the sample (age 32 -71, median = 59).
We have uploaded a detailed rating procedure to this project's OSF page (link here) as we hope that it will help other researchers implement versions of this semi-automatic quality control procedure for large adult lifespan samples.  3 Frequentist modelling approach

Whole Brain Correlations
We examined whether the different metrics of brain structure provided unique and complementary information about age and cognitive ability. To do so, we ran frequentist path models and Bayesian model selection framework in which cortical thickness and surface area predicted either age, fluid intelligence or age-adjusted fluid intelligence (ignoring volume since this is the product of thickness and surface area). These revealed that the best model of age and fluid intelligence required both surface area and thickness (Figure 1 A-B). In contrast, individual differences in (age-residualized) fluid intelligence were best captured by surface area alone     Compares the best model (top row) to the next five best fitting models.

Regional results
In Cam-CAN, after looking at whole brain correlations between the eight metrics and age, fluid intelligence and age-residualized fluid intelligence, we investigated regional correlations.
Regions were assigned 62 labels following the Desikan-Killiany-Tourville (DKT) protocol in the Mindboggle pipeline (Klein et al., 2018). We then averaged across both hemispheres. Results are shown in Tables 4-6 and plotted in Figures 4-6. Note that data for the entorhinal, banks superior temporal and temporal pole was only available for Thickinthehead and Volume.
Our regional investigations further supported the morphological dichotomy found in the whole brain analyses. For cortical thickness, all 32 brain regions (averaged across the hemispheres) were significantly correlated with age (all correlations were FDR corrected at alpha = 0.05), while not a single region predicted age-residualized fluid intelligence (Figure 3 and Tables 4-6 in supplementary materials). In contrast, for surface area, all regions were significantly associated with age-residualized fluid intelligence. While regional surface area also correlated with age, the correlations were substantially weaker than the brain-age correlations for cortical thickness.
More regional results are shown in Tables 4-6 and Figures 4-6 in the supplementary materials.

Intro
Here we use R's pwr package to run power analyses on the brain-age and brain-cognition relationship for volume, thickness and surface area. These include estimated correlation coeffcients, based on well-powered findings in the literature.

Age
First, let's run power analyses based on whole brain-age effect sizes (correlation coeffcients) found in the literature. We use this well-powered study as a source of reference (see Table 1 for whole brain -age correlation coeffcients): https://www.sciencedirect.com/science/article/pii/S0197458010003210?casa_token=lUY7YAgJKZsAAAAA: FCrWz1X7EWi5lKjsFmzGBYMzKnVknQ8_X2iBUn3xqqdd-R3wU1pPnHEOasgn0XUZ175R4JtpXdvV As a reminder, CamCAN has a sample size of N = 647, LCBC has N = 1345.

Fluid intelligence
For volume and thickness, we use correlation coeffcients from this study (see Figure 3)

Age-residualized fluid intelligence
Because very few studies have age-residualized cognitive abilities, no reliable, well-powered correlation coeffcients were availble in the literature. We therefore did not run a priori power analyses for these correlations.