Individualized characterization of volumetric development in the preterm brain

Objective Preterm birth carries a significant risk for atypical development. While studies comparing group means have identified a number of early brain correlates of prematurity, they may ‘average out’ effects significant in a single individual. To understand better the cerebral consequences of prematurity, we created normative ‘growth curves’ characterizing neonatal brain development and explored the effect of preterm birth and related clinical risks in individual infants. Methods We used Gaussian process regression to map typical volumetric development in 275 healthy term-born infants, modelling for age at scan and sex. We compared magnetic resonance images of 89 preterm infants (born 28.7–34 weeks gestational age) scanned at term-equivalent age to these normative charts and related deviations from typical volumetric development to both perinatal clinical variables and neurocognitive scores at 18 months. We then tested if this approach can be generalized to an independent dataset of 253 preterm infants (born 28–31.6 weeks gestational age) also scanned at term-equivalent age but using different acquisition parameters and scanner, who were followed-up at 20 months. Results In both preterm cohorts, cerebral atypicalities were widespread and often multiple, but varied highly between individual infants. Deviations from normative brain volumetric development were associated with perinatal factors including respiratory support, nutrition and postnatal growth, as well as with later neurocognitive outcome. Conclusion Group-level understanding of the preterm brain might disguise a large degree of individual differences. We provide a method and a normative dataset for clinicians and researchers to profile the individual brain. This will allow a more precise characterization of the cerebral consequences of prematurity and improve the predictive power of neuroimaging.


Introduction
Preterm birth (before 37 weeks gestational age, GA) affects approximately 10% of pregnancies worldwide 1 . While recent advances in neonatal medicine have greatly improved survival rates, preterm born children are at a significant risk of atypical brain development and lifelong cognitive difficulties 2 including a higher incidence of neurodevelopmental and psychiatric disorders 3,4 .
Although early brain correlates of preterm birth have been identified at a group level in preterm infants 5 , this vulnerable population is highly heterogenous, with individuals following diverse clinical and neurocognitive trajectories 6,7 . This variability poses a significant challenge for the statistical techniques adopted by studies that explore mean or median effects to describe the average abnormality observed within a group 8,9 . Despite providing valuable insights about how the vulnerable brain develops, these approaches struggle to capture atypical development fully due to their limited capacity to characterize heterogeneity and their inability to infer at the level of the individual.
To understand brain development at an individual level, offer accurate prognosis of later outcome, and study the effects of clinical risks and interventions, it is necessary to provide a personalized assessment of cerebral maturation 10 . Indeed, an unwarranted assumption that preterm birth has a homogenous effect on brain development might account for the relatively poor predictive power of neonatal MRI for later outcome, especially in the absence of major focal lesions 11,12 . Comparing individuals against robust normative data avoids the requirement to define quasi-homogenous groups in a search for effects common to the group, and offers a powerful alternative to investigate brain maturation with high sensitivity to pathology at an individual infant level 10,13,14 .
In this study, we used Gaussian process regression (GPR) to create 'growth curves' of normative volumetric development using a large sample of healthy term-born infants scanned cross-sectionally within the first month of life. Analogous to the widely employed paediatric height and weight growth charts, this technique allows the local imaging features of individuals to be referred to typical variation in term-born infants while simultaneously accounting for confounds and variables such as age and sex 9 . Having established normative values for brain growth in a large group of healthy termborn infants, we quantified deviations from typical development in individual preterm infants. We investigated the heterogeneity of these deviations, and the association between individual deviations, perinatal clinical variables and later neurocognitive abilities. To test generalizability, we repeated these assessments in infants from a second large independent dataset acquired on a different MR scanner using different imaging parameters.

Participants
This study utilised data from two neonatal cohorts. Informed written consent was given by parents prior to imaging. Pulse oximetry, temperature and heart rate were monitored throughout, and ear protection was used for each infant (President Putty, Coltene Whaledent, Mahwah, New Jersey, USA; MiniMuffs, Natus Medical, San Carlos, California, USA). All images were inspected by experienced neonatal neuroradiologists.

Developing Human Connectome Project
Infants were participants in the developing Human Connectome Project (dHCP), approved by the National Research Ethics Committee (REC: 14/Lo/1169) and were scanned during natural unsedated sleep at the Evelina London Children's Hospital between 2015 and 2019. Full details regarding the preparation of infants prior to scanning have been previously described 15 . The initial dataset consisted of 323 term-born and 100 preterm infants scanned at term-equivalent age (TEA: 37-45 weeks postmenstrual age, PMA). Exclusion criteria for the term-born infants included admission to neonatal intensive care unit or significant intracranial abnormality detected on neonatal MRI scan (including acute infarction and parenchymal haemorrhage), but not punctate white matter lesions (PWMLs), small subependymal cysts/haemorrhages in the caudothalamic notch, mildly prominent ventricles or widening of the extra-axial CSF (within normative variation). Seventeen term-born infants were excluded due to incidental findings. There were no exclusion criteria for the preterm infants, except for major congenital malformations.

Evaluation of Preterm Imaging Project
We used the data from a second preterm cohort consisting of 511 infants from the Evaluation of Preterm Imaging (EPrime, REC:09/H0707/98). Details regarding infant characteristics and preparation prior to scanning have been previously described 12 . In brief, infants born before 33 weeks GA underwent MRI between 37 and 45 weeks PMA at the neonatal intensive care unit in Hammersmith Hospital between 2010 and 2013. Infants with major congenital malformations were excluded.

MRI acquisition and preprocessing
MRI data for the dHCP were collected on a Philips Achieva 3T (Philips Medical Systems, Best, The Netherlands) using a dedicated 32-channel neonatal head coil 15 . T 2 -weighted scans were acquired with TR/TE of 12s/156ms, SENSE=2.11/2.58 (axial/sagittal) with in-plane resolution of 0.8x0.8mm, slice thickness of 1.6 and overlap of 0.8mm. Images were motion corrected and super-resolution reconstructed resulting in 0.5mm isotropic resolution (full details in Makropoulos et al. 16 ). MRI data collected for EPrime were acquired on a Philips 3T system using an eight-channel phased array head coil. T 2 -weighted turbo spin echo was acquired with TR/TE of 8670/160ms, in plane resolution 0.86×0.86mm, slice thickness of 2mm with 1mm overlap.
Both datasets were preprocessed using the dHCP structural pipeline 16 . In brief, motion-corrected, reconstructed T 2 -weighted images were corrected for bias-field inhomogeneities, brain extracted and segmented into 9 tissue classes using the Draw-EM algorithm 17 . Tissue labels included cerebrospinal fluid (CSF), white matter (WM), cortical grey matter (cGM), deep grey matter (dGM), ventricles (including the cavum), cerebellum, brainstem, hippocampus and amygdala. dGM was further parcellated into left/right caudate, lentiform and thalamus. Total tissue volume (TTV) incorporated all brain GM and WM volumes; total brain volume (TBV) included TTV and ventricles; and intracranial volume (ICV) included TBV and CSF (Table 1). Given the high correlation between TTV and TBV (ρ=0.98), we reported only TTV.
The quality of the preprocessing was visually evaluated to ensure no images severely affected by head motion or with poor segmentation were included. This was achieved using a scoring system detailed in Makropoulos et al 16 . In brief, we excluded images of poor quality due to severe (score 1) or significant motion (score 2) and included images with negligible motion (score 3) and no visible effects of motion (score 4) (Suppl. Fig. 1). The quality of the segmentation was examined in the same fashion. Images were excluded due to unsuccessful (score 1) or poor (regional errors -score 2; Suppl.  -TTV Total tissue volume (TTV) All brain GM + WM tissue -Total brain volume (TBV) All brain GM + WM tissue + ventricles -Intracranial volume (ICV) All brain GM + WM tissue + ventricles + CSF - Accuracy was tested under 5-fold cross-validation stratified to cover the whole PMA range (37 to 45 weeks). The association between brain volumes and the model predictors was estimated with a combination of radial basis function, linear and white noise covariance kernels (sum kernel). Model hyperparameters were optimized using log marginal likelihood. Prediction performance was evaluated using the mean absolute error and the correlation (Spearman ρ) between the observed and the predicted values from the 5-fold cross-validation.

Modelling volumetric development using Gaussian Process Regression
We characterized normative development in absolute (cm 3 ) and relative (%) volumes. To calculate relative volumes, we estimated the proportion of each tissue volume from TTV, the ventricles from TBV and CSF from ICV (Table 1). To investigate the effects of preterm birth, we chose to look at relative instead of absolute volumes to (i) ensure results are not driven by extreme individual differences in non-brain intracranial volume, often seen in preterm infants, (ii) partially alleviate differences in data acquisition. Furthermore, to quantify the effect of the difference in imaging spatial resolution between the dHCP and EPrime datasets, we reran the GPR models with dHCP data downsampled to 1mm isotropic resolution prior to tissue segmentation and examined the difference in (i) model means and (ii) the number of EPrime infants who deviated significantly from the predicted model mean.

Association between perinatal clinical risks and deviations from normative development
Individual Z-scores were computed for every region and infant. To quantify extreme deviations, prior to analyses, we chose a threshold of |Z|>2.6 (corresponding to p<0.005) following the convention adopted in previous GPR analyses modelling adult brain development 8,18 . We examined the proportion of infants with volumes lying more than 2.6 standard deviations (sd) above or below the model mean (indicating the top and bottom 0.5 percent of the typical group values described hereafter as extreme positive or negative deviations).
We tested the association between these deviations and recognized perinatal clinical risks 19 . These included GA at birth, postnatal growth, birth weight Z-score, days receiving mechanical ventilation, days receiving continuous positive airway pressure (CPAP) and days receiving total parenteral nutrition (TPN, available only for EPrime). Postnatal growth was estimated as the difference between birth and scan weight Z-score calculated using the population data from the uk90 growth charts implemented in sitar R package 20 .

Association between deviations from normative development and later neurocognition
Bayley III Scales of Infant Development (BSID-III) 21 assessment was carried out by trained developmental paediatricians/psychologists at 18 months for the dHCP and at 20 months for EPrime.
We used the composite scores for motor, cognitive and language development (mean(sd)=100 (15)).
Socio-economic status was defined by the Index of Multiple Deprivation (IMD), based on parents' postal address, which accounts for 38 factors including income, employment, education, health and crime, with a higher score indicating greater deprivation.

Statistical testing
Associations were examined using Spearman rho (ρ) or Mann-Whitney U test combined with Cliff's delta (d, ranging from -1 to 1). Statistical testing was done under Bonferroni-Holm multiple comparison corrections. Data analyses and visualization were performed in R 3.6.1 (www.rproject.org) and python 3.7 (www.python.org).

Data availability
The data collected for the dHCP will be publicly available at http://developingconnectome.org/.

Results
The perinatal characteristics, demographics and neurocognitive outcome of the sample are presented in Table 2. On average EPrime infants were born earlier (p<0.05, d=0.28) and had lower birth weight (p<0.05, d=0. 19) compared to dHCP preterm infants. They also had poorer motor (p<0.05, d=0.22) but not language (p=0.1) nor cognitive (p=0.12) skills at follow-up. There were no differences in days on CPAP between the two preterm cohorts (p=0.57), but on avarage dHCP preterm infants required mechanical ventilation for longer (p<0.05, d=0. 16). The two preterm cohorts did not differ in PWMLs incidence (p=0.09) or proportion of infants with intrauterine growth restriction (IUGR, p=0.15).

Characterizing typical brain volumetric development during the neonatal period
First, we characterized the development of absolute brain volumes during the neonatal period. The data showed an increase in all volumes except ventricles, where no change was detected (Fig. 1A, 3A). The increase was greatest in the cGM (10.4% per week, pw) and the cerebellum (9.9% pw) compared to ICV (6.1% pw), TTV (6% pw) and CSF (7% pw Correlation between observed and predicted values was highest in cGM (ρ=0.72) and cerebellum (ρ=0.77), and lowest in CSF (ρ=0.33).
The greatest changes in relative volumes were observed in cGM and WM (Fig. 1B, Suppl. Fig.   3B). cGM represented 36% of TTV at 37 weeks PMA, and increased to 44% at 44 weeks PMA, while the relative WM volume decreased from 48% to 38% of TTV. The relative cerebellar volume increased from 6% to 7%. The model showed an increase in the relative volume of the lentiform, a subtle decrease in caudate and no change in thalamus. We observed a slight increase in the CSF proportion of ICV and a steady decrease in the proportion that ventricles contributed to TBV. Mean absolute error ranged between 0.43 (cGM and WM) and 0.78 (CSF) in units of sd (Suppl . Table 1).

Image resolution and volumetric development
Overall, the majority of observations in both dHCP and EPrime preterm infants fit within 2.6 sd of the term-born model mean with good agreement between the two studies ( Fig. 2A). Differences were most profound in fluid-filled structures, likely attributable to partial voluming of high T 2 signal CSF.
In agreement, when compared to the models built using the original dHCP resolution of 0.5mm, the matched 1mm resolution GPR models showed a mean shift (increase) only for the CSF and ventricular volumes (Fig. 2B; Suppl. Fig. 4 and 5). As a result, when using the lower resolution growth charts the proportion of extreme positive deviations in the EPrime decreased from 53% to 29% for CSF and from 44% to 32% for ventricles (Fig. 2C). Changes in the proportion of extreme deviations associated with image resolution for the rest of the brain structures were more subtle.
Unless stated otherwise, data were presented for the 0.5mm resolution models.

Infants who showed deviations in thalamic volume also had PWMLs
In the dHCP preterm sample, all eight infants with extreme negative thalamic deviations had PWMLs, seven out of eight had multiple lesions. Four out of these seven infants had lesions involving the corticospinal tract (Fig. 3). Seven out of the eight infants were on CPAP, but none of them for a long period of time (five infants<4 days; one infant 11 days; one infant 18 days) and all seven did not require ventilation. In six out of the eight cases, extreme deviations in thalamic volumes were isolated findings with no other deviations in brain volumes. In one infant this was accompanied by overall brain alterations including reduced TTV and cerebellar volume and increased CSF. This infant had bilateral cystic lesions in the thalamus and was on ventilation for 6 days and CPAP for 11 days. In one infant, reduced thalamic volumes were accompanied by increased CSF. None of these infants had a birthweight of less than 1kg.
In the EPrime cohort 17 infants had bilateral reduced thalamic volume and 10 had unilateral extreme deviations (with structure in the other hemisphere close to but not reaching Z<-2.6). 78% of these infants had PWMLs compared to 16% incidence in the rest of the sample. Furthermore, overall across the whole cohort, infants with PWMLs had significantly reduced left (Cliff's d=0.56) and right (Cliff's d=0.53) thalamic volumes, compared to infants without (both p<0.05). In the EPrime study, infants with reduced thalamic volumes, often had CSF or ventricular volumes significantly bigger than the normative values for their age. In seven infants, this was associated with periventricular leukomalacia and in a further two, with haemorrhagic parenchymal infarction.

Infants with extreme deviations in cerebellar volume have poor outcome
Significant reductions in cerebellar volumes (Z<-2.6) were accompanied by increased ventricular volume in all infants and by CSF widening in the majority of infants (dHCP: 2/3; EPrime: 4/5). In two infants, we observed significant cerebellar injury resulting in deviations lying more than 10 sd below the predicted mean for their age and sex. The first infant had bilateral cerebellar haemorrhage with marked parenchymal damage, resulting in atrophic brainstem and substantial cerebellar tissue loss Fig 4). The second infant had imaging features suggestive of likely underlying genetic condition (infant 2, Fig 4). Both infants were born extremely preterm with very low birth weight (infant 1: GA=24 +3 , 800 grams; infant 2: GA=27 +3 , 550 grams). Where follow-up data were available

Figure 4. Extreme deviations in cerebellar development in two preterm infants with marked cerebellar injury.
Extreme reduction in cerebellum volume during the perinatal period is associated with widespread alterations across the neonatal brain. In both cases, reduced cerebellar volume was accompanied by widening of the CSF, dilation of the ventricles and reduction in TTV. Both of these infants had poor neurocognitive follow-up.

Atypical ventricular development in preterm infants -frequent but highly heterogeneous
Widening of the fluid-filled brain structures was the most frequently observed deviation from normative development in both studies. In the dHCP 29% and 17% of the preterm infants showed extreme deviations in ventricular and CSF volumes, respectively. This number was higher in the EPrime where increased ventricles and CSF were seen in 44% and 53% of the sample with the original 0.5mm dHCP resolution and in 29% and 32% when the downsampled 1mm resolution was used. Figure 5 shows the most extreme cases where infants' ventricles were 10 sd above the model mean. These extreme deviations in ventricular volume were associated with overt focal brain injuries including haemorrhagic parenchymal infarction in infants 1,2,4 and 6, and periventricular leukomalacia in infants 5 and 7. In all these infants we also observed extreme negative deviations in TTV or thalamus and increased CSF. These infants had poor neurocognitive performance later in life ( Fig 5). The figure also depicts the marked variability in ventricular development observed in preterm cohorts.  Fig 2A). The figure also depicts the T 2 -weighted images for infants with ventricular volume lying 10 sd above the mean, separate for females (top) and males (bottom), together with their neurocognitive scores (M -motor, C -cognitive, L -language). Ventricular development in EPrime preterm infants is highly heterogeneous both in shape and size as illustrated in (B) showing ventricular volumes of various Z-scores.

Association between perinatal clinical risks and deviations from normative development
Next, we sought to replicate previous findings describing the relationship between preterm birth, related clinical risks and brain volumetric development.

Figure 6. Association between GA at birth and deviations from normative brain growth.
In the dHCP preterm sample, decreased GA at birth was related to reduced TTV and increased CSF. In EPrime, decreased GA at birth was associated with reduced TTV and increased ventricular volume. Individual preterm observations are plotted against the normative model mean for female (purple) and male (blue) term infants. The plots also show ±1, ±2 and ±3 sd from the normative model means together with ---lines indicating Z > |2.6|, the threshold used to define extreme deviations. Ventricular data are shown only for infants with volume ±10 sd from the model mean.

Discussion
The We previously demonstrated that GPR could be used to detect subtle white matter injury with high sensitivity 10 , and to characterise the heterogeneous cerebral consequences of prematurity on the developing brain microstructure 7 . However, the present application of GPR to volumetric data offers more straightforward clinical translation. GPR provides growth curves describing normative volumetric growth and can detect and quantify atypical development [22][23][24] . The GPR approach generalized to a cohort of infants with brain images collected on a different MR scanner with different acquisition parameters. This information could be integrated into automatic tools that complement radiological decisions regarding infant development, and in the future it might aid personalized intervention choices 24 . GPR allows an inclusion of further covariates within the model framework, for example genetic or demographic data, and our methods and normative dataset will be freely available for clinicians and researchers to develop further personalized approaches to understanding pathogenesis, trialling interventions and defining neurocognitive prognosis for vulnerable preterm infants.
We quantified rapid postnatal brain growth consistent with previous imaging and post-mortem studies which have described dynamic changes in the size, organisation and complexity of the human brain during the perinatal period 17,25-31 . Abrupt preterm extrauterine exposure represents a significant stressor leading to widespread deviations from the normative trajectory of brain growth with a wide range of neurodevelopmental consequences [32][33][34][35][36] . However, these alterations are not a result of loss of intrauterine environment alone, but are a product of the cumulative effects of clinical and genetic factors creating individualized circumstances for every infant. GPR applied to a large normative dataset offers a powerful approach to study how prematurity shapes the brain at an individual infant level, and offers the means to capture important differences in single infants that may be missed by analysis of the means or medians of quasi-homogenous groups which 'averages-out' personal effects.
By quantifying this inter-individual variability, our analysis clarified the relationship between reduced global brain growth and preterm birth. Many but not all studies show group-level differences in total brain volume between preterm and term-born infants 33 . We report a subset of infants in both preterm cohorts that deviated significantly from normative brain tissue volumes. These infants were born very early, very small and had prolonged need for supplemental oxygen. Consistent with this, lower GA at birth, birth weight Z-score, longer requirement for respiratory support and TPN were related to reduced TTV and enlarged CSF/ventricles in both preterm cohorts. Not all extremely preterm infants had TTV deviations significantly below the model mean, which could explain the discrepancies found between previous group analyses. A personalized approach is now possible to address the important question of which protective factors or lack of adverse perinatal risks, lead to typical global brain growth in these at-risk infants.
Brain growth is neither uniform nor uniformly affected by preterm birth. The thalamus is particularly vulnerable following preterm birth 37,38 . In agreement with group mean studies reporting reduced thalamic size at TEA 38-40 , we showed a subset of preterm infants with thalamic volumes significantly below the model mean (Z<-2.6). These infants had a high load of PWMLs, further supporting previous suggestions of a close link between thalamic development and white matter abnormalities, including a previous group analysis of the EPrime dataset 38,41,42 . The cerebellum is one of the most rapidly growing structures during the perinatal period, but altered cerebellar development in preterm infants and its relation to supratentorial brain injury is complex and poorly characterized [43][44][45][46] . We observed significant deviations in cerebellar volume in the absence of structural supratentorial injury in a small proportion of infants in both preterm cohorts. Yet, infants furthest from the normative mean had substantial cerebellar injury. These infants were extremely preterm and lowbirth-weight, consistent with reports of highest incidence of cerebellar haemorrhages/infarcts in this risk subgroup 47 . The cerebellum plays a crucial role in motor and cognitive development 46 , and cerebellar injury during the last trimester, especially in these extreme cases, had an adverse effect on later neurocognitive abilities.
Compared to the dHCP preterm cohort, the EPrime study comprised extremely preterm infants (GA<33 weeks), that were overall sicker during clinical care, had poorer motor outcomes, and were imaged using different acquisition parameters. These factors in combination likely underlie some of the differences in associations between extreme deviations and later neurocognitive scores observed between the two datatsets. The lower spatial resolution in EPrime in particular, contributed to the mean shift (increase) in CSF and ventricular volumes observed in the EPrime. With all this in mind, it was reassuring that deviations in brain development and their association with perinatal risks found in the dHCP broadly replicated in EPrime. We chose to use absolute and relative brain volumetric measures that are easy to calculate in research studies or routine clinical examinations. This offers a direct clinical application, though given the regional heterochrony of early life brain development 48 , future work should focus on more finely-parcellated regions and take advantage of more sophisticated MRI-derived features, including cortical thickness and surface area.
Some associations were only observed in the larger EPrime cohort, which comprised earlier preterm born infants compared to the dHCP preterm sample. Specifically, an association between bigger birth weight Z-score and reduced proportion of brainstem and bilateral thalamic volumes, as well as increased postnatal growth and reduced TTV proportion of WM and left caudate volumes.
Given that the proportion these structures represent from TTV decreased with age, these findings are intuitive, suggesting that for preterm infants, being born bigger and showing good postnatal growth is related to more mature or robust volumetric development of these subcortical structures and WM by TEA.
The argument that essentially every brain is different is not novel, and the expectation that the effects of preterm birth are homogeneous and exactly alike in every infant is equally untenable 7,10 .
Personalized methodologies have been successfully applied in other fields (e.g. neuropsychology 14 ) and may hold significant promise for the preterm infant. A target for future personalized analysis could be the marked variability in ventricular size. This is likely to be important given that ventricular enlargement has been broadly but imprecisely linked to poorer outcome in group-wise studies 49,50 .
Although a group-mean difference may be detected using the conventional case-control approach, the significant heterogeneity would not be captured and effects of possible clinical significance to individual infants would be averaged out 6 . Effects that might appear visually subtle on their own might have prognostic significance when combined with other deviations from normative brain growth, for example reduced thalamic volume, and further analytic power may be gained by including covariates in the GPR model.
By focusing on the individual rather than the average atypicality within a group, our approach offers more precise understanding of the cerebral consequences of preterm birth and in future might improve the predictive power of neuroimaging.