Heterogeneity of Incipient Atrophy Patterns in Parkinson’s Disease

Parkinson’s Disease (PD) is a the second most common neurodegenerative disorder after Alzheimer’s disease and is characterized by cell death in the amygdala and in substructures of the basal ganglia such as the substantia nigra. Since neuronal loss in PD leads to measurable atrophy patterns in the brain, there is clinical value in understanding where exactly the pathology emerges in each patient and how incipient atrophy relates to the future spread of disease. A recent seed-inference algorithm combining an established network-diffusion model with an L1-penalized optimization routine led to new insights regarding the non-stereotypical origins of Alzheimer’s pathologies across individual subjects. Here, we leverage the same technique to PD patients, demonstrating that the high variability in their atrophy patterns also translates into heterogeneous seed locations. Our individualized seeds are significantly more predictive of future atrophy than a single seed placed at the substantia nigra or the amygdala. We also found a clear distinction in seeding patterns between two PD subgroups – one characterized by predominant involvement of brainstem and ventral nuclei, and the other by more widespread frontal and striatal cortices. This might be indicative of two distinct etiological mechanisms operative in PD. Ultimately, our methods demonstrate that the early stages of the disease may exhibit incipient atrophy patterns that are more complex and variable than generally appreciated.

technique to PD patients, demonstrating that the high variability in their atrophy patterns also translates 23 into heterogeneous seed locations. Our individualized seeds are significantly more predictive of future 24 atrophy than a single seed placed at the substantia nigra or the amygdala. We also found a clear distinction 25 in seeding patterns between two PD subgroupsone characterized by predominant involvement of 26 brainstem and ventral nuclei, and the other by more widespread frontal and striatal cortices. This might 27 be indicative of two distinct etiological mechanisms operative in PD. Ultimately, our methods demonstrate 28 that the early stages of the disease may exhibit incipient atrophy patterns that are more complex and 29 variable than generally appreciated. 30 31 then be "inverted" to enable seed region inference. Here we briefly survey the neuropathological basis of 62 this model and the seed-inference algorithm. Toxic misfolded AS undergo template-driven aggregation in a "prion-like" fashion (Wood et al., 1999; 67 Yonetani et al., 2009;Sacino et al., 2013) intracellularly, followed by cell-to-cell transmission via trans-68 neuronal pathways along axonal projections to remote areas (Luk et al., 2012;Masuda-Suzukake et al., 69 2013; Rey et al., 2016). A study showing a wild-type mice receiving a single intrastriatal injection of 70 synthetic AS fibrils led to the trans-neuronal transmission of pathologic AS and Parkinson's-like Lewy 71 pathology in anatomically interconnected regions (Luk et al., 2012). Lewy pathology spread occurs along 72 the local and long-range fiber projections, thereby suggesting a process of "network spread". This process 73 is ubiquitous to neurodegenerative diseases, including Alzheimer's, frontotemporal dementia, and others 74 (Braak and Braak, 1991;Pandya et al., 2017).

76
Recently, an in vivo quantitative approach using predictive network diffusion model (NDM) has been 77 applied to AD and other dementias by specifically modeling the trans-neuronal spread using connectivity 78 (Raj et al., 2012(Raj et al., , 2015. The diagnostic and clinical role of NDM and other models of neurodegenerative 79 pathologies through trans-neuronal diffusion is detailed in (Iturria-Medina, 2013;Iturria-Medina et al., 80 2014; Carbonell et al., 2018). Likewise, a network-based analysis has been applied to a large group of PD 81 patients showing atrophy distribution by modeling the trans-neuronal spread of AS outward from SN 82 (Zeighami et al., 2015). The study was statistically validated using neuropathological and neuroimaging 83 data of 232 PD patients which demonstrated the spatial and temporal atrophy patterns in PD reflecting out 84 of the disease epicenter in the SN. The NDM was validated on the same data series in our laboratory, 85 where we showed that SN-seeded NDM was able to faithfully recapitulate PD cross-sectional atrophy 86 patterns, both at the group level and at the individual level; the latter at lower predictive power and higher 87 variability , providing a new computationally-defined staging scheme that mirrors 88 Braak's six-stage Lewy pathology staging scheme (Braak et al., 2003). In this paper we report that the NDM can be successfully "inverted" in order to infer the most likely pattern 93 of incipient pathology seeding from available regional atrophy data in PD patients. For this purpose, we 94 represented by a vector (N=112). This t-statistic was converted to the natural range [0,1] using the logistic 138 transform, following (Raj et al., 2015). These atrophy measures were then used to test the modeling 139 analyses.

141
For this study, we used a brain parcellation with 112 regions whose details have been reported previously 142 (Zeighami et al., 2015). Thirty-four cerebellar regions were removed, leaving 78 cerebral regions. In brief, 143 supratentorial regions i.e. cortical and basal ganglia related regions were included from Hammers atlas 144 (Hammers et al., 2003). Furthermore, three midbrain structures, subthalamic nucleus, substantia nigra, 145 and red nucleus were manually segmented using the high-resolution MRI template (T1-weighted 146 ICBM152 template, resolution = 0.5 mm 3 ), the BigBrain (Amunts et al., 2013), and the brainstem 147 anatomical atlas of (Duvernoy, 1995). A subcortical atlas based on ultrahigh-field MRI (Keuken et al.,148 2014) was used to confirm the accuracy of the segmentations. weighted MRI (DW-MRI) data from 72 young healthy subjects was used for structural connectivity. We 152 used the anatomical connection density (ACD) as the measure of connectivity for this paper which is 153 defined as the fraction of the connected superficial nodes with respect to the total number of superficial 154 nodes of both areas, as proposed in (Iturria-Medina et al., 2007). ACD accounts for correcting varying 155 brain region sizes as it is obtained by dividing the raw connection strength value by the sum of region-156 pair surface areas. We refer to this network using the connectivity matrix = { , } whose elements ,

157
represent the connection strength of white matter fiber pathways between th and ℎ gray matter regions.

158
Connections are assumed to be bidirectional due to limitations of the DTI tractography data.

Predictive PD Network Diffusion Model
160 Raj et al. (2012) demonstrated that the spread of proteinopathic agents over time is well captured by a 161 dynamical system defined over a network-graph rendering of the brain, with the nodes representing gray 162 matter structures and inter-regional connections defined as above. atrophy configuration, we can use the NDM to predict the atrophy pattern at all future time points. We have previously validated the accuracy of the NDM predictions in multiple contexts (Raj et al., 2012, 178 2015; Pandya et al., 2017;Raj and Powell, 2018 where the indexes denote the i-th brain region and j-th patient respectively; and denote mean and 190 standard deviation calculated with respect to the healthy controls. The scores were then normalized by a 191 weighted logistic transform to keep values within the (0,1) range; these normalized vectors will be referred 192 to in what follows as empirically-observed atrophy vectors against which we run our inference 193 algorithm (Raj et al., 2012;Torok et al., 2018).

195
The forward NDM can be used to infer the most likely pattern of disease seeding seed from a given vector

196
. This inverse seed-inference process utilizes a constrained optimization algorithm with a 1 -penalized 197 cost-function to promote sparsity while maximizing the Pearson correlation R between the NDM-

222
For every subject in our study, we infer an individualized seed vector seed ∈ ℝ 78 that minimizes the cost 223 function above. Each entry of the seed vector is associated with a specific gray matter region given by the 224 brain atlas, and represents the likelihood that proteinopathic agents are present at that region when t = 0. respectively. Finally, we will use brackets and subscripts of the form < argument > subgroup to indicate 229 the average of vectors across subjects from a given subgroup (i.e. HC and PD). average of the inferred seed vectors < seed > across all PD patients. We note that < seed > differs 234 from the seed vector inferred directly from < > , since the averaging and seed-inferencing 235 mathematical operations do not commute. In all panels, the load of proteinopathic agents in a brain region 236 is proportional to the sphere diameter placed there, although actual values were scaled for improving 237 visualization. Table 2 compares the top-10 brain regions associated with the largest entries of < > 238 with those of < seed > . There is significant overlap between the top-10 regions for both vectors, 239 especially regarding the Putamen, Pallidum, and Red Nucleus.   256 Torok et al., 2018 presented a novel method to determine a consistent seeding pattern for each patient.  The goal is to select the "elbow" of the curve, i.e., a value that provides a sensible tradeoff between the 261 mismatch of model/datagiven by the − ( ( , seed ), ) term in the cost-function, and the sparsity of 262 the seed, given by the | | 1 Figure 2 shows how the mismatch between model/data decreases as we (i) increase the value of , (ii) 265 increase the number of seeds, or (iii) increase the 1 norm of the seed. We argue that = 0.25 can be 266 considered the "elbow" for the black curves; it provides an intermediate value of 0.63 (it is higher than the 267 min = 0.52 and lower than the max 0.74), an intermediate average number of seeds of 6 (higher than 1 268 and lower than 19), and an intermediate 1 -norm of 1 (higher than .2 and lower than 1.5 The inferred seed vectors varied significantly from patient to patient. In this subsection, we demonstrate 294 that inferred seed are more predictive of atrophy pattern than a common seed located at a single region, 295 i.e., a seed pattern defined by a vector with only one non-zero entry placed either in the Amygdala or in 296 the Substantia Nigra. While certain brainstem nuclei, might be better candidate seed locations of PD 297 pathology (e.g. medulla oblongata), these regions are not accessible on MRI and therefore we have focused 298 on seed locations corresponding to Braak stage-III and higher.

300
The algorithm is as follows: respectively. The R-max values associated with (i) are typically higher than the ones associated with (ii) 314 and (iii), demonstrating that inferred individualized seed patterns lead to significantly more predictive 315 patterns of the patients' atrophy vectors than a common single-seed vector. This result provides strong 316 evidence against a stereotyped/standard single seed location. We also find that (Fig. 3C) that a single seed 317 in the Substantia Nigra is more likely than a single seed in the Amygdala. Of course, if the seed inference 318 algorithm was giving trivial outcomes (e.g. inferred seed pattern = observed atrophy pattern) then we 319 would erroneously obtain similar results to the above. To guard against that possibility, In Fig. 3D, we 320 present a R-max histogram comparing the two vectors { , }, showing that seed are not obvious correlates of the observed atrophy patterns. This is consistent with the complex dynamics of disease spread 322 in PD and suggests that our seed inference is implicating a different set of regions than would be trivially 323 predictable from the most atrophied regions. In this section we demonstrate that despite the heterogeneity of the incipient atrophy patterns across 338 subjects, the inferred seed vectors seed can be categorized in two subgroups (S1 and S2). Analogously, 339 the empirically-observed atrophy vectors b can also be categorized in two subgroups (A1 and A2 description of the atrophy and seed patterns for each subgroup. Figure 5 shows on the top of each panel, a glassbrain view of the inferred seed vectors < seed > 1 and 367 < seed > 2 averaged between subjects in the S1 (cyan) and in the S2 (black) subgroups respectively. In 368 the middle of each panel, we show a glassbrain view of AFS1 (cyan) and AFS2 (black). They represent 369 the average predicted atrophy pattern (via NDM) of the seeds classified in S1 and S2 subgroups, i.e., < 370 ( , ∈ S1) > and < ( , ∈ S2) > respectively. Finally, in the bottom of each panel, we 371 show a glassbrain view of < > 1 and < > 2 , the empirically-observed atrophy vectors averaged 372 between subjects in A1 (red) and A2 (blue) subgroups respectively. vectors averaged across subgroups S1 and S2, AFS1 and AFS2 are average projected atrophy for subgroups S1 and S2 derived from forward NDM, and A1 and A2 are observed atrophy from subgroups 378 A1 and A2. Distinction of two subgroups from seed pattern as seen in S1 and S2 is evident as opposed to 379 projected atrophy (AFS1, AFS2) and observed atrophy (A1, A2  Table 3 compares the top-10 brain regions associated with the largest entries of < seed > 1 with those 383 of < seed > 2 , and Table 4 shows the top-10 brain regions associated with those of < > 1 and 384 < > 2 . Finally, Table 5 shows the regions associated with the largest entries of < ( , ∈ S1) > 385 and < ( , ∈ S2) > respectively. These results are based on the hierarchical clustering analysis 386 explained previously at two cluster level for both seed and atrophy data.

388
Caption: The individual patient-dependent seed vectors seed can be categorized in two subgroups (S1 389 and S2) via agglomerative hierarchical cluster tree method. Left column shows the brain regions 390 associated with the largest entries of < seed > 1 (in cyan) and right column shows the brain regions 391 associated with the largest entries of < seed > 2 (in black).  predict the future atrophy patterns ( , ). The AFS1 (cyan) and AFS2 (black) represent the average 403 predicted atrophy pattern of the seeds classified in S1 and S2 subgroups, i.e., < ( , ∈ S1) > and   and progression of PD patients. It is becoming increasingly evident that many patients cannot be fitted 416 into the canonical progression scheme (Burke et al., 2008;Rietdijk et al., 2017). Since antemortem 417 imaging cannot reveal directly the incipient patterns of pathology, this study aims to develop a principled, 418 model-based method of "looking back" into the disease trajectory based on present imaging data. Many 419 clinical uses of this approach are enumerated below, but the key outcome is to enable assessment of a 420 patient's incipient seeding pattern, and to use this pattern to cluster them into clinically relevant subtypes.
Olfactory impairment, dementia, depression and other neuropsychiatric symptoms appear in preclinical 422 stages of PD, preceding motor manifestations by long periods (Shiba et al., 2000;Schuurman et al., 2002;423 Leentjens et al., 2003;Ross et al., 2006Ross et al., , 2008. It is therefore plausible that the ability to infer seeding 424 patterns of a patient can reveal likely sources of symptomatic heterogeneity.

426
In this paper we leverage a recently developed seed-inference algorithm for Alzheimer's pathology (Torok 427 et al., 2018) and tailor it to the parkinsonian context. Torok et al., 2018 combined a network-diffusion 428 model that successfully recapitulates patterns of regional brain atrophy (Raj et al., 2012) with an L1-429 penalized optimization routine to infer the likely origins of pathology across individual subjects from the 430 Alzheimer's Disease Neuroimaging Initiative (ADNI) public database. Their results showed that the high 431 degree of variability between patients at baseline translates to even more heterogeneous seed patterns that 432 are significantly more predictive of future atrophy than a single seed placed in the hippocampus. Given 433 that successful graph-theoretic models are also available for Parkinson's Disease (Zhou et al., 2012;434 Zeighami et al., 2015;Yau et al., 2018) , it is sensible to investigate if similar methods are capable of 435 revealing seeding heterogeneity in PD. In this study we found that the observed heterogeneity in PD 436 atrophy is best explained by the heterogeneous seeding patterns. Furthermore, these seeding patterns are 437 clustered into two main sub-groups which are not obvious form their observed atrophy. We also confirmed 438 that inferred seeding patterns are more predictive of future atrophy patterns than a single seed placed at  Wichmann , 2007). The pallidum is the next region with more atrophy.
Group average inferred seeding patterns implicate common sites of early synuclein 451 pathology and Braak staging Figure 1 (bottom) shows the average of the inferred seed vectors < seed > from their atrophy profiles.

454
The main regions implicated as the most likely early seeding sites in Table 1 (SN, RN, and striatal areas   455 like putamen and accumbens) are all regions where early PD pathology is observed (Braak et al., 2003;456 Huot and Parent, 2007;Hanganu et al., 2014;Lewis et al., 2016) They also highlight a prominent role for 457 the red nucleus in the rostral midbrain as potential source of the disease. It exhibits a higher seed value 458 than regions commonly associated to PD such as the Putamen, Pallidum and Substantia Nigra. This is 459 consistent with the notion that PD does not necessarily being in the SN (Del Tredici et al., 2002;Braak et 460 al., 2003;Lang and Obeso, 2004;Ahlskog, 2005), but could also be due to the difficulty of disambiguating As shown in Table 2 and 3, the inferred group average seed regions are not the same as those with the 469 highest levels of atrophy. This confirms that early site of PD progression are not always the areas that 470 experience the most atrophy. It also confirms that the proposed algorithm is not trivially capturing 471 observed atrophy, but is in fact imposing additional criteria, involving seeding sparsity and the condition 472 that the ongoing progression occur on the anatomic network. The group average inferred seeding (Figure 1, Table 1) gives prominence to striatal structures such as 477 putamen and accumbens. Although synuclein deposits in Lewy bodies are not quite as common in the 478 striatum as in SNpc and the brainstem, these regions are some of the most highly affected in PD and 479 related Lewy pathologies (Huot and Parent, 2007;Hanganu et al., 2014) The fact that our inverse NDM 480 approach was successfully able to recapitulate the key early role of these regions at the group level both in the neostriatum (Braak et al., 2003), but in a follow up they reported the presence of neostriatal lesion 492 in stage VI of PD (Braak et al., 2006). 4.0 (Jellinger and Attems, 2006). Numerous Lewy inclusions were reported in the putamen and 498 neostriatum in DLB patients (Saito et al., 2003). A strong correlation was found between PD stages and 499 neostriatal inclusions (Mori et al., 2008), who further reported that AS accumulates in the neostriatum at 500 stage III initially. It is also possible that oligomeric or soluble synuclein might be more abundant than We find that a common incipient state of neurodegeneration cannot explain the intersubject variability 508 observed empirically. As shown in Figure 3, a single-seed vector with only one non-zero entry placed 509 either in the Amygdala or in the Substantia Nigra lead to predicted atrophy patterns that are poorly 510 correlated with the observed data. Thus, our methods demonstrate that the early stages of the disease may 511 exhibit incipient atrophy patterns that are more complex and variable than generally appreciated. This 512 indicates a high level of etiologic heterogeneity in individual subjects, which makes it unlikely for any 513 single brain region to be the source of PD pathology ramification in all subjects.

516
Our most important finding is that the estimated incipient atrophy patterns exhibit a high degree of 517 variability across subjects, perhaps more than generally appreciated (Del Tredici and Braak, 2016). The hierarchical clustering analysis revealed an interesting subgroup structure in the seeding patterns that were 519 not obvious from the group seeding pattern of instead, is linked to processing vision (especially related to letters), logical conditions, and encoding visual 527 memories. Our analysis show that its role in early stages of PD might be more important than generally 528 appreciated.

530
The hierarchical clustering tree was also applied to the observed atrophy patterns, but the difference 531 between the two major subgroups (A1 and A2) was not as distinct as the difference between the subgroups 532 for seeds. We conjecture that the patients in S1 (cyan) should have worse motor symptoms due to the 533 seeding in Substantia Nigra, Pallidum, Subthalamic Nucleus and Thalamus while patients in S2 would 534 have difficulty in attention due to seeding involvement in lingual gyrus and frontal cortex.

536
In summary, we report a clear distinction in seeding patterns between the two subgroups S1 and S2one 537 characterized by predominant involvement of brainstem and ventral nuclei, and the other by more 538 widespread frontal and striatal cortices. See Tables 3-5. This might be indicative of two distinct etiological 539 mechanisms operative in PD. To our knowledge, this is the first time such a clear brain related sub-540 structure has been shown in the PD cohort. Importantly, this distinction is not apparent in the individual 541 subjects' atrophy patterns, which do not give a clear separation between these subgroups (labeled AFS1 542 and AFS2 in Figure 6), nor from a separate clustering directly from atrophy (labeled A1 and A2 in Figure   543 6 The regions with the highest seed values in Table 2, largely located within the midbrain, show strong 567 agreement with accepted structures that are affected earlier rather than later in AS pathology in PD (Braak et al., 2003). While our seeds do not contain the more caudal structures implicated in the earliest stages 569 of the Braak scheme, there is disagreement about the extent to which detectable α-synucleinopathy in 570 those regions is necessary or sufficient for pre-determining rostral manifestation of PD (Burke et al.,571 2008). In any case, since significant atrophy changes are known to occur in the amygdala (Harding et al.,572 2002) and in substructures of the basal ganglia such as the substantia nigra (Davie, 2008), we can 573 determine if our individual inferred seeds are more predictive of future atrophy patterns than a single seed 574 placed at these locations. We demonstrate that the NDM predicts patient atrophy patterns using individual 575 seeds better than a consensus seed placed either at the SN or the amygdala (Figure 3), similar to the results 576 in Torok et al., 2018 for AD; in that study, it was also demonstrated that using a consensus seed containing 577 5 or 10 regions also resulted in poor predictions, indicating that this effect is independent of seed sparsity.

578
In contrast to that study, however, we did detect two distinct subpopulations of patients with our seeds, 579 one of which indicated strong midbrain involvement and the other more cortical area involvement ( Figure   580 5, Table 3). The latter subpopulation is surprising in light of the generally accepted notion that the cortex 581 harbors AS pathology only late in disease, but we note that the resultant predictions of atrophy under the 582 NDM from the seeds of these two divergent populations are remarkably similar (R = 0.55 for AFS1 vs 583 AFS2 in contrast to R = -0.12 for S1 vs S2). The complicated relationship between atrophy, AS pathology, 584 and other clinical markers of PD, which limits the interpretation of these two seed subpopulations, is

594
While there seems to be less overlap between the extreme quartiles in the bottom plots, there are no distinct 595 clusters in any of them, which shows that the high variability of atrophy patterns within the different 596 subjects translate also into heterogeneous seed patterns. Figure 7 shows analogous plots for data matrices 597 restricted to PD patients alone (excluding controls). As a consequence, it remains a challenge to relate 598 discrepancies in regional brain volumes to cognitive dysfunctions, in particular to all the different domains 599 affected during PD such as attention and concentration, executive functions, memory, language, visuo-600 constructional skills, conceptual thinking, calculations, and orientation.