Deep sequencing of proteotoxicity modifier genes uncovers a Presenilin-2/beta-amyloid-actin genetic risk module shared among alpha-synucleinopathies

Whether neurodegenerative diseases linked to misfolding of the same protein share genetic risk drivers or whether different protein-aggregation pathologies in neurodegeneration are mechanistically related remains uncertain. Conventional genetic analyses are underpowered to address these questions. Through careful selection of patients based on protein aggregation phenotype (rather than clinical diagnosis) we can increase statistical power to detect associated variants in a targeted set of genes that modify proteotoxicities. Genetic modifiers of alpha-synuclein (ɑS) and beta-amyloid (Aβ) cytotoxicity in yeast are enriched in risk factors for Parkinson’s disease (PD) and Alzheimer’s disease (AD), respectively. Here, along with known AD/PD risk genes, we deeply sequenced exomes of 430 ɑS/Aβ modifier genes in patients across alpha-synucleinopathies (PD, Lewy body dementia and multiple system atrophy). Beyond known PD genes GBA1 and LRRK2, rare variants AD genes (CD33, CR1 and PSEN2) and Aβ toxicity modifiers involved in RhoA/actin cytoskeleton regulation (ARGHEF1, ARHGEF28, MICAL3, PASK, PKN2, PSEN2) were shared risk factors across synucleinopathies. Actin pathology occurred in iPSC synucleinopathy models and RhoA downregulation exacerbated ɑS pathology. Even in sporadic PD, the expression of these genes was altered across CNS cell types. Genome-wide CRISPR screens revealed the essentiality of PSEN2 in both human cortical and dopaminergic neurons, and PSEN2 mutation carriers exhibited diffuse brainstem and cortical synucleinopathy independent of AD pathology. PSEN2 contributes to a common-risk signal in PD GWAS and regulates ɑS expression in neurons. Our results identify convergent mechanisms across synucleinopathies, some shared with AD.


INTRODUCTION
Proteinopathies are age-related diseases in which specific proteins aggregate and form inclusions in distinct cell types.Different neurodegenerative diseases can be associated with aggregation of the same protein.For example, ɑ-synuclein (ɑS)-rich inclusions, including Lewy bodies (LBs) and glial cytoplasmic inclusions (GCIs), are the hallmark pathology of synucleinopathies, including Parkinson's disease (PD), dementia with Lewy bodies (DLB) and multiple system atrophy (MSA) 1 .Equally, patients not infrequently have mixed pathologies 2,3 .For example, more than 50% of Alzheimer's disease cases have the classic Lewy bodies 4,5 , and the hallmark β-amyloid (Aβ) and tau pathologies of AD are common in DLB and Parkinson's disease dementia (PDD), but also found in a significant proportion of PD cases (>30%) 3,[6][7][8] .These findings raise two fundamentally important questions we sought to shed light on in the current investigation: first, to what extent do synucleinopathies share a common underlying biological basis, and second, are there share risk factors among different proteinopathies?
Rare variants in the ɑ-synuclein-encoding gene SNCA suggest a shared basis across synucleinopathies.For example, in familial synucleinopathies, individual rare variants lead to a spectrum of possible disease outcomes, from PD to PDD to DLB, within the same family 9 .Two Mendelian synucleinopathiescaused by either the SNCA G51>D point mutation or by triplication at the SNCA locus-are associated with diffuse ɑS inclusions characteristic of PD and MSA 10,11 .Additional genetic evidence is emerging to support this.For example, patients harboring mutations in the glucocerebrosidase-encoding gene GBA1 are at elevated risk of clinical PD, PDD, or DLB 12,13 , and recent publications suggest an association of GBA1 variants also with MSA.At a cellular level, distinct synucleinopathies could result from the differential vulnerability of distinct brain regions and circuits to ɑS pathology 14 .At a biophysical level, they have been associated with distinct pathologic amyloid conformers, or "strains" [15][16][17][18][19] .It is thus plausible that different clinical presentations, even in patients with identical primary mutations (for example, in GBA1 or SNCA), could stem from other environmental factors or genetic background modifiers that lead to either distinct ɑS conformational states or glio-neuronal vulnerability patterns (or both).
The presence of variable pathologies in most patients with neurodegenerative diseases may reflect the coexistence of different disease processes in the same patient or potentially a shared mechanistic link among different proteinopathies.Moreover, a potential shared link may depend on the existence of overt proteinaceous aggregates.Genetic, transcriptomic, and proteomic interrogations of tractable cellular and organismal models to answer this question have yielded conflicting results 20 .However, evidence for the overlapping genetic basis of neurodegenerative and neuropsychiatric diseases, including synucleinopathies is emerging from multiple recent human genetic studies [21][22][23] .For example, coding variants at the tau-encoding MAPT locus lead to frontotemporal dementia and parkinsonism (a tauopathy) 24 .Still, single nucleotide polymorphisms (SNPs) in linkage with MAPT (and specifically the tau H1 haplotype) are among the best-validated risk factors for PD with growing evidence in AD also 25 .The PD association holds in pathologically confirmed synucleinopathy cases and does not seemingly hinge on overt tau aggregation 26 .In other examples, the APOE, TMEM175 and BIN1 risk factors suggest crossover risk between AD and DLB 13,21,27 , and the PD-associated LRRK2 variants can either associate with a synucleinopathy or a pure tauopathy with the accumulation of AD-type tau 28 .Pathologic forms of ɑS, Aβ, and tau synergistically interact with one another, enhancing aggregation and neurotoxicity and suggesting the presence of a shared mechanistic link among proteinopathies 29,30 .Biophysical data suggesting that aggregation-prone proteins can cross-fibrillize -for example, Aβ with ɑS 31 and ɑS with tau 32 -indicates a possible mechanism to explain the coexistence of proteinopathies.How closely these in vitro studies mimic the physiologic interactions in vivo remains unclear, and our understanding of the shared genetic basis of different synucleinopathies and across proteinopathies remains far from complete.
The field has employed two distinct but complementary genetic approaches to understand the complex mechanisms driving neurodegenerative diseases: human genetic analysis and genetic analysis of tractable model systems.The most comprehensive modifier screens to date in model organisms have been performed in yeast in which the expression of ɑS and Aβ in yeast leads to robust cytotoxicity 33 .Genes encompassing ~85% of the yeast proteome have been individually co-expressed with each of these toxicity proteins to identify genetic modifiers of ɑS 34 and Aβ 35 toxicity, respectively.The system offers a convenient way to isolate the cytotoxicity of ɑS from Aβ because these proteins do not natively exist in this cell.Strikingly, despite this absence, the system is disease-relevant because the Aβ screen 34,36,37 statistically enriched for genes implicated in AD (CD2AP, PICALM, INPP5D, RIN3) and the ɑS screen for PD/parkinsonism-implicated genes 35,36 (ATP13A2, CHCHD2, SYNJ1, VPS35 35,37 ).Interestingly, in these acute cytotoxicity models, there was little meaningful overlap between ɑS and Aβ screen hits.Notable exception was a modification of both ɑS and Aβ proteotoxicity by paralogs, ROM1 and ROM2 38 guanidine exchange factors of the yeast Rho GTPase Rho1p involved in actin cytoskeleton regulation 36 .
In human genetics, the standard method has been genome-wide association studies (GWAS).These are best powered for common variants (with MAF >1%).However, such variants typically have small effect sizes and are located in non-coding genomic regions, thus making it challenging to map them back to specific genes.To identify rare variants, very large cohorts comprising of whole-genome or whole-exome sequence data are required for late-onset complex diseases.Recent investigations have attempted to narrow the search space for rare variants to circumvent this.For example, variants have been restricted to particular classes of variants 39 , confined to restricted ontologies emerging from common-variant GWAS 40 , and anchored to specific cellular processes 41 or organelles.One such study 42 targeted exome sequencing of lysosomal genes demonstrated the importance of this organelle to PD, a finding subsequently verified more definitively in larger meta-GWAS analysis 43 .
To examine whether the ɑS and Aβ proteotoxicity modifiers and gene networks can yield important insights into shared risk across synucleinopathies and between PD and AD, we performed targeted exome screening of human orthologs of these modifier genes as well as additional known AD, PD, and neurodegeneration-related genes.We targeted a relatively small but deeply characterized cohort of 496 patients with different synucleinopathies (PD, DLB, and MSA).We performed carefully controlled joint calling with ~2,500 aged neurotypical controls from the Medical Genome Reference Bank (MGRB) 44 .We selected a wide range of orthologs of the yeast genes to capture broad expression across tissues and CNS cells because we hypothesized that distinct cell types would be involved in different synucleinopathies.Our goal was to reduce the sequencing sampling space to a set of genes known a priori to be enriched with known PD and AD Mendelian genetic risk factors to enable us to find meaningful signals even in a modestly sized cohort.Critically, we matched proteinopathy screening in yeast to a molecular, rather than clinical, phenotype in patients.Put another way, we matched synucleinopathy in humans, regardless of clinical diagnosis, to screens specifically against proteinopathies in yeast.
Reassuringly, our cross-species screen identified significant enrichment of rare nonsynonymous variants in known PD genes GBA1 and LRRK2.However, three AD genes (CD33, CR1, and PSEN2) were also implicated, building on emerging genetic evidence for shared risk across AD and PD.Our statistically significant genes crossed over synucleinpathy boundaries and implicated shared genetic risk among PD, DLB and MSA.Most strikingly, when we validated our findings in orthogonal (UK biobank and AMP-PD) cohorts, our statistical signal was stronger for Aβ toxicity modifiers than ɑS.These modifiers centered on genes that, together with PSEN2, have been strongly implicated in the regulation of actin cytoskeleton (ARGHEF1, ARHGEF28, PKN2, PRKCD, PASK), genes that collectively we refer to as the PSEN2/Aβ-actin gene module.Three of these genes were positive regulators or effectors of RhoA, a central actin cytoskeleton signaling factor.Patients with mutations in these genes have diffuse synucleinopathy with brainstem and cortical pathology and in human neurons, simply downregulating RhoA can deplete stabilized Factin and enhance ɑS pathology.Among this gene module, PSEN2 itself emerged as the clearest risk factor, essential for cortical and dopaminergic (DA) neuron survival, downregulated in cortical and dopaminergic neurons with contributions to aggregated PD GWAS risk and, finally, tied to regulation of ɑS expression itself.

RESULTS
To investigate potentially shared genetic risk and penetrance factors for the synucleinopathies of interest, we combined functional genomics with human genetics.We leveraged the largely nonoverlapping modifiers from genome-wide over-expression screens of ɑS 35 and Aβ 34 proteotoxicity in yeast cells to probe the proteotoxicities of interest, rather than be limited to clinical diagnosis.Since yeast cells lack clear homologs of Aβ and ɑS, we sequenced a wide range of human orthologs for these genes, including the known Mendelian genetic risk factors for AD and PD among them 36 .This set was further enhanced by including known AD, PD, and ataxia risk genes.We performed high-depth sequencing (~100x) of a total of 430 proteotoxicity modifier genes in a human genetic screen of 496 patients with synucleinopathies (PD, DLB and MSA) (Figure 1A).We jointly-called these patients with 2,516 aged (>70 years old) unrelated individuals with no history of cancer, cardiovascular disease, or dementia from the Medical Genome Reference Bank (MGRB) 44 by extracting the same target regions from their whole genomes (Figure 1A, Methods).Samples from HapMap with northern and western European ancestry (CEU) samples were also included for quality control purposes.We hypothesized that reducing the sequencing sampling space to a set of genes known a priori to be enriched with known PD and AD Mendelian genetic risk factors would enable us to find meaningful signals even in a modestly sized cohort.

Increased rare variant burden in both PD and AD genes and in both Aβ and ɑS toxicity modifiers across synucleinopathies
To assess the contribution of the Aβ and ɑS genetic networks to the risk of disease in synucleinopathies and identify shared genetic drivers across different synucleinopathies, we performed the one-sided Fisher exact test on each biallelic variant with a minor allele frequency (MAF) cut-off of 0.01 both at single-variant and gene levels on the European cases vs controls (Ncase = 496 and Ncontrol = 2,516).Singleton variants i.e., variants that appeared exactly once in both PD cases and MGRB controls were excluded because of the significant deviation observed in their distribution between cases and controls (see Methods).

Genes with significant rare variant burden cross clinicopathologic disease boundaries
Using 151 of our cases that had additional neuropathologic data, we asked whether the significant variants we identified, crossed clinical disease boundaries.In pathologically confirmed cases, these significant variants indeed crossed different synucleinopathies (Figure 1D).Additionally, among the 11 patients with top variants in either AD genes (PSEN2) or in the Aβ-modifier genes (TEAD2, ARHGEF1, ARHGEF28), only three had concomitant AD pathology (Figure 1D, Supplementary Figure 1; see also Figure 4).

Rare variant trend test validates significant both ɑS and Aβ gene modules in UK Biobank and AMP-PD
We validated our variant-and gene-level significant results in two independent PD datasets: UK Biobank (UKBB) ~500K whole exomes and Accelerating Medicines Partnership for PD (AMP-PD) whole genomes.Large-scale independent studies of these datasets 45,46 found no significant genes at single gene level except GBA1 and LRRK2 (Supplementary Table 1).Thus, we performed the pathway-based rare variant trend test (RVTT) 51,52 with the genes identified by variant and gene-level tests in our original cohort.RVTT selects qualifying rare variants using a variable minor allele frequency (MAF) threshold approach and uses the Cochran-Armitage test statistic to measure a group's increased burden of variants.RVTT provides permutation-based pvalues as an indicator of the strength of association.
In the UKBB dataset, we compared 2,273 unrelated European PD cases (ICD10 code: G20) of age 40 or older with 6,711 controls randomly sampled from 167,188 unrelated European neurotypical (no ICD10 code between G01-G99) individuals of age 60 or older (Methods).From the AMP-PD dataset, we compared 1,598 unrelated European sporadic PD cases with 1,095 agematched neurotypical controls.HBS samples were excluded from the analysis due to potential overlap with our original sequencing cohort (Methods).
We applied RVTT on each dataset individually and meta-analyzed the p-values with the Cauchy combination test (CCT) 53 .We observed a significant enrichment of rare missense variants (MAF < 0.01) in the significant genes identified at both variant-and gene-levels in the combined analysis (Figure 1E).To assess the contribution of significant genes, other than LRRK2 and GBA1, to disease risk, we also ran RVTT on two subsets of significant genes (SKAT-O): (i) AD genes and Aβ-modifiers and (ii) PD genes and ɑS-only modifiers.Both were significant at CCT p-value cutoff of 0.1.Surprisingly, the set of known AD genes and Aβ-modifiers showed a stronger association to PD risk compared to the known PD genes and ɑS-only modifiers (Figure 1E).Detailed results from both cohorts are available in Supplementary Table 3.No significant trend was observed in synonymous variants between cases and controls in the gene sets under question.This provides evidence for the association of our significant AD-related and Aβ-modifying genes with the risk of PD.Notably, six of these genes, namely ARHGEF1, ARHGEF28, PKN2, PASK, PRKCD and PSEN2, have been tied to the regulation of actin cytoskeleton (Figure 2).

Common regulatory changes near significant Aβ/AD genes
Recently, there is growing evidence that, even for rare-variants, impact on human disease ultimately leads to altered gene expression 54,55 .However, with existing sample sizes and statistical tools, we are not powered to detect the regulatory roles of rare variants directly.Common regulatory variations such as expression quantitative trait loci (eQTLs) and splice quantitative trait loci (sQTLs) can provide a way to connect genes enriched in rare variants with expression changes in disease-relevant tissues and cell types.Hence, we asked whether the genes recovered in our analysis were enriched for such common regulatory variations.We focused on known AD genes and Aβ toxicity modifiers because this was where we saw the most convincing signal in our UKBB/AMP-PD analysis (Figure 1E).We interrogated the GTEx Brain and PsychENCODE datasets and identified a significant enrichment (p=5.18x10 - , Supplementary Figure 3A), indicating a propensity of tissue-relevant genetic regulation control of these significant genes from our rare-variant analysis.Furthermore, we observed many of these genes have significant cell-type-specific cis-eQTLs in different brain cell types (Supplementary Figure 3B, Suppl.Table 4) in two publicly available datasets of single-cell eQTLs from prefrontal cortex 56 and mid-brain 57 .In the mid-brain dataset, filtering for PD GWAS variants overlapping with single cell eQTLs near our known AD genes and Aβ toxicity modifiers, we detected significant enrichment with loci at different PD GWAS p-value thresholds < 1x10 -5 (Supplementary Figure 3C) in dopaminergic neurons (DAs), suggesting a relationship between PD risk modifying regulatory variants acting in the DA neuron context through the AD and Aβ toxicity-modifier genes.Notably, this enrichment was primarily driven by PSEN2 which also had multiple significant cis-eQTLs in excitatory glutamatergic neurons (Supplementary Table 4).

Shared synucleinopathy risk factors comprise a PSEN2/Aβ-Actin regulator module that includes RhoA regulators and effectors
Our initial screen and RVTT validation, together with eQTL analysis, pointed to AD and Aβ modifier genes as being the strongest contributor to synucleinopathy risk among our targeted genes.This was an altogether unexpected finding.When we perused the list more closely, we were struck by one common thread trying most of these genes together, namely a link to regulation of the actin cytoskeleton (see Figure 2).Dysregulation of the cytoskeleton has been linked repeatedly to neurodegeneration and specifically to synucleinopathy in in vitro, cellular, and in vivo models.Most intriguing to us, three of the genes we identified encoded proteins Arhgef1/28 and Pkn2 that were regulators and effectors of RhoA signaling, respectively.Psen2 was selected to analyze along with these genes because it is a known AD risk factor, but because it also has been shown to directly interact with the filamin class of actin-binding proteins.We thus established a human iPSC-based synucleinopathy model to further investigate these relationships.
A tractable synucleinopathy hiPSC model is established with transgenic SNCA expression iPSC modeling for neurodegenerative proteinopathies suffers from poor reproducibility and tractability, and low levels of endogenous αS.Recently, we 58 and others 59,60 have begun to establish a suite of tractable models to study different aspects of neurodegenerative disease biology.We extended this development by engineering the WTC11 iPSC line that was selected for engineering by the Allen Institute for Cell Science (https://www.allencell.org/cell-catalog.html).It is known that simply inheriting extra copy number variants (CNVs) of wild-type αS through genemultiplication of the SNCA locus can lead to PD, PDD or DLB.These patients even have MSAlike glial cytoplasmic pathologies 61 .To create a better surrogate for this pathobiology, we established Cortical induced synucleinopathy (CiS) models (Figure 3A).Cortical glutamatergic neurons are straightforward to generate through one-step trans-differentiation with Ngn2 and exhibit αS pathology across synucleinopathies, thus providing a good substrate for rapid modeling of synucleinopathy "in the dish".
WTC11 hiPSCs with doxycycline-inducible-NGN2 knocked-in at the AAVS1 safe-harbor locus were transduced with a lentiviral SNCA-IRES-mCherry transgenic construct or with mCherry alone serving as a control.hiPSCs were flow cytometry-sorted utilizing the mCherry fluorescence and replated at low density for single-cell clonal selection.iPSC clones were expanded, karyotyped, and examined for αS expression.Two iPSC clones in SNCA-overexpression group, two clones in control group were selected for further investigation and named SNCA-high, SNCAintermediate, SNCA-endogenous1, SNCA-endogenous2, accordingly.
We compared αS expression levels in DIV28 neurons with human brain samples and with 4-copy SNCA neurons from a patient harboring SNCA triplication and a 2-copy isogenic mutationcorrected control.SNCA-intermediate neurons were comparable in expression levels to human cortex samples and 2-copy SNCA neurons.Notably, however, normalization here is to Gapdh.We previously showed that this underestimates true neuronal αS content 58 by several-fold in the human brain (where there are co-resident glia).SNCA-high neurons have approximately 2-fold higher protein levels than 4-copy SNCA neurons, thus likely a closer surrogate for true neuronal expression levels in human brain (Figure 3B).qPCR in DIV28 neurons confirmed SNCA expression at the mRNA level in the CiS models (Figure 3C).
To investigate cytotoxicity phenotype in CiS neurons, we performed longitudinal imaging (Incucyte) with a cytotox assay.Significant cell death was observed in SNCA-high neurons starting from DIV14 (Figure 3D).At DIV28, significantly increased Caspase+ cells were observed in SNCA overexpression neurons (Supplementary Figure 3A).We next investigated whether altered αS levels and resultant toxicity at DIV28 in the CiS models was accompanied by altered expression of the PSEN2/Aβ-actin gene module described in Figure 2. qPCR expression-analysis of key members of this module in our CiS models at DIV28 revealed that altering αS levels did significantly alter expression of these genes (Figure 3E).These data suggested that alteration of the actin cytoskeleton may be accompanying toxicity in these models.
We previously showed that αS closely interacts with Spectrin 36,37 and can induce profound and consequential alterations in the actin cytoskeleton in Drosophila 62 .This earlier work demonstrated altered stabilization of actin into F-actin.Phalloidin is a dye that can recognize actin stabilization.We thus performed phalloidin immunostaining in CiS neurons.We observed no phalloidin intensity difference among DIV14 CiS neurons, but by DIV28 we observed that phalloidin intensity trended significantly lower in SNCA-high neurons and phalloidin aggregates were noted to form (Figure 3F-G).Thus, stabilized actin is reduced overall as αS becomes toxic.
Since our significant top-hit genes were either positive regulators or positive effectors of RhoA signaling (Figure 2A), we investigated the consequences of RhoA reduction in neurons as a proxy for the damaging/missense mutations identified in our study.We previously established a lenti-shRNA construct targeting RhoA and confirmed knockdown in both HEK cells and neurons 58 .We knocked down RhoA in DIV0 CiS neurons.By DIV 14, RhoA downregulation significantly decreased phalloidin in all CiS neurons (Figure 3H, Supplementary Figure 3B), reminiscent of the SNCA phenotype in DIV28 SNCA-high neurons.A standard marker of pathologic αS is phosphorylation at Ser129 (pS129), the accumulation of which occurs across all synucleinopathies.Even at day 14, where there is no overt toxicity in the DIV28 SNCA-high neurons, these neurons had elevated levels of this marker compared to SNCA-intermediate or SNCA-endogenous lines.After RhoA downregulation, αS-pS129 selectively increased in the SNCA-high neurons but not SNCA-intermediate or SNCA-endogenous lines (Figure 3I, Supplementary Figure 3C).Thus, in cells primed for αS-induced cell death, RhoA downregulation further exacerbates synucleinopathy.

Diffused synucleinopathy in PSEN2/ARHGEF1/ARHGEF28 mutation carriers regardless of the extent of concomitant AD pathology
The advantage of our cohort is that a subset had matched postmortem brain and highly characterized neuropathology.Since RhoA inhibition induced pathologic αS induction, we next probed more deeply into the localization of this type of pathology in carriers of mutations in PSEN2/Aβ-actin genes for whom we had postmortem brain samples.Among these six brains, we noted that all but one had diffuse brainstem and cortical synucleinopathy.One (ARHGEF1 mutation carrier) had brainstem-predominant disease but we note he was an outlier among these cases with early age of onset (46yo) (Figure 4A, Figure 1D and Supplementary Figure 1).For example, the clinical phenomenology may have been attributable to some secondary pathology, especially concomitant AD.This was particularly pertinent to investigate for PSEN2 mutationcarriers because this is a known AD gene.
Neuropathologic criteria for AD are set with three "scores" corresponding to Amyloid/Thal, Braak/tau and CERAD/neuritic plaque quantitation.Based on these scores, cases are designated as having no, low, intermediate, or high level of Alzheimer's disease neuropathologic change (ADNC).When we examined the brains matched to our genes of interest, we found that PSEN2, ARHGEF1 and ARHGEF28 variant carriers could exhibited diffuse αS pathology completely independent of AD co-pathology (Figure 4A; Figure 1D and Supplementary Figure 1).As a case in point, we present the brain histopathology of a patient who harbored mutations in both PSEN2 and ARHGEF1 (Figure 4B).These findings are far from definitive, given the small numbers, but are consistent with this module of genes being important across different neuronal subtypes, including cortical and DA neurons, and contributing to diffuse brainstem and cortical synucleinopathy.

Dysregulated expression of PSEN2/Aβ-actin gene module in cortical and DA neurons
Our neuropathologic data suggested there could be diffuse dysregulation of PSEN2/Aβ-actin gene module across different cell types in the brain.Moreover, there was enrichment of DA neuron-specific eQTLs near PSEN2 (Supplementary Table 4) and a suggestive link with PD GWAS (Supplementary Figure 3C).This raised the possibility that even in sporadic PD/LBD cases there could be dysregulation of this gene module.To address this, we first confirmed that there are indeed cell-type specific cis-eQTLs near our PSEN2/Aβ-actin module member-genes in different neuronal subtypes (Figure 5A).
Next, we performed transcriptome-wide single-cell differential expression analysis using the MAST framework on the major CNS cell types using a publicly available single nuclei RNA-seq case-control dataset 63 of ~450k cells from ten PD and DLB patients and eight healthy controls (Methods).For comparison, we also looked at the expression changes of 26 genes that harbor either familial or common variants known to confer increased risk of PD 64 as suggested by the original study 63 .Like known PD genes, our SKAT-O significant genes showed a diverse profile of transcriptome-wide significant (BH-corrected p-value < 0.1) expression changes across different cell types in PD brains compared to controls (Supplementary Figure 5A).Among our top hits, the known PD gene GBA1 was significantly upregulated in dopaminergic (DA) and glutamatergic (non-DA excitatory) neurons.In contrast, the known AD gene CR1 was downregulated considerably in microglia.
With specific reference to our PSEN2/Aβ-actin module (Figure 5B, right), PSEN2, MICAL3, and PRKCD, showed transcriptome-wide significant expression changes in DA neurons and two were significant in excitatory neurons in PD brains compared to controls (Fig 4A).Only one gene in this group exhibited transcriptome-wide reduction in both excitatory glutamatergic and DA neurons, and that was PSEN2.Most strikingly, PSEN2 was significantly downregulated in eight out of the ten DA neuron subtypes, including CALB1_GEM and CALB1_TRHR subtypes that are relatively protected in PD (Figure 5C, right).This pattern was highly reminiscent of other PD genes (Figure 5C, left).Transcriptional changes of all SKAT-O significant genes are available in Supplementary Figure 5B.

Genome-scale CRISPR screens reveal the essentiality of PSEN2 in both cortical glutamatergic and DA neurons
The dysregulation of genes among our top hits and the PSEN2/Aβ-actin gene module in both excitatory and DA neurons (Figure 5) raised the possibility that this set of genes is important for the viability and function of these neurons.In the case of PSEN2, there was strong downregulation across multiple cell types (Figures 5B and 5C).We thus addressed what the consequence of this downregulation might be.Recently, the emergence of CRISPR/Cas9 technology has enabled genome-scale forward genetic screens, even in neurons and glia 65,66 .We utilized this technology to ask in an unbiased fashion: on which genes does survival of cortical and dopaminergic neurons depend?
We thus performed essentiality screens in human ESC-derived neurons.Briefly, gRNAs representing 19,993 genes were transduced into H9 hESC lines that had been knocked in for a doxycycline-inducible Cas9 at the AAVS1 safe-harbor locus.At an early timepoint (DIV20 for cortical neurons and in the neural progenitor stage for DA neurons) Cas9 was induced, and a gRNA representation was obtained.Cells grown in parallel were aged to DIV65 (cortical neurons) or DIV42 (DA neurons) and gRNA representation again assessed to look for dropout.If gRNAs representing specific genes consistently dropped out, that gene was considered essential for the neuronal subtype (Figure 6A).Top-hit genes from our targeted exome screen were in fact enriched for dropouts in both cortical (hypergeometric test; p=3.9x10 -4 ) and DA neurons (hypergeometric test; p=0.039).Notably, PSEN2 proved essential for survival in both neuronal populations.Thus, its downregulation in the context of synucleinopathy (as in Figures 5B and 5C) is likely to impact viability in these subclasses of neurons.

PSEN2 is involved in SNCA gene-expression
Our neuropathologic studies suggested synucleinopathy was occurring in PSEN2 carriers independent of overt AD pathology (Figure 4).Because PSEN2 was downregulated in synucleinopathy brains (Figures 5B and 5C), we explored whether there was any consequence of this on SNCA expression.We downregulated PSEN2 in SNCA-endogenous neurons with shRNA.SNCA expression was significantly increased (Figure 6B), suggesting PSEN2 may play an important role in regulating SNCA expression.Thus, one mechanism through which PSEN2 mutation or depletion may relate to synucleinopathy is through altered regulation of SNCA expression.This may be particularly detrimental in cortical and DA neurons, in which PSEN2 is both downregulated (Figures 5B and 5C) and essential for survival (Figure 6A).

DISCUSSION
Clinical labels such as "Parkinson's disease" or "Lewy body dementia" belie the highly heterogeneous nature of neurodegenerative diseases and the presence of abundant mixed pathologies in the brains of patients.Adopting a more molecular view of neurodegenerative disease diagnosis is likely an important first step to more targeted disease-modifying therapies for patients.In this study, rather than define patients in terms of clinical diagnoses, we anchored our genetic analysis on the presence or absence of a specific proteinopathy, in this case studying approximately 500 patients with alpha-synucleinopathies.We were motivated by recent studies suggesting that common genetic risk factors like GBA1 exist among synucleinopathies 12,13 , from PD to DLB to MSA, and are also shared among diseases defined by different proteinaceous neuropathologies [21][22][23] .For example, APOE, BIN1 or TMEM175 may be shared risk factors among AD, PD and DLB.It thus seemed reasonable and important to conduct a genetic study directed at the presence of a proteinopathy as opposed to a single clinical diagnosis.
Human genetic investigation of neurodegeneration is limited by statistical power.For example, in recent years, adding many thousands of genomes to PD analyses has added marginal novel insights into their genetic basis.Moreover, most of the studies have investigated common genetic variation, and the recovered risk factors are predicted to have low effect.Polygenic risk analyses suggest most heritability for these diseases yet remains undefined despite decades of such investigations.Part of this missing heritability could arise from rare coding variants in the genome.But the statistical challenge for rare-variant studies is even starker, with current estimates suggesting millions of genomes will be required to adequately powered studies 67 .In this investigation, we thus focused on enhancing statistical power of our rare-variant analysis by decreasing the genomic search space on biological grounds.Simply put, the idea was to yolk unbiased forward genetic screens in model organisms with targeted human genetic screens.Saturated forward genetic screens for yeast Aβ and ɑS proteotoxicity modifiers had recovered known AD and PD risk factors at much higher rates than expected by chance.We thus hypothesized that these Aβ and ɑS cytotoxicity-modifier genes may be enriched in genetic factors of relevance to human disease.Acutely aware of the evolutionary distance between yeast and human, let alone between a unicellular organism and the human brain, we screened at high (100x) depth the exomes of patients for a wide array of homologs of each yeast hit.The idea was to survey the maximum diversity across patients, diseases, and the CNS cellular landscape.
Among the resultant significant hits at gene-and variant-level, we recovered LRRK2 and GBA1 as the strongest hits.This finding serves as a validating positive control because of the prior strong association between these genes and synucleinopathy risk.Second, all our hits were shared between PD and DLB diagnoses and, despite only 24 of our 496 sequenced cases having MSA, three of our top hits at the variant level (GBA1, PTPRS and TEAD2) occurred in pathologically confirmed MSA cases.Thus, our study uncovered evidence of shared risk across synucleinopathies.Moreover, disease-associated LRRK2 and GBA1 variants colocalized with other rare variants that reached screen-wide significance of their own accord across our cohort, raising the possibility of cumulative effects.There were sufficient GBA1 N409S mutation carriers among both cases and controls to recover several novel genes (PTPRB, MICAL3, FREM2, ATP2A3, SH3RF3, PRKCD) in which variants were selectively enriched in cases, suggesting these genes may be candidates for altering phenotypic penetrance of the GBA1 mutation.Significant hits at both variant-and gene-level originated in both Aβ and ɑS proteotoxicity networks and surprisingly included three known AD genes CD33, CR1 and PSEN2.Even more surprisingly, top-hits from the AD/Aβ network validated in both AMP-PD and UK biobank more convincingly than for the PD/ɑS network and showed suggestive (albeit not definitive) eQTL enrichment signals among PD GWAS hits (at 1x10 -7 significance).These data tie our genes not only to rare causal variants but to more common sporadic forms of PD also.More convincingly, our top-hit genes exhibited strong changes in expression across multiple cell types in single-cell sequencing analyses of sporadic PD/synucleinopathy brains, just as is seen with known PD genes.
Where do the Aβ and ɑS proteinopathies converge?An interesting early clue came in our prior yeast studies.While there were negligible common genetic factors among an initial genome-wide modifier screens in yeast, a subsequent pooled screen revealed a convergence at the level of ROM1/2, the guanidine exchange factors (GEFs) for the Rho GTPase Rho1 36 .In humans, these GEF proteins include Arhgef1 and Arhgef28.The most canonical function of these proteins is in the regulation of actin cytoskeletal dynamics, suggesting this could be a common convergent point.Indeed, actin cytoskeletal regulation emerged as a shared function of many of our top-hit genes (ARHGEF1, ARHGEF28, PKN2, MICAL3, PRKCD).Consistent with this, considerable emerging evidence point to abnormal stabilization of F-actin accompanying Aβtau-and ɑSinduced proteotoxicity and leading to altered mitochondrial dysfunction 62 .Recently, we showed a cellular aspect of synucleinopathy that could render patients particularly vulnerable to deficits in this pathway.We found that RhoA is sequestered into highly toxic classes of ɑS inclusions and that postmitotic neurons are exquisitely sensitive to RhoA reduction 58 .A similar sequestration has been described in neurofibrillary tangle pathology.One of other hits ARGHEF28 (also known as RGNEF) has been implicated in ALS with sequestration into TDP-43-positive inclusions 68,69 .In this study, RhoA depletion in our novel iPSC-based "CiS" model led to F-actin destabilization in neurons and accumulation of pathologic pS129 ɑS.Collectively, these studies suggest that genetic liabilities in the RhoA signaling pathway would leave neurons particularly vulnerable to the alpha-synucleinopathy, especially those that form RhoA-positive ɑS inclusions.
The connection of PSEN2 to synucleinopathy is particularly intriguing, given the known prominent synucleinopathy that is known to accompany PSEN1/2-related AD 4,70 .Importantly, the defining feature in patients in our study was synucleinopathy and not concomitant AD pathology, suggesting that the gene can be tied to a pure synucleinopathy itself.Moreover, the pathology in these patients was diffuse, affecting cortical and DA neurons.More than any other gene we recovered in this study, PSEN2 showed the most robust transcriptional expression changes across different neuronal populations in a fashion highly reminiscent of known PD genes, including across DA subtypes (Figures 5B and 5C).This downregulation is presumably highly consequential because our genome-scale CRISPR screens pinpointed PSEN2 as the one gene in our study that was essential for survival of both cortical glutamatergic and DA neurons.Finally, we showed that downregulation of PSEN2 leads to a surprisingly robust elevation of SNCA expression in neurons.Whether or not this upregulation directly leads to neuronal toxicity (and contributes to the dropout in our CRISPR/Cas9 screens) awaits further exploration.Another open question is whether, in the specific context of cortical and DA neurons, Presenilin-2 directly regulates the actin cytoskeleton.Presenilin-2 is known to bind the filamin class of actin-binding proteins 71 .Intriguingly, ɑS itself binds filamins and many other actin-binding proteins and regulators 37,62,72 and is in a complex with RhoA 58 .These are tantalizing connections for future studies to explore.
Beyond the actin cytoskeleton, altered intracellular signaling pathways were strongly represented among top hits, including a kinase within the MAPK family (MAP2K3), three tyrosine phosphatases strongly tied to MAPK signaling (PTPN18, PTPRB, PTPRR 73 , PTPRS) and a key transcriptional effector of Hippo signaling (TEAD2).Hippo signaling itself regulates MAPK signaling, and these pathways all impact cellular senescence.Importantly, PTPRH was tied to PD in a prior exome screen 74 of an independent population but we did not sequence that gene here.Cellular senescence is triggered by ɑS pathology in astrocytes and microglia 75 and by the LRRK2-G2019S mutation in neural cells 76 .This is an area of intense investigative focus, given the important role of aging as a risk factor across proteinopathies.Another pathway we very recently tied directly to synucleinopathy through extensive genetic and biochemical analysis is mRNA decapping and gene regulation 52 .While P-body genes did not emerge in the overexpression screen used as the source of ɑS genetic modifiers in this study (but rather in later screens), we did in fact sequence a single P-body gene, the nuclease-encoding XRN1, because it had emerged as a modifier of Aβ toxicity in an overexpression screen.XRN1 was a top hit at the gene level.The emergence here not only reinforces the tie-in to synucleinopathy in our prior study, but also implicates altered mRNA stability as a potentially common shared pathway across Aβ and ɑS proteinopathies worthy of future exploration.
One key aim of our study was to investigate where in the CNS our genetic hits are impacting cellular vulnerability to neurodegeneration.Our expectation was that interactions within a unicellular context (i.e., yeast) would play out in a multitude of CNS cellular contexts.Moreover, we envisaged that these interactions could become accessible through high-depth sequencing of multiple orthologs of a single gene.In this regard, it was first reassuring that, across approximately 1500 genes in our database of Aβand ɑS-interacting proteins, expression of our top-hit genes was enriched in the CNS, suggesting their relevance to CNS disease.Intriguingly, when the essentiality CRISPR screen was performed in hiPSC-derived glutamatergic neurons, rather than DA neurons, a different set of essential genes emerged.It Is plausible that our top-hit genes alter vulnerability in specific cell types and may confer differential vulnerability in synucleinopathies.For example, ARFGAP3 is a top hit in our targeted exome screen and a gene essential for survival of cortical but not DA neurons.It is tempting to speculate that mutations in a gene like ARFGAP3 could "flip" a patient from a PD (DA>glutamatergic) to DLB (glutamatergic>DA) phenotype.More generally, our ability to uncover altered expression of genes from a study looking for rare proteincoding variants points to crossover among rare and common genetic risk factors for neurodegenerative diseases.This has already been noted for key PD-associated genes like GBA1, LRRK2 and SNCA itself.
It will be likely decades before we have the millions of genomes required for truly systematic analysis of late-onset complex diseases.Until that time at least, and perhaps even after that time, carefully crafted biological tools will be essential to interpret and expand the reach of statistical genetics analyses.Here, our analyses suggest convergent mechanisms are shared within the synucleinopathy group, but also across proteinopathies and clinical diagnoses.This targeted rarevariant approach may effectively find true signals under the GWAS peaks for common variants.Ultimately, in this approach it is the convergence of multiple orthogonal lines of evidence that makes our gene associations convincing and are required to mitigate against false positives from biologically driven (and hence biased) studies like this.We anticipate that, among the many approaches now available, cross-species approaches will continue to yield important insights for complex diseases in the post-genomics era.

Targeted exome sequencing and joint calling
We performed targeted exome sequencing of 506 familial and sporadic PD, LBD, and MSA cases as well as 98 HapMap CEU samples that were sequenced as part of the 1000 Genomes Project for quality control (Coriell Institute for Medical Research, Camden, NJ, USA).Exome enrichment and sequencing library preparation were done following a hybrid selection protocol 77 using Agilent SureSelect baits targeting the coding region of 430 genes (~1 Mb).Samples were sequenced at the Broad Institute of MIT and Harvard (Cambridge, MA, USA) on the Illumina HiSeq platform with ~100x depth.Raw reads were aligned to the human reference genome (GRCh37/hg19) using BWA (v0.5.9).We used 2,570 individuals over the age of 70 who have no history of cancer and dementia from the Medical Genome Reference Bank (MGRB) 44 as controls.Target regions were extracted from the whole genomes of these individuals with 100 bp padding around each interval, and BAM files were generated by aligning them to the same reference genome.
Individual-level variant calling was performed using the GATK HaplotypeCaller (v3.7), and GenomicsDBImport was used for joint calling genotypes of cases and controls.Variant Quality Score Recalibration (VQSR) was not performed due to the insufficient coverage of the genome by our target genes.Instead, we ensured the quality of our targeted sequencing by comparing the SNV calls from our internal HapMap CEU samples with those from the 1000 Genomes Project and observed a concordance greater than 99.9% for all chromosomes.

Variant quality control
Variants were filtered by applying GATK's VariantFiltration utility with the following hard filters: QD > 2.0, SOR < 3.0, MQ > 40.0, FS < 60, MQRankSum > -12.5, ReadPosRankSum > -6.0 and indels: QD > 2.0, SOR < 10.0, FS < 200.0,ReadPosRankSum > -10.0) with Q > 50 and GQ > 20.Only variants that passed all GATK filters and that overlapped our target regions were retained (Supplementary Figure 6).Variants were further filtered to include only autosomal biallelic sites with a maximum of 5% missing genotypes, HWE p-value > 1x10 -6 in controls (internal HapMap and MGRB) and covered at >= 10x depth in at least 95% of samples.Sites with a depth greater than twice the mean depth were excluded, resulting in a total of 15,655 variants (15,237 SNPs and 418 indels).MGRB controls had nearly twice as many singletons (cohort allele count, AC = 1) as cases but roughly the same percentage (~60%) as observed in gnomAD individuals with non-Finnish European (NFE).

Sample quality control
Relatedness statistics were calculated using VCFtools (0.1.14)to determine the unadjusted Ajk statistic 80 .Expected value of Ajk is between zero and one, zero indicating unrelated individuals and one indicating duplicate samples.Based on known kinships, pairs of samples with Ajk > 0.3 were termed as "related."We detected four related pairs; only one sample from each pair was retained.Additionally, we removed two cases due to excessive heterozygosity and one due to excessive genotype missingness.Principal component analysis (PCA) was performed to determine the ancestry of our cases and controls using 1397 HapMap samples from the 1000 Genomes project as reference.Three cases and 53 MGRB controls were removed as population outliers with potentially non-European ancestry.The remaining 496 cases clustered closely with the CEU (Utah residents with Northern and Western European Ancestry) and TSI (Toscani in Italia) HapMap populations and with the MGRB control samples (Supplementary Figure 7).One MGRB control was removed due to an excessive missingness rate (> 25%) which left us with 2516 MGRB controls.No additional samples were removed based on the distribution of Ts/Tv, and Het/Hom (Supplementary Figure 8).
To reduce the potential for technical artifacts, we checked whether ultra-rare variants, such as singletons (AC=1), doubletons (AC=2), and tripletons (AC=3) in cases and controls follow an exact binomial distribution (Supplementary Table 5).As singletons showed significant distributional differences between the two groups, we excluded them from our rare variant analysis.

Rare variant association analysis
Single variant association testing was performed on rare variants (MAF < 0.01) using a one-sided Fisher exact test.The linkage between the most significant variants was assessed using LDSC, and no linkage was found.
To perform gene-level rare variant collapsing analysis, we used the CMC approach to collapse the variants.Then, we applied the SKAT-O test, a combined collapsing and variance component test, which is statistically efficient regardless of the direction and effect of the tested variants 81 .Qualifying variants were selected using an MAF cutoff of 1% and grouped using two masks: (i) nonsynonymous (missense, nonsense, and splice), and (ii) synonymous.The synonymous mask was used as an internal control.
Due to the discrete nature of rare variants and the non-uniformity of the unadjusted p-values, we performed a modified FDR adjustment 82 for multiple hypotheses testing correction.The test performs permutation-based resampling to empirically estimate the null distribution and then uses a rank-based approach like Benjamini-Hochberg adjustment to calculate the FDR-adjusted pvalue.

Controlling for inflation
Since cases and MGRB controls were sequenced at different depths (median coverage of cases ~55x and MGRB ~37x), there was potential for inflation in the variant-level analysis.To check for such bias, we downsampled our cases to 1 million reads per sample (median coverage~42X).A one-sided Fisher's exact test of the downsampled call set detected all significant variants that were identified using the original joint call set.The genomic inflation factors in the jointly genotyped call set, the combined separately genotyped call set and the downsampled callset were computed after the variant-level association analysis.Before excluding ClinVar pathogenic mutations, the λ values were 1.32, 1.33, 1.34 for the joint callset, the combined callset, and the downsampled callset, respectively.After excluding the 15 ClinVar pathogenic mutations, the corresponding λ values decreased to 1.12, 1.15, 1.15 respectively (Supplementary Figure 9).The decrease in genomic inflation indicates that it was primarily driven by the enrichment of known pathogenic mutations rather than the difference in coverage.

Independent validation cohorts: UK Biobank and AMP-PD
We investigated two independent Parkinson's disease (PD) case-control cohorts from the UK Biobank (UKBB), and AMP-PD (Accelerating Medicines Partnership: Parkinson's Disease) to validate our findings from rare variant association analysis.We used ~500K whole exome sequencing (WES) data from UKBB.Unrelated individuals of European ancestry with age over 40 years who had a ICD10 diagnosis code, G20 (data field: 41270) were selected as PD cases.We randomly sampled 6,711 controls from 167,188 unrelated European individuals of 60 years or older with no neurological diagnosis (no ICD10 code between G01-G99) in the UKBB.For rare variant analysis, only high-quality autosomal biallelic variants passing GATK best practices filters with average quality (AQ) >= 50 were retained.Data was analyzed using the DNAnexus platform.
For the AMP-PD whole genome sequencing (WGS) dataset, we first excluded any individual belonging to the genetic registry, genetic cohort, subjects without evidence of dopamine deficit (SWEDD), prodromal categories, or the AMP-LBD cohort.We also excluded the samples from the Harvard Biomarker Study (HBS) due to the possibility of overlaps with our original case cohort.High-quality autosomal biallelic variants passing GATK best practices filters with maximum 10% missingness were retained for analysis.Unrelated European individuals were included in the analysis.Additional sample outliers were removed based on Ts/Tv, Het/Hom ratios, and perhaploid SNV counts.Outliers were defined as samples which are +/-3 standard deviations away from the mean.After performing variant-and sample-level quality filtering, the resulting cohort included 1,598 sporadic PD cases with no known LRRK2 and GBA1 mutations, and 1,095 neurotypical controls.
Variants from both the cohorts were called using the GRCh38 assembly and were annotated with the gnomAD minor allele frequencies and in-silico predictions of deleteriousness of the missense variants by PolyPhen2 and SIFT from the dbNSFP (v4.3a) database using the using VEP (v109).Variants termed as synonymous, missense, splice donor, splice acceptor, splice region, stopgained, stop-lost, start-lost, frameshift, in-frame insertion, and in-frame deletion, were included in the analysis.Three masks were used to group variants: (i) Damaging: missense variants predicted to be either ''P'' or ''D'' by PolyPhen2 or ''deleterious'' by SIFT, as well as, splice donor, splice acceptor, splice region, stop-gained, stop-lost, start-lost, frameshift, in-frame insertion, and in-frame deletion; (ii) Missense, and (iii) Synonymous.

Rare variant trend test (RVTT)
Since single-variant and gene-level analysis of both UKBB and AMP-PD were unable to find any genome/exome-wide significant gene other than LRRK2 and GBA1 in previous studies 45,46 as well as our in-house analysis, we utilized RVTT 51 , a gene set based approach to assess trend in rare variant burden.Due to relatively small sample sizes of PD cohorts and the burden of multiple hypotheses correction, the statistical power of variant-and gene-level burden tests remains low.RVTT increases power by looking at gene sets and reduces false positives by looking at the frequency of qualifying variants in the gene set of interest instead of presence or absence of variants.Under the null hypothesis, the test assumes that there is no linear trend in the binomial proportions of cases and controls in terms of rare variant occurrences in the gene set.The alternative hypothesis indicates the presence of a linear trend.RVTT uses the Cochran-Armitage statistic to quantify this trend.Qualifying rare variants are selected using a variable threshold approach 83 .Since the test statistic doesn't follow a chi-square distribution, p-values are drawn from an empirical distribution estimated from 10,000 random permutations of the case-control labels.
We tested four gene sets: (i) significant genes (variant-level; FDR adj-p < 0.1), (ii) significant genes (SKAT-O; FDR adj-p < 0.1), (iii) AD genes and Aβ modifiers among significant genes, and (iv) PD genes and a-syn modifiers among significant genes (Figure 1E).RVTT was applied to both UKBB and AMP-PD.The test selected qualifying rare variants by varying the MAF cutoff up to 0.01.The validity of significant results was confirmed by demonstrating that there is an accumulation of damaging and missense variants in patients versus controls, but no statistically significant difference in proportions of synonymous variants within the same gene sets in cases vs. controls.
The RVTT p-values per variant category from both datasets were combined using the Cauchy combination test (CCT) 53 .CCT defines the test statistic as a weighted sum of transformed p-values that follows a standard Cauchy distribution.Unlike Fisher's combined test, CCT is robust against arbitrary correlation structures among the p-values being tested and less biased towards extremely small p-values from individual studies.Therefore, it is better suited to genomic analysis where complete independence of underlying variants and genes cannot be assumed.

Single-cell transcriptomic analysis in post-mortem brain data
We analyzed publicly available single nuclei RNA-seq data from post-mortem brains of 10 PD/DLB patients and 8 healthy controls 63 .We used the cell type and subtype definitions from the original study and followed their quality control recommendations.Differential expression analysis was performed in all major cell types of the central nervous system (CNS) as well as in the ten subtypes of dopaminergic (DA) neurons using MAST (model-based analysis of single-cell transcriptomes) 84 .
We used MAST's discrete hurdle model which is a mixed effects model that assumes the snRNAseq data follows a mixture of a binomial and normal distribution while accounting for pre-defined covariates.Sex, age, race, percentage of reads aligned to mitochondrial genes, and number of UMIs (log-scale) were used as fixed-effect covariates.Disease status was also included as fixedeffect covariate and was the main variable of interest.Sample ids were used as a random-effect covariate, to account for dependencies between cells originating from the same individual.The beta value from Wald test was used as an estimate of the effect of disease on expression of each gene where the sign of beta indicated the direction of effect.Differentially expressed genes were selected as being transcriptome-wide significant with a cutoff of BH-adjusted p-value < 0.1.

Enrichment analysis of expression quantitative trait loci (eQTLs)
We searched the bulk data from brain GTEx (v8) and PsychENCODE for the presence of known expression QTLs (eQTL) and splice QTLs (sQTLs) within +/-1 Mb of our significant genes (full set as well as the AD genes and Aβ modifiers subset) in different brain regions.We assessed the enrichment of these gene sets with significant rare variant burden over all genes having e/sQTLs in both brain GTEx and PsychENCODE and within anchored Hi-C loops.Confidence intervals were empirically estimated through random sampling of N (N=number of genes of interest), permuting 1,000 times and permutation p-values were reported.
We interrogated two publicly available datasets 56,57 through scQTLbase 85 to identify significant cell-type specific cis-eQTLs in near our significant genes in major CNS cell types.To assess the relationship between expression and PD risk, we integrated cell-type specific eQTLs from iPSCderived cell types differentiating towards a mid-brain fate 57 with summary statistics from a largescale genome-wide association study (GWAS) of PD 43 .Since current single cell eQTL datasets for CNS cell-types have small sample size, many eQTLs remain undiscovered 86 .Also, at the genome-wide significance level (p-value < 1x10 -8 ), we are underpowered to detect any enrichment.Therefore, we used different relaxed p-value cutoffs (1x10 -7 , 1x10 -6 , and 1x10 -5 ) for selecting PD-associated variants with suggestive evidence.We then performed enrichment analyses using cis-eQTLs in iPSC-derived neurons near our significant genes as well as the AD genes and Aβ modifiers subset over a background of cell-type specific cis-eQTLs with suggestive signal in PD GWAS.Confidence intervals were empirically estimated through random sampling of N (N=number of genes of interest), permuting 1,000 times and permutation p-values were reported.

Cortical induced synucleinopathy (CiS) model Generation
SNCA-IRES-mCherry or IRES-mCherry was cloned into lentiviral vector under the control of EF-1a promoter.Plasmids were sequenced verified and submitted for lentivirus production.WTC11 hiPSCs with doxycycline-inducible NGN2 in AAVS1 safe-harbor locus 65 were transduced either lentivirus.hiPSCs were flow cytometry-sorted utilizing the mCherry fluorescence and replated at low density for single-cell clonal selection.iPSC clones were expanded, karyotyped.

Whole-cell protein extraction
For cell lysis, frozen cell pellets were resuspended in RIPA buffer (Thermo Scientific; Cat.No. 89900) supplemented with Complete EDTA-free protease inhibitor cocktail (Sigma-Aldrich; Cat.No. 11873580001) and PhosSTOP phosphatase inhibitor cocktail (Sigma-Aldrich; Cat.No. 4906845001).The cell pellets were kept on ice for 30 minutes and vortexed every 10 minutes to enable complete suspension of pellets to the RIPA buffer.After incubation, the cells were centrifuged at 21000g at tabletop centrifuge for 25 minutes at 4°C.The supernatant was then collected and transferred to a new Eppendorf tube.Protein concentration was measured with Pierce™ BCA Protein Assay Kit according to manufacturer's guidelines (Thermo Fisher; Cat.No. 23225).Protein samples were mixed with 4X NuPAGE™ LDS Sample Buffer (Thermo Fisher; Cat.No. NP0007) supplemented with 40 mM TCEP Bond Breaker (Thermo Fisher; Cat.No. 77720) and boiled for 10 minutes at 65°C.

Western blotting
For SDS-PAGE electrophoresis, NuPAGE™ 4-12% Bis-Tris protein gels were used (Thermo Fisher; Cat.No. NP0321BOX) with MOPS running buffer (Thermo Fisher; Cat.No. NP0001).The gels were transferred to PVDF membranes (Invitrogen; Cat.No. IB24002) with iBlot™ 2 Gel Transfer Device with the preset P0 setting.After the transfer, the membranes were fixed with 0.4% Paraformaldehyde in PBS for 15 minutes at room temperature with slight rocking (This step is especially critical for detection of α-Synuclein protein).The membranes were washed three times with PBS-T (PBS with 0.1% Tween-20) and blocked with blocking buffer (5% BSA in PBS-T) for 30 minutes at room temperature.Primary antibodies are diluted in blocking buffer and incubated overnight in cold room.After three times PBS-T washing, the secondary HRP conjugated antibodies are incubated with modified blocking buffer (1% BSA in PBS-T) for 45 minutes at room temperature with slight rocking.All the exposures were recorded digitally by iBright™ CL1000 Imaging System.

RNA isolation, qPCR analysis
Total cellular RNA was isolated with Purelink RNA mini kit (Fisher Scientific; Cat.No.12183018A) according to manufacturer's instructions.RNA was used to synthesized cDNA with the Superscript IV Vilo master mix with ezDNase enzyme (Fisher Scientific; Cat.No. 11766050).Quantitative SYBR green PCR assay was performed using Powerup SYBR green master mix (Fisher Scientific; Cat.No. A25777) following previously published protocol 87 .The fold change in gene expression was determined by the ΔΔCt method.GAPDH was used as a housekeeping gene.

Immunofluorescence and microscopy
For immunofluorescent studies, cells were fixed in 4% PFA, washed with PBS + 0.1% Triton, blocked with 5% BSA in PBS + 0.1% Triton, and incubated in primary antibodies overnight at 4 °C.The following primary antibody concentrations were used: cleaved caspase, Cell Signaling Cat.No. 9661 (1:500); TUJ1, Biolegend Cat.No. 801213 (1:500); Phalloidin, Cytoskeleton Cat.No. PHDG1-A; pS129, Abcam Cat.No. ab184674 (1:1000).Appropriate fluorescent secondary antibodies (Alexa Fluor, Molecular Probes) were used at 1:1,000.Zeiss LSM-800 confocal microscope with 40X objective lens was used in capturing the fluorescence intensity of caspase, phalloidin and pS129 staining.The same confocal settings were used in scanning all the genotypes and treatment conditions.The confocal setting was optimized to avoid any crosstalk between the fluorophores.Image J was used in analyzing fluorescence intensity and cell/aggregates count.For phalloidin, 3 images were acquired, and two experimental replicates were used.For caspase-positive cell count, the whole well was analyzed.For pS129, fluorescence intensity in the cell body was measured and normalized to the intensity of DAPI.For each condition, 14 -18 neurons were quantified from two wells.

Genome-wide CRISPR/Cas9 screen in midbrain cortical excitatory neurons
The WGS was performed in WA-09 (H9) embryonic stem cells with doxycycline inducible Cas9 (iCas) knocked into the AAVS1 self-harbor locus as previously described (PMID: 24931489).To perform the screen H9-iCas pluripotent stem cells were infected with the Brunello whole genome human CRISPR knockout library 88 at an MOI of 0.3-0.5 and at 1000x representation.Successfully transduced cells were selected for by adding 0.4ug/ml puromycin to the E8 medium.PSCs were differentiated to cortical neurons as previously described 89 .A DIV20 a no Doxycycline sample was harvested as the representation control for the screen.Cas9 was induced by the addition of Doxycycline (2 μg/ml) to the culture medium from DIV20-22.Neurons were cultured until DIV65 then harvested for library preparation and sequencing.Analysis was performed using the MAGECK-MLE pipeline as previously described 90 .Genes were considered essential if they had a negative beta score with a P-value of <0.05 and FDR of 10%.

Genome-wide CRISPR/Cas9 screen in midbrain Dopaminergic neurons
The WGS was performed in WA-09 (H9) embryonic stem cells with doxycycline inducible Cas9 (iCas) knocked into the AAVS1 self-harbor locus as previously described 91 .Stem cells were transduced with the Gattinara human CRISPR pooled knockout library 92 at an MOI of 0.3-0.5 and 1000X representation.Transduced stem cells were selected by puromycin and differentiated toward dopamine neurons until reaching the neural progenitor stage as described 93 .Cas9 expression was induced by doxycycline addition, cells were collected for our initial timepoint, and the remaining cells were differentiated.All remaining cells were collected for our final timepoint as neuron cell death began.Sample were processed for library preparation and sequenced.Sequencing reads were aligned to the screened library and analyzed using MAGeCK-MLE 90 , and hits were classified as having Wald-FDR < 0.05 and Beta < -0.58.
MAST analysis of snRNA-seq data, and MAGeCK-MLE analysis were performed using R packages stats, SKAT, MAST, and MAGeCKFlute respectively.

Acknowledgement
Deep targeted exome sequencing of the original cohort was supported through Fidelity funds (FBRI: https://fprimecapital.com/fbri).The authors gratefully acknowledge funding by the Aligning Science Across Parkinson's Initiative (ASAP) award ASAP-000472, Brigham Research Institute Director's Transformative Award, Michael J. Fox.Foundation award 023460 and the National Institute of Health (NIH) grant R01NS109209.SN gratefully acknowledges support from the NIH grant R35-GM127131 (PI: Sunyaev) and the Australian Parkinson's Mission.XW gratefully acknowledges support from the NIH grant T32AG000222 (PI: Yankner).We thank Dr. Christopher Cassa and Prof. Peter Kharchenko for sharing resources for the UKBB analysis and the snRNAseq analysis in post-mortem brains respectively.(B) Genes with significant rare variant burden in our analysis, including the known AD gene PSEN2, that have been mechanistically connected to regulation of the actin cytoskeleton.(A) Independent essentiality screens performed in cortical excitatory and midbrain DA neurons converge on PSEN2/Ab-actin genes.In cortical and DA neurons, genome-wide screens were performed to identify genes that lead to cell death with 19,993 genes plus 1,000 control gRNAs.Screen performed in triplicate in parallel and were collected at days 0, 20, and 40 (DA neurons)/65 (cortical neurons).Samples from day 20 were compared to day 40/65 to identify barcodes with reduced representation during neuronal maturation, suggesting those genes were essential for neuronal survival.MAGeCK-MLE analysis identified 1,132 and 693 genes to be essential for cortical and DA neurons respectively.Our SKAT-O significant genes showed significant overlaps with the essential genes in hypergeometric enrichment test with PSEN2 being the convergent signal.

Figure 2 .
Figure 2. A gene module comprising RhoA signaling/actin cytoskeleton regulators previously tied to AD and Aβ toxicity are among the more significant genes identified in this study.(A)Schematic diagram of Ras homology family member A (RhoA) signaling.RhoA is a master regulator of actin cytoskeleton stabilization and other processes.Signaling is activated by guanidine exchange factor Arhgef proteins.A key effector of actin cytoskeleton changes is the Pkn2 kinase.(B) Genes with significant rare variant burden in our analysis, including the known AD gene PSEN2, that have been mechanistically connected to regulation of the actin cytoskeleton.

Figure
Figure 4. Clinical and neuropathology diagnosis of patients carrying ARHGEF1/28, PSEN2 variants.(A)The distinction between PD with dementia (PDD) and DLB is based on clinical criteria.Neuropathologic diagnosis for Lewy Body Disease (LBD) is either present/absent, and the type of LBD (brainstem predominant, limbic, amygdala only, or diffuse/neocortical). Neuropathologic criteria for Alzheimer's disease (AD) outlines three "scores" corresponding to Amyloid/Thal, Braak/tau and CERAD/neuritic plaque.Based on these scores, cases are designated as having no, low, intermediate, or high level of AD neuropathologic change (ADNC).No/low ADNC is generally associated with cognitively normal individuals.Pathologic aging: beta-amyloid deposits being found without any tau.Lower panel: Histopathology from PSEN2 variant carrier, which is an example of LBD and low level ADNC.(B) Histopathology from case #4, a carrier of PSEN2 and ARHGEF28 variants.Low ADNC was observed in this case (scalebar: 50 um).

Figure 5 .
Figure 5. PSEN2/Ab-actin gene module exhibit significant dysregulation in neuronal subpopulations in postmortem PD/DLB brain.(A)Heatmap showing the presence of cell-type-specific cis-eQTLs near the members of the PSEN2/Ab-actin gene module in different neuronal cell types.PSEN2 had a significant enrichment of DA-specific cis-eQTLs (hypergeometic p-value: 2.49x10 -65 ) and PKN2 had an enrichment of cis-eQTLs in excitatory neurons (hypergeometric p-value: 1.97x10-77  ).See Supplementary Table4for additional details.(B) Heatmap showing PD-associated transcriptional changes of the identified PSEN2/Ab-actin gene module in post-mortem PD/DLB brains vs controls in different neuronal cell types.Celltype specific differential expression analysis was performed in dopaminergic (DA), non-DA excitatory and non-DA inhibitory neurons of the midbrain using MAST's discrete hurdle model.Transcriptome-wide significance is reported using a BH-corrected p-value cutoff of 0.1 (denoted by *s; *: adj-p < 0.1, **: adj-p < 0.01, ***: adj-p < 0.001, ****: adj-p < 0.0001).Transcriptional changes of 26 known PD genes were shown on the left for comparison.See Methods for details.(C) Heatmap showing PD-associated transcriptional changes of the identified PSEN2/Ab-actin gene module in post-mortem PD/DLB brains vs controls in different subtypes of DA neurons in the midbrain.Cell subtype specific differential expression analysis was performed using MAST's discrete hurdle model.Transcriptome-wide significance is reported using a BH-corrected pvalue cutoff of 0.1 (denoted by *s; *: adj-p < 0.1, **: adj-p < 0.01, ***: adj-p < 0.001, ****: adj-p < 0.0001).Transcriptional changes of 26 known PD genes were shown on the left for comparison.

Figure 6 .
Figure 6.PSEN2 is an essential gene in cortical and dopaminergic neurons, and is involved in SNCA expression.(A)Independent essentiality screens performed in cortical excitatory and midbrain DA neurons converge on PSEN2/Ab-actin genes.In cortical and DA neurons, genome-wide screens were performed to identify genes that lead to cell death with 19,993 genes plus 1,000 control gRNAs.Screen performed in triplicate in parallel and were collected at days 0, 20, and 40 (DA neurons)/65 (cortical neurons).Samples from day 20 were compared to day 40/65 to identify barcodes with reduced representation during neuronal maturation, suggesting those genes were essential for neuronal survival.MAGeCK-MLE analysis identified 1,132 and 693 genes to be essential for cortical and DA neurons respectively.Our SKAT-O significant genes showed significant overlaps with the essential genes in hypergeometric enrichment test with PSEN2 being the convergent signal.(B) Relative mRNA expression level in SNCA-endogenous CiS neurons 7days after control or PSEN2 shRNA transduction, revealing PSEN2 knockdown significantly increased SNCA expression.Values represent mean ± s.e.m. (**p < 0.01; ****p < 0.0001; t-test).

Table 1 . Significant rare nonsynonymous variants (MAF < 0.01) in synucleinopathy cases versus controls.
Fisher exact test was performed to test the association of variants with disease risk.Significant variants were selected with a cutoff of FDR-adjusted p-value < 0.1.

Table 2 . Significant genes with rare nonsynonymous variant burden in synucleinopathy cases versus controls
. Missense, nonsense, and splice variants with MAF < 0.01 were collapsed and gene-level SKAT-O test was performed, and significant genes were selected with a cutoff of FDR-adjusted p-value < 0.1.

4. Clinical and neuropathology diagnosis of patients carrying ARHGEF1/28, PSEN2 variants
.(A) The distinction between PD with dementia (PDD) and DLB is based on clinical criteria.Neuropathologic diagnosis for Lewy Body Disease (LBD) is either present/absent, and the type of LBD (brainstem predominant, limbic, amygdala only, or diffuse/neocortical). Neuropathologic criteria for Alzheimer's disease (AD) outlines three "scores" corresponding to Amyloid/Thal, Braak/tau and CERAD/neuritic plaque.Based on these scores, cases are designated as having no, low, intermediate, or high level of AD neuropathologic change (ADNC).No/low ADNC is generally associated with cognitively normal individuals.Pathologic aging: beta-amyloid deposits being found without any tau.Lower panel: Histopathology from PSEN2 variant carrier, which is an example of LBD and low level ADNC.(B) Histopathology from case #4, a carrier of PSEN2 and ARHGEF28 variants.Low ADNC was observed in this case (scalebar: 50 um).