The chromatin modulating NSL complex regulates genes and pathways genetically linked to Parkinson’s disease

Genetic variants conferring risk for Parkinson’s disease have been highlighted through genome-wide association studies, yet exploration of their specific disease mechanisms is lacking. Two Parkinson’s disease candidate genes, KAT8 and KANSL1, identified through genome-wide studies and a PINK1-mitophagy screen, encode part of the histone acetylating non-specific lethal complex. This complex localises to the nucleus, where it has a role in transcriptional activation, and to mitochondria, where it has been suggested to have a role in mitochondrial transcription. In this study, we sought to identify whether the non-specific lethal complex has potential regulatory relationships with other genes associated with Parkinson’s disease in human brain. Correlation in the expression of non-specific lethal genes and Parkinson’s disease-associated genes was investigated in primary gene co-expression networks utilising publicly available transcriptomic data from multiple brain regions (provided by the Genotype-Tissue Expression Consortium and UK Brain Expression Consortium), whilst secondary networks were used to examine cell-type specificity. Reverse engineering of gene regulatory networks generated regulons of the complex, which were tested for heritability using stratified linkage disequilibrium score regression and then validated in vitro using the QuantiGene multiplex assay. Significant clustering of non-specific lethal genes was revealed alongside Parkinson’s disease-associated genes in frontal cortex primary co-expression modules. Both primary and secondary co-expression modules containing these genes were enriched for mainly neuronal cell types. Regulons of the complex contained Parkinson’s disease-associated genes and were enriched for biological pathways genetically linked to disease. When examined in a neuroblastoma cell line, 41% of prioritised gene targets showed significant changes in mRNA expression following KANSL1 or KAT8 perturbation. In conclusion, genes encoding the non-specific lethal complex are highly correlated with and regulate genes associated with Parkinson’s disease. Overall, these findings reveal a potentially wider role for this protein complex in regulating genes and pathways implicated in Parkinson’s disease.


Introduction
An in-depth understanding of the genetic and pathophysiological mechanisms underlying neurodegenerative diseases is necessary to develop effective disease-modifying treatments. In the case of Parkinson's disease, although 90-95% of cases are sporadic, historically much of the research into its genetic basis has focused on family-based linkage studies. Indeed, the identification of at least 23 genes with highly penetrant effects on Parkinson's disease risk has succeeded in elucidating multiple biological pathways involved in its pathology. In particular, mitochondrial dysfunction and impaired protein degradation pathways are common themes.
More recently, genome-wide association studies (GWASs) have identified 90 independent risk signals linked to Parkinson's disease. Several of these had already appeared in familial studies, thereby highlighting important commonalities in the processes driving both types of the disease. 1 However, a broader understanding of the molecular relationships between Parkinson's disease loci is still lacking. Genes causally linked to the disease and involved in transcriptional regulation have the potential to provide such insights and shed light on key disease-relevant gene networks.
One such transcriptional regulator with strong links to Parkinson's disease is KAT8. This gene was first linked to the disease through the identification of a risk signal on chromosome 16 (rs14235) with subsequent expression quantitative trait loci (eQTL) analysis suggesting the risk allele results in lower KAT8 mRNA levels. 2,3 Further GWAS analyses have again highlighted KAT8 as a candidate gene, with recent colocalization and transcriptome-wide analyses strengthening the evidence for KAT8's contribution to Parkinson's disease. 1,4 Importantly, KAT8 functions within two multiprotein complexes that regulate its activity and specificity, namely the male specific lethal (MSL) and non-specific lethal (NSL) complexes. 5 Although the KAT8-encoded acetyl-transferase is thought to be the main catalytic driver in both complexes, differences in lysine specificity and in the genomic regions which are targeted can likely be attributed to subunits aside from KAT8 itself. 6 This makes the other components of the MSL and NSL complexes of potential interest, with the latter particularly important in Parkinson's disease as it contains KAT8 Regulatory NSL Complex Subunit 1 (KANSL1), another protein encoded by a Parkinson's disease candidate gene. 3,7 KANSL1 is contained within the 970kb inversion polymorphism on chromosome 17q21, located within a linkage disequilibrium (LD) block of approximately 2Mb which gives rise to H1/H2 haplotype variation. 8 The H1 haplotype has well established links to neurodegenerative disease, specifically progressive supranuclear palsy, Alzheimer's and Parkinson's disease. [9][10][11] The precise mechanism underlying the link to Parkinson's disease is disputed, with this risk frequently attributed to the adjacent tau-encoding MAPT as well as, more recently, a putative enhancer RNA expressed from within KANSL1. 12,13 Moreover, the first GWAS of short tandem repeats in Parkinson's disease found the strongest signal within KANSL1. 14 Furthermore, both KAT8 and KANSL1 have been linked to mitophagy, the process by which defective mitochondria are identified and degraded and a key pathway implicated in Parkinson's disease. Accumulation of mitophagy marker, phospho-ubiquitin (pUb, serine 65) has been detected in post-mortem diseased brains, whilst deficient mitophagy has been found in both sporadic and Mendelian patient-derived induced pluripotent stem cell (iPSC) models and even suggested to play a direct role in α-synuclein accumulation. [15][16][17] Proteins involved in mitophagy, in particular PTEN-induced putative kinase 1 (PINK1) and parkin, are associated with early-onset autosomal recessive forms of the disease through mutations in their encoding genes, PINK1 and PRKN. 18,19 A biological screening assay of Parkinson's disease GWAS candidate genes which measured PINK1-mediated mitophagy in neuroblastoma cells demonstrated significantly reduced pUb accumulation, parkin recruitment and phosphorylation, as well as lysosomal localisation of mitochondria following knockdown (KD) of both KAT8 and KANSL1, thus demonstrating an important role of the NSL complex in mitochondrial quality control and Parkinson's disease. 4,20 However, there is some uncertainty regarding the precise molecular processes linking the NSL complex to PINK1-mediated mitophagy. There is evidence that components of the NSL complex can localise to mitochondria, though the most established function of the complex is in the nucleus, where it is involved in chromatin regulation. 21 Thus, KAT8 and KANSL1 could operate to regulate the risk of Parkinson's disease in multiple sub-cellular compartments.
In this study, we focused on the role of the NSL complex within the nucleus and tested our hypothesis that this complex operates as a master regulator of Parkinson's disease risk. This idea is supported by existing evidence implicating KAT8-dependend lysine acetylation of primarily histone 4 in the regulation of a range of cellular processes, including DNA damage repair, and autophagy. 6,[22][23][24][25][26][27] To pursue this idea, we performed a series of in silico analyses which successfully predicted gene regulatory relationships between the chromatin modulating NSL complex and genes associated with Parkinson's disease. These findings suggest a role for the NSL complex in modulating multiple pathological pathways and provide a useful framework for investigating potential gene regulatory mechanisms underlying disease risk associated with loci highlighted through GWASs.

Gene selection
We collated three lists of genes, namely the NSL genes, genes causally associated with Mendelian forms of Parkinson's disease and genes nominated through GWAS (Tab. 1). The nine genes encoding the NSL complex are widely published. 5

QuantiGene multiplex assay
We used the QuantiGene multiplex system to simultaneously measure the expression of multiple genes in siRNA-treated SHSY5Y cell samples. 52,53 Individual reagents and probe sets were purchased from ThermoFisher. Probes were directed against: i) five housekeeping genes used for normalisation, ii) two NSL complex genes used to quantify the KD, iii) all genes both causally linked to Mendelian Parkinson's disease and complex parkinsonism, and contained within NSL regulons in the GTEx dataset, and iv) all genes nominated through GWAS which are expressed in brain and predicted to be regulated by at least three NSL complex genes (Tab.
2 Data analysis was performed by first subtracting the background, then normalising the signals obtained for the genes of interest to the geometric mean of the five housekeeping gene signals.
Technical duplicates were included for each experimental repeat and outliers were identified and excluded using the ROUT method in GraphPad Prism (version 9, RRID:SCR_002798). 57 Remaining duplicates were averaged and normalized to the SCR treated sample mean.

Data availability
Raw data used to generate specificity matrices from substantia nigra and medial temporal gyrus  Fig. 1a). Differences in cell type-specific expression were also examined within two brain regions, namely the substantia nigra and medial temporal gyrus, utilising EWCE analysis and based on single-nuclei transcriptomic data. 31,32 Consistent with expectation, the NSL complex genes displayed no overt differences in cell type-specific gene expression when considered collectively ( Supplementary Fig. 1b).

Components of the NSL complex cluster together in gene co-expression modules derived from human frontal cortex data
Given that there was no clear specificity of expression of NSL genes in CNS tissues or, more importantly, cell types, gene co-expression analysis was used to investigate the possibility of regional differences in co-expression that could explain selective neuronal vulnerability in Parkinson's disease. This approach was based on the fact that genes with highly correlated expression tend to share biological relationships, and so GCN analysis can reveal otherwise hidden patterns in expression that reflect molecular and cellular processes. 39

Co-expression analysis supports a role for the NSL complex in the regulation of both chromatin and mitochondrial function in human frontal cortex
Next, we focused on co-expression modules containing NSL complex genes to better understand their function in human brain. This was achieved by examining GO term enrichment within all five modules of interest in the GTEx frontal cortex GCN, but with the primary focus being the 'red' module. In total, these five modules were significantly enriched for 2899 GO terms (FDR < 0.05). Following term reduction, we noted enrichments of terms representing both the well-characterised nuclear-based and lesser-known mitochondrial-based role of the NSL complex. Consistent with expectation, nucleus-related terms, such as transcription coactivator activity (FDR range = 1.85x10 -5 -0.0227), were identified as enriched within the 'red' module, whilst the 'darkred' module was enriched for a range of mitochondriarelated terms, including cytochrome-c oxidase activity (FDR range = 2.03x10 -8 -0.0128) and mitochondrial inner membrane (FDR range = 1.33x10 -51 -2.46x10 -3 ) (Fig. 1b). The UKBEC GCN was also significantly enriched for several nuclear terms in the 'black' module, but lacked any mitochondrial terms ( Supplementary Fig. 3b). Together these results indicated that both the chromatin-and mitochondria-related functions of the NSL complex were captured in these GCNs, indicating that these functions are likely to be active within human frontal cortex.

Components of the NSL complex cluster together with Parkinson's diseaseassociated genes in co-expression modules in human brain
Next, we explored the possibility that NSL genes are functionally related to genes associated with Parkinson's disease. Parkinson's-related genes were divided into those causally linked to

NSL complex activity in Parkinson's disease may be most important in neuronal cell types
Although the genes encoding the NSL complex are expressed ubiquitously across different cell types, their gene regulatory relationships may nonetheless be cell type-specific, and only in specific cell types might there be a relationship with Parkinson's disease genes. We used the GMSCA tool to test this possibility, and generated secondary GCNs, from which the contribution of four major cell types (neurons, astrocytes, microglia and oligodendrocytes) was removed in turn prior to network construction. 40 As might be expected, this correction altered the clustering of NSL genes and their co-expression patterns with Parkinson's diseaseassociated genes across secondary GCN modules (Fig. 2a,c). The sporadic Parkinson's disease gene list was significantly enriched alongside KANSL1as in the primary GCNwithin the oligodendrocyte-corrected 'green' module (FDR-corrected p-values = 4.81x10 -3 and 5.87x10 -3 respectively) alone, whilst each of the neuron-, microglia-and astrocyte-corrected networks disrupted the relationship between NSL and disease genes (Fig 2a,c). This suggested that the relationship between these genes is likely to be active in the neurons, microglia and/or astrocytes.
To identify which of the three cell types was most important for NSL-Parkinson's disease coexpression, we examined the enrichment of cell type markers across all primary and secondary GCN modules. The 'darkred' module containing KAT8 and MCRS1 was enriched for markers of dopaminergic neuronal signalling (p-value = 3.23x10 -5 ) in the GTEx primary GCN (Fig. 2b). MCRS1 remained predominantly in modules enriched for different neuronal markers following the correction of microglial, astrocytic and oligodendrocytic signatures (pvalue range = 3.92x10 -16 -7.42x10 -3 , module membership range = 0.8452-0.8529) (Fig. 2b).
The primary GCN 'grey60' module containing PHF20 lacked any cell type enrichment, contrasting to the secondary GCN 'salmon' and 'greenyellow' modules which were both enriched for multiple neuronal cell types following the correction of microglial and oligodendrocytic signatures respectively (p-value range = 3.99x10 -11 -5.11x10 -3 , module memberships = 0.8544 and 0.8539) (Fig. 2b). Although KANSL1-containing modules had no cell type enrichments, these results suggest the gene regulatory links between the NSL complex and genes associated with Parkinson's disease may be most important in neuronal cell types.

In silico analysis predicts the regulation of Parkinson's disease-associated genes by members of the NSL complex
The genetic interactions modelled in GCNs are typically undirected in that causality is unassigned. 38 However, it is already known that the NSL complex is highly important in the regulation of gene expression, suggesting that at least a proportion of the genes co-expressed with the NSL complex are regulated by it. 5 We formally tested this possibility in silico with the tool Algorithm for the Reconstruction of Accurate Cellular Networks with adaptive partitioning (ARACNe-AP), which uses expression data to reverse engineer gene regulatory networks. 43,63 By applying ARACNe-AP, we predicted the genes most likely to be regulated by the NSL complex amongst those contained within the 'red' and 'darkred' GTEx GCN modules (Supplementary Table 1). 43 The resulting target gene lists, termed regulons, produced by this analysis ranged from 491 genes predicted to be regulated by KANSL1, to 1788 predicted to be regulated by WDR5 in the GTEx dataset. As expected for genes encoding a protein complex which regulates gene expression, regulons showed significant overlaps with each other. The regulons of the four NSL genes contained within the 'red' GTEx GCN module all significantly overlapped (FDR range = 3.53x10 -67 -3.76x10 -4 ), whilst the regulon of KAT8 significantly overlapped only with that of WDR5 (FDR = 2.98x10 -11 ) (Fig. 3a).
To reduce the impact of noise and focus on genes most representative of NSL complex activity, genes appearing in three or more regulons were collated and termed the NSL regulon (n = 1101). We then assessed this gene set for its role in Parkinson's disease causation. Firstly, we noted that two Mendelian disease-associated genes (ATXN2 and PLA2G6) and 12 sporadic

In vitro analysis confirms the regulation of Parkinson's disease-associated genes by the NSL complex
Given the success of our in silico analyses, we wanted to validate some of the regulatory relationships identified in vitro, in particular focusing on genes causally associated with Parkinson's disease and contained within the NSL regulon (  6a,b). 4 BIN3, CTSB, DGKQ, NCKIPSD and PGS1 were also significantly reduced following KANSL1 KD alone (p-value range < 1x10 -4 -0.0295), with reductions in DGKQ and NCKIPSD following KAT8 KD also reaching significance (p-value = 4.55x10 -3 and 4.25x10 -3 ).
Interestingly, WDR45 expression followed an inverse pattern, with KAT8 KD resulting in a significantly increase in expression(p-value = 3.31x10 -3 ) (Fig. 4). Thus, seven of the 17 genes (41.2%) predicted to be regulated by the NSL complex using in silico analyses were indeed found to show significant changes in expression when KAT8 or KANSL1 expression was suppressed (Fig. 5).

Discussion
This project utilised publicly available transcriptomic data from human brain tissue to characterise the expression patterns of genes encoding the NSL complex and their relationships to genes genetically linked to Parkinson's disease. First, NSL genes were found to cluster together with Parkinson's-associated genes in GCN modules annotated for both chromatin and mitochondria-related functions. Second, these co-expression relationships appeared to be most associated with neuronal cell types. Third, a number of Parkinson's-associated genes predicted to be directly regulated by multiple components of the NSL complex were subsequently validated in a relevant cell model. Hodge RHReynolds. International       antigen processing and presentation of exogenous peptide antigen via MHC class I, TAP−dependent ATP metabolic process autophagy cell cycle cellular localization cellular macromolecule biosynthetic process cellular protein localization cellular response to DNA damage stimulus DNA repair histone modification homophilic cell adhesion via plasma membrane adhesion molecules intracellular transport of virus macromolecule biosynthetic process mitochondrial ATP synthesis coupled proton transport mitochondrial electron transport, NADH to ubiquinone mitochondrion organization mRNA export from nucleus negative regulation of gene expression nucleobase−containing compound metabolic process nucleocytoplasmic transport oxidation−reduction process peptide metabolic process positive regulation of protein ubiquitination positive regulation of proteolysis primary metabolic process protein localization protein ubiquitination protein−containing complex assembly protein−containing complex disassembly proton transmembrane transport purine nucleotide metabolic process regulation of cell cycle regulation of macromolecule biosynthetic process regulation of metabolic process regulation of mRNA stability regulation of nitrogen compound metabolic process regulation of RNA splicing regulation of transcription by RNA polymerase II regulation of ubiquitin−protein transferase activity ribosome biogenesis RNA processing translation ubiquitin−dependent protein catabolic process virion assembly AP−type membrane coat adaptor complex catalytic step 2 spliceosome CCR4−NOT complex cyclin/CDK positive transcription elongation factor complex cytosol extracellular exosome intracellular membrane−enclosed lumen mitochondrial inner membrane nucleoplasm nucleus proteasome complex protein−containing complex 3'−5'−exoribonuclease activity ATP binding cytochrome−c oxidase activity metal ion binding methylated histone binding molecular_function NADH dehydrogenase (ubiquinone) activity nuclear hormone receptor binding nucleotidyltransferase activity oxidoreductase activity oxidoreductase activity, acting on diphenols and related substances as donors, cytochrome as acceptor protein binding ribonucleoside binding RNA binding structural constituent of ribosome ubiquitin protein ligase activity Each term has been uniformly reduced to a parent term in order to group together similar terms. Colour corresponds to the p-value of the most significantly enriched child term within the parent term and the x-axis denotes the mean ratio of genes intersecting with the term to total genes within each term. (C) REACTOME and KEGG term enrichments for genes appearing in three or more NSL regulons, filtered for those genetically linked to Parkinson's disease. The x-axis denotes the ratio of genes intersecting with the term to total genes within each term. Genes highlighted were tested using QuantiGene multiplex assay. Colours correspond to classification of gene (blue, genes linked to Parkinson's disease with NSL regulation detected; grey, genes linked to Parkinson's disease without NSL regulation detected; lightest yellow, KAT8 regulation; darkest yellow, KANSL1 regulation). Non-specific lethal (NSL).