Abstract
Alzheimer’s disease (AD), the most prevalent form of dementia, is a progressive and devastating neurodegenerative condition for which there are no effective treatments. Understanding the molecular pathology of AD during disease progression may identify new ways to reduce neuronal damage. Here, we present a longitudinal study tracking dynamic proteomic alterations in the brains of an inducible Drosophila melanogaster model of AD containing the Arctic mutant Aβ42 gene. We identified 3093 proteins from diseased flies and age-matched healthy controls using label-free quantitative ion-mobility data independent analysis mass spectrometry. Of these, 228 proteins were significantly altered by Aβ42 accumulation independently of age, are enriched for AD-associated processes and have distinct hub and bottleneck properties in the brain protein interaction network. We also demonstrate widespread ageing-independent brain proteome dysregulation in response to Aβ42, which affects the expression of proteins that are important for brain function and jmay explain the neuronal damage observed in AD.
Introduction
Alzheimer’s disease (AD) is a progressive and devastating neurodegenerative disease that is the most prevalent form of dementia [1]. Symptoms initially present as episodic memory loss and subsequently develop into widespread cognitive impairment. Two brain lesions are pathological hallmarks of the disease: plaques and neurofibrillary tangles. Plaques are extracellular aggregates of amyloid beta (Aβ) [2], whereas, neurofibrillary tangles are intraneuronal aggregates of hyperphosphorylated tau [3,4]. In addition to these hallmarks, the AD brain experiences many other changes, including metabolic and oxidative dysregulation [5,6], DNA damage [7], cell cycle re-entry [8], axon loss [9] and, eventually, neuronal death [6,10].
Despite a substantial research effort, no cure for AD has been found. Effective treatments are desperately needed to cope with the projected increase in the number of new cases as a result of longer life expectancy and an ageing population. Sporadic onset is the most common form of AD (SAD), for which age is the major risk factor, whereas, familial AD (FAD)—a less common (<1%), but more aggressive, form of the disease—has an early onset of pathology before the age of 65 [11]. Familial AD is caused by fully penetrant mutations in the Aβ precursor protein (APP) and two subunits—presenilin 1 and presenilin 2—of the γ-secretase complex that processes APP in the amyloidogenic pathway to produce Aβ. Whilst the exact disease mechanisms of AD are not yet fully understood, this has provided support for Aβ accumulation as a key player in its cause and progression [1]. Aβ42—a 42 amino acid variant of the peptide—is neurotoxic [12], necessary for plaque deposition [13] and sufficient for tangle formation [14]. The Arctic mutation in Aβ42 (Glu22Gly) [15] causes a particularly aggressive form of familial AD that is associated with an increased rate and volume of plaque deposition [16]. Genetic analyses of SAD, however, suggest a complex molecular pathology, in which alterations in neuro-inflammation, cholesterol metabolism and synaptic recycling pathways may also be required for Aβ42 to initiate the toxic cascade of events leading to tau pathology and neuronal damage in dementia.
Comparison of proteomic analyses of post-mortem human brains have further revealed an increase in metabolic processes and reduction in synaptic function in AD [17]. Oxidised proteins also accumulate at early stages in AD brain, probably as a result of mitochondrial ROS production [18], and redox proteomic approaches suggest that enzymes involved in glucose metabolism are oxidised in mild cognitive impairment and AD [19,20]. Moreover, phospho-proteomic approaches have revealed alterations in phosphorylation of glycolytic and metabolic enzymes, as well as in kinases that regulate phosphorylation of chaperones such as HSP27 and crystallin alpha B [21]. Of note, however, there is little proteomic overlap between studies using post-mortem human brain tissue, which may reflect the low sample numbers available for such studies, differences in comorbidities between patients and confounding post-mortem procedures [17]. Although valuable, post-mortem studies also reflect the end-stage of disease and, therefore, do not facilitate measurement of dynamic alterations in proteins as AD progresses.
Animal models of AD, generated through transgenic over-expression of human APP or tau, provide an opportunity to track proteomic alterations at pre-and post-pathological stages, thus facilitating insight into the molecular mechanisms underlying disease development and revealing new targets for drugs to reduce AD progression. Analyses of transgenic mice models of AD have revealed some overlapping alterations in metabolic enzymes, kinases and chaperones with human AD brain [17]. Only one study, however, has tracked alterations in protein carbonylation over time, showing increases in oxidation of metabolic enzymes (alpha-enolase, ATP synthase α-chain and pyruvate dehydrogenase E1) and regulatory molecules (14-3-3 and Pin1) in correlation with disease progression [22].
Drosophila models of AD have been generated and shown to develop progressive neurodegenerative phenotypes, such as reduced climbing ability, and shortened lifespan when human Aβ42 peptide is expressed exclusively in adult fly neurons [23]. Using this inducible system, and taking advantage of the short lifespan of the fly, we have performed a longitudinal study of the brain proteome to capture the effects of Aβ42-toxicity in the brain from the point of amyloid induction and across life. We identified 3093 proteins using label-free quantitative ion-mobility data independent analysis mass spectrometry (IM-DIA-MS) [24], 1854 of which were common to healthy and AD flies. In this set, we identified 228 proteins that are significantly altered in AD; although the proteome of AD flies was clearly segregated from healthy controls at all ages, suggesting that biochemical alterations induced by Aβ42 do not simply reflect accelerated ageing. Proteins altered in response to Aβ42 were enriched for AD processes and have statistically significant network properties in the brain protein interaction network. We also show that these proteins are likely to be bottlenecks for signalling in the network, suggesting that they comprise important proteins for normal brain function. Our data indicates that ageing-independent brain proteome dysregulation in AD alters essential brain processes resulting in the premature death of AD flies. Our data will be an invaluable resource to understand the dynamic properties of Aβ42 proteo-toxicity during AD progression, with future functional studies identifying potential therapeutic candidates to treat AD at pre-and post-symptomatic stages.
Results
Proteome analysis of healthy and AD brains
Using an inducible transgenic fly line expressing human Arctic mutant Aβ42 (TgAD) [23] (Fig 1A), we confirmed a previously observed [23] reduction in lifespan following Aβ42 induction prior to proteomic analyses (Fig 1B).
To understand how the brain is affected as Aβ42 toxicity progresses, fly brains were dissected from healthy and AD flies at 5, 19, 31 and 46 days, and 54 and 80 days for healthy controls, and the proteome was analysed by label-free quantitative IM-DIA-MS (Fig 1C, Supplementary Data 1). 1854 proteins were identified in both healthy and AD flies from a total of 3093 proteins (Fig 1D), which is comparable with recent fly proteomics studies [25,26].
For the 1854 proteins identified in both healthy and AD flies, we assessed the reliability of our data. Proteins were highly correlated between technical and biological repeats (Fig S1). We used principal component analysis of the protein abundances to identify sources of variance (Fig 1E). Healthy and AD samples are clearly separated in the first principal component, due to the effects of Aβ42 in AD flies. In the second principal component, samples are separated by increasing age, due to age-dependent changes in the proteome. These results show that whilst ageing does contribute to changes in the brain proteome (8.7% of the total variance), much larger changes are seen in AD (70.6%). Furthermore this suggests that Aβ42 toxicity does not simply reflect ‘accelerated ageing’, but instead operates via distinct pathways to the ageing process. We confirmed this result using hierarchical biclustering of protein abundances in Aβ42 versus healthy flies at 5 days (Fig 1F). The heat map reveals that, in healthy flies, most proteins do not vary significantly in abundance. Conversely, many proteins are differentially abundant in AD flies, compared with healthy flies.
Brain proteome dysregulation in AD
With the knowledge that Aβ42 expression affects the abundances of proteins in the brain, we then further identified proteins that were significantly altered in AD. To do this, we used five methods commonly used to analyse time course RNA-Seq data [27] and classified proteins as significantly altered if at least two methods detected them [28]. We identified 228 significantly altered proteins from 740 proteins that were detected by one or more methods (Fig 2A). A comparison of popular RNA-Seq analysis tools [29] showed that edgeR [30] has a high false positive rate and variable performance on different data sets, whereas, DESeq2 [31] and limma [32] have low false positive rates and perform more consistently. We saw a similar trend in our data set. limma and DESeq2 detected the lowest number of proteins, with 21 proteins in common (Fig S2A). edgeR detected more proteins, of which 38 were also detected by DESeq2 and 16 by limma. EDGE [33] and maSigPro [34] detected vastly more proteins, 464 of which were only detected by one method. Principal component analysis shows that edgeR, DESeq2 and limma detect similar proteins, whereas, EDGE and maSigPro detect very different proteins (Fig S2B).
Although these methods should be able to differentiate between proteins that are altered in TgAD flies from those that change during normal ageing, we confirmed this by analysing healthy flies separately. In total, 61 proteins were identified as significantly altered (Fig S3), of which 30 were identified as significantly altered in normal ageing and AD (Fig 2B), while 31 proteins were only significantly altered in normal ageing. These proteins are not enriched for any pathways or functions. Based on our data, we see that the vast majority of proteins that are significantly altered in AD are not altered in normal ageing and that AD causes significant dysregulation of the brain proteome. This further suggests that AD and ageing affect the brain via distinct pathways.
Reduced insulin/IGF signalling is known to promote longevity in many organisms. A recent mass spectrometry proteomics study of fly brains that have lower insulin/IFG signalling identified a large number of significantly altered proteins [26], although very few of these overlap with the proteins that we found to be significantly altered in AD. At the 0.05 significance level, 29 proteins were in common, representing 15% of our 228 significantly altered proteins, but just 7% of their total number of significantly altered proteins. Within these 29 proteins were three subunits of the cytochrome c oxidase complex, myosin and acyl CoA synthetase—involved in fatty acid metabolism. The small overlap of significantly altered proteins between these two studies is not surprising, however, and highlights the diverse molecular, cellular and physiological effects that ageing, AD and other age-associated diseases can have.
To understand how the abundances of the significantly altered proteins change in AD, we clustered their profiles using a Gaussian mixture model (Fig 2C). The proteins clustered best into four sets (Fig S4). In comparison to healthy flies, cluster 1 contains proteins that have consistently higher abundance in AD. Conversely, cluster 2 contains proteins that have lower abundance in AD. The abundances of proteins from clusters 1 and 2 are affected from the onset of disease at day 5, and remain at similar levels as the disease progresses. Dysregulation of these proteins may initiate AD pathogenesis, or be involved in early stage progression. Proteins in cluster 3 follow a similar trend in healthy and AD flies and increase in abundance with age. However, cluster 4 proteins decrease in abundance as the disease progresses, whilst remaining steady in healthy flies. These proteins may be interesting therapeutic targets because there is a greater opportunity to intervene between disease onset and amyloid accumulation, and their abundance beginning to decrease.
We performed a statistical GO enrichment analysis on each cluster, but found no enrichment of terms. Furthermore, we also saw no enrichment when we analysed all 228 proteins together.
Proteins significantly altered in AD have distinct network properties
Next, we analysed the 228 significantly altered proteins in the context of the brain protein interaction network to determine whether their network properties are significantly different to the other brain proteins. Using a subgraph of the STRING [35] network induced on the 3093 proteins identified by IM-DIA-MS, we calculated four graph theoretic network properties (Fig 3A) of the 183 significantly altered proteins contained in this network: degree, the number of edges that a node has; shortest path, the smallest node set that connect any two nodes; largest connected component, the largest node set for which all nodes have at least one edge to any of the other nodes; and betweenness centrality, the proportion of all the shortest paths in the network that a particular node lies on.
We performed hypothesis tests and found that these proteins have statistically significant network properties. Firstly, the significantly altered proteins make more interactions than expected (mean degree p < 0.05; Fig 3B). Therefore, these proteins may further imbalance the proteome by disrupting the expression or activity of proteins they interact with. Secondly, not only are these proteins close to each other (mean shortest path p < 0.05; Fig 3C), but also 129 of them form a connected component (size of largest connected component p < 0.01; Fig 3D). These two pieces of evidence suggest that AD disrupts proteins at the core of the proteome. Lastly, these proteins lie along shortest paths between many pairs of nodes (mean betweenness centrality p < 0.01; Fig 3E) and may control how signals are transmitted in cells. Proteins with high betweenness centrality are also more likely to be essential genes for viability [36]. Taken together, these findings results strongly suggest that the proteins significantly altered in AD are important in the protein interaction network, and that dysregulation of these proteins may have significant consequences for the brain proteome and therefore function.
Predicting the severity of AD-associated protein alterations using network properties
We predicted how severely particular AD-associated protein alterations may affect the brain using two network properties—the tendency of a node to be a hub or a bottleneck. In networks, nodes with high degree are hubs for communication, whereas, nodes with high betweenness centrality are bottlenecks that regulate how signals propagate through the network. Protein expression tends to be highly correlated to that of its neighbours in the protein interaction network. One exception to this rule, however, are bottleneck proteins, whose expression tends to be poorly correlated with that of its neighbours [36]. This suggests that the proteome is finely balanced and that the expression of bottleneck proteins is tightly regulated to maintain homeostasis. We analysed the hub and bottleneck properties of the significantly altered proteins and identified four hub-bottlenecks and five nonhub-bottlenecks that are involved in AD (Fig 4A) and analysed how their abundances change during normal ageing and over the course of the disease (Fig 4B).
Nonhub-bottlenecks: Acs1, CG6543, Got2, CoII and Acp65Aa
Three of the nonhub-bottlenecks are metabolic enzymes. Acs1 and CG6543 are involved in the production of acetyl-CoA from fatty acids. Acyl-CoA synthetase long chain (Acs1) catalyses the ligation of CoA to acyl chains and CG6543 hydrates double bonds in unsaturated fatty acids. AD is known to affect many enzymes involved in acetyl-CoA metabolism, causing an acetyl-CoA deficit in the brain and loss of cholinergic neurons [6]. Whilst CG6543 abundance increases in healthy flies during normal ageing—suggesting that aged flies require higher activity—its level was decreased in AD, which may have severe consequences. On the other hand, Acs1 is increased in AD. During development, Acs1 participates in neuronal development by directing the growth of axons.
Aspartate aminotransferase (Got2) produces the neurotransmitter L-glutamate from aspartate and is involved in assembly of synapses. After brain injury, aspartate aminotransferase levels become elevated [37], which may explain why Got2 is upregulated in AD.
In the mitochondrial electron transport chain, cytochrome c oxidase (COX)—also known as complex IV—uses the energy from reducing molecular oxygen to water to generate a proton gradient across the inner mitochondrial membrane. CoII—a COX subunit—is downregulated in AD flies. The link between COX and AD is unclear, although Aβ is known to inhibit COX activity [38]. For example, in AD patients, COX activity—but not abundance—is reduced, resulting in increased levels of ROS [39]. However, in COX-deficient mouse models of AD, plaque deposition and oxidative damage are reduced [40]. Taken together, these results suggest that whilst COX is clearly involved in AD, more work is required to decipher its role and how our results fit into this emerging picture.
The cuticle protein Acp65Aa was also upregulated in AD, but levels fell sharply between 5 and 19 days. However, it is surprising that we identified Acp65Aa in our samples, as it is not expected to be expressed in the brain. One explanation may involve chitin, which has been detected in AD brains and has been suggested to facilitate Aβ nucleation [41]. Amyloid aggregation has previously been shown to plateau around 15 days post-induction [42], which is around the same time that Acp65Aa drops in AD flies. Our results suggest that Aβ42 causes an increase in Acp65Aa expression early in the disease, but further experiments are needed to confirm this and whether AD flies have defective wings [43].
Hub-bottlenecks: Hsp70A, Gp93, Top2 and Act75B
Meanwhile, the four hub-bottlenecks indicate that the AD brain is stressed. Hsp70A, a heat shock protein that responds to hypoxia, is massively upregulated in AD—even after 5 days. Hypoxia has been shown to promote Aβ accumulation and tau hyperphosphorylation in the brain [44]. Additionally, we found Gp93—a stress response protein that binds unfolded proteins—to be twice as high in AD. DNA topoisomerase 2 (Top2), an essential enzyme for DNA double-strand break repair, is decreased in AD. Double-strand breaks occur naturally in the brain as a consequence of neuronal activity—an effect that is aggravated by Aβ [7]. As a consequence of deficient DNA repair machinery, deleterious genetic legions will accumulate in the brain and exacerbate neuronal loss.
Finally, we found that actin is increased in AD, in agreement with two recent studies on mice brains [45,46]. Recently, Kommaddi and colleagues found that Aβ causes depolymerisation of F-actin filaments in a mouse AD model before onset of AD pathology [46]. The authors showed that although the concentration of monomeric G-actin increases, the total concentration of actin remains unchanged. It has long been known that G-, but not F-, actin is susceptible to cleavage by trypsin [47], permitting its detection and quantification by IM-DIA-MS. Hence, the apparent increase of actin in AD flies may be due to F-actin depolymerisation, which increases the pool of trypsin-digestible G-actin, and is consistent with the findings of Kommaddi et al. To confirm whether total actin levels remain the same in the brains of AD flies, additional experiments would have to be carried out in the future: tryptic digestion in the presence of MgADP—which makes F-actin susceptible to cleavage [48]—and transcriptomic analysis of actin mRNA. Furthermore, actin polymerisation is ATP-dependent, so increased levels of G-actin may indicate reduced intracellular ATP. In addition, ATP is important for correct protein folding and therefore reduced levels may lead to increased protein aggregation in AD.
Due to the importance of these hub and bottleneck proteins in the protein interaction network, we predict that AD-associated alterations in their abundance will likely have a significant effect on the cellular dynamics of the brain. We predict that rescuing these perturbations with drugs, or other therapeutics, would return these proteins to their normal abundance and therefore alleviate the effects and symptoms of AD. For example, the abundances of Acsl1, Got2 and Gp93 increase as the disease progresses, so reducing their abundance should be neuroprotective. Conversely, increasing the expression of CG6543, ColI, or Top2, whose abundances are reduced in AD, should also be neuroprotective. Increasing or decreasing ACP64Aa, Act57B or Hsp70A could be neuroprotective, depending on the time of intervention, as toxicity may either be due to their elevated abundance in AD, or that their abundance falls as the disease progresses.
Dysregulated genes are associated with known AD and ageing network modules
Finally, we clustered the protein interaction network into modules and performed a GO enrichment analysis on modules that contained any of the 228 significantly altered proteins. We saw no GO term enrichment when we tested these proteins clustered according to their abundance profiles (Fig 2C), presumably because the proteins affected in AD are diverse and involved in many different biological processes. However, by testing network modules for functional enrichment, we exploited the principle that interacting proteins are functionally associated. Using a subgraph of the STRING network containing the significantly altered proteins and their directly-interacting neighbours, we used MCODE [49] to find modules of densely interconnected nodes. We chose to include neighbouring proteins to compensate for proteins that may not have been detected in the MS experiments due to the stochastic nature of observing peptides and the wide dynamic range of biological samples [50]. The resulting subgraph contained 4842 proteins, including 183 of the 228 significantly altered proteins, as well as 477 proteins that were only identified in healthy or AD flies and 3125 proteins that were not identified in our IM-DIA-MS experiments. 12 modules were present in the network (Fig 5A, Supplementary Data 2). The proportion of these modules that were composed of significantly altered proteins ranged from 0–8%. All but one of the modules were enriched for processes implicated in AD and ageing (Fig 5, Supplementary Data 3), including respiration and oxidative phosphorylation; transcription and translation; proteolysis; DNA replication and repair; and cell cycle regulation. These modules contained two proteins that were recently found to be significantly altered in the brain of AD mice [45] and are both upregulated four-fold in AD: adenylate kinase, an adenine nucleotide phosphotransferase, and Arm, involved in creating long-term memories.
In humans, the greatest genetic risk factor for AD is the ε4 allele of ApoE—an apolipoprotein involved in cholesterol transport and repairing brain injuries [51]. A recent study showed that ApoE is only upregulated in regions of the mouse brain that have increased levels of Aβ [45], indicating a direct link between the two proteins. Although flies lack a homolog of ApoE, they do possess a homolog of the related apolipoprotein ApoB (Apolpp) [52], which contributes to AD in mice [53,54] and is correlated with AD in humans [55,56]. Interestingly, whilst it was not identified by IM-DIA-MS, ApoB interacts with 12 significantly altered proteins in the STRING network, so is included in the subgraph induced on the significantly altered proteins and their neighbours. ApoB was found in the second highest scoring module that contains proteins involved in translation and glucose transport (Fig 5).
We analysed the 31 proteins significantly altered in normal ageing, but not AD. Of the 29 proteins that were contained in the STRING network, 24 interact directly with at least one of the AD significantly altered proteins, suggesting an interplay between ageing and AD at the pathway level. Using a subgraph of the STRING network induced on these proteins and their 1603 neighbours, we identified eight network modules that were enriched for ageing processes [57], including respiration; unfolded protein and oxidative damage stress responses; cell cycle regulation; DNA damage repair; and apoptosis.
Discussion
Despite the substantial research effort spent on finding drugs against AD, so far, effective treatments—let alone a cure—remain elusive. Recently, however, there is renewed optimism following the discovery that plaque deposition can be prevented by a therapeutic antibody [58]. This work establishes ion-mobility-enabled, label-free quantitative proteomics as an effective method to track dynamic proteomic alterations, such as the widespread Aβ42-induced proteome dysregulation we observed in our Drosophila AD model.
Our analysis identifies many similarities between the processes that are affected by AD in both fly and human, demonstrating the relevance of our fly experimental system in future AD research, such as drug efficacy assays. Whilst there are slight differences in AD pathology between worm, fly and mouse model organisms, numerous studies have demonstrated high levels of conservations between these models, particularly with regard to age-related diseases. We believe that the ease of maintaining animal stocks, obtaining single-tissue brain samples and quantifying the proteome without the need for exogenous labels make our experimental system an excellent choice with which to study AD. Furthermore, Drosophila are a powerful and tractable model in which to test drug targets against a wide range of genetic backgrounds and mutants.
In conclusion, we performed a longitudinal study of the Drosophila brain proteome in AD and tracked the dynamic molecular Aβ42-induced alterations that occured during progression of the disease by label-free quantitative IM-DIA-MS. We identified important proteins that are significantly altered in AD and enriched for a complex set of processes. By analysing these proteins in the context of protein interaction networks, we were able to untangle these processes and produce a more coherent picture of the disease. For example, we predicted that changes in the abundances of hub and bottleneck proteins will likely cause widespread dysregulation of the brain proteome. For correct neuronal function, homeostasis of the brain proteome must be maintained. As such, drugs that reduce the abundance of Acsl1, Got2 or Gp93 may protect the brain against AD, as the abundance of these proteins increases as the disease progresses.
Our work demonstrates that by analysing these proteins, the associated processes can be untangled. In the future, our data set will be an invaluable resource to elucidate the mechanisms of Aβ42-induced pathology and can provide important insights into human AD.
Materials and methods
Fly stocks
The TgAD fly line used in this study [23] contains the human transgene encoding the Arctic mutant Aβ42 peptide [59]. Expression of Aβ42 was controlled by GeneSwitch [60]—a mifepristone-inducible GAL4/UAS expression system—under the pan-neuronal elav promoter.
Flies were grown in 200 ml bottles on a 12 h/12 h light/dark cycle at constant temperature (25 °C) and humidity. Growth media contained 15 g/l agar, 50 g/l sugar, 100 g/l autolysed yeast, 100 g/l nipagin and 3 ml/l propionic acid. Flies were grown for two days after eclosion before females were transferred to vials at a density of 25 flies per vial for the lifespan analysis and 10 flies per vial for the IM-DIA-MS analysis. Expression of Aβ42 was induced in AD flies by spiking the growth media with mifepristone to a final concentration of 200 µM. Flies were transferred to fresh media three times per week, at which point the number of surviving flies was recorded. For each of the three biological repeats, 10 healthy and 10 AD flies were collected at 5, 19, 31 and 46 days, as well as 54 and 80 days for healthy flies. Following anesthetisation with CO2, brains were dissected in ice cold 10 mM phosphate buffered saline snap frozen and stored at −80°C.
Extraction of brain proteins
Brain proteins were extracted by homogenisation on ice into 50 µl of 50 mM ammonium bicarbonate, 10 mM DTT and 0.25% RapiGest detergent. Proteins were solubilised and disulfide bonds were reduced by heating at 80°C for 20 minutes. Free cysteine thiols were alkylated by adding 20 mM IAA and incubating at room temperature for 20 minutes in darkness. Protein concentration was determined and samples were diluted to a final concentration of 0.1% RapiGest using 50 mM ammonium bicarbonate. Proteins were digested with trypsin overnight at 37°C at a 50:1 protein:trypsin ratio. Additional trypsin was added at a 100:1 ratio the following morning and incubated for a further hour. Detergent was removed by incubating at 60°C for 1 hour in 0.1% formic acid. Insoluble debris was removed by centrifugation at 14,000 × g for 30 minutes. Supernatant was collected, lyophilised and stored at −80°C. Prior to lyophilisation peptide concentration was estimated by nanodrop (Thermo Fisher Scientific, Waltham, MA).
Label-free quantitative IM-DIA-MS
Peptides were separated by UPLC by loading 300 ng of protein onto an analytical reversed phase column. IM-DIA-MS analysis was performed using a Synapt G2-Si HDMS mass spectrometer (Waters Corporation, Manchester, UK). The time-of-flight analyzer of the mass spectrometer was externally calibrated with a NaCsI mixture from m/z 50 to 1990. Spectra were acquired over a range of 50–2000 m/z. Each biological repeat was analysed at least twice to account for technical variation.
Liquid chromatography MS data were peak detected and aligned by Progenesis QI for proteomics (Waters Corporation). The principles of the embedded search algorithm for DIA data has been described previously [61]. Proteins were identified by searching against the Drosophila melanogaster proteome in UniProt, appended with common contaminants, and revered sequence entries to estimate protein identification false discovery rate (FDR) values, using previously specified search criteria [62]. Peptide intensities were normalised to control for variation in protein loading and relative quantification. Abundances were estimated by Hi3-based quantisation [63].
Data analysis
Proteins that were identified in both healthy and AD flies were considered for further analysis. Missing data were replaced by the minimum abundance measured for any protein in the same repeat [50]. The data were quantile normalised [64], so that different conditions and time points could be compared reliably. Quantile normalisation transforms the abundances so that each repeat has the same distribution.
For PCA analysis, the data were log10-transformed and each protein was standardised to zero-mean and unit variance. Hierarchical biclustering was performed using the Euclidean distance metric with the complete linkage method. Prior to clustering, proteins were normalised to their abundance in healthy flies at 5 days.
Proteins that were identified by IM-DIA-MS in either healthy or AD flies were assessed for overrepresentation of GO terms using GOrilla [65], which uses ranked lists of target and background genes. Proteins were ranked in descending order by their mean abundance. The type I error rate was controlled by correcting for multiple testing using the Benjamini-Hochberg method at a FDR of 5%.
Identification of significantly altered proteins
Significantly altered proteins were identified using five methods that are frequently used to identify differentially expressed genes in time course RNA-Seq data. DESeq2 [31], EDGE [33], edgeR [30], limma [32] and maSigPro [34] are all available in R through Bioconductor.
Dispersions were estimated from the biological and technical repeats. Unless otherwise stated, default parameters were used for all methods under the null hypothesis that a protein does not change in abundance between healthy and AD conditions in normal ageing. The type I error rate was controlled by correcting for multiple testing using the Benjamini-Hochberg method at a FDR of 5%. A protein was classified as significantly altered if two or more methods identified it.
DESeq2 models proteins with the negative binomial distribution and performs likelihood ratio tests. A time course experiment was selected in EDGE using the likelihood ratio test and a normal null distribution. edgeR uses the negative binomial distribution and performs quasi-likelihood tests. limma fits linear models to the proteins and performed empirical Bayes F-tests. maSigPro fits generalised linear models to the proteins and performs log-likelihood ratio tests.
Significantly altered proteins were clustered using a Gaussian mixture model. Protein abundances were log10-transformed and z scores were calculated. Gaussian mixture models were implemented for 1–228 clusters. The best model was chosen using the Bayesian information criterion (BIC), which penalises complex models: where ln(L) is the log-likelihood of the model, n is the number of significantly altered proteins and k is the number of clusters. The model with lowest BIC was chosen.
Networks
All network analysis was performed using the Drosophila melanogaster STRING network (version 10) [35]. Low confidence interactions with a ‘combined score’ < 500 were removed in all network analyses.
Network properties of the significantly altered proteins were analysed in the brain protein interaction network. A subgraph of the STRING network was induced on the 3093 proteins identified by IM-DIA-MS in healthy or AD flies and the largest connected component was selected (2428 nodes and 44,561 edges). The subgraph contained 183 of the 228 significantly altered proteins. For these proteins, four network properties were calculated as test statistics: mean node degree; mean unweighted shortest path length between a node and the remaining 182 nodes; the size of the largest connected component in the subgraph induced on these nodes; and mean betweenness centrality. Hypothesis testing was performed using the null hypothesis that there is no difference between the nodes in the subgraph. Assuming the null hypothesis is true, null distributions of each test statistic were simulated by randomly sampling 183 nodes from the network 10,000 times. Using the null distributions, non-parametric one-sided p-values were calculated as the probability of observing a test statistic as extreme as the test statistic for the significantly altered proteins.
A subgraph of the STRING network was induced on the proteins significantly altered in AD and their neighbours and the largest connected component was selected (4842 nodes and 182,474 edges). The subgraph contained 198 of the 228 significantly altered proteins and was assessed for enrichment of GO terms. Densely connected subgraphs were identified using MCODE [49]. Modules were selected with an MCODE score > 10. As STRING is a functional interaction network, clusters of nodes may correspond to proteins from the same complex, pathway or functional family. Clusters were assessed for overrepresentation of GO-Slim terms in the Biological Process ontology using Panther (version 13.1) [66] with a custom background of the 3093 proteins identified by IM-DIA-MS in healthy or AD flies. Fisher’s exact tests were performed and the type I error rate was controlled by correcting for multiple testing using the Benjamini-Hochberg method at a FDR of 5%.
Funding
H.S. is supported by an ISMB Wellcome PhD studentship [203780/Z/16/A]; A.C. is supported by a BBSRC CASE PhD studentship and Waters Corporation; J.L. is supported by BBSRC [BB/L002817/1]. The work was supported by a Wellcome instrumentation grant [104913/Z/14/Z].
Competing interests
None
Acknowledgements
We thank Dr Damian Crowther (University of Cambridge) for donation of UAS-Aβ42 fly stocks and Dr Hervé Tricoire (CNRS, France) for donation of elavGS fly stocks. Fig 1C: fly graphic by Daan Kauwenberg and brain graphic by Julia Amadeo, both from the Noun Project.
Footnotes
↵* Joint first authors
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.
- 76.
- 77.
- 78.
- 79.