Summary Paragraph
TDP-43 is an essential DNA/RNA processing protein that undergoes both functional and pathogenic aggregation. Functional TDP-43 aggregates are reversible, forming transient species such as nuclear bodies, stress granules, and myo-granules1–3. In contrast pathogenic TDP-43 aggregates are irreversible, forming stable intracellular amyloid-like inclusions4,5. These inclusions are the primary pathology of amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration with TDP-43 inclusions (FTLD-TDP)6. Disease-associated, hereditary mutations in TDP-43 are known to accelerate the deposition of irreversible aggregates in the cytoplasm7. Reversible TDP-43 aggregation has been shown to precede the formation of irreversible amyloid fibrils similar to the behavior of proteins hnRNPA1 and FUS8–10. Still unknown, however, are the structural features of TDP-43 fibrils that confer both reversibility and irreversibility and how hereditary mutations can impose irreversible aggregation. Here, we determined the structures of amyloid fibrils formed by two segments previously reported to be the pathogenic cores of TDP-43 aggregation7,11,12; these are termed SegA (residues 311-360) and SegB A315E (residues 286-331 containing the ALS hereditary mutation A315E). SegA forms three polymorphs, all with dagger-shaped folds. SegB forms R-shaped folds. All four polymorphs have folds confined to two dimensions, and are stabilized by hydrophobic cores and peripheral hydrogen bonds. Energetic analysis suggests that the dagger-shaped polymorphs are examples of the irreversible fibril structures of TDP-43, whereas the SegB polymorph may participate in both reversible and irreversible fibril structure. Our structure suggests how the A315E mutation may convert this polymorph to the irreversible type and lead to mutation-enhanced pathology.
Main text
Amyloid-forming proteins seem to violate the central tenant of protein science—that amino acid sequence determines structure and function13. In contrast to globular and membrane proteins each of which folds into a single functional structure, a given amyloid-forming sequence can fold into several distinctly different polymorphic structures14,15. Here we find that TDP-43, known to form both reversible functional aggregates and irreversible pathogenic aggregates, exhibits polymorphic behavior. This complex protein thus offers an intriguing opportunity for structural exploration.
Irreversible, hyperphosphorylated aggregates of C-terminal segments of TDP-43 are found in the autopsied brains of ALS and FTLD-TDP patients6,16,17. These aggregates have also been found in Alzheimer’s, Parkinson’s, CTE, Huntington’s disease and inclusion body myopathies (IBMs), among others18–21. Because TDP-43 functions in several essential steps of RNA metabolism22,23, it is widely considered that TDP-43 aggregation is toxic through a loss-of-function mechanism24–26. Structural studies of amyloid fibrils of β-amyloid27,28, tau29,30, α-synuclein14, and β2-microglobulin31 have revealed polymorphs and insights into pathogenesis. Here, we use cryo-EM to determine the overall folds of TDP-43 amyloid cores, expanding structural information beyond the local interactions previously revealed by crystallography32,33.
We obtained TDP-43 fibrils by incubating SUMO-tagged TDP-43 segments in the test tube after the cleavage of SUMO tag (Supplementary Figure 1a). We first aimed to produce fibrils formed by full-length TDP-43, a pathogenic C-terminal fragment (CTF, 208-414) – truncation product that is enriched in disease brain34, or by the Low Complexity Domain (LCD, 274-414) which is considered to be necessary for TDP-43 aggregation32,35,36. However, despite our efforts at optimization, we could observe only highly clumped fibril-like structures and/or disordered aggregates that are not suitable for cryoEM structure determination (See Supplementary Figure 1b top panel). We suspect that this may be due to the ability of longer constructs of TDP-43 to participate in multi-valent interactions, possibly through LARKS32,37 or other adhesive segments outside the LCD33. These multi-valent interactions have been shown to assemble networks of protein chains and could conceivably explain why longer segments of TDP-43 form amorphous aggregates or fibril clumps not amenable to cryoEM structure determination. This observation is in line with TDP-43’s role in phase separation and stress granule formation, which requires the presence of multi-valent interactions37.
To overcome the hurdle of the disordered assembly of longer segments of TDP-43, we implemented a “divide and conquer” approach whereby we selected known aggregation cores for structure determination. We chose SegA (residues 311-360) and SegB (residues 286-331), guided by the following evidence. SegA was previously identified as an aggregation core of TDP-43 due to the observation that its deletion decreases TDP-43 aggregation in vitro and in cells, whereas addition of SegA to the aggregation-resistant C. elegans TDP-43 homolog induces aggregation11. Fibrils of SegB are toxic to primary neurons, and an ALS hereditary mutation A315T together with phosphorylation of the threonine, which is speculated to occur in hyper-phosphorylated aggregates of TDP-43 in disease, increases SegB’s cytotoxicity7. With this in mind, we selected SegB A315E – another hereditary mutation with similar effects as A315T7,38–40 and a mimic of A315T with phosphorylation – in order to visualize the structure of a second possible TDP-43 aggregation core and to gain insight into the molecular mechanism of mutation-enhanced TDP-43 pathology. The importance of SegA and SegB in full-length TDP-43 aggregation is also supported by other studies which found that amyloid fibrils containing either a core region (residues 314-353) of SegA or region (residues 274-313) similar to SegB can template aggregation of full-length TDP-43 in SH-SY5Y human neuronal cells41. Likewise, in the same cell line, deletion of these two regions (residues 314-353 or 274-313) from full-length TDP-43 inhibits aggregation. As we expected, fibrils formed by SegA and SegB A315E were much more homogenous and less bundled than longer segments of TDP-43, including SegAB (286-360) that contains both aggregation cores SegA and SegB (Supplementary Figure 1b). This observation supports the idea that eliminating competing multi-valent interactions helps to produce homogenous fibrils of isolated amyloid cores. Using these homogenous preparations, we determined three polymorphic fibril structures of SegA (termed SegA-sym, SegA-asym, and SegA-slow) and one of SegB A315E (Figure 1, Supplementary Figure 2-4 and Supplementary Table 1, see Methods for technical details).
All four fibrils are formed from gradually twisting β-sheets that run the entire length of the fibrils (Figure 1b-e). A thin slab or “layer” perpendicular to each fibril axis (Figure 1, middle) shows that individual SegA and SegB chains are each confined within an essentially two-dimensional layer, in contrast to globular and membrane proteins whose folds occupy three-dimensions. Identical layers stack on one another creating β-sheets that are parallel and in-register. Each layer contains two or more protein chains, giving rise to a corresponding number of protofilaments in the fibril.
SegA polymorphs all share a dagger-shaped fold comprised of residues 312-346 and a sharp 160° kink at Gln327, which forms the dagger tip (Figure 2a-d, Supplementary Figure 5a-c, detailed comparison see Supplementary Note 1&2). The three SegA polymorphs mainly differ in the number of protofilaments and symmetry. SegA-sym fibrils contain two protofilaments related by a pseudo-21 axis, whereas SegA-asym contains two protofilaments with somewhat different conformations. SegA-slow fibrils contain four protofilaments; two protofilaments are related by a central 2-fold axis and contain 50 ordered residues. These are flanked by two other protofilaments containing only 10 ordered residues. We note that SegA-sym and SegA-asym structures are compatible with full-length TDP-43 due to their free N-and C-termini, whereas the SegA-slow structure is an artifact of the truncation since the N- and C-termini are sequestered in the center of the fibrils (Figure 1d and Supplementary Figure 5b). Therefore, in the following analysis, we mostly focus on SegA-sym and SegA-asym. However, we note that SegA-slow provides valuable information, including validation of the dagger-shaped fold owing to its higher resolution and direct atomic evidence of secondary nucleation (Supplementary Note 3).
SegB A315E forms fibrils of a single morphology, in contrast to polymorphic SegA. These fibrils are characterized by an R-shaped fold spanning residues 288-319 (Figure 2e&f and Supplementary Note 4). Overall, the R-shaped fold is more highly kinked than the dagger-shaped fold (Figure 1b-e), probably due to the abundance of glycine residues. Each fibril is wound from four protofilaments (Figure 2e). The two inner protofilaments are related through a tight, pseudo-21 symmetric interface (Figure 2e&g, Supplementary Note 5). These protofilaments are flanked by two outer protofilaments through an asymmetric interface involving outer residues 289-304 and inner residues 289 and 306-318 (Figure 2e&g, Supplementary Note 5). Small conformational differences between inner and outer R-folds (Figure 2f) recall the positional polymorphism observed in a different, shorter TDP-43 segment33.
In all four R-folds we observe a salt-bridge between Arg293 and Glu315 enabled by the pathogenic A315E mutation. The Arg293-Glu315 salt-bridge is not formed by the residues from the same layer along the fibril axis. Rather, each Arg293 interacts with the Glu315 from one or two layers above (Figure 2h), which may affect the kinetics of fibril growth and nucleation. This salt-bridge suggests a mechanistic explanation for the inclination of A315E toward TDP-43 pathology. Model building suggests that wild-type SegB can form the same R-shaped fold as A315E, with Ala315 participating in a hydrophobic interaction with Ala297 and Phe313. This hypothesis is supported by the similar stability and morphology of wild-type SegB fibrils in our themostability assays (Supplementary Figure 1c) compared to SegB A315E fibrils and is also supported by the cross-seeding ability of SegB and SegB A315E (Supplementary Figure 1e).
It is noteworthy that although TDP-43 can form either the dagger-shaped polymorph of SegA or the R-shaped polymorph of SegB, it is unlikely that a given molecule of TDP-43 can form both simultaneously. This exclusivity is indicated by a superposition of the two folds in the overlapping segment Asn312-Asn319 (Supplementary Figure 5h). The superposition reveals incompatible steric hindrance between the main chains of SegA and SegB would occur if both folds were formed by a single TDP-43 molecule that contains both SegA and SegB sequences.
The extensive hydrophobic interactions and hydrogen bonds observed in both the dagger-and R-shaped folds suggest that both SegA and SegB A315E fibrils are irreversible. To examine this hypothesis, we calculated modified atomic solvation energies (see Methods) to quantify the stabilities of the fibrils. The calculated stabilities of both the dagger-and R-shaped fold fibrils (represented by energy per layer and per residue) are comparable to other pathogenic amyloid fibrils, such as the Alzheimer’s and Pick’s disease fibrils (Figure 3). In contrast the stability of FUS fibrils, thought to be reversible42, is distinctly lower. We performed thermostability assays to validate our energetic calculations. When heated to 75 °C for 30 minutes, fibrils formed by SegA, SegB, and SegB A315E are all stable, whereas fibrils formed by mCherry-FUS-LCD (composed of residues 1-214 – identical to the sequence determined in the FUS ssNMR structure except with a His-tag replacing mCherry42) disappeared after heating to 60 °C (Supplementary Figure 1c). These results are consistent with our energetic calculations and support the idea that the dagger-and R-shaped fold fibrils are irreversible.
Several lines of evidence suggest that the dagger and R-shaped folds can be adopted by longer TDP-43 segments such as the pathogenic TDP-CTF fragment. (1) Structural conservation: The core region (residue 320-334) of the dagger-shaped fold is structurally conserved in all four dagger-shaped folds despite differences in local environment (Supplementary Note 1). (2) Model building. The outward facing disposition of the termini in these folds longer TDP-43 structures to be modeled from these cores without steric interference. (3) Cross seeding. Fibrils formed by TDP-CTF can seed SegA and SegB A315E monomer (Supplementary Figure 1f), and SegA and SegB A315E fibrils can seed TDP-LCD monomer (Figure 4b) (due to its rapid aggregation, TDP-CTF monomer could not be used to measure seeding and we instead used TDP-LCD monomer as the LCD is required for TDP-43 aggregation32,35,36). (4) Mutagenesis experiments with TDP-CTF reported here and previously32. We found that five individual point mutations to tryptophan delay aggregation of TDP-CTF; these sterically conflict with the tightly packed dagger-shaped (A324W, L330W, Q331W, M337W) and R-shaped (Q303W) folds delay aggregation of TDP-CTF. Similarly, the mutations A324E and M337E that are disruptive to the dagger-shaped fold are found to inhibit aggregation of full-length TDP-43 in cells11. As negative controls, we found that A326W and G304W – located in a solvent exposed environment of the dagger-shaped fold and a loose cavity of the R-shaped fold, respectively – can tolerate the bulky tryptophan and did not delay aggregation of TDP-CTF (Supplementary Figure 1d, Supplementary Figure 6a&b, Supplementary Figure 7 and Supplementary Note 5)32. Furthermore, a key stabilizing feature of SegB A315E, the Arg293-Glu315 salt bridge, is similarly important for TDP-CTF aggregation. We designed two double mutations, R293E/A315E and R293E/A315R. Based on our structure of SegB A315E, R293E/A315E would disrupt the R-shaped fold by electrostatic repulsion whereas R293E/A315R would favor the R-shaped fold by restoring the salt-bridge. In accordance with our model, R293E/A315E reduces TDP-CTF aggregation whereas R293E/A315R enhances TDP-CTF aggregation (Supplementary Figure 1d). These results suggest that both the dagger-and the R-shaped folds are accessible to TDP-CTF molecules, so that disrupting either one blocks one pathway to fibril formation and hence delays aggregation of TDP-CTF. The observation that none of these mutations targeting the dagger-or R-shaped fold fully eliminates aggregation supports our hypothesis that TDP-CTF can form fibrils using one of multiple aggregation cores, including the dagger-or R-shaped fold, or possibly others not discovered here. In short, conservation of the dagger fold, modeling, seeding experiments, and mutational analysis all suggest that TDP-CTF can form the folds we observe.
We investigated whether the dagger-and R-shaped folds are relevant to TDP-43 fibrils in human disease, motivated by the irreversibility of fibrils formed by these folds and the likely accessibility to both folds of TDP-CTF. The scarcity of TDP-43 fibrils in autopsied brains of ALS and FTLD patients daunts structure determination of patient-derived TDP-43 fibrils, so we used seeding experiments to examine the possibility that our structures represent folds adopted by TDP-43 in disease. We seeded both SegA and SegB A315E monomer with sarkosyl-insoluble material from brain extracts of two patients with FTLD-TDP, which was previously reported to induce FTLD-TDP pathology in a mouse model43. We observe that the sarkosyl-insoluble material was able to seed SegA monomer (Supplementary Figure 1g). This process indicates that TDP-43 in these FTLD-TDP brain extracts contains a well-folded SegA region that can serve as a template for seeded aggregation, thereby providing plausibility for the existence of the dagger-shaped fold in these extracts. In contrast we found that the same brain extracts cannot seed SegB A315E monomer (Supplementary Figure 1g), suggesting that the R-shaped fold may not exist in these extracts. This lack of seeding may mean that the R-shaped fold is specific for patients bearing the TDP-43 A315E mutation. Alternatively, the R-shaped fold may not exist in this particular FTLD patient but may exist in other TDP-43 diseases or disease subtypes such as ALS, especially given that the A315E mutation was discovered in the context of ALS12.
We thus propose a one-to-one correspondence between polymorph and disease subtype. More specifically, we speculate that the difference between SegA-sym and SegA-asym are so subtle that they may be associated with the same diseases whereas the differences between the dagger-and R-shaped fold are significant enough to perhaps be associated with different diseases. This hypothesis is supported by : (1) insoluble TDP-43 extracts from different sub-types of FTLD have distinct morphologies and biochemical properties44, and (2) fibrils of two peptides whose sequences closely match the dagger-shaped fold and the R-shaped fold (Supplementary Figure 6c) seed cell-expressed TDP-43 into distinct aggregates41.
Our finding of polymorphism of TDP-43 fibrils adds TDP-43 to the cohort of other polymorph-forming amyloid proteins14,28–30, and raises the question of why polymorphism is common in amyloid fibrils but not in globular proteins. We suggest that the polymorphism observed in irreversible amyloid fibrils arises from a lack of evolutionary pressure to fold into a particular structure that performs an adaptive function. In contrast, each globular protein and functional aggregate has evolved to fold into a single structure with lower free energy than any other structure that its sequence can form. That is, pathogenic amyloid fibrils lack survival advantage and so can adopt multiple conformations that represent different local energy minima in the protein folding landscape. However, a particular polymorph may give rise to a particular disease45, perhaps nucleated by a mutation such as TDP-43 A315E or a particular cellular environment as suggested for alpha-synuclein46. Polymorph-specific diseases are consistent with results on tau where multiple patients with the same tauopathy all are found to have the same fibril polymorphs and patients with different tauopaties have different fibril polymorphs29,30,47.
Our structure of SegB A315E, the first reported cryo-EM fibril structure containing a hereditary mutation, offers a molecular explanation in terms of seeding for cell-to-cell spreading of pathology through the brain40. As explained above, our SegB A315E structure indicates that A315E facilitates TDP-43 aggregation through electrostatic attraction with Arg293, and suggests A315T (if the threonine is phosphorylated as speculated) can function in a similar way. As summarized in Figure 4a, wild-type TDP-43 monomer can form either the dagger-shaped or R-shaped fold (or possibly other folds), and both folds can act as seeds to recruit additional monomers into the fibril. In A315E, the inter-layer interaction motif of Arg293-Glu315 provides free positive and negative charges on both ends of any R-shaped fibril that consists of 2 layers or more (Figure 2h), and these free charges can attract additional monomers through long-range electrostatic interactions. Thus the seeding potency of the R-shaped fold with the A315E mutation may exceed that of other folds. Our seeding experiments support this model by showing that SegB A315E fibrils are more effective in seeding TDP-LCD A315E monomer than SegA fibrils (Figure 4b & Supplementary Note 7).
Hereditary mutations have also been proposed to impart pathology by additional mechanisms (Supplementary Note 8), including shifting TDP-43 from a reversible to an irreversible assembly48,49. Although we have shown that both dagger-and R-shaped folds form irreversible fibrils, and others have shown that the C-terminal domain can form helical assemblies in liquid droplets49, we cannot rule out the possibility that under certain conditions the dagger-and R-shaped folds may also participate in reversible aggregation reported for TDP-43. Energetic analysis suggests this may be possible for the R-shaped fold. The outer two chains in the SegB A315E structure differs noticeably from the inner two chains (Supplementary Note 3), having a lower stabilization energy (0.79 kcal/mol per residue), close to the stabilization energy of the FUS structure (0.66 kcal/mol per residue), which undergoes reversible aggregation, versus the inner chain (0.94 kcal/mol per residue). Conceivably in the native protein, in the absence of the constraint of the Arg293-Glu315 salt bridge and accompanied by certain conformational changes, the R-shaped fold may participate in a reversible fibril (Supplementary Note 9). That is we speculate that the A315E mutation could impose a switch of a reversible to an irreversible fibril.
Our four near-atomic resolution structures of TDP-43 amyloid fibrils establish that: (1) TDP-43 is capable of forming multiple fibrillar structural polymorphs; (2) That two sequence segments of TDP-43 can form distinct stable amyloid cores, one with dagger-shaped folds and the other with R-shaped folds; (3) The R293-E315 salt-bridge in R-shaped fold provides a plausible explanation of enhanced TDP-43 pathology by the ALS related hereditary mutation A315E (and possibly A315Tp); and (4) Energetic analysis highlights the structural features of amyloid fibrils that may lead to both reversible and irreversible aggregation.
Methods
Methods and materials used in this study are available in supplementary information.
Author contributions
Q.C. and D.R.B. designed experiments and performed data analysis. Q.C. expressed and purified constructs, and performed biochemical experiments. Q.C. and D.R.B. prepared cryo-EM samples, and performed cryo-EM data collection and processing. P.G. assisted in cryo-EM data collection and processing. M.R.S. performed solvation energy calculation. All authors analyzed the results and wrote the manuscript. D.S.E. supervised and guided the project.
Competing interests
D.S.E. is an advisor and equity shareholder in ADRx, Inc.
Materials & Correspondence
For requests of materials reported in this study, please contact David S. Eisenberg.
Data and materials availability
All structural data have been deposited into the Worldwide Protein Data Bank (wwPDB) and the Electron Microscopy Data Bank (EMDB) with the following accession codes: SegA-sym (PDB 6N37, EMD-9339), SegA-asym (PDB 6N3B, EMD-9350), SegA-slow (PDB 6N3A, EMID-9349), SegB A315E (PDB 6N3C, EMD-0334). All other data, including the custom software used for solvation energy calculation, are available from the authors upon reasonable request.
Acknowledgments
We thank H. Zhou for use of Electron Imaging Center for Nanomachines (EICN) resources. We acknowledge the use of instruments at the EICN supported by UCLA and by instrumentation grants from NIH (1S10RR23057 and 1U24GM116792) and NSF (DBI-1338135 and DMR-1548924). The authors acknowledge NIH AG 054022 and DOE DE-FC02-02ER63421 for support. D.R.B. was supported by the National Science Foundation Graduate Research Fellowship Program.
References
Supplemental References
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.