Summary
In eukaryotes, the essential process of cellular respiration takes place in the cristae of mitochondria. The protein Mic60 is known to stabilize crista junctions; however, how the C-terminal Mitofilin domain of Mic60 mediates cristae-supported respiration remains elusive. Here, we used ancestral sequence reconstruction to generate Mitofilin ancestors up to and including the last opisthokont common ancestor (LOCA). We found that yeast-lineage derived Mitofilin ancestors as far back as the LOCA rescue respiration. By comparing Mitofilin ancestors with different respiratory phenotypes, we identify four residues that explain the difference between respiration functional yeast- and non-functional animal-derived common Mitofilin ancestors. Our results imply that Mitofilin-supported respiration in yeast stems from a conserved mechanism, and provide a foundation for investigating the divergence of candidate crista junction interactions present during the emergence of eukaryotes.
Introduction
Mitochondria are double membrane-defined eukaryotic organelles with essential metabolic and bioenergetic functions1. Cellular respiration, the process where adenosine triphosphate (ATP) is generated through oxidative phosphorylation, takes place in cristae, the characteristic invaginations of the inner mitochondrial membrane2–5. Within the inner mitochondrial membrane, a neck-like membrane structure called the crista junction connects cristae to the inner boundary membrane and is an important site for transit of substrates essential for respiration and other physiological functions6–8.
Crista junction organization and stability is mediated by the mitochondrial contact site and crista organizing system (MICOS), a heterooligomeric inner mitochondrial membrane protein complex.9–11 MICOS consists of two subcomplexes that assemble independently and have distinct roles in crista junction formation and stability12–14. In yeast, subcomplex I comprises Mic60 and Mic19 (and Mic25 in animals) and subcomplex II consists of Mic10, Mic26, Mic27, and Mic12 (Mic13 in animals), which connects the two subcomplexes13,15.
Mic60 is the core shape-determining protein component of subcomplex I. It is conserved across eukaryotes, and all Mic60 proteins share a similar domain architecture with an N-terminal transmembrane helix followed by a large C-terminal region facing the intermembrane space16. Mic60 localizes at crista junctions, and together with Mic19, closely tethers the crista junction membrane to the outer mitochondrial membrane via interactions with the sorting and assembly machinery for β barrel proteins (i.e., SAM) and other membrane proteins.11,17–20 Moreover, Mic60 is also an ancient MICOS component, with homologues identified in alphaproteobacteria, the extant cousins of a hypothesized endosymbiosed mitochondrial precursor21. Alphaproteobacterial Mic60 is found in intracytoplasmic membranes, which are functionally and morphologically analogous to mitochondrial cristae16,22. Bacterial and eukaryotic Mic60 share several functional and biophysical similarities, such as the ability to deform membranes16,23.
Downregulation of Mic60 causes the loss of crista junctions11,17,24, respiratory growth defects25–28, and is a hallmark of several neurodegenerative disorders such as Parkinson’s disease, Alzheimer’s disease, and various myopathies29–31. It has been established that Mic60’s most conserved region, its C-terminal Mitofilin domain, is important for respiration26–28 and interacts with Mic19, Mic10, and Mic1332–34. Yet, how Mic60’s Mitofilin sequence encodes its crista junction supported functions, in particular cellular respiration, remains unclear.
We used ancestral sequence reconstruction and respiratory growth assays to identify residues responsible for Mitofilin-supported yeast respiration35. In contrast to a previous report using a plasmid-based system, we observe that substitution of the yeast Mitofilin domain with human or alphaproteobacterial homologues at the endogenous locus does not rescue respiration23. To explore when Mic60’s Mitofilin-supported respiration diverged, we reconstructed ancestral Mitofilin sequences and resurrected them for in vivo characterization. We found that common Mitofilin ancestors as far back as a putative last opisthokont common ancestor (LOCA) can rescue mitochondrial respiration in yeast. Comparing reconstructed ancestors with different rescue phenotypes, we identify four residues in the center of the Mitofilin fold that are involved in mitochondrial respiration. This finding, obtained from ancestral sequence reconstructions spanning an estimated one billion years of evolution36, paves the way for molecular studies investigating the timing and sequence determinants of endosymbiotic interactions that drove eukaryogenesis.
Results
Re-defined domain boundaries for Mic60’s Mitofilin domain
Despite extensive study, the definitions of Mic60’s C-terminal Mitofilin domain boundary vary. While it has been established that the Mitofilin domain extends to the C terminus, multiple N-terminal boundaries have been proposed10,26,34,37,38. We superimposed AlphaFold2 predictions of Mic60 from H. sapiens, S. cerevisiae and the alphaproteobacterium C. sphaeroides and observed high structural conservation for a C-terminal region ranging from residues 318-540 (S. cerevisiae numbering; Figure 1A-C, Figure S1 for prediction confidence). This region extends beyond all previously assigned domain boundaries for the Mitofilin domain by an additional predicted 4-helix bundle in the N-terminal direction and includes the two previously proposed lipid binding sites (LBS 1 and LBS 2)10,34,37 (Figure 1A). This newly defined Mitofilin domain is predicted to comprise two 4-helical bundles and an additional α helix α5, which contains most residues of the second previously assigned lipid binding site (LBS 2) and protrudes from the outer 4-helix bundle (Figure 1B, C). Residues of the other lipid binding site (LBS 1) are within the fourth α helix (α4) of the outer 4-helix bundle. Helix α8 spans both helical bundles. Based on these structural alignments, we redefined the Mitofilin domain boundaries, and used this revised Mitofilin definition (residues 318-540) in all subsequent experiments.
Alphaproteobacterial and human Mitofilin do not rescue cellular respiration in yeast
To test for Mic60 function, we applied an established respiratory growth assay in S. cerevisiae (Figure 2A). In this assay, oxidative phosphorylation-dependent respiratory growth was monitored using glycerol and ethanol (YPEG media; see Methods) as a non-fermentable carbon source. Only S. cerevisiae strains capable of cellular respiration can grow on YPEG. To control for mitochondria-independent growth deficits, we grew parallel cultures of all strains in dextrose (YPD media), a fermentable carbon source (Figure S2, Figure S3 for mitochondrial genome sequencing). We used CRISPR-Cas9 to generate Mic60 mutations at the endogenous locus in the genome of the S288C strain of S. cerevisiae. Consistent with previous work using derivatives of BY4741 S. cerevisiae26,28,37, we observe that knocking out Mic60 in S. cerevisiae (mic60Δ) leads to poor growth compared to wildtype under non-fermentable conditions (Figure 2A). Similarly, poor respiratory growth has been previously reported for C-terminally truncated versions of Mic60 (Mic601-140, Mic601-472, and Mic601-491 all show a reduced growth phenotype in S. cerevisiae)26,28,37. As expected, our truncation based on a more generous Mitofilin domain definition (Mic601-317; hereafter referred to as mitofilinΔ), also showed a reduced growth phenotype similar to mic60Δ (Figure 2A).
Given that the structural conservation of the Mitofilin domain extends beyond all major eukaryotic groups into prokaryotic alphaproteobacteria38, we next asked whether Mitofilin-supported cellular respiration is similarly well conserved. We generated S288C strains where the S. cerevisiae Mitofilin sequence (residues 318-540) was swapped with that of Mitofilin homologues from different species at the endogenous locus. In contrast to previous reports39, we found that a chimeric yeast strain encoding the alphaproteobacterial C. sphaeroides Mitofilin domain (residues 270-433; mitofilinΔ::Mitofilin(C. sphaeroides)) shows a similar growth deficit to the mic60Δ and mitofilinΔ strains on non-fermentable media (Figure 2A). Similarly, the yeast encoding the human Mitofilin chimera (residues 562-758; mitofilinΔ::Mitofilin(H. sapiens)) displays a respiratory growth deficit like mic60Δ (Figure 2A). Western blots of isolated mitochondrial fractions confirm the mitochondrial localization of mitofilinΔ and the chimeric H. sapiens Mitofilin and C. sphaeroides Mitofilin yeast strains (Figures S4). It is worth noting that a previous complementation study using plasmid-based expression observed a partial rescue of S. cerevisiae Mic60 function using the full-length C. sphaeroides Mic60 sequence39. In our work, we used CRISPR-Cas9 to generate chimeras of Mic60 at the endogenous locus of the S. cerevisiae genome.
K. phaffi Mitofilin rescues respiratory growth in S. cerevisiae
We noticed in alignments of eukaryotic Mic60 sequences that the S. cerevisiae Mitofilin domain contains a loop region (residues 364-406) between α helices α2 and α3, which is distinct to the Saccharomyceta clade (Figure 2B). To test whether this loop is important for yeast Mitofilin respiratory function, we generated a S. cerevisiae strain that expresses a chimeric S. cerevisiae Mic60 with the Mitofilin domain of K. phaffi (residues 318-509; mitofilinΔ::Mitofilin(K. phaffi)), which lacks the unique loop region. We found that the Mic60 chimera with the K. phaffi Mitofilin domain localizes to mitochondria and grows like wildtype S. cerevisiae under non-fermentable conditions (Figure 2A, Figure S4). This indicates that despite only sharing 40% of sequence identity with S. cerevisiae Mitofilin (Figure 3B), the K. phaffi Mitofilin supports wildtype-like cellular respiration, and that the Saccharomyceta-specific loop region is not required for Mitofilin-supported respiratory growth in S. cerevisiae.
Ancestral sequence reconstruction of the last common opisthokont Mitofilin ancestors
Our finding that K. phaffi Mitofilin rescues respiratory growth in S. cerevisiae suggests the Mitofilin domain’s role in cellular respiration may be an ancient function, well-conserved in yeast. To test this hypothesis and to determine an evolutionary time point when yeast Mitofilin respiratory function diverged in opisthokonts, we used ancestral sequence reconstruction35 to infer the Mitofilin sequences of opisthokont ancestors, then tested the sufficiency for these ancestral Mic60 Mitofilin chimeras to support yeast respiratory growth. We began by collecting Mic60 sequences from all eukaryotic clades, optimizing for phylogenetic coverage of Ascomycota fungi, and truncated the sequences to only contain the C-terminal Mitofilin domain as defined above. Following multiple sequence alignments, we applied a stringent amino acid substitution model to generate Mitofilin-based molecular phylogenies (Figure 3A, Figure S5). Briefly, substitution models were sorted by Bayesian information criteria, and evaluated for their ability to achieve largely congruent evolutionary relationships between fungi (Figure S7). We found that the Q.yeast model40,41 best modeled this tree, with +I and +G4 decorators chosen to approximate the evolutionary rate42. A yeast sequence-focused tree containing 121 sequences was then used to reconstruct ancestors along the fungal lineage from S. cerevisiae to the last common ancestors of Saccharomycotina (yLSCA), Ascomycota (yLACA), Fungi (yLFCA) and Opisthokonta (yLOCA) (Figure 3A-D). Reconstruction confidence is measured by the mean posterior probability for a reconstructed ancestor. We reconstructed 174-179 residues for yLSCA, yLACA, yLFCA and yLOCA Mitofilin with a mean posterior probability ranging from 72% for yLOCA to 82% for yLACA and yLSCA (Figure 3C, Figures S7, S8).
We next reconstructed a tree with good phylogenetic coverage for animals to investigate the conservation of yeast respiratory function further down the lineage to the human Mitofilin domain (Figure 3A, Figures S6, S7). This phylogenetic tree focused on human lineage-derived reconstructed Mitofilin domains for the last vertebrate (hLVCA), last Bilateria non-arthropod (hLBnaCA), and last opisthokont common ancestor (hLOCA) (see methods for details). The mean posterior probability for hLVCA, hLBnaCA, and hLOCA ranged from 75-88% (Figure 3C, Figure S8). The yLOCA and hLOCA Mitofilin sequences share 64% sequence identify (Figure 3B).
Yeast-lineage derived Mitofilin LOCA rescues respiratory growth
To evaluate the functional conservation of the ancestor Mitofilin sequences, we generated yeast strains encoding Mitofilin chimeras of Mic60, where we exchanged the wildtype Mitofilin domain with the reconstructed ancestral Mitofilin domains. All the reconstructed yeast-derived ancestors localize to the mitochondria, as shown by immunoblots of mitochondrial fractions (Figure S4). Notably, all yeast ancestors (yLSCA, yLACA, yLFCA and yLOCA) grow like wildtype in respiratory media (Figure 3D, Figure S9). To better quantify yeast respiratory growth over time, we cultured yeast strains in liquid medium (50 ml YPEG) under non-fermentable conditions and assayed cell density by spectrophotometrically measuring the OD600 at defined time points. We observe two growth regimes: while growth rates do not significantly differ within the first 24 hours of growth, during mid-log phase, all yeast ancestors and K. phaffi Mitofilin grow like wildtype, whereas mic60Δ and mitofilinΔ show a significantly reduced growth rate, which is consistent with colony number and thickness in the solid growth assays (Figure 3D, Figure 2 for K. phaffi). Our findings suggest that Mitofilin’s respiratory growth function in yeast is conserved all the way to yLOCA, covering over one billion years of evolution43.
Whole genome sequencing confirms the correct genomic insertion of the human-lineage derived ancestor sequences in the yeast strain; however, we do not observe stable expression for the human-derived hLBnaCA and hLOCA ancestors in immunoblots (Figure S4). Accordingly, the human-lineage derived ancestors show deficient respiratory growth like mic60Δ (Figure 3D, Figure S9). While the hLVCA Mitofilin ancestor expresses and localizes to mitochondria (Figure S4), it does not rescue respiratory growth (Figure S9).
Structural comparison of ancestors
Ancestral sequence reconstruction infers the amino acid sequences of putative ancestors, which we compared to identify residue signatures that correlate with function. We first generated structure predictions of Mitofilin ancestors using our reconstructed sequences as input for AlphaFold2. As expected based on sequence identity, all predicted structures adopt the same overall fold (Figure 3E). To identify potential amino acids responsible for functional differences, we analyzed multiple sequence alignments of extant and resurrected Mitofilin sequences for similarities in functional constructs in yeast (wildtype, K. phaffi and all yeast-lineage derived ancestral sequences), and differences compared with constructs that exhibited a mic60Δ-like growth phenotype (H. sapiens and all human-lineage derived ancestors) or constructs that did not stably express (hLOCA, hLBnaCA). We refer to functional constructs as all extant and reconstructed sequences that show wildtype-like respiratory growth and refer to non-functional constructs as sequences that exhibit a mic60Δ-like respiratory growth curve or sequences that do not show stable expression in yeast (Figure 3D, Figures S4, S9).
We identified four residues in the Mitofilin domain, which are either identical or retain similar chemical properties (charge, hydrophobicity, size) in all functional constructs, but differ in the non-functional constructs (Figure 4A, B, Figure S10). Intriguingly, when mapped onto the predicted structure, these four residues are located in a line that runs across the center of the Mitofilin domain, perpendicular to the helical bundle axis (Figure 4B). Two of these residues, Glycine 421 (S. cerevisiae nomenclature) and Leucine 425 are positioned on the loop between helices α3 and α4, which connects the predicted inner and outer 4-helix bundles of the Mitofilin domain (Figure 4B). Glycine 421 changes from a small (Glycine) or hydrophilic (Serine/Asparagine) residue in all functional sequences into a charged residue (Glutamate) in all human-derived ancestors. Notably, Glycine 421 corresponds to a Threonine (Threonine 622) in H. sapiens. Leucine 425 is a hydrophobic residue (Leucine/Alanine) in all yeast constructs, but a polar amino acid (Serine/Tyrosine) in all non-functional sequences. Aspartate 497 on α7 on the outer 4-helix bundle is a negatively charged or small polar residue (Aspartate/Glutamate/Serine) in functional constructs. In non-functional constructs, Aspartate 497 is a larger polar or a hydrophobic amino acid (Asparagine/Glutamine/Leucine). The fourth residue which differs between functional and non-functional residues, Arginine 522, is located on helix α8 and exists as a positively charged residue in functional sequences (Arginine/Lysine), and a hydrophobic residue (Alanine/Leucine/Methionine) in non-functional constructs.
Discussion
Respiration is an essential function supported by mitochondrial cristae. While the essentiality of Mic60 for stable crista junctions is established26,28,37, understanding of conserved sequence and structural determinants for cristae-supported respiration have been lacking. Here we used ancestral sequence reconstruction to infer putative Mic60 Mitofilin sequences along the fungal lineage from S. cerevisiae to the last opisthokont common ancestor (LOCA). We show that Mic60 Mitofilin ancestors as far back as the last opisthokont common ancestor (yLOCA) rescue yeast mitochondrial respiration. These results indicate that Mitofilin-supported respiration is based on a well-conserved mechanism, spanning approximately one billion years of evolutionary history. This observation is striking, given the high rate of Mitofilin evolution in fungi, as indicated by the number of residue substitutions in the yeast phylogenetic tree compared to that of animals. An implication of this finding is that respiration-supporting interactions between the Mitofilin domain and partner proteins may also be similarly well-conserved.
Previously, Tarasenko et al. reported a “slight” rescue of yeast respiratory growth by plasmid-expressed and mitochondrially targeted wildtype C-terminally FLAG-tagged full-length alphaproteobacterial Mic6039. In contrast, when introducing the Mitofilin domain of the same alphaproteobacterial species as a yeast chimera at the endogenous locus with CRISPR-Cas9, we observe a definitive mic60Δ-like respiratory growth deficit. Since Mic60 function is coupled to its expression levels29,31, we opted for genome editing by CRISPR-Cas9 at the endogenous locus, as opposed to plasmid-based expression to avoid plasmid-specific protein expression effects. Furthermore, despite confirmation of correct insertion by whole-genome sequencing, two yeast Mic60 chimeras with human-lineage derived Mitofilin ancestors do not show protein expression (Figure S4). Accordingly, these Mic60 chimeras also display a mic60Δ-like respiratory growth phenotype (Figure 3D, Figure S9). Since the expressed and non-expressed yeast Mic60 chimeras all contain the N-terminal mitochondrial targeting sequence and only differ in the C-terminal Mitofilin domain (Figure 1A), these observations suggest that additional, yet to be defined, mechanisms of proteostasis are at play.
Being a probabilistic approach, ancestral sequence reconstruction includes an inherent degree of uncertainty, even with the most robust reconstructions. By correlating respiratory growth profiles with the corresponding amino acid sequences, we were able to identify just four residues that explain the difference in Mitofilin function for yeast-and animal-derived ancestors. This is an advance over previous structure/function dissections of the Mitofilin domain that mainly used various truncation constructs26,28,37. These residues differ in chemical properties in functional and non-functional constructs. Interestingly, these residues are in close proximity to one another, within a central axis of the Mitofilin domain, perpendicular to the two distinctive 4-helix bundles. Given Mic60’s functions, candidate roles for these residues may include domain stability, self-assembly, or protein-protein interactions. While it has been observed that Mic60’s Mitofilin domain interacts with several proteins, knowledge on binding interfaces is sparse32–34. By identifying single candidate residues, this work lays the foundation for future work investigating mechanistic details of Mitofilin-supported respiration.
Limitations of the study
This study identifies conserved sequence elements in Mic60’s Mitofilin domain that are important for yeast respiration. This work does not explore whether these residues complement other organelle functions, nor does this study investigate the role for these residues in animal cells. This work focused on the C-terminal Mitofilin domain of Mic60 due to its extreme conservation through eukaryotes, which is requisite for the ancestral sequence reconstruction approach. The role for other portions of Mic60, or other components of MICOS were not investigated. These interesting questions provide promising avenues for future investigation.
Author Contributions
F.M.C.B., T.A.B., T.H.N., D.S. and L.H.C. conceived of the study and designed the experiments. F.M.C.B., T.A.B., T.H.N. and D.S. generated the yeast strains. F.M.C.B. and T.H.N. performed growth assays and mitochondrial localization experiments. T.A.B., T.H.N, L.B.C and C.J.B.d.C. performed ancestral sequence reconstruction. F.M.C.B., T.A.B., T.H.N., C.J.B.d.C., and L.H.C. analyzed the data. F.M.C.B. and T.A.B. created figures and tables, and F.M.C.B. and L.H.C. wrote the manuscript with contributions from T.A.B. and T.H.N. All authors approved the final version of the manuscript.
Declaration of Interests
The authors declare no competing interests.
STAR⍰Methods
Key resources table
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Luke H. Chao (chao{at}molbio.mgh.harvard.edu).
Materials availability
Plasmids and yeast strains generated in this study are available from the lead contact upon request.
Data and code availability
Whole Genome Sequencing data and original western blot images to be made publicly available upon publication.
All original code has been deposited at Github and is publicly available as of the date of publication. DOIs are listed in the key resource table.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Method details
Yeast Strains and Growth
All S. cerevisiae strains used in this work are derivatives of the S288c DBY1204546 strain (kindly gifted by L. Stirling Churchman). A complete list of all yeast strains generated and used is included as Table S2.
For spot assays, yeast strains were first grown in YPD (1% yeast extract, 2% peptone, 2% glucose) or YPEG (1% yeast extract, 2% peptone, 2% ethanol, 3% glycerol) media at 30°C to an OD600 of 0.2 to 0.3. Cells were pelleted by centrifugation at 6,000 x g for 2 min and resuspended in water to a final OD600 of 0.1. Each sample was 5-fold serially diluted in water to obtain cells with an OD600 value ranging from 0.1 to 6.4 x 10-6, then transferred to YPEG or YPD agar plates using a multichannel pipette. Plates were incubated at 30°C, 25°C, or 18°C and imaged at 24-hour intervals for 7 days.
For liquid growth assays, yeast strains were first grown in YPD at 30°C to an OD600 of 11. Following centrifugation at 6,000 x g for 2 minutes, cell pellets were resuspended in 50 ml YPEG to a final OD600 of 0.4 and grown at 30°C for 72 hours. 3 technical and 3 biological replicates (N=9) were grown for each strain. Measurements were taken at defined time points and analyzed using the matplotlib57 and pandas58 libraries in Python.
CRISPR/Cas9 Endogenous Genome Editing
Mic60 knockout (mic60Δ), Mitofilin knockout (mitofilinΔ) and chimeric yeast Mic60 strains expressing Mitofilin domains of other species or reconstructed ancestral Mitofilin domains were generated by CRISPR-Cas9 using the method by Levi and Arava59. Synthesized oligonucleotides (Integrated DNA Technologies) encoding the desired sgRNA sequence were cloned into the Cas9-encoding bRA66 plasmid (Addgene, 100952). Yeast strains were co-transformed with 500 ng of modified bRA66 plasmid (encoding Cas9 and sgRNA) and 5 ug of synthesized donor DNA (GenScript) using a lithium acetate/PEG-3350/heat-shock protocol described by Shaw et al60. All plasmids, sgRNAs and donor DNAs are listed in Table S3. Gene modifications were verified by whole genome sequencing.
Whole Genome Sequencing Analysis
Genomic DNA from each yeast strain was extracted using the MasterPure Yeast DNA Purification Kit (Biosearch Technologies) and sequenced with HiSeq 50 nt paired-end reads (Illumina). Reads were aligned against the S. cerevisiae S288C genome (Genbank GCF_000146045.2) using bwa56 and samtools/bcftools55. For chimeric constructs, custom reference genomes were produced by manually editing the S. cerevisiae S288C genome in fasta format. Aligned reads were visualized using Integrated Genome Browser49. Chromosomal coverage was assessed with custom python scripts, available in an open-source repository (https://github.com/tribell4310/wgs_pipe).
Immunoblot Analysis of Protein Localization
Expression and mitochondrial localization of Mic60 for all yeast strains were evaluated through antibody detection by immunoblotting of isolated mitochondria. Yeast strains were prepared and grown as described above for growth assays in liquid YPEG for 28 hours. After centrifugation at 1,500 x g for 5 minutes, mitochondria were isolated using the Yeast Mitochondria Isolation Kit (Abcam). Spheroblasts were homogenized by pushing 12 times through a 25G needle. For immunoblot analysis, the following antibodies were used: anti-Mic60 (custom-made, Cusabio), anti-MTCO1 as a mitochondrial marker (ab110270, Abcam) and anti-PGK1 as a cytosolic marker (ab113687, Abcam).
Ancestral Sequence Reconstruction
Ancestral reconstruction was carried out following the method of Prinston et al.61. Briefly, Mic60 sequences were curated from the InterPro database62 (IPR019133) as well as through searches of NCBI GenBank63 and Uniprot64 using BLAST65 and HMMER66 algorithms. Searches returned >5,000 sequences, which were aligned with MAFFT50. Sequences with large insertions or deletions and sequences with greater than 96% pairwise sequence identity were eliminated immediately. This set of sequences (Table S4) was then whittled down to 69 (yeast-focused tree) or 62 (animal-focused tree) sequences that effectively represented the full phylogenetic diversity of the evolutionary space using custom python scripts, available in an open-source repository (https://github.com/tribell4310/phylogenetics). Seven amoebozoan and other non-opisthokont eukaryotic sequences were then added to the alignments for outgroup rooting, and the sequences were re-aligned using PRANK52. The best-fit substitution model (Q.yeast40,41 with allowed invariable sites (+I) and a discrete Gamma model of site heterogeneity67 with four rate categories (+G4) for the yeast-focused tree; LG68 with a four-class free-rate model42,69 of site heterogeneity (+R4) for the animal-focused tree) was determined using MODELFINDER70 according to Bayesian information criterion. Phylogeny construction and ancestral reconstruction were performed in IQ-TREE53 using default settings. Branch supports were inferred using SH-like approximate likelihood ratio tests71 in IQ-TREE. Tanglegram comparison of phylogenies with the Open Tree of Life72 was performed using the rotl package73, as well as custom R scripts made available in the above-referenced open-source repository. All alignment positions were reconstructed without regard to relatively unpopulated gap regions. Gaps were manually removed from the reconstructed sequences by comparing the reconstructions to their nearest-neighbor extant sequences using a parsimony approach similar in principle to Fitch’s algorithm74.
Structure Prediction and Analysis
Model predictions for all proteins shown were generated by AlphaFold247 using ColabFold v. 1.5.548. All sequences used for protein predictions are listed in Table S1. Visual representations of protein structures were created and analyzed with UCSF ChimeraX (Resource for Biocomputing, Visualization, and Informatics, University of California, San Francisco)54 and PyMOL (The PyMOL Molecular Graphics System, Version 2.5 Schrödinger, LLC). Protein folds were compared by superposition using the matchmaker75 command in ChimeraX. Sequence conservation was mapped onto the structures using the ConSurf webserver44,45.
Supplemental Information
See Supplemental Information file. Tables S1-S4.
Acknowledgments
We thank L. Stirling Churchman for gifting us the S288c DBY12045 yeast strain; Ulandt Kim and the NextGen Sequencing Core at MGH for their outstanding support; Andrew J. Roger and Sergio A. Muñoz-Gómez for helpful discussions and advice. We acknowledge support from the Swiss National Science Foundation grant P180777 (F.M.C.B.), the Keystone Future of Science Fund (F.M.C.B), the Helen Hay Whitney Foundation (T.A.B.), Natural Sciences and Engineering Research Council of Canada grant RGPIN-2016-04801 (C.J.B.d.C.), Canada Foundation for Innovation grant 34475 (C.J.B.d.C.), Canadian Institutes of Health Research grant 377068 (C.J.B.d.C.), New Frontiers in Research Fund-Exploration grant NFRFE-2018-00064 (C.J.B.d.C.), the Moore–Simons Project on the Origin of the Eukaryotic Cell grant 9736 (L.H.C., C.J.B.d.C.) and National Institutes of Health grant R35GM142553 (L.H.C.).
Footnotes
↵8 Lead contact