Abstract
Interspecific hybrids can exhibit phenotypes that surpass those of their parents, known as heterosis, which is often of interest for industrial applications and evolutionary genetics research. However, constructing hybrids between distantly related species, such as intergeneric yeasts, presents technical challenges. In this study, we established a method to transfer individual chromosomes from Saccharomyces cerevisiae (Sc) into Kluyveromyces marxianus (Km), an emerging model for bioproduction. The Sc chromosome of interest was circularized, genetically modified to carry Km centromeres and replication origins, and transformed into Km via protoplast transformation. With this method, we generated monochromosomal hybrids with eight Km chromosomes and Sc chromosome I or III. The Sc chromosomes exhibited normal replication, segregation, and active transcription in the hybrids. The hybrids displayed heterosis in flocculation and salt tolerance due to the overexpression of FLO9 and SPS22, respectively. Transcriptomic analysis revealed that both cis- and trans-regulatory changes contributed to the divergence of gene expression between the two species, with the cis and trans effects often acting in a compensatory manner. Our strategy has potential applications in optimizing cell factories, constructing synthetic genomes, and advancing evolutionary research.
Teaser A yeast hybrid containing an alien chromosome transferred from a distantly related species exhibits improved phenotypes.
Introduction
The transfer of genetic materials between species is one of the major driving forces for speciation and environmental adaptation 1,2. Chromosomes, which carry packs of genes, can introduce abundant genetic variation and contribute to evolution of adaptive traits when transferred between species. For example, horizontal transfer of plastid chromosomes between tobacco plants (Nicotiana) aided chloroplast capture 3. Horizontal transfer of chromosomes between Fusarium species, a group of filamentous fungi, led to new pathogenic lineages 4. Furthermore, hybridization is by nature a process of interspecific chromosome transfers. With the combined genetic materials, hybrids often exhibit enhanced phenotypes, known as heterosis or hybrid vigor. For example, hybridization between the top-fermenting yeast Saccharomyces cerevisiae and the cold-tolerant S. eubayanus gave rise to one of the most important lager-brewing species, S. pastorianus 5. Similarly, triticale, a hybrid of rye (Secale cereale) and wheat (Triticum aestivum L.), exhibited improved yield in marginal environments 6. Therefore, introduction of genetic materials from a distantly related species may hold great potential for phenotypic improvement in the fields of crop breeding and industrial applications.
Understanding the phenotypic and molecular consequences of interspecific hybridization may also address many important questions in genome evolution. For example, what are the molecular bases for heterosis 7? From the perspective of gene regulation, can distantly related genomes regulate each other 8? Is there an upper limit for evolutionary divergence such that two genomes with a divergence level above this threshold cannot regulate each other anymore? Is evolution of gene expression predominantly driven by cis- or trans-regulatory changes 9? Many previous studies have addressed these questions with natural or synthetic hybrids 10–17, including important conclusions that sequence divergence in cis-regulatory elements such as promoters and enhancers, as well as compensatory changes, became increasingly important for evolution of gene expression as evolutionary distance increases 16. However, many of these questions remain open for distantly related species where natural hybrids are not available.
In this study, we explore the phenotypic and molecular consequences of hybridization between Kluyveromyces marxianus and S. cerevisiae, by synthetically constructing monochromosomal hybrids between the two distantly related species. K. marxianus is a yeast species that belongs to the Saccharomycetaceae family but only distantly related to S. cerevisiae (Fig. 1A). It has been previously argued that it may serve as an emerging model for bioproduction, including heterologous proteins, bioethanol and bulk chemicals 18,19. It has a number of traits different from S. cerevisiae, including a high growth rate, thermotolerance, and the ability to assimilate a wide variety of sugars 20–22. On the other hand, S. cerevisiae is more tolerant to ethanol than K. marxianus 19. It is widely used in industrial fermentation and has arguably the best studied genome among eukaryotes 23. Mixing the genetic materials between the two species is expected to provide novel phenotypes for industrial development, as well as to understand their evolution.
(A) Phylogenetic tree of the Saccharomycetaceae family based on 71 selected species. Blue ellipses show species capable of interspecific hybridization, while red ellipse marks the event of whole-genome duplication. Sc and Km are highlighted in orange. The tree was constructed with iqtree2 35 with concatenated coding sequences of 2,408 orthologous groups 31. (B) The evolutionary distance between Km and Sc, compared to species in other taxa that are capable of interspecific hybridization (solid lines) or tolerant to chromosomal transfer (dashed lines). Data of amino acid substitutions per site were generated by Orthofinder2 36. Sp, Saccharomyces paradoxus. Su, S. uvarum. (C, D) Transformation efficiency (C) and stability (D) of plasmids containing different combinations of ARS and CEN alleles in the two species (blue for Sc, red for Km). Stability refers to the percentage of cells containing the plasmid after being grown in non-selective YPD medium for 24 h. Lines in the boxplots and whiskers show the median, maximum and minimum values. (E) Mating assay between Km and Sc. Haploid Km or Sc cells containing a kanR (red cross) or hygR (blue cross) plasmid were mixed and spotted onto a medium containing hygromycin (Hyg), kanamycin (Kan), or both. Mating efficiency was quantified by counting double-resistant colonies (see Methods), represented by the mean of three replicate experiments.
Stable allodiploid hybrids between the two species haven’t been made available, although partial integration of S. cerevisiae genome fragments into K. marxianus has been achieved by protoplast fusion 24,25. In order to introduce S. cerevisiae genetic materials in a systematic, controllable manner, we developed a genetic-engineering strategy to artificially transfer chromosomes between K. marxianus and S. cerevisiae. Artificial chromosome transfer was first established in 1977, when murine chromosomes were transferred into human cells via microcells26. To date, artificial transfer of functional chromosomes has primarily been achieved between closely related species (dashed lines, Fig. 1B), including transfers of human chromosome 21 into mouse and chicken cells 27–29, and between bacteria species Mycoplasma mycoides and M. capricolum 30. S. cerevisiae (Sc) and K. marxianus (Km) diverged 114 million years ago 31, significantly exceeding the evolutionary distance between the species that have been shown capable of natural or artificial chromosome transfer (Fig. 1B). Previous efforts of protoplast fusion between Sc and Km showed that Sc genomes are often unstable after the fusion 25. One possible explanation is the incompatibility of replicating elements such as centromere (CEN) and autonomously replicating sequence (ARS), as previously shown in hybrid incompatibility studies 32,33. In this study, we engineered the Sc chromosome to resolve potential incompatibilities, including circularization of the chromosome and insertion of ARS and CEN from the host species34. The engineered Sc chromosome was transformed into Km, creating a monochromosomal hybrid. Through phenotypic and gene expression analysis, we showed that functional chromosome transfer between distantly related species can produce beneficial phenotypes associated with overexpression of genes from the Sc chromosome. The transferred chromosomes triggered wide-spread transcriptional responses in the transferred and host genomes, suggesting cross-species regulation. Finally, we found that divergence of gene expression of the two species is often contributed by compensating cis- and trans-regulatory changes, providing new evidence for regulatory evolution across a long evolutionary timescale. Our strategy has potential applications in optimizing microbial cell factories and advancing evolutionary studies.
Results
Compatibility of ARS and CEN between Sc and Km
It is critical to ensure proper replication and segregation of the heterologous chromosome upon chromosome transfer. Therefore, we started by investigating the compatibility of ARS and CEN between Sc and Km. We examined transformation success and stability of plasmids carrying different combinations of KmARS1, KmCEN5, ScARSH4 and ScCEN6. We found that KmARS1 and KmCEN5 was respectively essential for successful transformation (Fig. 1C, left) and plasmid maintenance in Km (Fig. 1D, left). Neither could be replaced by their Sc counterparts, ScARSH4 and ScCEN6, suggesting incompatibility. Plasmids carrying KmCEN5+KmARS1 produced transformants in Sc (Fig. 1C, right), but with poor stability (Fig. 1D, right), suggesting that KmARS1 but not KmCEN5 can function in Sc. Similar results were obtained using combinations of KmARS18, KmCEN3, ScARS1 and ScCEN4 (Fig. S1). The observed incompatibility could explain the rapid loss of Sc chromosomes in previous protoplast fusion between Km and Sc 25. Finally, the plasmid containing both ARSs and CENs from Sc and Km, known as double CEN/ARS plasmid, could replicate and segregate stably in both Km and Sc (Fig. 1C, left). Therefore, the issue of Sc chromosomal instability in Km may be resolved by engineering KmCEN and KmARS into Sc chromosomes.
Using double CEN/ARS plasmids carrying different antibiotic markers, we examined the mating rate between Km and Sc. We transformed haploid Km and Sc cells with plasmids carrying the kanRor hygR marker, mixed them in different combinations (see Methods) and selected for zygotes on double antibiotic medium (Fig. 1E). The results showed that the mating efficiency between Km MATa and MATα cells, as well as between Sc MATa and MATα cells, was around 16%. However, mixing 1.6×107 Km cells with 1.6×106 Sc cells, regardless of their mating types, failed to yield any zygotes, based on both double selection (Fig. 1E) and microscopic observation (Fig. S2). These results demonstrated a prezygotic reproductive barrier between the two species, which urges a synthetic method for generating Km-Sc hybrids.
Engineering and transferring Sc chromosomes into Km
We selected the smallest Sc chromosome, chrI (chr1, 252 kb, 117 genes), and the third smallest chromosome, chrIII (chr3, 341 kb, 184 genes), for proof-of-principle experiments of Sc-Km chromosomal transfer. The experimental pipeline is shown in Fig. 2A. First, we circularized the chromosomes of interest to avoid potential incompatibility of telomeres. Sc telomeres (TELs) are composed of TG1−3 repeat sequences, while Km TELs consist of a long repeat motif (25 nt)37. We removed the TELs from chr1 and chr3, respectively, and joined the ends with KmURA3 with CRISPR/Cas9 38. To avoid homologous recombination at the telomeres, the telomere-associated long repetitive sequence on chr1 was deleted during circularization 38. Next, to ensure stable maintenance of Sc chromosomes in Km, we inserted KmCEN5 /ARS1 adjacent to the native CENs on chr1 and chr3. Given that chr1 and chr3 contain 5 and 12 ARSs, respectively 39, we placed another copy of KmARS1 close to a native ScARS, which was about ∼100 kb away from the KmCEN5 /ARS1 in the circular chromosomes (see Methods for details). The engineered circular chr1 and chr3 were named R1 and R3, respectively. Sc cells with R1 (“Sc-R1”) or R3 (“Sc-R3”) showed no fitness defect under normal culture conditions as well as in the presence of microtubule depolymerizing agents and DNA damage agents (Fig. S3), indicating that the introduction of KmARS1 and KmCEN5 did not affect chromosome replication in Sc. The circularization was confirmed by pulsed-field gel electrophoresis (PFGE), which showed an absence of linear chr1 or chr3 in the gel (Fig. 2B). Following a protocol developed by Noskov et al. 40, we extracted and column-purified R1 and R3, with the undesired linear chromosomes removed by exonuclease. Finally, the purified R1 and R3 were separately transformed into Km using protoplast transformation 41, yielding four R1 transformants (Kluyveromyces-Saccharomyces-R1, “KS-R1” hereafter) and one R3 transformant (“KS-R3”). The low number of transformants indicates a need for further optimization of the transformation protocol, especially for transferring larger chromosomes.
(A) Experimental pipeline for transferring Sc chromosomes into Km. We used CRISPR technology to remove TELs (black points) from Sc chr1 or chr3 and connected the ends with the KmURA3 marker (yellow triangle). We then inserted KmCEN5/ARS1 (red points and green rectangles) and a drug resistance marker (orange triangles, kanR in chr1 and hygR in chr3) near the endogenous CEN sequences (blue point). Another copy of KmARS1 and a drug resistance marker (hygR in chr1 and kanRin chr3) were placed close to a native ScARS. The circular chromosome was then extracted and transferred into Km via protoplast transformation, resulting in a monochromosomal hybrid containing eight Km chromosomes (red) and one Sc chromosome (blue). (B) PFGE of chromosomal extracts from Sc-R1, Sc-R3 and the parental Sc strain. (C, D) PCR of markers in the circular chromosomes in KS-R1 (C, represented by one transformant) and KS-R3 (D). Purple box labels the missing marker, #7, in KS-R3. See Table S6 for marker positions. (E) PFGE of R1 and R3 (arrows) in the hybrids, linearized with NotI and AscI respectively. Ctrl: non-digested KS-R1 chromosomal extracts. λ: Lambda PFG Ladder. (F) Illustration of the 29 kb deletion in R3 of KS-R3. Coordinates are based on a manually curated R1 reference sequence (Table S4). (G) Growth curves of KS-R1, KS-R3, and Km-V in YPD. The values represent mean ± SD (n=3). (H) Stability of circular chromosomes in hybrids. The values represent the mean (n=3). (I) Spot assay of KS-R1 and KS-R3 in the presence of benomyl (BML), thiabendazole (TBZ), hydroxyurea (HU), methyl methane sulfonate (MMS), camptothecin (CPT) or cycloheximide (CHX). Unless otherwise indicated, the plates were incubated at 30 degrees for 1 day.
We examined KS-R1 and KS-R3 for the integrity of the transferred chromosomes. All four KS-R1 transformants retained six Sc-specific markers located on R1 (Fig. 2C, see Table S6 for marker positions), indicating successful transfer of R1. The transfer was further confirmed by restriction digest (Fig. 2E) and whole genome sequencing, which found no mutations in the transferred R1. In PCR analysis of KS-R3, marker 7 was absent (Fig. 2D, see Table S6 for marker positions). PFGE analysis showed that the size of linearized R3 was smaller than the expected 340 kb (Fig. 2E). Genome sequencing revealed a 29 kb deletion between two direct repeats of putative Ty elements in R3 (Figs. 2F & S4). The deletion was potentially due to homologous recombination between Ty elements, the rate of which has been shown to significantly increase in circular plasmids during transformation 42. The deleted region contained 13 genes (MAK32, PET18, MAK31, HTL1, HSP30, YCR022C, YCR023C, SLM5, YCR024C-B, PMP1, YCR025C, NPP1, RHB1). We introduced the deleted region, in six individual segments via plasmids, into Km cells (Fig. S4). None of the segments affected growth (Fig. S4), excluding the possibility that the loss of the 29 kb region was an adaptive response to overcome incompatibility.
We next evaluated the Km genome integrity after transformation. The PFGE patterns of Km chromosomes in KS-R1 and KS-R3 were consistent with the Km parent (Fig. S5). Genome sequencing revealed 33 mutations in 11 ORFs, as well as 82 SNPs and indels in intergenic sequences in Km chromosomes in KS-R1 (Table S3). The Km chromosomes in KS-R3 contained 4 mutations in 4 ORFs, and 47 SNPs and indels in intergenic sequences (Table S3). These mutations might have been induced during the protoplast transformation. Overall, the transfer of R1 or R3 into Km did not result in fusions, translocations, or other large-scale rearrangements in the Km chromosomes, thus creating a monochromosomal hybrid containing eight Km chromosomes and one Sc chromosome.
Subsequently, we examined the hybrids for any fitness defects. In rich medium (YPD), the growth curves of KS-R1 and KS-R3 were indistinguishable from Km-V, the parental Km strain with a void vector (Fig. 2G). Under non-selective conditions, R1 and R3 were highly stable in the hybrids. The loss rate per generation for R1 and R3 in YPD was 0.11% and 0.84%, respectively (Fig. 2H), a level comparable to that of circular YACs in Km 43. Furthermore, KS-R1 and KS-R3 exhibited robust growth in the presence of the microtubule depolymerizing agents benomyl (BML) and thiabendazole (TBZ), DNA replication inhibitor hydroxyurea (HU), the DNA-damaging agents methyl methane sulfonate (MMS) and camptothecin (CPT) 44, or protein synthesis inhibitor cycloheximide (CHX) (Fig. 2I). Taken together, the introduction of R1 and R3 did not affect the growth, chromosome segregation, DNA replication, or protein synthesis of the hybrids in rich medium. The Km cells were tolerant to the transplantation of Sc chromosomes, reflecting cellular plasticity.
Hybrids exhibit heterosis due to interactions between Km-encoded trans-factors and Sc cis-regulatory sequences
We next investigated if the transferred chromosomes conferred novel phenotypes. We treated the hybrids and their parents with twenty environmental conditions and semi-quantified their relative growth on solid media, using YPD growth as a reference (Figs. 3A & S6). The monochromosomal hybrids KS-R1 and KS-R3 exhibited comparable growth to Km-V under 19 and 15 conditions, respectively (Fig. 3A), suggesting that the small number of genes in R1 and R3 did not trigger a wide-spread metabolic reprogramming in Km. Under a few conditions, R1 and R3 caused growth defects, suggesting potential incompatibility. For instance, KS-R1 exhibited increased sensitivity to 42 ℃ and KS-R3 exhibited increased sensitivity to tunicamycin (TM) (Fig. 3A-B). The defective growth phenotypes were not shared by the two hybrids, indicating that the defects were caused by dominant effects of genes on R1 or R3, rather than a general effect caused by chromosomal transfers. Additionally, we ruled out the possibility that the mutations in the Km genome caused the phenotypes, because the strains no longer exhibited the phenotypes after losing the Sc chromosomes in non-selective medium (Fig. S7).
(A) Growth phenotypes of hybrids and their parental strains under various conditions. Relative growth was semi-quantified through greyscale scanning of images from spot assays and normalized to YPD growth. Original images are provided in Fig. S6. The heatmap shows Z-scores of log2-transformed data. Grey denotes conditions where the strain could not grow. Glu, glucose; 1/4 N, one-quarter ammonium sulfate; 1/2 AA, half amino acids; 1/2 N, half ammonium sulfate; TM, tunicamycin; 5-FU, 5-fluorouracil. (B) Growth defects under tunicamycin or 42°C treatment associated with the transferred Sc chromosomes. (C) Improved NaCl tolerance associated with the promoter sequence of ScSPS22. ScSPS22, KmSPS22 and PKm-ScSPS22 alleles were expressed with a centromeric plasmid (indicated by the subscript CEN) or a 2µ plasmid (ScSPS222µ) in either Km (upper panel) or Sc-R3 (lower panel). Serial dilutions were performed at 1:5 for all spot assays in this study. (D) Relative mRNA levels of ScSPS22 and KmSPS22 measured by qPCR. mRNA levels in the hybrid and Km cells were normalized to the average level of three housekeeping genes, KmSWC4, KmTRK1, and KmMPE1. mRNA levels of ScSPS22 in Sc were normalized to the average level of ScSWC4, ScTRK1, and ScMPE1. The values represent the mean ± SD (n=3). (E, F) Enhanced flocculation in KS-R1 associated with ScFLO9. ScFLO9 was expressed with either a centromeric plasmid (ScFLO9 CEN) or a 2µ plasmid (ScFLO92µ) in Km or Sc-R1. The strains were cultured overnight in SD-Ura or SD-Leu (see Methods) to an OD600 of 20. The cultures were vortexed vigorously and left still, before being imaged (E) and examined for the OD600 of the supernatant (F) at designated timepoints. The values in (F) represent the mean ± SD (n=3). (G) Relative expression levels of ScFLO9 and KmFLO5 measured by qPCR and normalized as described above. The values represent the mean ± SD (n=3). *, p<0.05; ***, p<0.001. NS, not significant. P-values were based on t-tests.
Interestingly, the chromosomal transfer resulted in heterosis. KS-R3 exhibited enhanced tolerance to high concentration of sodium chloride (NaCl) compared to its parents (Fig. 3C). The high-salt condition is a common stress in industrial processes. Therefore, increased tolerance to NaCl is a beneficial phenotype for industrial production 45. We next investigated the molecular mechanisms by which R3 increased NaCl tolerance in the hybrid. Among the genes carried by R3, ScSPS22 is involved in β-glucan synthesis and related to cell wall function 46. In a previously published genome-wide screen, overexpressing ScSPS22 reduced NaCl tolerance 47. We thus tested if ScSPS22 contributed to the enhanced NaCl tolerance of the hybrid. ScSPS22, along with its 1000 bp upstream sequence (promoter), was introduced into Km via a centromeric plasmid. We found that, in contrast to its role in Sc, ScSPS22 increased NaCl tolerance of Km to the same level as KS-R3 (Fig. 3C, upper panel). Expressing the Km ortholog of ScSPS22, KmSPS22 (KLMA_10044), did not have an obvious effect (Fig. 3C, upper panel). This result indicates that ScSPS22 is the causal gene for the increased NaCl tolerance in KS-R3. Additionally, overexpressing ScSPS22 in Sc with centromeric or 2-micron plasmids reduced NaCl tolerance (Fig. 3C, lower panel), consistent with the previous report.
In order to understand the contribution of cis-regulatory and coding sequences (CDS) to the phenotypic effect of ScSPS22, we constructed a chimeric allele consisting of the promoter of KmSPS22 and the CDS of ScSPS22, namely Pkm-ScSPS22. There was no substantial difference between Pkm-ScSPS22 and KmSPS22 when expressed with a centromeric plasmid (Fig. 3C, upper panel), suggesting a lack of functional divergence in the CDS. On the contrary, Pkm-ScSPS22 conferred a much lower level of NaCl tolerance than ScSPS22 driven by its endogenous promoter, indicating that the Sc promoter was causal for the increase in NaCl tolerance.
The different phenotypic effects of Sc and Km promoters suggest potential divergence in gene expression. Therefore, we analyzed the transcription level of ScSPS22 and KmSPS22 in the hybrid and parents by qPCR (Fig. 3D). The relative mRNA level of ScSPS22 in KS-R3 was respectively 53 and 80 times higher than those in Sc-R3 with and without NaCl (Fig. 3D, right panel), suggesting that (1) a significant difference in trans-acting regulators for ScSPS22 between Sc and KS-R3; and (2) the up-regulation of ScSPS22 in KS-R3 is constitutive. The relative mRNA level of KmSPS22 did not significantly differ between KS-R3 and Km-V (Fig. 3D, left panel), indicating little difference in the trans-acting factors for KmSPS22 between Km and KS-R3.
When transformed into Km on a centromeric plasmid, the ScSPS22 allele was expressed at the same level as ScSPS22 in KS-R3 (p > 0.05 for both YPD and NaCl conditions, Student’s t-test, Fig. 3D), suggesting that the promoter sequence of ScSPS22 was causal for its up-regulation in KS-R3. The chimeric allele Pkm-ScSPS22 drove much lower expression than ScSPS22 in Km, consistent with cis-regulatory divergence. In Sc-R3 background, expressing ScSPS22 with centromeric or high-copy 2μ plasmids increased its relative mRNA level, but failed to match the level of ScSPS22 in KS-R3, suggesting that the upregulation of ScSPS22 in KS-R3 resulted from induction or de-repression by Km-specific trans-acting factors, rather than increased gene copy number. Taken together, the phenotypic and expression analysis showed that the heterosis in NaCl resistance was caused by specific interactions between Km-encoded trans-acting regulators and the Sc promoter of SPS22.
Although KS-R1 did not show any significant heterosis in the twenty conditions shown in Fig. 3A, we found that it exhibited a flocculation phenotype surpassing that of the parental Km-V and Sc-R1 (Fig. 3E&F). The flocculation phenotype disappeared after the loss of R1, indicating that R1 is responsible for the enhanced flocculation (Fig. S8). Flocculation facilitates cell sedimentation, reducing separation costs in high-density fermentation 48. The FLO family genes control flocculation 49, among which ScFLO9 is present on R1. Introducing ScFLO9 with its 1000 bp upstream sequence into Km via a centromeric plasmid enhanced flocculation to the same level as KS-R1 (Fig. 3E&F). This result suggests that ScFLO9 is responsible for the increased flocculation phenotype in KS-R1. Overexpressing ScFLO9 with a centromeric or 2μ plasmid in Sc slightly enhanced flocculation (Fig. 3E&F), consistent with its known role in Sc but suggesting its effect is background dependent.
We similarly analyzed the expression level of FLO9 to understand the molecular mechanisms of the heterosis phenotype. qPCR showed that the relative mRNA level of ScFLO9 in KS-R1 was 64 times higher than in Sc-R1 (Fig. 3G), again suggesting a significant change in the trans-regulatory environment. The high expression in KS-R1 was recapitulated by ScFLO9 expression from a centromeric plasmid in Km (light pink bars in Fig. 3G, p > 0.05, Student’s t-test). The expression level of KmFLO5 (KLMA_10835), the ScFLO9 ortholog, was not increased in KS-R1, consistent with a lack of change in trans environment for Km genes (Fig. 3G, left panel). Overexpressing ScFLO9 in Sc-R1 with centromeric and 2μ plasmids led to an increase in the mRNA level of ScFLO9 expression (purple bars in Fig. 3G), with the higher level of ScFLO92μ correlating with faster flocculation (Fig. 3F). Taken together, the enhanced flocculation of KS-R1 was associated with ScFLO9 being exposed to an alien trans-regulatory environment encoded by the Km genome. In both cases of SPS22 and FLO9, the genetic combinations between Km-encoded trans-factors and Sc-encoded cis-factors would never be found in nature due to reproductive isolation, demonstrating the power of cross-species synthetic biology.
Pervasive transcriptional responses triggered by chromosomal transfer
The experiments with ScSPS22 and ScFLO9 revealed significant changes in Sc gene expression after being transferred to Km. To determine if this is a general pattern, we analyzed transcriptomes of the hybrids (KS) and parental strains in YPD and under two stress conditions, 1 μg/mL tunicamycin (TM) and 1 M NaCl. All 89 genes on R1 and 151 genes on R3 (excluding those in the deleted 29 kb region and artificially inserted marker genes KanR, hygR and KmURA3) were actively transcribed in the hybrids under the three conditions (read counts >5) (Table S4). However, their expression in hybrids poorly correlated with that in Sc, with an average correlation coefficient (R²) of 0.31 for R1 genes between Sc and KS expression, and an average of R² of 0.36 for R3 genes (Fig. 4A). This correlation was substantially lower than the correlation between the expression of human genes on Hsa21 transplanted into mouse cells and their counterparts in human cells (R²=0.90) (21), which aligns with the remote evolutionary distance between Km and Sc (Fig. 1B). Compared to Sc, on average, 38.1% and 16.8% of the genes in R1 and R3, respectively, were up-regulated in KS, while 16.7% and 29.8% of the genes in R1 and R3 were down-regulated in KS, respectively (FDR-adjusted p < 0.05 by Wald test and absolute log2FoldChange >1; Fig. 4B). The median log2-fold change for up- and down-regulated genes was 2.06 and -1.67, respectively. Consistent with the case studies of SPS22 and FLO9, the high proportion and fold change of differentially expressed Sc genes reflect dramatic differences in the trans-regulatory environment between Sc and the hybrids.
(A) Correlation of gene expression between hybrids (KS) and Sc strains. The expression level was represented by log2 (average normalized read counts). Each point represents the average of three replicates. Adjusted R2 were derived from linear regression. N = 84 genes for R1, and 151 genes for R3 (see Methods for data filters). DEG, differentially expressed genes. n.s., not significant. (B) Number of significantly up-regulated and down-regulated R1/R3 genes in hybrids. The number inside the column represents the percentage of up- or down-regulated genes. (C) Expression differences [log2(KS/Sc)] along R1 and R3 chromosomes in the hybrids in YPD. The gray shading indicates the deleted segment in KS-R3. Dashed lines indicate a log2FoldChange of 1 or -1. VBA3 and GEX1 are in red and CHA1 is in orange. For stress conditions, see Fig. S9. (D) The relative mRNA levels of HMLα1 and HMLα2 in KS-R3 and Sc-R3 determined by qPCR. The mRNA levels were normalized to the average of three housekeeping genes as described in Fig. 3D. The values represent the mean ± SD (n=3). (E) Number of DEGs between KS strains and Km-V out of 4,857 Km genes. Y, YPD; T, TM; N, NaCl. In this figure, all DEGs are defined as an FDR-adjusted p-value < 0.05 by Wald test and abs(log2FoldChange) > 1.
The Sc genes showing differential expression between Sc and hybrids were in general uniformly distributed across the transferred chromosomes (Figs. 4C & S9), but we found a region around HML that showed de-repression across multiple genes in KS-R3, providing an interesting example for chromatin-level regulatory divergence. The HML locus is flanked by GEX1 and VBA3 on one side and CHA1 on the other side (Fig. 4D). Notably, GEX1 and VBA3 were substantially upregulated in KS-R3 under all three conditions compared with those in Sc-R3, while the level of CHA1 did not change significantly (Figs. 4D & S9). GEX1 and VBA3, but not CHA1, are bound by the SIR complex that spreads from the HML locus in Sc 50. This suggests de-repression of the HML locus in KS-R3. The α1 and α2 genes within HML were silenced in the MATa strain Sc-R3, with their RNAseq reads falling below the read-count threshold (see Methods). qPCR showed that the relative mRNA levels of α1 and α2 were upregulated 42-fold and 39-fold, respectively, in KS-R3 compared with Sc-R3 (Fig. 4D). The composition of silencers between Kluyveromyces and Saccharomyces differs significantly 51. The silencer of HML in KS-R3 might not be able to recruit essential proteins such as Rap1 and ORC to the locus 52, contributing to the de-repression of α1 and α2, and flanking GEX1 and VBA3.
While the expression of Sc genes underwent significant changes in the hybrids, R1 and R3 also profoundly affected the expression of Km genes. Out of 4,857 Km genes, 1,248 and 1,121 genes in KS-R1 and KS-R3, respectively, exhibited significant expression differences compared to Km-V in YPD (Fig. 4E, Table S5). R1 encodes a transcription factor, Oaf1, that regulates at least 97 Sc target genes 53. Among the Km orthologs of the known Sc target genes (see Methods for ortholog assignment), 34 (42.5%) showed differential expression in KS-R1. R3 encodes four transcription factors regulating at least 305 Sc target genes 53, with 71 (35.9%) of their Km orthologs differentially expressed in KS-R3 (Table S5). This suggests that transcription factors encoded by R1 and R3 might directly alter gene expression in Km, but a large fraction of the observed transcriptional changes might be explained by indirect or uncharacterized regulation. Under TM and NaCl stresses, the number of differentially expressed genes was substantially lower than in YPD (Fig. 4E, Table S5). These results indicate that the heterologous chromosomes induced pervasive changes in expression of the endogenous genome, potentially via novel Sc-Km regulatory interactions. Interestingly, regulation from the Km genome became dominant under stress conditions, removing much of the effects associated with foreign chromosomes found in YPD.
Contributions of cis and trans effects to the expression divergence between Km and Sc
The monochromosomal hybrids provide a unique opportunity to examine the contribution of cis- and trans-regulatory divergence to evolution of gene expression of the two remotely related species. In the hybrid, alleles from the two species are exposed to the same trans environment, so their expression differences stem from differences in cis-acting regulatory elements, such as promoters, that are physically located on the same DNA strand of the regulated genes. The expression differences between the parental species are the sum of cis-and trans- acting effects, the latter being diffusible products (such as transcription factors) that are not linked with the affected genes (Fig. 5A) 9. Studies on Saccharomyces and Drosophila hybrids indicate that expression divergence between species was mainly due to cis effects, while trans effects play a major role in transcriptional responses to environmental changes 11,54. However, whether this tendency can explain the expression divergence between more distantly related species remains unresolved. Furthermore, in the cases where both cis and trans effects showed a significant contribution, it is often found that they act in a compensatory manner, consistent with stabilizing selection on gene expression.
(A) Schematic representation of cis and trans effects in expression divergence between Km and Sc. The expression differences between Sc and Km reflect a combination of cis and trans effects (cis+trans). The expression differences between alleles in the hybrid reflect cis effects. Trans effects are calculated by subtracting cis effects from cis+trans effects. (B) Clustered heatmap of genes with only significant cis effects (cis only), only significant trans effects (trans only), or a combination of significant cis and trans effects (cis & trans) in YPD. (C) Proportion of genes showing cis only, trans only, cis & trans, compensatory (cis and trans effects act in the opposite directions), and reinforcing (cis and trans effects act in the same directions) effects in three conditions. Numbers indicate the proportion of cis only, trans only, and cis & trans category. (D) Genome-wide Pearson correlations between cis and trans effects across conditions. YPD (Y), TM (T), NaCl (N). (E) The number of genes with significant cis and trans effects among those showing the divergence of transcriptional response. (F) Proportion of genes exhibiting conserved and diverged regulation in YPD. For stress conditions, see Fig. S10. (G) The magnitude of cis and trans effects between essential and dispensable genes, and between genes with or without paralogs in Sc. The magnitude was represented by the average of the absolute values of cis or trans effects under three conditions. Bars and whiskers represent mean ± SD. (H) The magnitude of expression differences in Sc-specific genes and orthologous genes. The ratio between the expression level of a gene in the hybrid and that in Sc was calculated. The magnitude was represented by the average of the absolute values of the log-transformed ratios across three conditions. Values represent mean ± SD (n=3). P values in (G) and (H) were from Student’s t-test. *, p<0.05; ***, p<0.001.
In order to characterize the effects of cis and trans regulatory divergence, we identified Km orthologs for 70/89 genes on R1 and 112/151 genes on R3 (see Methods), with an average amino-acid sequence identity of 52.8% (Table S7). By comparing expression levels between alleles in the hybrids and parents, we identified genes with significant differential expression driven by cis-only, trans-only or cis-and-trans effects (see Methods; Fig. 5B&C). The genes were most commonly under both cis and trans effects (35-55%, Fig. 5C), which deviates from the previous notion that cis effects become increasingly dominant as species diverge 16. However, we note that the pronounced trans effect could be associated with the fact that only one chromosome was transferred into the hybrids (see Discussion). Interestingly, when both significant, the cis and trans effects often act in the opposite direction, i.e. their effects compensate each other (green bars, Fig. 5C). Only a marginal proportion of genes showed reinforcing effects, where cis and trans effects act in the same direction (blue bars, Fig. 5C). This level of prevalence of compensatory effects suggests that stabilizing selection on gene expression might operate on a much longer evolutionary distance than previously appreciated.
Across environmental conditions, the correlations within cis effects (average correlation of 0.89) and trans effects (average correlation of 0.85) were high, indicating that the mechanisms causing gene expression divergence under different conditions are similar (Fig. 5D). Among the genes showing significant transcriptional responses to TM or NaCl treatment, genes with significant trans effects outnumbered those with cis effects (Fig. 5E). This highlights the major role of trans effects in driving divergent expression responses under environmental stress, aligning with results from intragenus hybrids 11.
There were 44 (YPD), 60 (TM) and 57 (NaCl) genes that showed a conserved level of gene expression between Km and Sc, without any significant cis or trans effects. What factors could potentially drive their conservation? We considered (1) essentiality of genes and (2) the effect of gene duplication for this question. We found that the proportion of genes with conserved expression was slightly higher in essential genes compared to dispensable genes in all three conditions (Figs. 5F and S10, Table S7), although this trend was not statistically significant. After separating from Km, Sc underwent whole-genome duplication (WGD), leading to paralogs in Sc genome (Fig. 1A, Table S7). We found that the genes with paralogs were less likely to exhibit conserved expression than those without paralogs (p = 0.03 and 0.018 for YPD and TM conditions; p = 0.23 for the NaCl condition, Fisher’s exact test; Figs. 5F & S10), consistent with the notion that gene duplication promotes functional divergence 55. Among genes exhibiting regulatory divergence, the magnitude of cis effects in essential genes was significantly lower than in dispensable genes (p < 0.001, Student’s t-test, Fig. 5G). The cis effects in genes without paralogs were significantly smaller than in genes with paralogs (p < 0.05, Student’s t-test, Fig. 5G). In contrast, there was no significant difference in the magnitude of trans effects between essential and dispensable genes, nor between genes with or without paralogs (Fig. 5G). These findings suggest that the regulatory divergence in non-essential genes and paralogous genes could be partially driven by cis- but not trans-regulatory evolution 56.
Finally, we note that the monochromosomal hybrids’ trans environment predominantly resembled Km, given that the transferred R1 and R3 chromosomes contained only a number of genes. As shown above, the pervasive expression changes in Sc genes upon chromosomal transfer reflected the divergence in trans-acting factors between Km and Sc. We expect that the change in trans environment would cause a more dramatic effect on Sc-specific genes that evolved after the split of the Sc and Km lineages. We identified 59 Sc-specific genes in R1 and R3, which did not have any identifiable orthologs in Km (Table S7). Indeed, the magnitude of expression differences between hybrids and Sc was significantly greater for Sc-specific genes than genes with orthologs (p < 0.05, Student’s t-test; Fig. 5H). This potentially suggests that regulation of Sc-specific genes required co-evolution of trans-acting factors in a lineage-specific manner.
Discussion
Interspecific hybridization often leads to novel phenotypes that are of interest for industrial purposes and evolutionary studies. For diverged species that cannot mate naturally, artificial chromosomal transfer provides an opportunity to explore the molecular and phenotypic consequences of combining genetic materials evolved independently during millions of years of evolution. In this study, we, for the first time, successfully introduced chromosomes of S. cerevisiae to K. marxianus, two species both with great industrial potential but genetically as diverged as human and lancelet (Fig. 1B). We show that the transferred Sc chromosomes are stably maintained in Km. The monochromosomal hybrids exhibited heterosis, demonstrating the technology’s potential in future phenotypic screens. The improved salt resistance and flocculation phenotypes were associated with Sc promoters being activated by Km-encoded trans-acting factors in the artificial hybrids, to a greater level than the native regulation in Km or Sc. Finally, we examined the molecular consequences of monochromosomal hybridization with transcriptomic analysis, revealing broad transcriptional responses upon chromosomal transfer, as well as prevalent compensatory evolution of cis- and trans-regulatory changes during the divergence of the two species.
One of the technological advances in our study is the method to enable stable maintenance of Sc chromosomes in Km. There are several technical factors of consideration. First, we removed Sc TELs and circularized the chromosomes of interest. The circularization made it possible to easily remove unwanted linear chromosomes by exonuclease treatment 40, which may increase transformation efficiency. In the case of CEN engineering, it was important to ensure that the heterologous CEN are not partially functional, which has been shown to disrupt chromosome segregation and lead to dicentric breakage, thereby causing genome rearrangements and instability 57. We found that Sc and Km CENs were non-functional in the other species. Therefore, we may use double CENs for maintaining the heterologous chromosomes. Next, we considered the density of ARSs. The requirement of ARS density for YACs varied in previous reports, from one ARS per 30 kb 58, 51 kb 59, to 1 Mb 60. In our study, KmARS1 was positioned at a density of one ARS per 110 kb in R1 and 150 kb in R3, which proved to be sufficient for replication of Sc chromosomes in the hybrid. Finally, our strategy allows for possibilities of transferring multiple chromosomes into Km. This can be done either by marker recycling, i.e., popping out URA3 in the monochromosomal hybrid (see the sequence design File S1) and transforming another chromosome subsequently, or by mating with another monochromosomal hybrid.
The synthetic Km-Sc monochromosomal hybrids showed interesting phenotypic and molecular characteristics, demonstrating that chromosome transfer is functional, even across a remote evolutionary distance such as between yeast genera. Previous interspecific chromosomal transfers in yeast have been successful in using S. cerevisiae as a container to maintain heterologous genetic materials from bacteria 30,60–62, but little interaction between the host and the heterologous chromosomes has been characterized, or desired at all. In our study, we found that the Sc chromosomes were “alive” in the Km background. All genes on the transferred R1 and R3 chromosomes were actively transcribed, despite a high level of sequence divergence. Genes on R1 and R3 showed significant expression differences between KS and Sc backgrounds, and the same for Km genes between KS and Km background. The transcriptional responses indicate active interspecific regulation, which can give rise to heterosis phenotypes, as shown with SPS22 and FLO9. It is of future interest to investigate what proportion of the regulatory interactions between the two species’ genomes are associated with beneficial phenotypes, or indicative of molecular incompatibility.
There are a few advantages in constructing functional monochromosomal hybrids for synthetic biology purposes. First, it partly circumvents the problem of genome incompatibility, which is a major barrier for hybridization between remotely related species. We found that, although Km and Sc cannot form hybrids, the monochromosomal hybrids with R1 or R3 were viable. It suggests that R1 and R3 did not carry incompatibilities that are detrimental to the hybrids’ survival, although they may cause fitness defects under certain environmental conditions (Fig. 3A). These observations imply that R1 and R3 might carry environment-dependent incompatibilities, which should be noted in future studies. Second, the hybrids exhibited heterosis in a few phenotypes, demonstrating industrial potential. They also provide ideal materials for directed evolution for desirable traits, as demonstrated by other synthetic hybrids 63. Third, the limited number of genes on transferred chromosomes made it easier to identify the molecular basis of phenotypes of interest. We found that the flocculation phenotype in KS-R1 was due to overexpression of ScFLO9, consistent with its known role in Sc 64. This suggests that the mechanism by which overexpressing ScFLO9 boosts flocculation is conserved in both Sc and Km 64. The heterosis in NaCl tolerance stems from overexpression of ScSPS22 in KS-R3, which might be caused by loss of repressors 65. It remains to be understood why overexpression of ScSPS22 had an opposite effect in KS than Sc. One explanation lies in the different composition of glucan in cell wall between Km and Sc 66. As a gene responsible for synthesizing β-1,3-glucan in the cell wall, ScSPS22 might affect the two species differently under osmotic stresses.
The monochromosomal hybrids allowed for exploring evolutionary patterns of gene regulation between distantly related species. We found that the intergeneric expression differences were primarily driven by a combination of cis and trans effects (Fig. 5B, C), contrasting with previous findings with intrageneric species where expression divergence was mainly attributed to changes in cis effects 11,16. However, we note that the monochromosomal hybrids differ from allodiploid hybrids in that its trans environment closely resembles one of the parental species (Km in this case). Therefore, the Sc and Km alleles experienced imbalanced changes in trans environment and the contribution of trans effects might be overestimated. In fact, the monochromosomal hybrids resembles introgression lines where only parts of the genomes are hybridized 67,68. With this deviation in mind, our conclusion is consistent with analysis from 230 transgenic experiments on insects and nematodes, which suggests that cis-trans coevolution is more likely to accumulate over greater evolutionary timespans than cis or trans effects alone 69. Another interesting finding is the prevalence of compensatory effects of cis and trans changes (Fig. 5C) 11,16. It has been proposed that stabilizing selection acts to maintain gene expression, so the cis- and trans-acting changes might accumulate opposite effects. However, few studies have demonstrated that stabilizing selection on gene expression operates on such a long evolutionary timescale as Km and Sc. Finally, our study sheds light on evolutionary patterns that are difficult to study with intragenus hybrids, including the regulatory dynamics of paralogs and species-specific genes (Fig. 5G, H).
In summary, our study established a strategy for engineering the structural elements and ARS of a functional chromosome before transferring it between intergeneric yeasts. This strategy might be extended to other microbes as well as to plants. The resultant hybrids, containing rationally designed synthetic genomes, provide valuable resources for cell factories, synthetic biology, and evolutionary genomics.
Materials and Methods
Strains and media
Yeast strains used in this study are listed in Table S1. Sc strain W303-1A (MATa leu2-3,112 trp1-1 can1-100 ura3-1 ade2-1 his3-11,15) was subjected to chromosome engineering. The engineered Sc chromosome was transferred into a ura3Δ Km strain, FIM-1ΔU 70. The following media were used in this study: YPD 71, supplemented synthetic medium lacking uracil or leucine (SD-Ura or SD-Leu) (0.17% yeast nitrogen base without amino acids and ammonium sulfate, 2% glucose, 1 g/L sodium glutamate, 2 g/L DO Supplement-Ura (630416, Takara) or DO Supplement-Leu (630414, Takara), 2% agar for plates), SD (same recipe as SD-Ura, with the addition of 20 mg/L uracil), and ME (2% malt extract, 3% agar). G418 (A600958, Sango) and hygromycin (H8080, Solarbio) were added to YPD medium at final concentrations of 200 mg/L and 300 mg/L, respectively, to prepare YPDG and YPDH media. Both antibiotics were supplemented into YPD and SD-Ura to prepare YPDGH and SD-Ura+GH, respectively. Unless indicated, cells were grown at 30 ℃.
Plasmids
All plasmids used in this study are listed in Table S2. All primers are listed in Table S6. A previously published vector LHZ626 which contains CEN and ARS elements from both Sc and Km (double CEN/ARS plasmid), as well as KmURA3, served as a control for circular chromosomes 41. To construct plasmids for diploid selection, a fragment containing ScARSH4 and ScCEN6 was amplified from LHZ626 and inserted into the HpaI and SpeI sites of a Km-centromeric plasmid, LHZ882 72, yielding LHZ1493. KANMX6 (kanR) was amplified from pFA6a-KANMX6 and used to replace HPHMX4 (hygR) in LHZ1493 to produce LHZ1494. To create plasmids with different combinations of ARS and CEN from Km and Sc, a BamHI site was introduced between KmARS1 and KmCEN5 of LHZ882 to generate LHZ1495. ScCEN6 was inserted between the BamHI and SalI sites of LHZ1495 to generate LHZ1496. ScARSH4 was inserted between the BamHI and HindIII sites of LHZ1495 to generate LHZ1497. ScARSH4 and ScCEN6 were inserted between the HindIII and SalI sites of LHZ1495 to obtain LHZ1498.
In order to remove the telomeres and join the chromosomal ends, we first constructed CRISPR plasmids expressing single guide RNAs (gRNAs) targeting telomeres of each end of chr1 and chr3. Primers containing a 20 bp gRNA sequence were annealed in pairs and inserted into the SapI sites of pRS425-Cas9-2xSapI, resulting in plasmids LHZ1499A, B, LHZ1500A and B. Next, the gRNA cassette from LHZ1499A was inserted into the NotI site of LHZ1499B to produce LHZ1499. Similarly, the gRNA cassette from LHZ1500A was inserted into the NotI site of LHZ1500B to produce LHZ1500. The double-gRNA plasmids, LHZ1499 and LHZ1500, were subsequently transformed into yeast (see below). In order to insert KmARS1 and KmCEN5 into the circularized chromosomes, we cloned gDNAs into the SapI sites of pRS425-Cas9-2xSapI to construct LHZ1501∼1504, following the same cloning procedure as described above. The homologous repair templates (“donor DNA”, see below) used in CRISPR/Cas9 genome editing was assembled into pMD-18T with In-Fusion Snap Assembly (Clontech), according to the designs in File S1.
For flocculation analysis, the ScFLO9 cassette, including 1000 bp upstream, the ORF of ScFLO9, and 200 bp downstream sequence was amplified from the genome of W303-1A. This cassette was inserted into the NotI site of LHZ626 to produce LHZ1505, and into the SmaI and SpeI sites of pRS315 and pRS425 to produce LHZ1509 and LHZ1510, respectively.
For NaCl tolerance analysis, the ScSPS22 cassette, including 1000 bp upstream sequence, the ORF of ScSPS22, and 200 bp downstream sequence was amplified from the genome of W303-1A. This cassette was inserted into the NotI site of LHZ626 to produce LHZ1506, and into the SmaI and SpeI sites of pRS315 and pRS425 to produce LHZ1511 and 1512, respectively. The 1000 bp upstream sequence of ScSPS22 in LHZ1506 was replaced by the 1000 bp upstream sequence of KmSPS22 to produce LHZ1507. The KmSPS22 cassette, including 1000 bp upstream sequence, the ORF of KmSPS22, and 200 bp downstream sequence was amplified from the genome of FIM-1ΔU and inserted into the NotI site of LHZ626 to produce LHZ1508. The full sequences of LHZ626, LHZ1495, and pRS425-Cas9-2xSapI are listed in Table S2.
Transformation efficiency and stability of the plasmid containing different combinations of ARS and CEN from Km and Sc
Fim-1ΔU and W303-1A were grown in YPD liquid medium overnight. Cells from 1 mL- culture were pelleted and transformed with LHZ626, LHZ1495∼LHZ1498, respectively, using the lithium acetate (LiAc) method 73,74. The cells were diluted and spread onto YPDH plates. Colony-forming units (CFUs) were counted after 3 days. The transformation efficiency (CFUs/μg DNA) was calculated taking the dilution factor into account. To measure the stability of the plasmid, transformants were grown in YPD liquid medium for 24 h. The cells were then diluted and spread onto YPD and YPDH plates. Stability was determined by dividing the CFU count on YPDH plates by that on YPD plates. The experiments were replicated three times.
Mating assay
The quantitative mating assay was performed as previously described 75 with modifications. Strains transformed with LHZ1493 (hygR) were used as experimental cells, and strains transformed with LHZ1494 (kanR) served as tester cells. Both experimental and tester cells were cultured overnight in YPDH and YPDG liquid media, respectively. Cells were then washed and resuspended in H2O to an OD600 of 1. A total of 100 μL of experimental cells was mixed with 500 μL of tester cells. The mixture resulted in a 1:5 ratio of experimental to tester cells when both cell types were Km or Sc. The mixture resulted in a 1:10 ratio of Sc experimental cells to Km tester cells, as CFU per OD600 of Km cells was twice that of Sc cells. The mixed cells were pelleted, resuspended in 20 μL of ddH2O, and spotted onto ME plates. After incubation at 30 ℃ for 24 h, cells were washed off the ME plates with ddH2O and then pelleted and resuspended in 1 mL of ddH2O. To visualize the mating result, a total of 3 μL cells were spotted on YPDH, YPDG, and YPDGH (Fig.1E). These plates were incubated at 30 ℃ for 24 h. To quantify mating efficiency, the cells were diluted and plated onto YPD and YPDGH plates. The mating efficiency was calculated as the number of colonies on YPDGH plates divided by one-sixth of the number on YPD plates, reflecting the 1:5 ratio of experimental to tester cells. The experiments were replicated three times.
Engineering of Sc chromosome I and III
Each step of engineering Sc chromosome I (chr1) or III (chr3) involved transforming a CRISPR plasmid (Table S2) and a donor, which is a PCR product for homologous recombination repair. The composition and full sequences of the donors are listed in File S1.
Chr1 and chr3 were circularized as previously described 76. First, LHZ1499 or LHZ1500 was transformed into W303-1A to induce double strand breaks in both telomeres in chr1 or chr3. A DNA fragment “chr1-cir” or “chr3-cir” containing KmURA3 with homologous arms to the chromosomal ends was simultaneously transformed to connect the two chromosomal ends with KmURA3. The constructs above were designed to remove the left telomere (0-1111 bp) and the right telomere along with the repetitive sequence (208917-252221 bp) of chr1, and the left telomere (0-831 bp) and the right telomere (334198-341087 bp) of chr3. Successful circularization was selected by SD-Ura-Leu medium. The transformants with circularized chromosomes were named Sc-ring1 and Sc-ring3.
In order to insert KmARS1 and KmCEN5 into the circularized chromosomes, we transformed LHZ1501 and a donor “chr1-ARS1” into Sc-ring1 to insert KmARS1 and hygR at 66090 bp of chr1, resulting in a strain named Sc-ring1L. LHZ1502 and a donor “chr1-ARS1/CEN5” were transformed into Sc-ring1L to insert KmARS1/KmCEN5 and kanR at 157346 bp of chr1, resulting in a strain named Sc-R1. Similarly, LHZ1503 and a donor “chr3-ARS1/CEN5” were transformed into Sc-ring3 to insert KmARS1/KmCEN5 and hygR at 108718 bp of chr3, resulting in a strain named Sc-ring3L. LHZ1504 and a donor “chr3-ARS1” were transformed into Sc-ring3L to insert KmARS1 and kanR at 243183 bp of chr3, resulting in a strain named Sc-R3. All transformations described above followed the LiAc method 73,74.
Extraction and protoplast transformation of R1 and R3
R1 and R3 were extracted from Sc-R1 and Sc-R3 as previously described 40, which was developed for purifying yeast artificial chromosomes up to 600 kb in size. A total of 10 μL R1 or R3 was transformed into Km by protoplast transformation 42. The transformants were selected on SD-Ura+GH plates.
Pulsed-Field Gel Electrophoresis (PFGE)
Plugs containing the genome of Sc, Km, and hybrid cells were prepared by using the CHEF Yeast Genomic DNA Plug Kit (170-3593, Biorad). To separate circular R1 from linear chromosomes in KS-R1, chromosomes were separated on the CHEF MAPPERTM XA System in a 1% pulsed field certified agarose gel (162-0137, Biorad) in 0.5×TBE (diluted from 10×TBE, T1051, Solarbio) at 14 ℃. The running time was 24 h at 6.0 V cm-1, with a 60∼120 sec switch time ramp at an included angle of 120 °. The plug containing the entrapped R1 was cut out and linearized with NotI as described in the protocol of the CHEF Yeast Genomic DNA Plug Kit. Similarly, R3 in KS-R3 was separated from linear chromosomes and then linearized with AscI. Chromosomes of W303-1A, Sc-R1, Sc-R3, linearized R1 and R3 were separated in a 1% pulsed field certified agarose gel in 0.5×TBE at 14 ℃. The running time was 16 h at 6.7 V cm-1, with a 10∼40 sec switch time ramp at an included angle of 120 °. Lambda PFG Ladder (N0341S, NEB) was used as a size marker for PFGE.
Growth curves of KS-R1 and KS-R3 and the stabilities of transferred chromosomes
As a control, Fim-1ΔU was transformed with LHZ626 to produce Km-V. KS-R1, KS-R3, and Km-V were cultured in YPDGH liquid medium overnight. The overnight culture (referred to as day 0 culture) was diluted into YPD liquid medium to an initial OD600 of 0.2 and grown at 30 ℃. To monitor growth curves, the OD600 of the culture was recorded every 2 hours for the first 14 hours. To assess the stability of the transferred chromosomes, the culture was diluted into fresh YPD medium to start at an OD600 of 0.2 every 24 hours, a period corresponding to approximately 7 generations. Cultures from day 0 and after 5 days of growth were diluted and plated onto YPD and YPDGH plates on the day the cultures were collected. Stability was determined by dividing the CFU count on YPDGH plates by that on YPD plates. The loss rate of transferred chromosomes per generation was calculated as previously described 72. The experiments were replicated three times.
Genome Sequencing
Km-V, Sc-R1, Sc-R3, KS-R1, and KS-R3 were cultured overnight in SD-Ura+GH liquid medium. The cultures were diluted to an OD600 of 0.2 and cultured until the OD600 reached 0.6. The cells were collected and washed once with ddH2O. Genomic DNA was extracted using the E.Z.N.A. Fungal DNA Kit (D3390, OMEGA Bio-Tek) and sequenced on the Illumina NovaSeq 6000 platform using the 150-bp pair-end sequencing strategy (BIOZERON, Shanghai, China).
To construct the reference sequences for R1 and R3, sequences of chr1 and chr3 were extracted from the W303-1A genome (GenBank assembly: GCA_002163515.1). We then manually modified the sequences to reflect our genome edits, including the insertions of Km elements and deletions of telomere sequences, giving rise to reference sequences R1 and R3 (provided in Table S4). Annotations for R1 and R3 were transferred from S288c annotations with SnapGene. The whole-genome sequencing data were processed by BIOZERON biotechnology. To identify de novo variants during circular chromosome transfer, the raw sequencing data of Sc and KS strains were aligned to reference sequences for R1 or R3 using BWA 77 with default settings. Picard tools were used to remove PCR duplicates. The average sequencing depth of KS-R1, Sc-R1, KS-R3, and Sc-R3 was 298, 1,183, 1,241 and 1,134 for the respective R1 or R3 chromosome. SNPs and indels were called following the best practices of GATK HaplotypeCaller 78. Then, all heterozygous variants were removed from the VCF files of KS-R1 and KS-R3. All variants coexisting in both KS-R1 and Sc-R1 or KS-R3 and Sc-R3 were filtered out from the KS-R1 and KS-R3 VCF files as well. The remaining variants were considered to have occurred during the chromosome transfer. In the case of R1, no de novo mutation was found.
To identify variants in the genome of Km, raw sequencing data of KS-R1, KS-R3, and Km-V were aligned to the reference genome of FIM1 (GenBank assembly: GCA_001854445.2) 79 combined with either R1 or R3 sequence, using BWA with default parameters. Picard tools were used to remove PCR duplicates. The average sequencing depth for KS-R1, KS-R3 and Km-V was 1,036, 1,102 and 615, respectively. GATK HaplotypeCaller was used to identify the variants in these strains, following its best practices. VCFtools 80 was used to remove variants shared between Km-V and KS-R1 or KS-R3, as well as heterozygous calls. The remaining unique, homozygous variants in KS-R1 and KS-R3 were listed in Table S3.
RNAseq and qPCR
Three biological replicates of Km-V, Sc-R1, Sc-R3, KS-R1, and KS-R3 were cultured overnight in SD-Ura+GH liquid medium. The cultures were diluted into YPD to achieve an OD600 of 0.2. Once the OD600 of the cultures reached 0.6, cells were collected directly for the YPD group. For stress treatment, the culture was supplemented with tunicamycin (T8480, Solarbio) at a final concentration of 1 μg/mL, or with NaCl at a final concentration of 1 M, and cells were collected after 1 hour. Total RNA was extracted using the ZR Fungal/Bacterial RNA MiniPrep kit (R2014, ZymoResearch). Samples were reversed transcribed using TruSeqTM RNA sample preparation Kit (Illumina, California, USA) and sequenced by Illumina HiSeq X Ten (BIOZERON, Shanghai, China).
Clean reads were aligned and processed by BIOZERON biotechnology. Briefly, the reads were aligned to the corresponding reference genomes with HISAT2 (2.0.5) 81. The reference sequence for Km-V was the FIM1 genome. The reference for Sc-R1 and Sc-R3 was the S288c genome (GenBank assembly: GCA 000146045.2), with the chr1 or chr3 sequences replaced by that of R1 or R3 (see above), respectively. The reference for KS-R1 and KS-R3 combined the FIM1 genome and R1 or R3 respectively. Number of reads covering each gene was counted with featureCounts (1.6.3) in the Subread package 82, with the parameters “-p -C -B -P -O -T 16 -Q 20”. Of note, the reads for Km genes were counted based on FIM1 annotations, but the genes were renamed to be consistent with the nomenclature of a previously published K. marxianus DMKU3-1042 genome (GCA_001417885.1). The relationship between FIM1 and DMKU3-1042 labels had been previously published 79.
Differential expression was analyzed with DEseq2 83. To examine changes in expression levels before and after chromosomal transfers (Sc vs. KS; Km vs KS), the data were fitted to a model of count ∼ condition, where condition represented a combination between the strain factor (Sc/KS/Km) and the treatment factor (YPD/TM/NaCl). Adjusted p-values for pairwise comparisons between conditions (e.g. Sc-YPD vs. KS-YPD) were derived from FDR-corrected Wald significance tests, with an FDR cutoff of 0.05. Log2 fold changes were shrunken with the ashr method 84. In this analysis, Sc and Km alleles were treated independently. For Sc, only genes in R1 and R3 were included. Synthetic elements, including KmURA3, HphMX4 and KanR cassettes were excluded from the analysis. Genes with an average read count lower than 5 in any condition were removed from the analysis. The removed Sc genes included 6 genes not expressed in Sc (PAU8, YAL067W-A, YAR010C, YAR035C-A, YCL067C, and YCL066W), the 13 genes absent in KS-R3 (see Results), and one gene not expressed in either Sc or KS (YCR040W), leaving 235 out of 255 genes in the dataset. For Km, 93 genes were filtered out due to low read counts, of which 21 genes had a read count of 0 across all conditions, possibly due to annotation problems, and others showed condition-specific expression. There was a total of 4,857 Km genes after the filters. The median of read counts across conditions was 2,083 for Sc and 1,770 for Km, prior to normalization.
For qPCR, RNA samples were reverse-transcribed using a PrimeScript RT Reagent Kit (RR037A, Takara, China). The qPCR was performed using ChamQ Universal SYBR qPCR Master Mix (Q711-02, Vazyme). The mRNA level was normalized to the average of three housekeeping genes, including MPE1, TRK1, and SWC4. Primers used in qPCR are listed in Table S6.
Cis and trans effects
The Km orthologs of Sc genes were identified with OrthoDB (v11) 85. In the case where there were multiple orthologs for one gene, we selected the gene with the highest score in ortholog search in Sequence Similarity DataBase (SSDB) in Kyoto Encyclopedia of Genes and Genomes (KEGG) 86. The orthologous and Sc-specific genes, essential genes 87, and paralogous genes on R1 and R3 88 were listed in Table S7.
Cis and trans effects were calculated as previously described 17. Briefly, read counts were normalized and analyzed using DESeq2. We designated Km alleles in Km-V as kmk, Sc alleles in Sc-R1 or Sc-R3 as scs, Km alleles in hybrids as hyk, and Sc alleles in hybrids as hys. The variables gb, F, and allele were used, where gb represents genetic background (Km for kmk, Sc for scs, hybrid-Km for hyk, hybrid-Sc for hys), F represents generation (F0 for kmk and scs, F1 for hyk and hys), and allele indicates the allele (km for kmk and hyk, sc for scs and hys).
The model “count ∼ gb” was used for testing the significance of cis+trans and cis effect, i.e., contrasts between kmk and scs for cis + trans effects and between hyk and hys for cis effects. P-values were from Wald test followed by an FDR-correction. For testing trans effects, another model “count ∼ F + allele + F:allele” was used. Significant trans effect was determined by a likelihood ratio test (test = “LRT”, reduced = “F + allele”) for testing the effect of the interaction term “F:allele”. Genes with significant cis or trans effects were defined by a log2-fold change greater than 1 and an FDR-adjusted p-value less than 0.05. YPD, TM and NaCl data were analyzed separately.
The transcriptional response to stress was calculated as the ratio between DESeq2-normalized read counts before and after TM/NaCl treatment. A two-tailed Student’s t-test was used to test the significance of stress responses, with a p-value cutoff of 0.05. The contributions of cis and trans effects to the divergence of transcriptional response under stress were calculated accordingly. The summarized experiment matrix and values of cis and trans effects are shown in Table S7.
Spot assay and quantification of growth phenotypes under various conditions
Km-V, Sc-R1, Sc-R3, KS-R1, and KS-R3 were cultured overnight in liquid SD-Ura+GH medium. The cultures were diluted to an OD600 of 1.0 and subjected to five serial 5-fold dilutions. These dilutions were spotted on plates using a 48-pin replicator. To evaluate the growth in different carbon sources, we used YP in combination with 2% ethanol, 3% glycerol, or different concentrations of glucose (0.02%, 1%, 3%, and 5%). To evaluate the growth in various nitrogen sources, we replaced 1 g/L sodium glutamate in the SD medium with 1 g/L threonine, 1 g/L serine, or 1.25 g of ammonium sulfate (1/4 N). In the condition of 1/2 AA and 1/2 N, the amino acids supplemented in the SD medium were halved, and 2.5 g/L ammonium sulfate was added. In the plates for chemical treatment, YPD medium was supplemented with 0.08% H2O2, 60 mM acetic acid (AcOH), 0.4 μg/mL TM, 20 mM DTT, 15 μg/mL 5-FU, 1 M NaCl, 1 M sorbitol, 20 μg/mL benomyl (BML), 10 μg/mL thiabendazole (TBZ), 0.05 M hydroxyurea (HU), 0.01% methyl methane sulfonate (MMS), 5 μg/mL camptothecin (CPT), or 0.05 μg/mL cycloheximide (CHX). The plates were incubated at 30 ℃ or at other specified temperatures for 1-3 days before the pictures were taken. To quantify growth phenotypes, spot quantification was performed using the ‘gitter’ package 89 in R, with the dilution factor taken into account. The quantified spot value of a strain under a specific condition was divided by the value of the same strain grown in YPD, to calculate the relative growth value under that condition. Each spot value represents the average of three replicates.
To perform spot assays of cells with plasmids expressing SPS22, Sc-R3 was transformed with LHZ1511 or LHZ1512, and Fim-1ΔU was transformed with LHZ1506, LHZ1507, or LHZ1508. Transformants were cultured overnight in SD-Leu medium. The culture was diluted and spotted onto plates with or without 1 M NaCl, as described above.
Flocculation analysis
Km-V, KS-R1, and Sc-R1 were cultured overnight in SD-Ura medium. Sc-R1 was transformed with LHZ1509 or LHZ1510. Fim-1ΔU was transformed with LHZ1505. Transformants were cultured overnight in SD-Leu medium. Cells were harvested and washed with ddH2O and 250 mM EDTA. After two subsequent washes with ddH2O to ensure complete removal of EDTA, cells equivalent to an OD600 of 40 were pelleted and resuspended in 2 mL of ddH2O in a 15 mL tube. A total of 100 μL of 1 M Tris-HCl (pH 7.5) was added to the cell suspension, followed by 1 min of agitation. The tube was then left undisturbed, and pictures were taken every minute for a total of 5 minutes. Supernatant samples were collected at various time points from the surface for OD600 measurement to quantify the flocculation progress.
Funding
Science and Technology Research Program of Shanghai 24ZR1406500 (YY)
National Key Research and Development Program of China grant 2021YFA0910603 (YY)
National Key Research and Development Program of China grant 2022YFC2106201 (JZ)
National Key Research and Development Program of China grant 2021YFA0910601 (HL)
Science and Technology Research Program of Shanghai grant 2023ZX01 (HL)
Open Fund of State Key Laboratory of Genetic Engineering grant SKLGE-2318 (YY)
The Fundamental Research Funds for the Central Universities (XCL)
Young Scientists Fund of the National Natural Science Foundation of China, grant #32400495 (XCL).
Author contributions
Conceptualization: HL, YY
Methodology: HL, YY, YL
Investigation: YL, KS, JZ, HC, XCL
Writing—original draft: YL, YY
Writing—review & editing: YL, YY, YW, XH, XCL
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
DNA-seq and RNA-seq data are available in the NCBI. (https://dataview.ncbi.nlm.nih.gov/object/PRJNA1102743?reviewer=98okerqm686mrj3ddq79r4l18u)
Supplementary Materials
Fig. S1. Transformation efficiency and stability of plasmids containing different combinations of ARS and CEN from Km and Sc.
Fig. S2. Zygotes formed between Km cells and between Sc cells.
Fig. S3. Growth of Sc-R1 and Sc-R3.
Fig. S4. Identification of a 29 kb deletion in R3.
Fig. S5. PFGE of Km chromosomes.
Fig. S6. Growth of hybrids and their parental strains under various conditions.
Fig. S7. Growth of KS-R1 and KS-R3 after loss of Sc chromosomes.
Fig. S8. Flocculation of Km and KS-R1ΔR1.
Fig. S9. Expression differences along R1 and R3 chromosomes in TM and NaCl conditions.
Fig. S10. Proportion of genes exhibiting conserved and diverged regulation in stress conditions.
File S1. Donors used in the engineering of Sc chr1 and chr3.
Table S1. Strains used in this study.
Table S2. Plasmids used in this study, and sequences of LHZ626, LHZ1495 and pRS425-Cas9-2xSapI.
Table S3. SNPs and INDELs on Km chromosomes of R1&R3.
Table S4. Read counts of R1 and R3 genes, with reference sequences for R1 and R3.
Table S5. Differentially expressed Km genes in KS-R1 and KS-R3.
Table S6. Primers used in this study.
Table S7. Homologous, Sc-specific, essential and paralogous genes on R1 and R3, and their cis and trans effects under 3 conditions.
Acknowledgments
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.
- 13.
- 14.
- 15.
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵