Abstract
Synthetic genomics provides a distinct approach to systematically explore the process of genome evolution in a dynamic manner. SCRaMbLE is an evolutionary system intrinsic to the synthetic yeast genome that can rapidly drive structural variations. Here, we detected over 260,000 rearrangement events after SCRaMbLEing of a synthetic strain harboring 6 synthetic yeast chromosomes. Remarkably, it was found that the rearrangement events exhibited a specific landscape of rearrangement frequency. We further revealed that the landscape was shaped by coordinated effects of chromatin accessibility and spatial contact probability. The rearrangements tend to occur in 3D spatially proximal and chromatin-accessible regions. Enormous numbers of rearrangements by SCRaMbLE provide a driving force to potentiate directed genome evolution and investigation of the population genomic resource offer mechanistic insights into the dynamic genome evolution.
Main Text
Studying the processes and mechanisms of genome evolution is critical to understanding genetic diversity and species diversity at the genomic levels (1–4). S. cerevisiae, a powerful model organism for eukaryotic genome evolution (5–7) has been subjected to comparative genomic studies, which provided mechanistic insights underlying genome evolution (8–12). However, these studies relied on relatively static genomic sequences, and could have missed many details of dynamic processes. Synthetic genomes by de novo design and synthesis of genomic sequences incorporating a variety of strategies such as genome minimization (13), genetic codon recoding (14, 15) and introduction of synthetic parts (16) have greatly facilitated the study of genome evolution. SCRaMbLE with symmetrical loxP recombination sites positioned downstream of the 3’ untranslated regions (3’UTRs) of all nonessential genes as part of the Synthetic Yeast Genome Project (Sc2.0) (16, 17) has also been used for the study of genome evolution recently. Induced Cre recombinase activity quickly triggers rearrangements between loxPsym sites and generates structural variations, including deletions, inversions, duplications, and translocations (18–25), providing missing snapshots of dynamic genome evolution events.
Here we consolidated six synthetic chromosomes (synII (26), synIII (27), synV (28), synVI (29), synIXR (17), and synX (30)) in a single haploid strain based on an orthogonal site-specific recombination system, enabling eliminations of the entirely counterpart wild-type chromosomes. We comprehensively analyzed a SCRaMbLEd pool obtained using the poly-synthetic strain and detected over 260,000 rearrangement events via a loxPsym junction analysis method. By analyzing these rearrangement events, we uncovered a stable rearrangement landscape of the synthetic chromosomes that could be correlated to local chromatin structures and the three-dimensional genome architecture via assay for transposase-accessible chromatin using sequencing (ATAC-seq) and genome-wide chromosome conformation capture (Hi-C), respectively. Our results reveal that the rearrangements tend to occur in 3D spatially proximal and chromatin-accessible regions, which provides insights into the effect of hierarchical chromatin organization on the dynamic evolution of the genome.
Results
Consolidation and SCRaMbLE of six synthetic chromosomes
We used a stepwise method to construct a single yeast haploid strain that harbors six synthetic chromosomes (synII, synIII, synV, synVI, synIXR, synX) (17, 26–30). We crossed two haploid strains of opposite mating types harboring synV&X (yYW169) and synII, III, VI&IXR (yZY192) respectively, resulting in a diploid strain (yYW268) (Fig. S1A). Next, we employed the Vika/vox, an orthogonal site-specific recombination system with Cre/loxP (31, 32), to excise the centromere from chromosomes and realize the elimination of a whole chromosome. The homologous wild type chromosomes (wtII, wtIX, wtV, wtX, wtVI, and wtIII) were eliminated successively. (Fig. S1B) (33). Following sporulation and tetrad dissecting, the haploid strain (yYW394), with six aforementioned wild type chromosomes replaced by synthetic chromosomes, comprising ~2.61 Mb (~22.0% of the yeast genome) (Table S1) was constructed (Fig. S1A), and confirmed by pulsed-field gel electrophoresis (Fig. S1C, Table S1) and genome sequencing (Fig. S2B). The yYW394 strain grew robustly at 30°C but exhibited growth defects at 37°C (Fig. S2A). Adaptive laboratory evolution of yYW394 was performed to recover the growth fitness generating a new strain (yZSJ025) (Fig. S1A).
A Cre recombinase expression plasmid pYW085 (pRS413-pCLB2-Cre-EBD) was transformed into yZSJ025, comprising 894 loxPsym sites in the 3’UTRs of nonessential genes (Fig. S3). SCRaMbLE was then induced by β-estradiol. The SCRaMbLEd cells were recultured in fresh YPD liquid medium without β-estradiol (Fig. 1A), and subjected to deep sequencing (~600,000×). All reads (150 bp each) were screened for the presence of loxPsym. Those with flanking sequences different from the references could be identified and hereafter called rearrangement reads, which were then used to identify and classify rearrangement events (25) (Fig. S4). Identical reads were considered as a result of a single rearrangement event. In total, 263,520 rearrangement events, including 124,499 (47.24%) (Fig. 1B) intra- and 139,021 (52.76%) (Fig. 1C) inter-chromosomal events were detected. We further analyzed the intra-chromosomal rearrangement events and found 62,106 (23.57%) inversions, 22,526 (8.55%) deletions and 39,867 (15.13%) complex rearrangement events (Fig. 1D left panel) (Fig. S4). The numbers of identical reads represent frequencies of corresponding rearrangement events. When percentages of different types of rearrangements were calculated by read numbers (Fig. 1D right panel), the percentage of inter-chromosomal events reduced to 9.87%, indicating that these are relatively low frequency events compared to intra-chromosomal events. We next investigated whether there is chromosomal preference for rearrangement to occur. The total number of rearrangement reads for each chromosome was plotted against the number of loxPsym sites in that chromosome. The linear curve suggests no such preference is evident (r = 0.96, p = 0.002) (Fig. 1E).
A specific rearrangement pattern of the synthetic yeast chromosomes
In theory of Cre/loxPsym reactions, SCRaMbLE can generate random rearrangements between any two loxPsym sites on synthetic chromosomes. However, for different chromosomal loci, the results in Fig. 1B and 1C show high or low rearrangement frequencies. To estimate the local rearrangement frequencies along the synthetic chromosomes in our SCRaMbLEd pool, we mapped the number of rearrangement reads to each loxPsym site, generating a landscape of the rearrangement frequency (Fig. 2A). Rearrangement reads were identified in all 877 loxPsym sites. Clearly, loxPsym sites differed from each other in rearrangement frequencies, which range from 1,027 to 44,353 per site (Fig. 2A). The loxPsym sites with the highest and lowest rearrangement frequencies (90 each) were defined as hotspots and coldspots, and selected for further analyses (Fig. 2A, Table S4). The rearrangement frequencies of these spots are statistically different from the average frequency of the loxPsym sites (Fig. S5). The rearrangement frequency landscape of independent SCRaMbLE experiments is highly reproducible with biological replicates. (r = 0.99, p = 1.7×10-232) (Fig. 2B, Fig. S6A-F).
We also compared rearrangement patterns of the same synthetic chromosome in different strains containing varied numbers of incorporated synthetic chromosomes. Similar intra-chromosomal rearrangement patterns were observed for synV in the three synV-containing strains tested (yXZX846, yYW169 and yZSJ025) (Fig. 2C, Fig. S7A-C), as well as for synX in the two synX-containing strains tested (yYW169 and yZSJ025) (Fig. S7D, E). Overall, our results show that the patterns of SCRaMbLE are specific on each synthetic yeast chromosome and that synthetic chromosomes exhibit relatively stable rearrangement hotspots and coldspots.
Rearrangement frequency correlates with chromatin accessibility
We then want to investigate the mechanisms underlying the rearrangement patterns observed in the previous subsection. SCRaMbLE requires physical interaction between Cre recombinase and loxPsym sites, suggesting the importance of chromatin accessibility in determining rearrangement frequency in SCRaMbLE cells. To test this hypothesis, genome-wide chromatin accessibility of the yZSJ025 strain was measured by ATAC-seq (34). The ATAC-seq signals of a window containing 400 bp upstream and downstream of each loxPsym site were collected and processed. The ATAC-seq signals of the rearrangement hotspots and coldspots were individually normalized by the average signals of 877 loxPsym sites. The statistical results show that the average ATAC-seq signals of hotspots are significantly strong, while signals of the coldspots are significantly weak (Fig. 3A). The rearrangement frequencies and ATAC-seq signals of a typical loxPsym coldspot and hotspot located in a region of synX were displayed in Fig. 3B. The ATAC-seq signals reveal that the coldspot at the 3’UTR of SET4 has a weak ATAC-seq signal, while the hotspot at the 3’UTRs of PRY3 has strong ATAC-seq signals. In addition, nucleosome occupancy of the yZSJ025 strain showed that nucleosome occupancy is low in the hotspots and high in the coldspots (Fig. S8). Both ATAC-seq and nucleosome occupancy data are consistent with our hypothesis that chromatin accessibility is critical to rearrangement frequency.
We then mapped the positions of the hotspots and coldspots to the wild type genome based on neighboring sequences. We extracted and processed the ATAC-seq for the wild type yeast as we did for the synthetic chromosomes (34) (Fig. S9). Results indicate that ATAC-seq signals peak at the positions of the hotspots, and remain weak at the coldspots in the six corresponding wild type chromosomes, suggesting that sequence modifications do not perturb chromatin accessibility.
Rearrangement frequency correlates with 3D chromatin organization
Next, we aimed to explore whether, besides chromatin accessibility, the rearrangement frequency is correlated with the spatial proximity of loxPsym sites. Our SCRaMbLE system provides a unique platform to statistically evaluate the role of spatial proximity in chromosomal rearrangements. A genomic chromosome conformation capture approach (Hi-C) was thus carried out for yZSJ025, generating a contact map that displays the frequencies of spatial contacts between any two genomic loci (Fig. S10). We then extracted the information of contact frequencies regarding solely the synthetic chromosomes (Fig. 3C). Rearrangement frequency was then plotted in a similar heatmap format to facilitate direct comparison (Fig. 3D). The diagonal regions represent loci close in primary sequence in each chromosome. Consistent with this, these intra-chromosomal regions are the “hottest” in the Hi-C map (Fig. 3C), and they also exhibit the highest rearrangement frequency (Fig. 3D). That rearrangements tend to occur most frequently between adjacent loxPsym sites for all synthetic chromosomes is shown by statistical analysis of all rearrangement reads (Fig. S11).
Notably, the inter-chromosomal contact probability in the regions of centromeres and telomeres is obviously higher than that in other regions (Fig. 3C), in consistence with centromeres being clustered around the spindle pole body, telomeres being clustered with the nuclear envelope for both wild type (35) and synthetic chromosomes (36) (Fig. 3E). Compared to other regions, the pericentromeric regions exhibit higher inter-chromosomal rearrangement frequency (Fig. 1C, 3D, 3F). Similar results were found for the peritelomeric regions of the synthetic chromosomes (Fig. 1C, 3D, 3G). Taken together, our results suggest that rearrangement events are generally more likely to occur between genomic loci with spatial proximity.
As both chromatin accessibility and spatial proximity affect rearrangement frequency, we seek to distinguish and weigh these two factors on individual events. Typical intra-chromosomal rearrangement events were selected and schematically demonstrated in Fig. 4A-C, based on actual experimental data. It is showed that: (i) The difference of the frequencies of two rearrangement events at loci with similar chromatin accessibility was determined by spatial proximity (Fig. 4A). (ii) The difference of the frequencies of two rearrangement events at loci with similar spatial proximity was determined by chromatin accessibility (Fig. 4B). (iii) Two rearrangement events with similar frequency differed in chromatin accessibility, which was compensated by the difference of spatial proximity (Fig. 4C).
Discussion
In this study, we detected over 260,000 rearrangement events in the SCRaMbLEd pool, revealing the tremendous plasticity of the yeast genome. Previous mechanistic studies of SCRaMbLE were mainly focus on the correlation of chromosomal rearrangement with nucleotide sequence (19, 21, 23, 25, 37). Here we revealed that the rearrangement frequency landscape is molded by chromatin accessibility and spatial proximity from an epigenetic perspective. The biochemical essence of SCRaMbLE is recombination between pairs of different loxPsym sites in the synthetic chromosomes catalyzed by Cre recombinase (38), we speculate that local chromatin accessibility affects rearrangement by affecting the ability of Cre to access loxPsym sites, while 3D chromatin organization affects the contact probability of two loxPsym sites. For the effect of spatial contact probability on rearrangement, previous studies also showed that intra-chromosomal structural variations in cancer are related to 3D genomic organization (39–41). Our SCRaMbLE system may provide a platform to enable the experimental characterization and mechanistic study of the role of genome rearrangement in disease.
The use of synthetic chromosomes and SCRaMbLE allowed us to statistically reveal that the inter-chromosomal rearrangement hotspots strikingly clustered at the peritelomeric and pericentromeric regions. Our results are consistent with previous reports that increased inter-chromosomal reshuffling occurred in the peritelomeric regions of domesticated and wild yeast isolates during genome evolution (8, 11). The peritelomeric regions are functionally enriched for secondary metabolisms and stress responses that contribute to environmental adaptation (8, 11, 42–44), high frequencies of rearrangement in these regions could thus be important driving force of evolution. As for the pericentromeric regions, inter-chromosomal rearrangements would result in swapping of chromosomal arms between two chromosomes. Recent genomic studies demonstrated that such rearrangements can cause reproductive isolation and promote incipient speciation (45–47). We speculate that chromatin structures play important roles in genome evolution in terms of effects to the rearrangements of the peritelomeric and pericentromeric regions.
The flexible and controllable synthetic yeast genome provides a unique model to systematically interrogate and explore the dynamic process of eukaryotic genome evolution. The investigation of the large number of rearrangements indicated the tremendous plasticity of the yeast genome and the importance of hierarchical chromatin organization in regional rates of chromosomal variations for genome evolution. These findings give crucial insights that the variability and complexity of synthetic genome design can be further increased; meanwhile, influence of chromatin organization needs to be considered for the design and engineering of higher organism genomes.
Funding
This work was funded by the National Natural Science Foundation of China (21621004, 31861143017, 31971351), National Key R&D Program of China, Synthetic Biology Research (2019YFA0903800), New Drug Creation Manufacturing Program (No. 2019ZX09J19105) and Young Elite Scientist Sponsorship Program by CAST (YESS) (2018QNRC001).
Author contributions
Conceptualization: YJY, YW; experiments: SZ, YZ (construction of strain yZY192), ZZ; analysis: YJY, YW, SZ, ZZ, LJ, LL; writing: YJY, YW, SZ, YZ, JT.
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
Genome sequencing data have been submitted to NCBI Sequence Read Archive (SRA) under accession number PRJNA705059. The ATAC-seq and Hi-C sequencing data have been submitted to NCBI Gene Expression Omnibus (GEO) under accession number GSE168182. The ATAC-seq data of wild type S. cerevisiae is from GEO under accession number GSE66386.
Supplementary Materials
Materials and Methods
Supplementary Text
Figs. S1 to S11
Tables S1 to S6
References (48–51)
Acknowledgments
We thank Jef D. Boeke from New York University for fruitful discussions and advice. We are grateful to Jef D. Boeke and Leslie A. Mitchell from New York University, Huanming Yang, Yue Shen, Yun Wang and Tai Chen from BGI-Shenzhen, Yizhi Cai from University of Edinburgh, Srinivasan Chandrasegaran, Joel Bader, Narayana Annaluru, Héloïse Muller and Jessica S. Dymond from Johns Hopkins University for sharing the synthetic yeast strains and technical support. This work is part of the Synthetic Yeast Genome Project, http://syntheticyeast.org/sc2-0/.