Abstract
The functional consequences of structural variants (SVs) in mammalian genomes are challenging to study. This is due to several factors, including: 1) their numerical paucity relative to other forms of standing genetic variation such as single nucleotide variants (SNVs) and short insertions or deletions (indels); 2) the fact that a single SV can involve and potentially impact the function of more than one gene and/or cis regulatory element; and 3) the relative immaturity of methods to generate SVs, either randomly or in targeted fashion, in in vitro or in vivo model systems. Towards addressing some of these challenges, we developed Genome-Shuffle-seq, a straightforward method that enables the multiplex generation of several major forms of SVs (deletions, inversions, translocations) throughout a mammalian genome. Genome-Shuffle-seq is based on the random integration of “shuffle cassettes” to the genome, wherein each shuffle cassette contains components that facilitate its site-specific recombination (SSR) with other integrated shuffle cassettes (via Cre-loxP), its mapping to a specific genomic location (via T7-mediated in vitro transcription or IVT), and its identification in single-cell RNA-seq (scRNA-seq) data (via T7-mediated in situ transcription or IST). In this proof-of-concept, we apply Genome-Shuffle-seq to induce and map thousands of genomic SVs in mouse embryonic stem cells (mESCs) in a single experiment. Induced SVs are rapidly depleted from the cellular population over time, possibly due to some combination of Cre-mediated toxicity and negative selection on the rearrangements themselves. We demonstrate that we can efficiently genotype which SVs are present in association with each of many single cell transcriptomes in scRNA-seq data. Finally, preliminary evidence suggests our method may be a powerful means of generating extrachromosomal circular DNAs (ecDNAs). Looking forward, we anticipate that Genome-Shuffle-seq may be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, the chromatin landscape, and 3D nuclear architecture. We further anticipate potential uses for in vitro modeling of ecDNAs, as well as in paving the path to a minimal mammalian genome.
Competing Interest Statement
J.S. is a scientific advisory board member, consultant and/or co-founder of Prime Medicine, Cajal Neuroscience, Guardant Health, Maze Therapeutics, Camp4 Therapeutics, Phase Genomics, Adaptive Biotechnologies, Scale Biosciences, Sixth Street Capital and Pacific Biosciences. The other authors declare no competing interests.