Abstract
Major genomic deletions in independent eukaryotic lineages have led to repeated ancestral loss of biosynthesis pathways for nine of the twenty canonical amino acids1. While the evolutionary forces driving these polyphyletic deletion events are not well understood, the consequence is that extant metazoans are unable to produce nine essential amino acids (EAAs). Previous studies have highlighted that EAA biosynthesis tends to be more energetically costly2,3, raising the possibility that these pathways were lost from organisms with access to abundant EAAs in the environment4,5. It is unclear whether present-day metazoans can reaccept these pathways to resurrect biosynthetic capabilities that were lost long ago or whether evolution has rendered EAA pathways incompatible with metazoan metabolism. Here, we report progress on a large-scale synthetic genomics effort to reestablish EAA biosynthetic functionality in a mammalian cell. We designed codon-optimized biosynthesis pathways based on genes mined from Escherichia coli. These pathways were de novo synthesized in 3 kilobase chunks, assembled in yeasto and genomically integrated into a Chinese Hamster Ovary (CHO) cell line. One synthetic pathway produced valine at a sufficient level for cell viability and proliferation, and thus represents a successful example of metazoan EAA biosynthesis restoration. This prototrophic CHO line grows in valine-free medium, and metabolomics using labeled precursors verified de novo biosynthesis of valine. RNA-seq profiling of the valine prototrophic CHO line showed that the synthetic pathway minimally disrupted the cellular transcriptome. Furthermore, valine prototrophic cells exhibited transcriptional signatures associated with rescue from nutritional starvation. This work demonstrates that mammalian metabolism is amenable to restoration of ancient core pathways, thus paving a path for genome-scale efforts to synthetically restore metabolic functions to the metazoan lineage.
Whole genome sequencing across the tree of life has revealed the surprising observation that nine amino acid (AA) biosynthesis pathways are missing from the metazoan lineage1. Furthermore, these losses appear to have occurred multiple times during eukaryotic evolution, including in some microbial lineages (Fig 1A)1,4. Branching from core metabolism, the nine EAA biosynthesis pathways missing from metazoans involve over forty genes (Fig 1B, Table ED1-ED3), widely found in bacteria, fungi and plants4. While the absence of essential metabolic pathways is observed in certain bacteria6, which possess short generation times and high genomic flexibility to adapt to rapidly changing environments, the forces driving the loss of multiple EAA biosynthetic pathways in multicellular eukaryotes remain a great mystery. Their partial reacquisition through horizontal gene transfer in certain rare insect lineages with extremely simple nutrient sources, such as sap or blood, enables them to host genome-reduced intracellular bacteria that provide other missing metabolites missing from these limited diets, and provides the “exception that proves the rule”7. Recent efforts in genome-scale synthesis8-10 and genome-writing11 have highlighted our increasing capacity to construct synthetic genomes with novel properties, thus providing a route to not only examine these interesting evolutionary questions but also yield new capacities of bioindustrial utility 12-14.
We sought to explore the possibility of generating prototrophic mammalian cells capable of complete biosynthesis of EAAs using a synthetic genomics approach (Fig 1C). The Chinese Hamster Ovary (CHO) K1 cell line was chosen as a model system due to its fast generation time, amenability to genetic manipulations, availability of a whole genome sequence, and established industrial relevance for producing biologics15. EAA biosynthesis genes from the best characterized model organisms were considered during pathway design while optimizing for the fewest number of enzymes needed for a given EAA pathway. To avoid using multiple promoters, we introduced ribosome-skipping 2A sequences16 between biosynthetic genes to allow for protein translation of separate enzymes from a single transcriptional unit. The EAA pathway and an additional EGFP reporter were placed in a vector that could be integrated as a single copy into the CHO genome at a designated landing pad using the FLP-In system17. The entire pathway was synthesized de novo by commercial gene synthesis in 3 kilobase fragments and assembled in Saccharomyces cerevisiae via homologous recombination of 80 basepair overlaps. Subsequent antibiotic selection of cells transfected with the vector resulted in a stable cell line containing the integrated EAA pathway. Finally, we performed a variety of phenotypic, metabolomic, and transcriptomic characterizations on the modified cell line to verify activity of the EAA biosynthesis pathway.
We first confirmed that the CHO cell line was auxotrophic for each of the 9 EAAs. As expected, CHO-K1 did not grow in “dropout” F-12K medium lacking each of the 9 EAAs and supplemented with dialyzed FBS (Fig S1). We noted that in this cell line, canonically non-essential amino acids tyrosine and proline also exhibited EAA-like properties in dropout media. Insufficient concentrations of phenylalanine in F-12K media or low expression of endogenous phenylalanine-4-hydroxylase that converts phenylalanine to tyrosine could underlie tyrosine limitation. Proline auxotrophy in CHO-K1 results from epigenetic silencing of the gene encoding Δ1-pyrroline-5-carboxylate synthetase (P5CS) in the proline pathway18. We therefore used proline as a test case for our synthetic genomics pipeline. We tested the P5CS-equivalent proline biosynthesis enzyme found in E. coli, encoded by two separate genes, proA and proB (Fig ED2A). A vector (pPro) carrying codon-optimized proA and proB separated by a P2A sequence was synthesized and integrated into CHO-K1 (Fig ED2B). CHO cells with the stably integrated pPro proline pathway showed robust growth in proline-free medium (Fig ED2C-D), thus validating a pipeline for designing and generating specific AA prototrophic cells.
To demonstrate restoration of EAA pathways lost from the metazoan lineage more than 650-850 million years ago19, we built a 6-gene construct (pMTIV) to test the simultaneous rescue of methionine, threonine, isoleucine and valine auxotrophies. These EAAs were chosen because their biosynthesis pathways were missing the fewest number of genes: methionine and threonine production require two genes while valine and isoleucine require four genes total (Fig ED3). To biosynthesize methionine, we chose the E. coli metC gene, which converts cystathionine to homocysteine, a missing step in CHO-K1 cells in a potential serine to methionine biosynthetic pathway. Threonine production was tested using E. coli glycine hydroxymethyltransferase ltaE, which converts glycine and acetaldehyde into threonine. For branched chain amino acids (BCAAs) valine and isoleucine, three additional biosynthetic enzymes and one regulatory subunit are needed in theory to convert pyruvate and 2-oxobutanoate into valine and isoleucine, respectively. In the case of valine, pyruvate is converted to 2-acetolactate, then to 2,3-dihydroxy-isovalerate, then to 2-oxoisovalerate and finally to valine. For isoleucine, 2-oxobutanoate is converted to 2-aceto-2-hydroxybutanoate, then to 2,3-dihydroxy-3-methylpentanoate, then to 3-methyl-2-oxopentanoate, and finally to isoleucine. The final steps in both BCAAs can be performed by native CHO catabolic enzymes Bcat1 and Bcat218. In E. coli, the first three steps in the pathway are embodied in four genes that encode an acetolactate synthase split into catalytic and regulatory subunits (ilvB/N), a ketol-acid reductoisomerase (ilvC), and a dihydroxy-acid dehydratase (ilvD) (Fig 2A)20. The final pMTIV construct comprises metC, itaE, ilvB, ilvN, ilvC and ilvD, organized as a single open reading frame (ORF) with a 2A sequence variant in between each protein coding region (Fig 2B), driven by a strong SFFV viral promoter.
To test the biosynthetic capacity of pMTIV, we first introduced the construct into CHO cells. Flp-In integration was used to stably insert either pMTIV, or a control vector (pCtrl) into the CHO genome. Successful generation of each cell line was confirmed by PCR amplification of junction regions formed during vector integration (Fig ED4A-B). RNA-seq of cells containing the pMTIV construct confirmed transcription of the entire ORF (Fig 2C). Western blotting of pMTIV cells using antibodies against the P2A peptide yielded bands at the expected masses of P2A-tagged proteins, confirming the production of separate distinct enzymes (Fig ED4C).
In methionine-free, threonine-free, or isoleucine-free medium, cells containing the pMTIV construct did not show viability over seven days, similar to cells containing the pCtrl control vector Fig ED5). In striking contrast, however, cells containing the integrated pMTIV showed relatively healthy cell morphology and viability in valine-free medium (Fig 2D), whereas cells containing pCtrl exhibited substantial loss of viability over six days. In complete medium, cells carrying the integrated pMTIV construct showed no growth defects compared to control cells (Fig 2E). In valine-free medium, pMTIV cells showed a 32% increase in cell number over 6 days compared to an 88% decrease in cell number in pCtrl cells (Fig 2F). When cultured in valine-free medium over multiple passages with medium changes every two days, pMTIV cell proliferation was substantially reduced by the 3rd passage. We hypothesized that frequent passaging might over-dilute the medium and prevent sufficient accumulation of biosynthesized valine necessary for continued proliferation. We thus deployed a “conditioned-medium” regimen whereby 50% of the medium was freshly prepared valine-free medium and 50% was “conditioned” valine-free medium in which pMTIV cells had previously been cultured over 2 days (see Methods). Using this regimen, we were able to culture pMTIV cells for 9 passages without addition of exogenous valine, during which time they exhibited an average doubling time of 8.5 days. However, the doubling time varied across the 49 days of experimentation with cells exhibiting a mean doubling time of 5.3 days in the first 24 days and 21.0 days in the last 25 days. The increase in doubling time seen in later passages may be the result of detrimental effects from culturing cells longer-term in partially recycled and dialyzed FBS or may result from variation in the cell number to medium volume ratio, which trended downwards in later passaging events as cell growth slowed. Despite the slowed growth seen in later passages, cells exhibited healthy morphology and continued to proliferate at day 49, suggesting that the cells could have been passaged even further. To verify that the putative valine rescue effect was due to the valine biosynthesis genes present in pMTIV specifically, we constructed and tested a second EAA pathway vector pIV that only contained the four genes ilvNBCD. The pIV construct similarly supported cell growth in valine-free medium, and exhibited similar growth dynamics to the pMTIV construct in complete medium (Fig ED5, Fig ED6).
To confirm endogenous biosynthesis of valine, we cultured pCtrl and pMTIV cells in RPMI medium containing 13C6-glucose in the place of its 12C equivalent together with 13C3-pyruvate spiked in at 2 mM over 3 passages (Fig ED7A). High-resolution MS1 of MTIV cell lysates revealed a peak at 123.1032 m/z, the expected m/z for 13C5-valine (Fig 3A). This detected peak was subject to MS2 alongside a 12C-valine control peak and a 13C5/15N-valine peak, which was spiked into all samples to serve as an internal standard. The resulting fragmentation patterns for each peak (Fig 3B) matched the theoretical expectations for each isotopic version of valine (Fig ED7B). An extracted ion chromatogram further revealed a peak in the pMTIV valine-free medium metabolite extraction, which corresponded to a peak in the spiked-in positive control 13C5/15N-valine, whereas no equivalent peak was seen among metabolites extracted from pCtrl cells (Fig ED7C). Taken together, this demonstrates that pMTIV cells are biosynthesizing valine from core metabolites glucose and pyruvate, thereby representing successful metazoan biosynthesis of valine. Over the course of 3 passages in heavy valine-free medium, the non-essential amino acid alanine, which is absent from RPMI medium and synthesized from pyruvate, was found to be 86.1% 13C-labeled in pMTIV cell lysates. Assuming similar turnover rates for alanine and valine within the CHO proteome, we expected to see similar percentages of 13C-labeled valine. However, just 32.2% of valine in pMTIV cell lysates was 13C-labeled (Fig ED7D-E). For pMTIV cells cultured in heavy complete medium, just 6.4% of valine in cell lysates was 13C-labeled. Together with the observed slow proliferation of pMTIV cells in valine-free medium, our data suggests that valine complementation is sufficient but perhaps sub-optimal for cell growth.
We performed RNA-seq to profile the transcriptional responses of cells containing pMTIV or pCtrl in complete (harvested at 0 h) and valine-free medium (harvested at 4 and 48 h, respectively) (Fig 3C, Fig ED8A). The transcriptional impact of pathway integration is modest (Fig 3D). Only 51 transcripts were differentially expressed between pCtrl and pMTIV cells grown in complete medium, and the fold changes between conditions were small (Fig 3E, Fig ED8B). While some gene ontology (GO) functional categories were enriched (Fig ED8C), they did not suggest dramatic cellular stress. Rather, these transcriptional changes may reflect cellular response to BCAA dysregulation due to alterations in valine concentrations21, or they may result from cryptic effects of bacterial genes placed in a heterologous mammalian cellular context. In contrast, comparison of 48 h valine-starved pCtrl and pMTIV cells yielded ∼7,500 differentially expressed genes. Transcriptomes of pMTIV cells in valine-free medium more closely resembled cells grown on complete medium than did pCtrl cells in valine-free medium (Fig 3D, Fig ED9A). Differentially expressed genes between pCtrl and pMTIV cells showed enrichment for hundreds of GO categories, including clear signatures of cellular stress such as autophagy, changes to endoplasmic reticulum trafficking, and ribosome regulation (Fig ED9B). Most of the differentially regulated genes between pCtrl cells in complete medium, and those same cells starved of valine for 48 hours were also differentially expressed when comparing pCtrl and pMTIV cells in valine-free medium (Fig 3E), supporting the hypothesis that most of the observed transcriptional changes represent broad but partial rescue of the cellular response to starvation.
In this work, we demonstrated the successful restoration of an EAA biosynthetic pathway in a metazoan cell. Our results indicate that contemporary metazoan biochemistry can support complete biosynthesis of valine, despite millions of years of evolution from its initial loss from the ancestral lineage. Interestingly, independent evidence for BCAA biosynthesis has also been obtained for sap-feeding whitefly bacteriocytes that host bacterial endosymbionts; metabolite sharing between these cells is predicted to lead to biosynthesis of BCAAs that are limiting in their restricted diet. The malleability of mammalian metabolism to accept heterologous core pathways opens up the possibility of animals with designer metabolisms and enhanced capacities to thrive under environmental stress and nutritional starvation22. Yet, our failure to functionalize designed methionine, threonine and isoleucine pathways highlights outstanding challenges and future directions. Other pathway components or alternative selections may be needed for different EAAs23. A general lack of predictability and a dearth of well-characterized and controllable genetic “parts” with high dynamic range continue to hamper efforts in genome-scale mammalian engineering24-26. Studies to reincorporate EAAs into the core mammalian metabolism could provide greater understanding of nutrient-starvation in different physiological contexts including the tumor microenvironment27, help answer deep evolutionary questions regarding the formation of the metazoan lineage28, and lead to new model systems or even therapeutics to address metabolic syndrome, Maple Syrup Urine Disease29 and Phenylketonuria30 all of which involve amino acid biosynthetic dysfunction31,32. Emerging synthetic genomic efforts to build a prototrophic mammal may require reactivation of many more genes (Table ED1-ED3), iterations of the design, build, test (DBT) cycle, and a larger coordinated research effort to ultimately bring such a project to fruition.
METHODS
Pathway completeness analysis
For pathway completeness analysis, the EC numbers of each enzyme in each amino acid biosynthesis pathway (excluding pathways annotated as only occurring in prokaryotes) were collected from the MetaCyc database (Table ED4). Variant biosynthetic routes to the same amino acid were considered as separate pathways, generating distinct EC number lists. The resulting per-pathway EC number lists were checked against the KEGG, Entrez Gene, Entrez Nucleotide, and Uniprot databases using their respective web APIs for each listed organism. If the combination of all databases contained at least one complete EC numbers list, corresponding to an end-to-end complete biosynthetic pathway, the organism was considered “complete” for that essential amino acid.
Cell lines and media
CHO Flp-In™ cells (ThermoFisher, R75807) were used in all experiments. For growth assays involving amino acid dropout formulations, medium was prepared from an amino acid-free Ham’s F-12 (Kaighn’s) powder base (US Biological, N8545), and custom combinations of amino acids were added back in as needed to match the standard amino acid concentrations for Ham’s F-12 (Kaighn’s) medium or as specified. Custom amino acid dropout medium was adjusted to a pH of 7.3, sterile filtered, and supplemented with 10% dialyzed Fetal Bovine Serum and Penicillin-Streptomycin (100U/mL) prior to use. For metabolomics experiments, medium was prepared from an amino acid-free and glucose-free RPMI 1640 powder base (US Biological, R9010-01), and custom combinations of amino acids and isotopically heavy glucose and sodium pyruvate were added in to match the standard amino acid concentrations for RPMI 1640 or as specified. pH was adjusted to 7.3, sterile filtered and supplemented with 10% dialyzed Fetal Bovine Serum and Penicillin-Streptomycin (100U/mL) prior to use.
Cell counting and quantification
For amino acid dropout curves, cells were seeded at 1×104 into 6-well plates into F12-K media with lowered amino acid concentrations relative to typical F12-K media and then allowed to grow for five days. Media was then aspirated off and replaced with PBS with Hoechst 33342 live nuclear stain for automated imaging and counting using a DAPI filter set on an Eclipse Ti2 automated inverted microscope. To count, an automated microscopy routine was used to image 5 random locations within each well at 10x magnification, and then the cells present in imaged frames counted using automatic cell segregation and counting software. Given differences in cell response to starvation, segregation and counting parameters were tuned each experiment, but kept constant between starvation conditions and cells with and without the pathway. For synthetic prototroph pathway tests, raw cell counts were performed using the Countess II Automated Cell Counter (ThermoFisher, A27977) in accordance with the manufacturer’s protocol, or using the Scepter 2.0 Handheld Automated Cell Counter (Milipore Sigma, C85360) in accordance with the manufacturer’s protocol. Where indicated, relative cell quantification was measured using PrestoBlue™ Cell Viability Reagent (ThermoFisher, A13261) in accordance with the manufacturer’s protocol.
Culturing synthetic prototrophic cells without exogenous supply of valine
For long-term culture of synthetic prototrophic cells, cells were cultured in 50% conditioned valine-free F12-K medium. Conditioned medium was generated by seeding 1×106 pMTIV cells into 10mL complete F12-K medium on 10cm plates and replacing the medium with 10mL freshly prepared valine-free F12-K medium the next day following a PBS wash step. Cells conditioned the medium for 2 days at which point the medium was collected, centrifuged at 300xg for 3 mins to remove potential cell debris, and collected in 150mL vats to reduce batch-to-batch variation. This 100% conditioned medium was subsequently mixed in a 1:1 ratio with freshly prepared, unconditioned valine-free medium to generate so-called 50% conditioned valine-free medium, which was used throughout the long-term culturing process of synthetic prototrophic cells without exogenous supply of valine, For long-term culturing, 1×105 pMTIV cells were seeded in triplicate populations into complete medium in 6-well plates, (day -1) which was replaced with 50% conditioned valine-free medium the next day (day 0) following a PBS wash step. Cells were counted at each passaging event and split at a 1:4, 1:2 or 3:4 proportion such that approximately 2×106 cells were seeded at each passaging event as best possible.
DNA assembly, recovery and amplification
Integrated constructs were synthesized de novo in 3kb DNA segments with each segment overlapping neighboring segments by 80. Assembly was conducted in yeasto by co-transformation of segments into Saccharomyces cerevisiae. After 2 days of selection at 30C on Sc-URA, individual colonies were picked and cultured overnight. 1.5mL of each resulting yeast culture was resuspended in 250ul of P1 resuspension buffer (Qiagen, 19051) containing RNase. Glass beads were added to each resuspension and the mixture was vortexed for 10 mins to mechanically shear the cells. Next, cells were subject to alkaline lysis by adding 250ul of P2 lysis buffer (Qiagen, 19052) for 5 mins and then neutralized by addition of Qiagen N3 neutralization buffer (Qiagen, 19053). Subsequently, cell debris was spun down and plasmid DNA was collected using the Zymo Zyppy plasmid preparation kit (Zymo Research, D4036) according to the manufacturer’s instructions. Plasmid DNA was eluted in 30ul of Zyppy Elution buffer of which 10ul would be transformed into 100ul of E. coli for plasmid amplification.
Protein extraction and western blot
Cell were lysed in SKL Triton lysis buffer (50 mM Hepes pH7.5, 150 mM NaCl, 1 mM EDTA, 1 mM EGTA, 10% glycerol, 1% Triton X-100, 25 mM NaF, 10 μM ZnCl2) supplemented with protease inhibitor (Sigma 11873580001). NuPAGE™ LDS sample buffer (ThermoFisher, NP0007) supplemented with 1.43 M β-mercaptoethanol was added to samples prior to heating at 70C for 10 mins. Gel electrophoresis was performed using 4-12% Bis-Tris gels (ThermoFisher, NP0326BOX) and run in NuPAGE™ MOPS running buffer (ThermoFisher, NP0001). Proteins were then transferred onto a PVDF membrane (Milipore Sigma, IPFL00010) using the Biorad Trans-Blot Turbo system in accordance with the manufacturer’s instructions. The transfer membrane was blocked in Odyssey blocking buffer (LI-COR, 927-40000) for 1 h at room temperature prior to incubation in primary antibody (Novus Biologicals, NBP2-59627 [1:1000 dilution]; Cell Signaling Technology, 2148 [1:1000 dilution]) solubilized in a 1:1 mixture of Odyssey blocking buffer and TBS-T buffer (50 mM Tris Base, 154 mM NaCl, 0.1% Tween20) overnight at 4C. Secondary antibodies (LI-COR, 926-32210 [1:20,000 dilution]; LI-COR, 926-68071 [1:20,000 dilution]),were also solubilized in Odyssey blocking buffer / TBS-T buffer. The membrane was incubated in the secondary antibody solution for 1.5 h at room temperature.
Metabolomics
Cells were cultured in IH medium over 3 passages prior to cell harvest. Cell pellets were generated by trypsinization, followed by low speed centrifugation, and the pellet was frozen at - 80°C until further processing. A metabolite extraction was carried out on each sample with an extraction ratio of 1e6 cells per mL (80% methanol containing internal standards, 500 nM), according to a previously described method33. The LC column was a Millipore™ ZIC-pHILIC (2.1 x150 mm, 5 μm) coupled to a Dionex Ultimate 3000™ system and the column oven temperature was set to 25°C for the gradient elution. A flow rate of 100 μL/min was used with the following buffers; A) 10 mM ammonium carbonate in water, pH 9.0, and B) neat acetonitrile. The gradient profile was as follows; 80-20%B (0-30 min), 20-80%B (30-31 min), 80-80%B (31-42 min). Injection volume was set to 1 μL for all analyses (42 min total run time per injection). MS analyses were carried out by coupling the LC system to a Thermo Q Exactive HF™ mass spectrometer operating in heated electrospray ionization mode (HESI). Method duration was 30 min with a polarity switching data-dependent Top 3 method for both positive and negative modes, and targeted MS2 scans for the monoisotopic, U-13C, and U-13C/U-15N valine m/z values. Spray voltage for both positive and negative modes was 3.5kV and capillary temperature was set to 320°C with a sheath gas rate of 35, aux gas of 10, and max spray current of 100 μA. The full MS scan for both polarities utilized 120,000 resolution with an AGC target of 3e6 and a maximum IT of 100 ms, and the scan range was from 67-1000 m/z. Tandem MS spectra for both positive and negative mode used a resolution of 15,000, AGC target of 1e5, maximum IT of 50 ms, isolation window of 0.4 m/z, isolation offset of 0.1 m/z, fixed first mass of 50 m/z, and 3-way multiplexed normalized collision energies (nCE) of 10, 35, 80. The minimum AGC target was 1e4 with an intensity threshold of 2e5. All data were acquired in profile mode. All valine data were processed using Thermo XCalibur Qualbrowser for manual inspection and annotation of the resulting spectra and peak heights referring to authentic valine standards and labeled internal standards as described.
RNA Seq
RNA was extracted from cells using the Qiagen RNeasy Kit (Qiagen, 74104) according to the manufacturer’s protocol. QIAshredder homogenizer columns were used to disrupt the cell lysates (Qiagen, 79654). mRNA was purified using the NEBNext poly(A) mRNA Magnetic Isolation module (New England Biolabs, E7490) in accordance with the manufacturer’s protocol. Libraries were prepared using the NEBNext Ultra RNA Library Prep Kit for Illumina (New England Biolabs, E7770), and sequenced on a NextSeq 550 single-end 75 cycles high output with v2.5 chemistry. Reads were adapter and quality trimmed with fastP using default parameters and psuedoaligned to the GCF_003668045.1_CriGri-PICR Chinese hamster genome assembly using kallisto. Differential gene enrichment analysis was performed with in R with DESeq2 and GO enrichment performed and visualized with clusterProfiler against the org.Mm.eg.db database, with further visualization with the pathview, GoSemSim, eulerr packages.
Funding
Defense Advanced Research Projects Agency HR0011-17-2-0041 (HHW, JDB)
National Institutes of Health / National Human Genome Research Institute RM1 HG009491 (JDB)
National Science Foundation MCB-1453219 (HHW)
Burroughs Wellcome Fund PATH1016691 (HHW)
Irma T. Hirschl Trust (HHW)
Dean’s Fellowship from the Graduate School of Arts and Sciences of Columbia University (RMM)
Author contributions
RMM, JT, JDB and HHW developed the initial concept. JT, RMM, AK, SP, HB, SG, LL, MJS, XG, DRJ, and AM performed experiments and analyzed the results. The overall project was supervised by HHW and JDB. The manuscript was drafted by JT, RMM, JDB, and HHW with input from all authors.
Competing interests
Jef D. Boeke is a Founder and Director of CDI Labs, Inc., a Founder of Neochromosome, Inc, a Founder and SAB member of ReOpen Diagnostics, and serves or served on the Scientific Advisory Board of the following: Sangamo, Inc., Modern Meadow, Inc., Sample6, Inc., Tessera Therapeutics, Inc. and the Wyss Institute.
Data and materials availability
Sequencing data generated for this study is deposited in the NCBI SRA at accession number PRJNA742028 (pending).
EXTENDED DATA
Acknowledgements
We would like to thank the members of the Boeke and Wang labs for comments and discussion on the work and manuscript. RMM additionally thanks personal support from Xiaoyu Weng.