ABSTRACT
Massively parallel genetic screens have been used to map sequence-to-function relationships for a variety of genetic elements. However, because these approaches only interrogate short sequences, it remains challenging to perform high throughput (HT) assays on constructs containing combinations of multiple sequence elements arranged across multikb length scales. Overcoming this barrier could accelerate synthetic biology; by screening diverse gene circuit designs, “composition-to-function” mappings could be created that reveal genetic part composability rules and enable rapid identification of behavior-optimized variants. Here, we introduce CLASSIC, a genetic screening platform that combines long– and short-read next-generation sequencing (NGS) modalities to quantitatively assess pools of constructs of arbitrary length containing diverse part compositions. We show that CLASSIC can measure expression profiles of >105 drug-inducible gene circuit designs (from 6-9 kb) in a single experiment in human cells. Using statistical inference and machine learning (ML) approaches, we demonstrate that data obtained with CLASSIC enables accurate predictive modeling of an entire circuit design landscape, offering critical insight into underlying design principles. Our work shows that by expanding the throughput and understanding gained with each design-build-test-learn (DBTL) cycle, CLASSIC dramatically augments the pace and scale of synthetic biology and establishes an experimental basis for data-driven design of complex genetic systems.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Updated model used in figure 4, panels A through E, and conclusions derived from the data, that are used to present summary figure 4F; Updated the relevant in-line text corresponding to these figures to reflect the new results; Updated methods section; Updated references.