A common genetic architecture enables the lossy compression of large CRISPR libraries

Boyang Zhao; Yiyun Rao; Luke Gilbert; Justin Pritchard

doi:10.1101/2020.12.18.423506

Abstract

There are thousands of ubiquitously expressed mammalian genes, yet a genetic knockout can be lethal to one cell, and harmless to another. This context specificity confounds our understanding of genetics and cell biology. 2 large collections of pooled CRISPR screens offer an exciting opportunity to explore cell specificity. One explanation, synthetic lethality, occurs when a single “private” mutation creates a unique genetic dependency. However, by fitting thousands of machine learning models across millions of omic and CRISPR features, we discovered a “public” genetic architecture that is common across cell lines and explains more context specificity than synthetic lethality. This common architecture is built on CRISPR loss-of-function phenotypes that are surprisingly predictive of other loss-of-function phenotypes. Using these insights and inspired by the in silico lossy compression of images, we use machine learning to identify small “lossy compression” sets of in vitro CRISPR constructs where reduced measurements produce genome-scale loss-of-function predictions.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

https://github.com/pritchardlabatpsu/cnp_dev

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.