Abstract
CRISPR-based high-throughput screens are a powerful method to unbiasedly assign function to a large set of genes, but current genome-wide libraries yield a substantial number of false positives and negatives. We use a retrieval-tree based approach to accurately characterize the off-target space of these libraries and show that they contain a notable fraction of highly promiscuous gRNAs. Promiscuous gRNAs are depleted from screens in a gene-independent manner, create noise in the data generated by these libraries, and ultimately lead to low accuracy in hit identification. This extensive off-targeting also contributes to low overlap between data generated by independent libraries. To minimize these problems we developed the CRISPR Specificity Correction (CSC), a computational approach that segregates on- and off-targeting effects on gRNA depletion. We demonstrate that CSC is able to reduce the occurrence of false positives, improve hit reproducibility between different libraries, and uncover both known and novel genetic dependencies in melanoma cells.
Footnotes
This version of the manuscript has been formatted for easier reading.