PT - JOURNAL ARTICLE AU - Yanying Yu AU - Sandra Gawlitt AU - Lisa Barros de Andrade e Sousa AU - Erinc Merdivan AU - Marie Piraud AU - Chase Beisel AU - Lars Barquist TI - Improved prediction of bacterial CRISPRi guide efficiency through data integration and automated machine learning AID - 10.1101/2022.05.27.493707 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.05.27.493707 4099 - http://biorxiv.org/content/early/2022/05/28/2022.05.27.493707.short 4100 - http://biorxiv.org/content/early/2022/05/28/2022.05.27.493707.full AB - CRISPR interference (CRISPRi), the targeting of a catalytically dead Cas protein to block transcription, is the leading technique to silence gene expression in bacteria. Genome-scale CRISPRi essentiality screens provide one data source from which rules for guide design can be extracted. However, depletion confounds guide efficiency with effects from the targeted gene. Using automated machine learning, we show that depletion can be predicted by a combination of guide and gene features, with expression of the target gene having an outsized influence. Further, integrating data across independent CRISPRi screens improves performance. We develop a mixed-effect random forest regression model that learns from multiple datasets and isolates effects manipulable in guide design, and apply methods from explainable AI to infer interpretable design rules. Our method outperforms the state-of-the-art in predicting depletion in an independent saturating screen targeting purine biosynthesis genes in Escherichia coli. Our approach provides a blueprint for the development of predictive models for CRISPR technologies in bacteria.Competing Interest StatementThe authors have declared no competing interest.