Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion

Alexander Krohannon, Mansi Srivastava, Simone Rauch, Rajneesh Srivastava, Bryan C. Dickinson, View ORCID ProfileSarath Chandra Janga
doi: https://doi.org/10.1101/2021.07.26.453663
Alexander Krohannon
1Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University, 719 Indiana Ave St 319, Walker Plaza Building, Indianapolis, Indiana 46202
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mansi Srivastava
1Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University, 719 Indiana Ave St 319, Walker Plaza Building, Indianapolis, Indiana 46202
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simone Rauch
2Department of Chemistry, The University of Chicago, Chicago, Illinois, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rajneesh Srivastava
1Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University, 719 Indiana Ave St 319, Walker Plaza Building, Indianapolis, Indiana 46202
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bryan C. Dickinson
2Department of Chemistry, The University of Chicago, Chicago, Illinois, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarath Chandra Janga
1Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University, 719 Indiana Ave St 319, Walker Plaza Building, Indianapolis, Indiana 46202
3Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 5021 Health Information and Translation Sciences (HITS), 410 West 10th Street, Indianapolis, Indiana 46202
4Department of Medical and Molecular Genetics, Indiana University School of Medicine, Medical Research and Library Building, 975 West Walnut Street, Indianapolis, Indiana, 46202
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sarath Chandra Janga
  • For correspondence: scjanga@iupui.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Recent discovery of the gene editing system -CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) associated proteins (Cas), has resulted in its widespread use for improved understanding of a variety of biological systems. Cas13, a lesser studied Cas protein, has been repurposed to allow for efficient and precise editing of RNA molecules. The Cas13 system utilizes base complementarity between a crRNA/sgRNA (crispr RNA or single guide RNA) and a target RNA transcript, to preferentially bind to only the target transcript. Unlike targeting the upstream regulatory regions of protein coding genes on the genome, the transcriptome is significantly more redundant, leading to many transcripts having wide stretches of identical nucleotide sequences. Transcripts also exhibit complex three-dimensional structures and interact with an array of RBPs (RNA Binding Proteins), both of which further limit the scope of effective target sequences. As a result, there currently exists no method to predict whether a specific sgRNA will effectively knockdown a transcript. Here we present a novel machine learning and computational tool, CASowary, to predict the efficacy of a sgRNA. We used publicly available RNA knockdown data from Cas13 characterization experiments for 555 sgRNAs targeting the transcriptome in HEK293 cells, in conjunction with transcriptome-wide protein occupancy information on RNA. Our model utilizes a Decision Tree architecture with a set of 112 sequence and target availability features, to classify sgRNA efficacy into one of four classes, based upon expected level of target transcript knockdown. After accounting for noise in the training data set, the noise-normalized accuracy exceeds 70%. Additionally, highly effective sgRNA predictions have been experimentally validated using an independent RNA targeting Cas system -CIRTS, confirming the robustness and reproducibility of our model’s sgRNA predictions. Utilizing transcriptome wide protein occupancy map generated using POP-seq in Hela cells against publicly available protein-RNA interaction map in Hek293 cells, we show that CASowary can predict high quality guides for numerous transcripts in a cell line specific manner. Application of CASowary to whole transcriptomes should enable rapid deployment of CRISPR/Cas13 systems, facilitating the development of therapeutic interventions linked with aberrations in RNA regulatory processes.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/Janga-Lab/CASowary

  • Abbreviations

    AUC
    Area Under the Curve
    Cas
    CRISPR associated
    cDNA
    complementary DNA
    CIRTS
    CRISPR-Cas-Inspired RNA Targeting System
    CRISPR
    Clustered Regularly Interspersed Short Palindromic Repeats
    crRNA
    crispr RNA
    DNA
    DeoxyRibonucleic Acid
    GEO
    Gene Expression Omnibus
    IDT
    Integrated DNA Technologies
    IGV
    Integrative Genomics Viewer
    lncRNA
    long non-coding RNA
    ncRNA
    non-coding RNA
    NIH
    National Institute of Health
    PAM
    Protospacer Adjacent Motif
    POP-seq
    Protein Occupancy Profile sequencing
    RBP
    RNA Binding Protien
    ROC
    Receiver Operator Curve
    sgRNA
    single guide RNA
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted July 27, 2021.
    Download PDF
    Data/Code
    Email

    Thank you for your interest in spreading the word about bioRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion
    (Your Name) has forwarded a page to you from bioRxiv
    (Your Name) thought you would like to see this page from the bioRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion
    Alexander Krohannon, Mansi Srivastava, Simone Rauch, Rajneesh Srivastava, Bryan C. Dickinson, Sarath Chandra Janga
    bioRxiv 2021.07.26.453663; doi: https://doi.org/10.1101/2021.07.26.453663
    Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion
    Alexander Krohannon, Mansi Srivastava, Simone Rauch, Rajneesh Srivastava, Bryan C. Dickinson, Sarath Chandra Janga
    bioRxiv 2021.07.26.453663; doi: https://doi.org/10.1101/2021.07.26.453663

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Bioinformatics
    Subject Areas
    All Articles
    • Animal Behavior and Cognition (4657)
    • Biochemistry (10309)
    • Bioengineering (7629)
    • Bioinformatics (26217)
    • Biophysics (13461)
    • Cancer Biology (10637)
    • Cell Biology (15354)
    • Clinical Trials (138)
    • Developmental Biology (8461)
    • Ecology (12766)
    • Epidemiology (2067)
    • Evolutionary Biology (16780)
    • Genetics (11368)
    • Genomics (15416)
    • Immunology (10562)
    • Microbiology (25064)
    • Molecular Biology (10165)
    • Neuroscience (54199)
    • Paleontology (398)
    • Pathology (1658)
    • Pharmacology and Toxicology (2878)
    • Physiology (4319)
    • Plant Biology (9206)
    • Scientific Communication and Education (1582)
    • Synthetic Biology (2543)
    • Systems Biology (6759)
    • Zoology (1454)