Abstract
A single guide RNA (sgRNA) directs Cas9 nuclease for gene-specific scission of double-stranded DNA. High Cas9 activity is essential for efficient gene editing to generate gene deletions and gene replacements by homologous recombination. However, cleavage efficiency is below 50% for more than half of randomly selected sgRNA sequences in human cell culture screens or model organisms. Here, we used in vitro assays to determine intrinsic molecular parameters for maximal sgRNA activity including correct folding of sgRNAs and Cas9 structural information. From comparison of over 10 data sets, we find that major constraints in sgRNA design originate from maintaining the secondary structure of the sgRNA, sequence context of the seed region, GC context and detrimental motifs, but we also find considerable variation among different prediction tools when applied to different data sets. To aid selection of efficient sgRNAs, we developed web-based PlatinumCRISPr, a sgRNA design tool to evaluate base-pairing and known sequence composition parameters for optimal design of highly efficient sgRNAs for Cas9 genome editing. We applied this tool to select sgRNAs to efficiently generate gene deletions in Drosophila Ythdc1 and Ythdf, that bind to N6 methylated adenosines (m6A) in mRNA. However, we discovered, that generating small deletions with sgRNAs and Cas9 leads to ectopic reinsertion of the deleted DNA fragment elsewhere in the genome. These insertions can be removed by standard genetic recombination and chromosome exchange. These new insights into sgRNA design and the mechanisms of CRISPR/Cas9 genome editing advances use of this technique for safer applications in humans.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
We added data analysing cleavage efficiency of sgRNAs from over 10 different data sets evaluating sgRNAs mutagenicity. We further compared PlatinumCRISPr with 9 other sgRNA selection tools and combinations thereof. Our analysis shows large variation between sgRNA selection tools applied to different data sets with PlatiunmCRISPr performing best in Drosophila, and in combination with Wong Score for all data sets.