A CRISPR/Cas9 pattern at mutation sites in the WGS data
There is broad consensus that CRISPR/Cas9 has off-target effects, but the best methods to address this in vivo are uncertain. We appreciate the interest of Editas Inc., Intellia Inc., and others in our observation and the questions they raise about the potential for off-target mutations by CRISPR.1 Editas and Intellia have suggested that the variation found in the CRISPR treated animals, compared to the non-treated animal, is simply the result of parental inheritance and not off-target CRISPR effects. To begin to address this question, we inspected the variants originally reported in our Correspondence, as well as newly generated variant calls from comparison of each animal’s WGS to the mouse reference genome (mm10). In addition, we performed TOPO cloning followed by Sanger sequencing to detect multiple alleles. We detected sites where numerous alleles were present, more so than would be expected by simple inheritance and with a pattern consistent with DNA breakage followed by repair (examples in Figures 1–6). These heterozygote mutations were mostly within 7–10 bp adjacent to NGG or NGA nucleotide sequences, the preferred Protospacer Adjacent Motif (PAM), or recognition site, for the SpCas9.
Sanger Sequencing Validation of a CRISPR/Cas9 pattern at mutation sites in the WGS data
Since rare WGS variants might be attributed to WGS artifact, we next employed an alternate method (Sanger sequencing) to explore the possibility that CRISPR/Cas9 induced the pattern of mutations at variation sites we described in our original Correspondence. Primer pairs were designed for each site, and PCR followed by TOPO cloning and sequencing was performed (see Methods). We found greater than 2 alleles at multiple sites. One site included the chr4:66,453,495 site, for which we had observed multiple alleles (Figure 3). Sanger sequencing at chr4:66,453,495 (Figure 4) demonstrated at total of 8 different alleles, including both the anticipated wild-type and deletion alleles as well as 6 other alleles, including novel SNVs. Several of these novel alleles were not visualized in the WGS reads (Figure 3), but of note, allele 8 was present in the WGS read. Similarly, we observed multiple alleles by WGS around the deletion at chrX:123,734,765 (Figure 5). Sanger sequencing at chrX:123,734,765 demonstrated 7 alleles at this site (Figure 6). Alleles 2 and 7 found by Sanger sequencing (Figure 6) can be seen in WGS reads (Figure 5). Thus our combined deep WGS analysis and Sanger sequencing experiments at our originally described mutation sites demonstrated multiple alleles that are inconsistent with Mendelian inheritance.
Possible interpretations and mechanisms of the combined WGS and Sanger sequencing data
One possible alternative interpretation of these findings is that both WGS and Sanger findings are due to artifact. However, given that we have found these multiple alleles using more than one method, here we posit the interpretation that the finding of multiple, novel alleles at our originally described CRISPR/Cas9 mutation sites by two different methods suggests that these are in fact CRISPR/Cas9-induced variants. One possible mechanism underlying our findings is that the displaced non-target strand is prone to hydrolysis at the 5’ end of the PAM sequence.2 Insufficient repair of this hydrolysis could explain why SNVs were detected. Such mechanisms would be much more difficult to detect, if at all, in vivo and without WGS. An alternate but related mechanism would be that the observed CRISPR-induced SNVs might be due to translesion synthesis after double strand breaks, which has been described in E. Coli.3
Perhaps highly relevant are the potential roles of miRNAs present in eukaryotes in vivo, which are approximately 22 nucleotides long, the same size as the seed sequence of a gRNA. If any of these have a PAM sequence (1 in 8 chance), they could potentially act as a gRNA. Additionally, there are multiple reports of miRNAs in the nucleus, making it possible that Cas9 may be associating with gRNA-like endogenous sequences within the same subcellular compartment as DNA.4 Furthermore, precursors of siRNAs are another type of small ncRNA transcript with the long stem-loop structures that are a vital structure of the gRNA. All these ncRNA species are seen in substantially high numbers of embryogenesis.5 This could implicate them as having a role in off-targeting of Cas9 specifically in injected embryos.
Proof of any of these particular mechanisms awaits further studies, but there is no obvious mechanism by which simple Mendelian inheritance can explain variants observed in the CRISPR-treated mice.
Comparison to previous WGS CRISPR/Cas9 in vivo papers
There are surprisingly few studies using WGS, and casual comparisons miss important differences. The Iyer et al. paper surveyed CRISPR-treated F1 hybrid mice for off-target mutations, focused on indels.6 In contrast, we surveyed CRISPR-treated F0 inbred mice for indels and SNVs. Additionally, Iyer et al. studied the off-target effects after non-homologous end-joining (NHEJ), while we studied off-targeting after homology directed repair (HDR). HDR requires a donor template, in our case a single-stranded oligonucleotide DNA molecule (ssODNA), which itself might be mutagenic, and even more so in combination with CRISPR-Cas9.7 The difference in results may reflect differences in the strains, filtering methods needed for hybrid mice with high levels of heterzygosity, gRNAs, technique etc., (as we addressed), but cannot be attributed solely to the use of colony controls. The two studies are similar in terms of sample size and lack of parental controls.
Claims regarding widespread heterozygosity in the inbred FVB line
There is significant heterozygosity observed in F03 and F05. Genetic drift is not something that could plausibly account for the observed heterozygosity, due to the experimental design. Based on our standard practice for murine transgenesis, a standard procedure was followed by ordering 3 to 8-week old oocyte donors and 8-week old stud males from Jackson Labs. We did not breed these mice in-house. All the stud males and oocyte donors were ordered within a few weeks of one another. In fact, this is what JAX recommends to avoid genetic drift issues as part of their Genetic Stability Program. These freshly ordered mice were used exclusively for the purpose of rd1 repair and were not kept past 6 months of age. Based on the JAX order, the parents that generated both the stud and oocyte donor were likely to be siblings of the stud, as it is common practice to use sibling matings to generate a colony of inbred mice. Thus, F03 and F05 could essentially be considered clones of one another and would be expected to be homozygous. Instead, we observed extensive heterozygosity, which was validated by Sanger sequencing (Figures 4,6, and 7, Table). The heterozygosity in F03 and F05 is unlikely to be parentally inherited. The colony control FVB/N was also purchased from JAX and not bred in-house.
Off-target mutations that passed all 3 pipelines were called “heterozygous” if reads were equal between the mutant allele and reference (+/-10%)
The Editas and Intellia re-analysis of our sequencing data has limited relevance to our Correspondence. The chief issue is that Editas notes the many variances between cases or controls and the mm10 reference sequence from a C57BL/6 strain. The appropriate reference sequence is the FVB/N strain sequence. Nonetheless, because we originally wanted to rule out any common germline mutations, we specifically excluded all variants present in the mm10 sequences (C57BL/6) or any of the 35 other reference strains present in dbSNP.
Heterozygosity in mice varies by strain and breeding, and use of the highly inbred FVB/N mice from JAX without in-house breeding is an experimental advantage.8–10 Inbreeding leads to a reduction in heterozygosity within the population.11 In 1988, FVB/N mice (which are blind because of the Pde6brd1mutation) were imported from NIH to Dr. Taketo at The Jackson Laboratory. In 1991, these were re-derived at F50 into the foundation stocks facility at The Jackson Laboratory (FVB/NJ). There is no evidence for widespread SNVs between mice in this line. No heterozygosity has been described. In contrast, the Oey et al. paper cited in the Editas Inc. letter, which reported variation between littermates, is based on a line that is a C57BL/6J x C3H/HeJ cross. These mice carry the agouti viable yellow (Avy) allele (this is why their mice show agouti coat colors and not black like the C57BL/6J strain). The number of backcrosses done in their colony is not reported.12 Moreover, the Avy line is known to have a poor DNA-repair mechanism, and a high spontaneous cancer rate.13,14 Hence, the colony used in Oey et al. is predisposed to SNVs and mutations. Table 2 of Oey et al. notes at least 1130 heterozygous variants shared by their two littermates, suggesting theirs is not a typical inbred line. An inbred, essentially clonal strain is not the same as a strain that was insufficiently backcrossed and crossed to a line predisposed to mutation. Moreover, in our observation, over 50% of the nearly 2035 total SNVs (339 unique to F03, 299 unique to F05, and 1,397 shared between the two) and over 30% of the over 160 total indels (47 unique to F03, 11 unique to F05 and 117 shared between the two) were reported at unexpected off-target sites, were read as heterozygous, and were absent in the control (see Figure 1 and Table 1). Again, heterozygous SNVs and indels should be an exceedingly rare event in this inbred line. Furthermore, the number of observed SNVs, if due to genetic drift, is estimated to take over 3.5 years (without any backcrossing) and would still be expected to be homozygous.
Intellia’s claims that their re-analysis of the WGS data from our correspondence shows hundreds of thousands of heterozygous sites in each of the three FVB mice. This number was not validated by any other method (e.g., Sanger sequencing of a commercially available FVB mouse at suspected heterozygous sites), and may represent a combination of insufficient filtering and false positive WGS reads. Evidence that Intellia’s heterozygous sites are likely false positives comes from several sources. First, Wong et al, in the only paper to report WGS sequencing in the FVB mouse reported no heterozygous variants, and no other published report describes any widespread heterozygosity in FVB mice.8 Second, Wong et al report only 115,228 total private SNPs in the FVB/NJ strain, so the total number of heterozygous sites reported by Intellia in each mouse is higher than the total number of private FVB SNPs previously described. Third, via Sanger sequencing confirmation, Wong et al describe that every 2/127 of their SNPs are false positives, but Intellia does not confirm any of their proposed variants by a secondary method. Fourth, to avoid false positives, our original Correspondence insisted on variants passing 3 different filtering methods, and then also not being known common germline variants, and not ever having been seen in dbSNP. Fifth, we Sanger sequence confirmed many of our variants (original Correspondence Figure and Figures 4, 6, and 7 in the present Correspondence). In summary, the amount of heterozygosity in the inbred FVB line (the standard line for murine transgenesis) claimed by Intellia is not reflected in the literature and has not been validated by independent experiments.
Relatedness between F03 and F05
The clonality between F03 and F05 can be discerned in our posted WGS data by the identity at all non-mutant call alleles. The WGS filtering pipeline in our Correspondence was not designed to simply determine all of the sequencing differences between the cases and controls. Nucleotides known to be commonly mutated in the germline were all rejected and did not appear in the final list of mutant genes (see Methods from the original Correspondence). If we were to assume long-standing genetic drift between the cases and the control, which are both from the original inbred line, we would expect these changes to be homozygous, and the most expedient way to eliminate variant calls that were due to this drift would be to add a filtering step that removes all homozygous calls. While this extra filtering step might lead to some false negative calls of true homozygous mutations, it would still leave over 1000 heterozygous mutations (which is more than 50% of the total mutations reported, Figures 4, 6 and 7, Table). These heterozygous mutations cannot be explained by long standing differences between inbred cases and control, as such differences would be homozygous. Therefore, genetic drift does not account for the number of mutations, most specifically the level of heterozygosity observed, leading one to at least consider the source as CRISPR therapy intervention.
Sequence read depth differences between cases and control
When we originally designed the HDR study, we fully expected to observe little to no off-targeting in the CRISPR/Cas9 treated mice. The FVB/NJ control inbred line genome was already publically available at 50x coverage in the mouse genome project (http://www.sanger.ac.uk/science/data/mouse-genomes-project, ftp://ftp-mouse.sanger.ac.uk/REL-1303-SNPs_Indels-GRCm38/) based on a published WGS study.8 However, we chose to sequence an available colony control to rule out any mutations that might be introduced because of differences in our local sequencing protocols and apparatus. Therefore, to save resources, we sequenced the control mouse at 30x coverage and the cases at 50x. We noted in the original correspondence that all mutation calls in the 50x sequenced cases had a read depth of at least 23x. For the 30x sequenced control, approximately 97% (2145/2210) of the wild-type reads were greater or equal to 20x covered. Of the remaining sites, 53/65 of wild-type reads were sequenced at greater than 15x. The remaining 12 mutation loci (7 SNV and 5 indels) reads were greater than 10x. It is possible that these few lower read loci are false positives. It is also possible that many of the reads in our cases that fell slightly below the 23x cutoff and were not called are actually false negatives, and that the true mutation rate is even higher than we reported. To secondarily test some of these loci, we performed Sanger sequencing for some of the mutations in the original Correspondence and have included more in the present Correspondence (Figures 4, 6, and 7).
Identity of some mutations in cases
Concern was expressed that despite the fact our CRISPR-Cas9-treated mice were mosaics, there was high similarity between WGS read depths in the SNVs. While this could be explained by parental inheritance, this could alternatively be explained by Cas9/ssODNA introducing mutations during early embryonic development, specifically at the 1-, 2-, or 4-cell stages when levels of Cas9 are very high. HDR may have occurred at a later stage in development resulting in a different degree of mosaicism. This could also account for the novel indels between the two animals (at regions not predicted by current algorithms), many of which are read as heterozygous (Table).
The finding of identical variants between the two CRISPR treated mice may also be explained by the filtering and/or the upper limits of the sensitivity of our study design. It is likely that CRISPR-Cas9 caused mutations at a particular off-target site at a high rate, and that many different alleles were created in this mutagenesis. However, since we sequenced at 50x and only accepted calls greater than 23x, we would not call many mutations that were lower frequency. The mutations called and reported in our original Correspondence may thus simply reflect the high frequency mutations and calls, but there may be multiple other mutations at the same genomic loci. In the present Correspondence, we validate multiple alleles at several loci (Figures 1–6), confirming the utility of deep sequencing by multiple methods. Future studies with alternate off-target calling methods (e.g., CIRCLE-Seq) or higher depth sequencing and different filtering protocols will directly answer this question.
Sample size (power) question
Restoration of sight in Pde6brd1 mice was the primary outcome of our original study that began in 2015 to test CRISPR homology directed repair (HDR) of a single point mutation.15 Off-target analysis was a secondary outcome reported in our Correspondence.1
In our study, only two of the eleven founders showed successful HDR.15 Tissue from these two and a colony control underwent WGS. Thus, as Editas points out, the sample size in our report was small—one control and two cases. This number is nearly identical to that of Iyer et al.’s Nature Methods correspondence,6 which is commonly cited to indicate Cas9 has limited off-target effects in vivo. Neither our study nor the Iyer et al. study used parental controls. Indeed, Iyer et al. (2015) states that, “To control for strain-specific variants, we also sequenced a C57BL/6J and a CBA animal from our breeding colonies.” The reason for this approach is practical: injecting CRISPR-Cas9 requires multiple zygotes, typically gathered from many females mated with many males. In our case, 56 zygotes were harvested from six pregnant females bred to six stud males and injected with CRISPR-Cas9. Exact parentage is difficult to assess, due to this technical aspect as well as the highly inbred nature of this strain. We agree that future studies where in vivo off-targeting is ascertained by WGS should be designed with parental controls.
Issues regarding gRNA guide design
Are there other reasons we may have detected off-target mutations? Editas suggests the guide RNA was suboptimal; and this may be correct. We used the online software from Benchling (San Francisco, CA) to design several gRNAs, and achieved high on-target cleavage rate with only one guide in vitro. This one gRNA was used in vivo. Since we aimed to rescue sight by repair of a specific rd1 sequence by HDR, our rd1 specific gRNA had to target a relatively short sequence, and our sequence optimization options were limited. In contrast, for a gene-disruption strategy, use of non-homology end joining (NHEJ), which can target many regions across a gene, typically gives the flexibility to choose from far more gRNAs. Although, a less perfect gRNA might be expected to hit more off-target sites, it would still be predicted to be restrained to homologous sites. Instead, we observed mutations to sites that showed little homology to the gRNA. This raises important questions. Are guide optimization studies performed by algorithms in silico or performed in immortalized cell lines predictive of guide function in vivo? Collection of more in vivo data using WGS will help address this question.
Consideration of other NGS studies
Our original correspondence was limited to five references; so two additional references were not included because of methodological differences.16,17 For example, the Nakajima et al study used exome sequencing, whereas our Correspondence used whole genome sequencing.16 Mianne et al used 9x read depth and 1.5% assembly gaps for their WGS.17 It is uncommon today for WGS to be performed at such low coverage, since filtering is likely to exclude so many regions due to poor quality. This may result in many false negative calls versus our 50x coverage. As in our study, in Mianne et al the sequenced control was a wild type mouse, not a parent. Mianne et al used a “standard mutation detection tool to search for potential sequence variations." The identity of these tools is not known to us. Mianne et al chiefly examined predicted off-target sites and sites surrounding the on-target site. Mianne et al then used another unidentified SNV detection tool to look only at coding sequences and found 42 SNVs. They went on to eliminate most of these because they had “low allele frequencies"-but again, this is with an average of 9x coverage and none were evaluated by Sanger sequencing before excluding them. Mianne et al Sanger sequenced 7 coding region SNVs-6 of which were predicted to be false positives and verified 1 real SNV.
Minor labeling discrepancies in the original Correspondence
There are labeling discrepancies in supplemental Figure 3 of the correspondence. In panel 3a, the top-10 off-target sites predicted by the Benchling software was originally performed when the guide was designed and used the mm9 build of the mouse genome. When the WGS analysis was performed later, the mm10 build was used, so Figure 3b-d use labels from the mm10 build. Regardless of the build, the sequences in panel 3a are the same, but for consistency, the chromosomal locations and gene names were relabeled using the mm10 build. For clarity, descriptive column titles were added, the first (Pde6b) sequence was removed in 3a, and the last five nucleotides of Pcnt in panel 3b were corrected (Revised Supplementary Figure 3).
Questions regarding mutation rate
The total number of mutations detected in our Correspondence specifically excludes common germline variants, many of which were described by references such as the Uchimura et al manuscript,18 in which C57BL/6J mice (JAX mice from Charles River) were used as wild-type control mice. Uchimura et al estimated 101.5 heterozygous SNVs per generation in one control mouse and 92.7 in the other (Uchimura et al Table 1), which is an order of magnitude less than the number of heterozygous mutations we found between F03 and F05 (Table), and this excludes known common germline mutations. Also, Uchimura et al performed WGS that included libraries amplified by PCR. Our correspondence included no PCR amplification in our WGS. It is unclear to what extent PCR amplification, which is itself known to introduce errors, could account for some of the mutation frequency observed in the Uchimura study.
Conclusion
The summary statements in our Correspondence reflect observations of a secondary outcome following successful achievement of the primary outcome using CRISPR to treat blindness in Pde6brd1 mice. As the scientific community considers the role of WGS in off-target analysis, future in vivo studies are needed where the design and primary outcome focuses on CRISPR off-targeting. We agree, of course, that a range of WGS controls are needed that include parents, different gRNAs, different versions of Cas9, and comparisons of different in vivo protocols. We look forward to the publication of such studies. Combined, these results will be essential to fully understand off-targeting and can be used to create better algorithms for off-target prediction in vivo. Overall, we are optimistic that some form of CRISPR therapy will be successfully engineered to treat blindness.
Methods
Topo Cloning and Sanger Sequencing
Mutated regions were amplified using primers (Integrated DNA Technologies), Biolase DNA polymerase (Bioline) and dNTP mix (New England Biolabs), and subsequently TOPO cloned using TOPO-TA cloning kit (ThermoFisher). Colonies containing the insert were expanded, and PCR amplification of the insert was performed using M13 primers. Crude PCR products were sent for Sanger sequencing (Functional Biosciences).
Primers:
M13 Forward: GTAAAACGACGGCCAGT
M13 Reverse: CAGGAAACAGCTATGAC
Indel chrX:123734364 Forward: CCCTTCACGTTAAACATATTGGA
Indel chrX:123734364 Reverse: TTGACTTACTTTTATATCCAGCCACTT
Indel chr4:66453492 Forward: TTTGGGATGATGGAGGAGAG
Indel chr4:66453492 Reverse: TCATTGTGCCACCAAGAAAC
Footnotes
The authors have no direct or indirect financial connections to any CRISPR company or to any related companies.