Abstract
Homology directed repair (HDR) induced by site specific DNA double strand breaks (DSB) with CRISPR/Cas9 is a precision gene editing approach that occurs at low frequency in comparison to indel forming non homologous end joining repair. In order to obtain high HDR frequency in mammalian cells, we delivered donor DNA to a DSB site by engineering a Cas9 protein fused to a high-affinity monoavidin domain to accept biotinylated DNA donors. In addition, we used the cationic polymer, polyethylenimine, to efficiently deliver our Cas9-DNA donor complex into the nucleus, thereby avoiding drawbacks such as cytotoxicity and limited in vivo translation associated with the more commonly used nucleofection technique. Combining these strategies led to an improvement in HDR rates of up to 90% on several test loci (CXCR4, EMX1, TLR). Our approach offers a cost effective, simple and broadly applicable editing method, thereby expanding the CRISPR/Cas9 genome editing toolbox.
Introduction
CRISPR/Cas9 is a gene editing approach directed by short guide RNA (sgRNA), which induces double strand breaks (DSBs) at specific genomic locations. DSBs are usually repaired in the cell by non-homologous end joining (NHEJ) or the rarer homology directed repair (HDR). NHEJ often generates deletions and insertions (indels) at the target site whereas HDR utilizes a donor DNA (dDNA) template to repair the lesion (Mali et al, 2013; Cong et al, 2013), resulting in single-point mutations or larger insertions (Schwank et al, 2013). However, the isolation of low frequency HDR-edited cell clones is time consuming and difficult. Current approaches often select Cas9 expressing cells via fluorescent markers and/or antibiotic resistance (Mali et al, 2013; Cong et al, 2013; Schwank et al, 2013; Van Trung Chu et al, 2015), to subsequently isolate only HDR-edited clones from the population of selected cells. While selection raises the percentage of successfully modified clones, it reduces the overall number of cells, and requires insertion of a relevant selection marker, usually via plasmid. Without chemical or biochemical interventions, HDR rates of below 1% are considered normal (Miyaoka et al, 2016).
Pharmaceutical interventions that block NHEJ can also improve HDR. Chu et al. inhibited the action of DNA ligase IV or Ku70, which mediate NHEJ, using a variety of methods – including pharmacological (SCR7), protein-based (expression of viral Ad4) and/or shRNA. Using these approaches, decreases in NHEJ (and corresponding increases in HDR) of 4-fold, 8-fold and 5-fold were achieved, respectively (Van Trung Chu et al, 2015). Other studies have shown that cell cycle phase can influence DNA repair, where NHEJ occurs more often in the G1 phase and HDR only increases in late S/G2. For example, nocodazole arrested cells in S/G2 phase can exhibit 38% HDR with Cas9 ribonucleoprotein (RNP) (Lin et al, 2014). However, the applicability of this method is cell line dependent, since in the MDA-MB-4687 breast cancer cell modelline, nocodazole, vincristine or colchicine cause a G1 arrest (Holt et al, 1997).
HDR can be improved by optimisation of the dDNA homology arm (HA) length and structure. Zhang et al. demonstrated 26% HDR with double cut plasmid donors with long (600 bp) HA length (Zhang et al, 2017). Designing dDNA reflecting aspects of the Cas9 DSB mechanism also impacts HDR frequency. The Cas9 complex releases the distal strands (single strand or double) almost immediately post cut, while the proximal strand is released later (Shibata et al, 2017; Richardson et al, 2016). Richardson et al. (Richardson et al, 2016) benefitted from dDNA that was antisense to the released genomic DNA as well as from asymmetric HA for capturing resected genomic regions. These approaches improved HDR by 2.6-fold compared to symmetric single strand dDNA and 4-fold over double stranded dDNA; thus the overall rate of HDR (57%) surpassed chemical and genetic interventions (Richardson et al, 2016) (further summarized in Supplemental Table 1).
In summary, HDR efficiency ranges from fractions of 1% (Van Trung Chu et al, 2015; Lin et al, 2014; Miyaoka et al, 2016; Gutschner et al, 2016; Davis & Maizels, 2016) to ~60% (Richardson et al, 2016; Lee et al, 2017b), influenced by cell cycle (Gutschner et al, 2016), drug treatment (Van Trung Chu et al, 2015; Lin et al, 2014; Li et al, 2017), upregulation of DNA repair proteins (Rad51/52) (Shao et al, 2017), siRNA knockdown of NHEJ factors (BRCA1) (Davis & Maizels, 2016), donor design (Zhang et al, 2017; Richardson et al, 2016), timed protein degradation loci, cell type and transfection method. However, a consistent and common approach has not yet been defined. Many pharmacological interventions are possible in cell culture, however these interventions may not be appropriate in vivo (Lee et al, 2017a; Yanik et al, 2017).
Transfection methods also influence the percentage of modified cells, where the current state-of-the-art method is nucleofection. Here, the Cas9 RNP complex is delivered directly into the cell nucleus, which results in lower off-target DSBs (Kim et al, 2014; Kouranova et al, 2016) and reduced nuclease lag time (Lin et al, 2014), thereby improving both NHEJ and HDR. The advantage of using a pre-formed Cas9 RNP complex versus plasmid delivery is that it avoids the lag time associated with mRNA and sgRNA transcription, Cas9 mRNA translation, and formation of intracellular RNP, balanced against the rates of sgRNA and protein degradation. Questions remain as to whether nucleofection is applicable in clinical settings other than ex-vivo editing and transplantation (Lin et al, 2017).
Cationic polymer RNP delivery of negatively charged conjugates is a potentially flexible delivery platform that has thus far drawn limited attention in CRISPR/Cas9 applications. To date, there have been only two novel attempts using this approach - sgRNA:donor chimeras and CRISPRGOLD applying PAsp(DET) polyplexes (Lee et al, 2017b; 2017a). Taking into account cost, biodegradation (Wen et al, 2009), and proven nucleic acid transfection (Boussif et al, 1995), we decided to use Cas9 RNP delivery by polyethylenimine (PEI). PEI enters the cell by size dependent entry mechanisms (clathrin, calnexin or macropinocytosis) (Khalil et al, 2006), escaping endosomes via a proton pump effect and translocating to the nucleus (Oh et al, 2002). PEI has been observed to be favourable with regards to nuclear localization over poly-L lysine polyplexes (Pollard et al, 1998). Additional benefits of PEI are microtubular active transport, which increases Cas9 concentration at the peri-nucleus (Drake & Pack, 2008; Suh et al, 2003). Editing by Cas9 could be improved by an increase in intranuclear or perinuclear RNP benefitting from both PEI and importin mediated (protein nuclear localization sequences [NLS]) nuclear trafficking concurrently (Mali et al, 2013; Cong et al, 2013).
In this paper we describe a proof-of-concept HDR system that may provide a simple and broadly applicable gene editing solution. The keys to our system of improved HDR efficiency are combining Cas9 engineered donor colocalization (Cas9 mono-avidin:biotin dDNA) and application of dual nuclear localization modes (PEI and NLS), such that the probability of colocalizing active RNP with dDNA at the time of DSB is increased.
Results
Protein Engineering and Characterisation of PEI:Cas9MAV Polyplexes
We speculate that the current low HDR frequency by CRISPR/Cas9 is primarily due to the inefficient delivery of Cas9 to the nucleus and/or inefficient delivery of dDNA to the DSB. To enhance Cas9 delivery to the nucleus, we used PEI. In order to optimize delivery of dDNA to the DSB, we forced the conjugation of the dDNA to Cas9 through the high affinity binding system. The avidin-biotin system is facile to implement compared to multi-step chemical conjugation of sgRNA and dDNA to create chimeras, as the biotin modification can be introduced by a PCR primer extension. Previous studies (Lee et al, 2017b) reported that colocalizing the dDNA using a chimeric crRNA improved HDR, but not significantly compared to cycle arrest strategies (10-35%) (Lee et al, 2017b; Jinek et al, 2012). In addition, the crRNA:dDNA chimera was exceptionally short in length (crRNA length 42, dDNA length 87 including the HAs), which could hinder the ability of the dDNA to interact with the DSB site. We hypothesized that HDR can be further improved by increasing the linker distance between the conjugated dDNA and the protein, thereby providing additional flexibility for the dDNA to capture the released genomic DNA. Thus, we engineered a Cas9-monoavidin (Cas9MAV) construct, whereby Cas9 already containing dual NLS sequences, was further modified by MAV at its C-terminus with a 17-amino acid linker between Cas9 and MAV (Fig.1A). The detailed procedure for the formation of the modified Cas9 is described in the methods section. Combined with biotinylated dDNA, Cas9MAV should increase the local concentration of the dDNA with improved flexibility. The rationale for choosing MAV as opposed to streptavidin, which exists as a tetramer and could deliver up to four dDNA components, was to maintain tight control on number of dDNAs being delivered (i.e. exactly one dDNA per Cas9) to the DSB.
The size and molecular weight (MW) of Cas9MAV was determined by size-exclusion chromatography-multi-angle light scattering (SEC-MALS) (calculated MW: 175.474 kDa, measured MW: 174.5 kDa (Fig.1A)). The hydrodynamic radius was estimated to be 9.5 nm which is in accordance with the presence of the flexible NLS-linker-MAV tether in Cas9MAV and a previously published Cas9 hydrodynamic radii (Carlson-Stevermer et al, 2017; Jinek et al, 2014; Nishimasu et al, 2014).
Before attempting colocalization intracellularly, formation of a Cas9MAV complex bound to biotinylated dDNA must be first demonstrated in vitro. Analysis by dynamic light scattering (DLS) (Fig.1) established that near complete conjugation of donor to protein was achieved, ensuring that the predominant protein species is conjugated and ready to provide dDNA for HDR repair. DLS was also used to validate polyplex formation. DLS measures light scattering generated by particles representing size, distribution and polydispersity. Sample mean intensity distribution can be transformed to a mean volume distribution (sample population spatial occupancy), and a mean number distribution (number of particles per population per volume). DLS determined that Cas9MAV:sgRNA is monodisperse with a 9 nm radius correlating well with SEC-MALS results of Cas9MAV alone (above, Fig. 1C and 1D). The donor DNA tested here was the 800 bp long CF3b (5’ biotinylated) and has an approximate radius of 30 nm. Preformed Cas9MAV:sgRNA RNP was mixed with CF3b in a 1:1 molar ratio and transferred back to the cuvette for DLS measurements. CF3b was able to complex completely with Cas9MAV:sgRNA as no residual free CF3b dDNA could be detected upon RNP addition (Fig 1C and 1D). It was observed that the radius of the final Cas9MAV:sgRNA:CF3b RNP complex was smaller with a hydrodynamic radius reduced to 6 nm, although a small population was observed at ~ 16 nm. We speculate that the reduction in the RNP radius of gyration is due to the conjugation of the negative phosphodiester backbone to the MAV portion of Cas9MAV, which presumably restricts the free rotation of the dDNA. In all cases the 1:1 molar ratio of the interactions of the sequential RNP formulation is confirmed by the absence of the free macromolecule populations (i.e. CF3b and Cas9MAV:sgRNA populations disappear when sgRNA:Cas9MAV:CF3b forms).
DLS confirmed the qualitative assessment by agarose gel shift of Cas9MAV:biotin donor complexation. Fig.1B shows the donors with and without Cas9MAV (purple and green boxes, respectively). When Cas9MAV and the biotin donors were loaded on an agarose gel, no migration was detected due to the increase in molecular weight of the complex.
Next, DLS confirmed encapsulation of the RNP complex by PEI (Fig.1E and 1F). PEI was added directly to the sample described in Figure 1C and 1D and incubated at room temperature for 10 min before DLS measurements. By intensity and volumetric measurements, we observed two species of 134 nm (+/−10) and 530 nm (+/−10) radii. The Cas9MAV:sgRNA:CF3b population was no longer detectable, indicating that it formed a polyplex with PEI. Considering the number of species in each population, PEI polyplexes of 134 nm radius were the most prevalent. Thus, based on this radius and prior knowledge of PEI, it is most likely that calnexin-mediated and macropinocytosis mechanisms (Khalil et al, 2006) are employed by the Cas9MAV:sgRNA:dDNA:PEI polyplex to cross the cell membrane in transfection experiments, since these mechanisms are more compatible with objects of 130 nm.
High transfection efficiency with PEI:Cas9MAV, resulting in high indel frequency in human and murine cell lines
High transfection efficiency must be obtained for the proposed Cas9:MAV system to function. To track transfection of the PEI:Cas9MAV complex, we used fluorescent PEI. Fig.1G, left panel shows fluorescent light microscopy of non-transfected and transfected cells with green fluorescent PEI:Cas9MAV complex with no dDNA (see supplemental methods). This confirms that transfection has occurred qualitatively (positive in the green fluorescence imaging channel). Quantitative measurements were performed using flow cytometry and 95% of cells were shown to have been transfected by the (carboxyfluorescein) FAM-labelled PEI:Cas9MAV complex (Fig.1G, right). With the demonstration of successful cellular uptake and DLS experiments verifying intact RNP formation, transfection and nuclease activity can now be investigated in multiple cell lines.
Six cell lines (Human: HEK293T, MCF-7, HeLa, U2SO and Mouse: 4T1, 3T3) were chosen to reflect both easy- (HEK293T, HeLa) and difficult-to-transfect (4T1, MCF7) cells, two species, and a variation in tissue of origin/disease (kidney, breast cancer, cervical cancer, osteosarcoma). To examine multiple loci, RNPs were formed complexing Cas9MAV with sgRNA for FLCN, ACC1 and EMX1 for human cell lines (Fig.1H) and FLCN & ACC1 for mouse cell lines (Fig.1I). NHEJ indels were introduced in all cell lines by transfection with PEI:Cas9MAV with no dDNA, and the frequency was determined by a T7 endonuclease assay. Indel frequencies (50-70%) showed broad equivalence across the human cell lines (Fig. 1H). In mouse 4T1 and 3T3 cell lines, two loci were tested (ACC1 and FLCN) and indel frequencies of 50-60% were observed (Fig. 1I). These results demonstrate that the PEI:Cas9MAV could be successfully introduced to other cell lines besides HEK293T.
High Rates of HDR achieved by combining PEI and Cas9MAV
To investigate the impact of PEI nuclear co-delivery and the impact of Cas9MAV upon donor design, we outlined a simple comparison experiment at the CXCR4 loci, focusing on four double stranded dDNA designs (Fig. 2A), with biotinylated and non-biotinylated versions. Each donor was designed with a HindIII restriction digest sequence within 5 bp of the DSB. The intention was to explore the design rules for double stranded donors using equal HAs of 600 bp (CF1), 400 bp (CF3) and asymmetric short donors (Fas1 and Fas2), the latter being sub-optimal designs as their HAs are significantly truncated. In our initial experiments (Fig. 2B) we assessed the indel frequency of non-biotinylated donors, finding that with short asymmetric dDNA, indel formation persisted. This was observed to occur for both Cas9MAV and a version of Cas9 lacking MAV, but still possessing dual NLSs (Cas9NLS), as well as for cells treated with the drug, nocodazole (200 µg/ml). Overall Fas1 and Fas2 averaged 57% and 54% indels, respectively, whereas CF1 and CF3 averaged only 5% and 24%. CF1 dDNA saw no observable indels with Cas9MAV and nocodazole treated variants.
Nocodazole is intended to aid HDR by cell cycle arrest. For non-biotinylated donors (CF1, CF3, Fas1 and Fas2) using RNPs formed with Cas9NLS or CAS9MAV (Fig. 2C), we observed no significant difference between drug treated and non-drug treated cells. Shorter non-biotinylated donors generally result in lower HDR percentages, as they are less likely to form stable, hybridised species with genomic DNA than longer donors;. however, these donors still benefit from PEI nuclear deliver as shown in (Fig. 2B).
An improved HDR percentage was observed for all donor constructs that were biotinylated, including the shorter Fas1b and Ras2 (biotinylated version of Fas2) (Fig.2D). Having excluded the need for drug treatments, we conclude that the HDR improvement we observe is a consequence of Cas9MAV and PEI transfection. Figure 2D shows an improvement in HDR percentage by virtue of colocalization when using shorter donor constructs. Figure 2E summarizes the overall impact of localization by Cas9MAV and shows that at the CXCR4 loci, Cas9MAV utilisation contributes a 20% improvement in HDR rates for all donors in comparison with Cas9NLS.
We wished to validate the performance of the biotinylated dDNA and Cas9MAV by investigating whether the genomic edits persisted at the mRNA level. Using Fas1b and its associated HindIII site to edit cells, which were split for DNA and RNA extraction at the point of cell harvesting. DNA extraction and PCR were performed and insert validated again by restriction digest (Fig. 2F). Genomic DNA was degraded by DNAse1 treatment before reverse transcription (see supplemental methods). Two primer sets for two separate extractions demonstrated the presence of the edit on the mRNA level (Fig. 2G, H). This enabled us to determine by cDNA derived PCR products, that restrictive cleavage was present and the edited DNA sequence had been transcribed to mRNA, demonstrating the persistence of the edit. Furthermore the edits persist after significant dilution and re-growth of cells post passaging (Supplemental Figure S2). Samples edited with CF1, CF3 and Fas1/Ras2b were prepared for Sanger sequencing and all samples returned the correct sequence (Fig. 2I).
Traffic Light Reporter Assay
Given the high HDR frequencies observed on the CXCR4 loci, we wanted to demonstrate the versatility of our PEI:Cas9MAV system by applying it to other loci. It was important not to choose loci and commensurate sgRNAs of equivalent performance as this would not give a clear evaluation of the methodologic generality of the system, nor provide insight on the influence of variables such as loci accessibility and sgRNA performance. We thus chose EMX1 and a Traffic Light Reporter (TLR) system (Certo et al, 2011), to obtain a more realistic appraisal of performance.
The CXCR4 loci is usually associated with high indel frequency. In seeking to check if our method was applicable at loci with low indel frequency, the TLR assay appeared an appropriate choice, since indel frequency observed in the TLR assay is usually historically low (12%) (Robert et al, 2015). Evaluation was performed using a Cas9NLS plasmid control as an indel forming positive control (Supplemental Figure S1). We observed a decline in NHEJ (0%) (Fig. S1) at the 48 hr time point (5’b and 3’b dDNA variants (Supplemental Table S2)) and HDR frequencies of 1-3%. At the 96 hr time point complete HDR of DSBs will have occurred, and there has been sufficient time for the Cas9 plasmid NHEJ control to reach an analytical signal (RFP expression) for flow cytometry. A frequency of 17.1% for NHEJ was observed at 96 hrs for the Cas9 plasmid control reflecting previous NHEJ frequencies with this assay (Robert et al, 2015), and a fluorescent microscope image was included for reference in Figure 3 as a qualitative assessment of expression. We observed that it was possible to detect with both fluorescence microscopy and flow cytometry the mixed NHEJ/HDR population in cells edited with the PEI:Cas9MAV system (Fig. 3A, 3B, 3E and supplemental Figure S1). NHEJ frequency for PEI:Cas9MAV at 48 hrs ranges from 0.01-0.14%, but increases to 0.23-0.61% at 96 hrs representing a small heterogeneous population in edited cells. HDR frequency for 5’b, 3’b and dual biotin modified 5’3’ donors with the PEI:Cas9MAV system by 96 hrs gave a range of 20-32% HDR with an average HDR percentage of 25% for all donors. NHEJ was a minor contributor to edits with less than 2% for all biotinylated donors. Comparison of 5’b donor (96 hrs) and control (96 hrs) are given as representative examples in Figure 3. Full plots are presented in supplemental Figure S1A,B and tabular summary of results are presented in supplementary Fig. S1C.
EMX1
The HDR frequencies achieved by our system in both CXCR4 and TLR experiments represented an improvement on previous pharmacological interventions (Van Trung Chu et al, 2015; Certo et al, 2011). In keeping with the trends observed for CXCR4 gene loci (Fig. 2), we surmise that EMX1 indel frequencies (Fig. 3I) are a metric for sgRNA quality and genomic loci accessibility. Based on the indel frequencies we observed, we predict a lower HDR percentage than observed for CXCR4, but greater than TLR. We considered FLCN as an alternative, but chose the loci:sgRNA combination with lower indel frequency (Fig. 3I). An additional stringency was introduced to the dDNA by truncating the equal HAs to 200 bp, on either side of a 6-repeat stop codon sequence and BamHI site. The donor is compromised in comparison to the ideal HAs of 400 and 600 bp used for CF3/CF1. BamHI was used to rapidly assess the percentage of HDR occurring (Fig. 3K). The experiment involved two donors (Fig. 3K), where biotinylation at the 5’ end of the forward primer is denoted by ‘b’.
The non-biotinylated donor generated an HDR percentage of 60% with high (<70%) indel formation, suggesting mono-allelic editing; partial incomplete HDR leading to indel formation and reflecting the semi-quantitative inaccuracies of gel-based assessments. This is similar to the observation of the short double stranded donors used in CXCR4 investigations (Fas1/Fas2 137 bp) and in the TLR study (111 bp total) generating lower HDR occurrence. Biotinylated EMX1 donor was used with the PEI:Cas9MAV system, resulting in indels dropping to less than 10% and HDR occurrence increased to 90% (Fig. 3 L&M).
We also wished to consider the impact of donor DNA being inserted randomly where the Cas9 DSBs were generated. A simple experiment was performed where the sgRNA for EMX1 was mismatched with a biotinylated CF3 CXCR4 donor. EMX1 retained its capacity for indel generation (Fig. 3I) and the absence of insert was validated as HindIII digestion did not result in cleavage (Fig. 3J). While not a complete proof of non-insertion where Cas9MAV cleaves DNA, it offers some indication that non-matched donors do not result in HDR.
PEI Transfection Method and Cas9MAV impacts to HDR: A comparison with Lipofectamine 3000
Central to our system are PEI transfection and MAV modification offering an alternative nuclear trafficking approach and dDNA-Cas9 co-localization at the site of the DSB, respectively. Nucleofection was shown to result in equal indel formation at CXCR4 loci (Fig. S3) and an improvement in indel formation at EMX1. Comparing Cas9NLS and Cas9MAV with nucleofection suggests that nucleofection renders the NLS sequence redundant. Early experiments with the lipofection technique suggested that nuclear localization is a minor aspect of lipofection with cytosolic delivery being most common for DNA payloads (Akita et al, 2004). We analyzed transfection by Lipofectamine 3000 as a substitute method for PEI. Lipofection shares many characteristics with PEI, such as payload (DNA/protein) protection from metabolic degradation, high transfection rates and low cytotoxicity (Fig. S6). Liposomes escape lysosomal degradation by diffusing freely within cells, rather than engaging in microtubule transport (Cardarelli et al, 2016). However, lipofection lacks nuclear localization and microtubule transport, and thus provides an ideal comparison transfection method to PEI along with other derivatives such as CRISPRmax that achieve high transfection and HDR editing (~20%) (Yu et al, 2016).
Figure 4A and 4B detail the lipofectamine experiments conducted with the CXCR4 loci using that same donors and editing system as in Figure 2. Each plate was transfected with RNP:LP3K, and biotin or non-biotinylated donors, respectively. DNA extraction from Fas2 donor treated cells failed, so comparison of this donor and its biotinylated variant, Ras2 were excluded. Figure 4A compares indel frequencies between biotin and non-biotinylated donors. It is observed as before (Figure 2 and 3 for CXCR4 and EMX1, respectively), that indels decline where biotinylated donors are used with CF1 being an aberration in this instance with persistent indels. Decreased indel formation is indicative of increased HDR. This was confirmed by Fig. 4A and 4B, where biotinylated donors outperform non-biotinylated donors (CF1:CF1b = 85% to 89%; Fas1:Fas1b = 55% to 90%; CF3:Cf3b = 46% to 92%), but do not achieve the same HDR percentage as those cells transfected by PEI (~99% Fig. 2). The difference between the biotin/non-biotinylated donors suggests that donor DNA colocalization to DSB increase HDR. Finally, the impact of PEI was considered. LP3K effectively enables the donor delivery to the cell’s cytoplasm, and the substantially lower HDR rates observed for non-biotinylated donors compared to those observed with PEI suggests that the nuclear localization potential of PEI plays a significant role. If sufficient donor DNA is present in the nucleus, the chance of interaction between donor template and distally released genomic DNA from the Cas9 RNP increases, but would likely be governed by a random walk diffusion regime as tethered localization through Cas9MAV is not enabled. Further mechanistic evaluation is beyond the scope of this paper, but both PEI and MAV modifications appear to have statistically relevant effects on the HDR frequencies achieved in this work (Fig.2).
NGS sequencing Confirmation of Editing Frequencies
To consider the frequency of HDR at a greater depth of sequencing we designed an Ion torrent sequencing approach. We took 7 CXCR4 sample experiments covering HDR (with and without biotin donor), and unedited controls (Tables S5, S6). Figure 4C presents an overview of the libraries sequenced and Supplemental Table S4 details the variables of each experiment sequenced. ‘FControl’ returned a clear unedited population of sequences, replicating the canonical genomic sequence for CXCR4. Our principal interest was to evaluate the sequences returned for complete edits (full insertions of donor sequence) and partial edits (combination insert and indel formation). When compared to all other libraries (FBest, F60, BC4,5,6), the rate of HDR edits becomes apparent showing that HDR has occurred at a 90% or greater rate (Fig. 4C&D). One concern with using non-sequencing approaches such as restriction digests for HDR or T7 for indel formation is the under- or overestimation of the editing that has occurred. The FBest library (CXCR4 with Cf1b donor) is a good indication of correlation with gel-based analysis given in Figure 2. These results suggest that suppression of indels and high editing percentage was achieved for the complete insert, with less than 5% being partial inserts/indels. In this case, the restriction digest provides a good guide for donor design performance, mirroring the indel suppression and high HDR percentage observed by NGS. When Fas1, a non-biotinylated asymmetric donor was analysed (F60 library), we observe an unedited fraction persisting, but at a lower percentage (~2%) than was estimated from the average restriction digest value (Fig. 2C). It is worth considering that the potential for higher editing exists for Fas1 as the standard deviation margin of error suggests that even by gel analysis, the editing frequency ranges broadly (~20%).
The existence of a persistent partial HDR insert in all libraries excluding control suggests logically that no process is 100% perfect in its ability to insert at a loci. While the analytical approach (Ion Torrent) is prone to generation of indels, as are other sequencing by synthesis approaches, the partial insert frequency is greater than either enzymatic amplification errors or misreads. Partial insert reads are only observed in HDR experiment libraries suggesting that these reads are due to incomplete templating in HDR, though the mechanism cannot be speculated upon without further investigation.
Discussion
We demonstrated the potential for a cationic polymer delivery of PEI:RNP approach for a streamlined and efficient Cas9 HDR methodology. The potential to utilize polymeric delivery leveraging specifically the nuclear localization and high transfection rates appear to be a contributing factor in the improvement of HDR rate observed at the CXCR4, TLR and EMX1 loci tested, while generating high frequency NHEJ at other loci (FLCN, 2 sgRNA targeting ACC1, and EMX1) when dDNA is not present. Further work is required to unravel the percentage contributions of the nuclear localization effect in HEK293T and other cell lines, and how generalizable the principle is to other cell lines such as embryonic stem cells with respect to RNP delivery. Additional studies should be performed on RNP delivery relating to polymer derivatization and polyplex size (Ogris et al, 1998). These polyplex properties have been shown as key determinants of DNA transfection in different cell types broadening the PEI:RNP transfection approaches with the CRISPR/Cas9 system.
With respect to Cas9MAV and the 20% HDR efficiency increase observed over Cas9NLS without donor biotin mediated co-localization (Figure 2), PEI improves non-biotinylated dDNA delivery to the nucleus, but the monoavidin:biotin conjugate appears to relax further the design rules that usually operate for template donor during HDR. In previous studies, the necessity for double stranded dDNA with HAs of 400-600 bp has been required to achieve high HDR frequencies (Lin et al, 2014; Zhang et al, 2017). Using Cas9MAV, asymmetric short double stranded dDNA (137 bp, HA 37 bp and 93 bp, respectively) can achieve equal efficiency of HDR, mitigating the suboptimal design.
While the purpose of this research was to avoid the application of nucleofection/electroporation techniques that induce increased cell death in the cell line studied, the Cas9MAV RNP could be delivered by nucleofection as an alternative method (Supplemental Figure S3), however, at the cost of cell viability. Nucleofection may lead to problems in the potential for translating the transfection method into in vivo studies, since animals have a long history of electroporation (Adachi et al, 2002), but human studies are still under development (Jiang et al, 2015) in comparison to viral and non-viral transfection methodologies.
To codify the experimental results, the loci:sgRNAs combinations chosen were loosely termed the ‘good’, the ‘bad’ and the ‘ugly’, by indel frequency without HDR. The ‘good’ refers to a loci where the sgRNA scores highly in design programs and indel frequency is in excess of 60-80% as indicated by T7 endonuclease assay. We hypothesize that for HDR, a consistently high occurrence of DSBs is required with high recognition of target by sgRNA. We acknowledge that T7 assays cannot reflect conservative DSB repairs, but offer only an approximate measure of cut efficiency. This approximation refers specifically to the performance at CXCR4. The 'bad' is a loci (EMX1) where the indel frequency is around 60% denoting lower DSB frequency or higher conservative DSB repair. Success was determined by achieving HDR above 60% to be consistent with the hypothesis that indel and HDR frequencies in this system invert. The 'ugly' is an HDR assay using the traffic light reporter system (Certo et al, 2011), that in literature examples exhibits very low HDR frequency ranging from 1-4%. Correspondingly, TLR experiments show comparatively low NHEJ at 12% or less, even with Cas9 plasmid systems, reflective of the loci:sgRNA combined inefficiency. In this case, the determination of success was the suppression of NHEJ in favour of HDR and achieving HDR in excess of 4% without drug treatment, siRNA knockdowns, or other interventions (Robert et al, 2015). In the three loci tested, we observed high percentages of HDR occurrence and suppression of NHEJ, with varying degrees of HDR between the loci due to experimental factors. We caution that any Cas9 experiment’s final editing efficiency will be governed by a combination of two factors: 1) loci accessibility (condensed/uncondensed chromatin) and sgRNA quality (sequence recognition) will govern the ability to create DSBs, whereas 2) donor sequence and insert size influence the templating repair itself. While the latter is partially mitigated by application of the PEI:Cas9MAV system, we observed through the TLR experiment the importance of efficient sgRNA, since in this experiment the global HDR percentage was only 25% reflecting the combined effect of loci and sgRNA quality.
The ability of the Cas9-MAV system to remove many of the donor design constraints (length, asymmetric, single/double strand, HA length), offers an intriguing opportunity we hope to explore by using other loci, as well as the size of non-genomic DNA spliced between the HAs. Our results do not stand in isolation. Recent publications exploring the role of donor localization to the nucleus and importantly, DSB site (Carlson-Stevermer et al, 2017; Ma et al, 2017), offer support that our system is affecting the mechanisms of DSB repair. It is uncertain at this point which repair pathway and protein effectors of repair, such as DNA-ligase, Ku heterodimer, Rad51/51 among others, are involved and at what stage. Figure 4D summarises our current model, focused on the benefit derived from biotin mediated donor localization with two nuclear delivery approaches.
In contrast to other CRISPR/Cas9 methodologies involving plasmid or mRNA delivery, our system does not require the cell to transcribe sgRNA and translate protein, which can impede RNP intracellular assembly. By coupling donor to Cas9, we remove the issues of a freely diffusing donor (plasmid or linear) and compartmentalisation of the donor in the cytosol. If the donor is not present, it is more likely that canonical DSB repair (NHEJ and conservative blunt end repair(Bétermier et al, 2014)) will occur rather than HDR. Recent work using aptamer:streptavidin localisation introduced ssODN to nuclei (Carlson-Stevermer et al, 2017), achieving ~6% HDR. This rate is lower than what could have been achieved with extracellular RNP assembly and hence an increased concentration of functional RNP:donor conjugates within the cell. In future work we hope to extend the transfection principle and Cas9 methodology to difficult-to-transfect cells other than MCF-7 & 4T1, and develop a cell-type specific mechanism for transfection amelioration of cationic polymeric delivery.
Materials and Methods
Cell culture
HEK293T, MCF-7, 3T3, HeLa and U2OS cell lines were all cultured in DMEM supplemented with 10% FBS, 100 U/mL penicillin, and 100 U/mL streptomycin and were maintained at 37°C and 5% CO2. The exception being 4T1 which was cultured in DMEM with 10% FBS, 10mM HEPES, 1mM Sodium Pyruvate, 36mM NaHCO3, 5 µg/mL amphotericin, 0.5% gentamicin and 1% L-glutamine. Media, trypsin and FBS were supplied by Wisent. Cells were kept at low passage for experimentation, not exceeding 10 passages before starting fresh cultures from frozen stocks.
For transfections, seeding density was 3×105 cells per 6 well and 1×105 for 12 well plates, prepared on the day prior to PEI transfection. Confluency of 60-70% at time of transfection was our objective, so for fast growing (HeLa) or larger cells (U2SO), seeding densities of 1×105 and 0.8×105 respectively were used for preparation for 6 well plates.
Cell Viability
Cell viability was determined by trypan blue staining in a 1:1 v/v ratio to 5µl sample of harvested cells. Staining was verified by Countess II or visually by haemocytometer. All cells were maintained at 97% viability post transfection.
sgRNA
Full sequences and synthesis by in vitro transcription are described in Supplemental Methods.
PEI preparation
100mg of linear PEI (polyscience) was dissolved in 100 ml nuclease free milliQ water. Solution was magnetically stirred for 4hrs at room temperature in a sterile duran bottle. Once the polymer was dissolved pH was adjusted to pH7.4 using concentrated HCl. Secondary amine protonation was assessed by ninhydrin assay by taking aliquots. Solution was filtered (0.22um) and aliquoted to 1.5 ml eppendorf tubes and stored at −20°C. Aliquots were removed as required for experiments and stored at 4°C for polyplex formation, with a maximum usage period of two weeks. PEI fluorescent labelling (FAM-PEI synthesis), secondary amine assay and transfection evaluation are described in Supplemental Methods.
Cloning of SpCas9-NLS-mono avidin
SpCas9 constructs with a C-terminal 17 aa linker followed by monoavidin were cloned by an overlap extension PCR. Primers are listed in Supplemental Info, Table S2. Fragment 1 was amplified using an internal EcoRI site in pMJ915 (Lin et al, 2014) and primers SpCas9_EcoRI_445_for and NLS_18linker_rev using iProof High-Fidelity DNA Polymerase (Biorad, USA). Fragment 2 was amplified from pRSET-mSA encoding monoavidin for high affinity binding of one biotin (Lim et al, 2013) using primers 18linker_avidin_for and avidin_NotI_rev, and VENT DNA polymerase (NEB, USA). Fragments 1 and 2 were combined and subjected to 10 x overlap extension, before primers SpCas9_EcoRI_445_for and avidin_NotI_rev were added for amplification. The resulting fragment was digested with EcoRI and NotI, and ligated into pMJ806(36) with Kanamycin resistance.
pMJ915 and pMJ806 were gifts from Jennifer Doudna (Addgene plasmid # 69090 and #39312) pRSET-mSA was a gift from Sheldon Park (Addgene plasmid # 39860)
The resulting fusion construct contained an N-terminal hexahistidine-maltose binding protein (His-MBP) tag, followed by a tobacco etch virus (TEV) protease cleavage site and wild type SpCas9 with two C-terminal SV40 nuclear localization signals, and lastly an 17 aa linker to monoavidin. Plasmid Map of Cas9MAV included in supplemental data SD2
Purification of Cas9 proteins
SpCas9 fusion constructs were expressed and purified essentially as described previously (36). Briefly, proteins were expressed in BL21(DE3) Rosetta2 cells grown in LB media at 18°C for 16 h following induction with 0.2 mM IPTG at OD600 = 0.8. The cell pellet was lysed in 500 mM NaCl, 5 mM imidazole, 20 mM Tris-HCl pH 8, 1 mM PMSF and 2 mM B-me, and disrupted by sonication. The cleared lysate was subjected to Ni affinity chromatography using two prepacked 5 mL HisTrap columns/1-2 L cell culture. The columns were extensively washed first in 20 mM Tris pH 8.0, 250 mM NaCl, 5 mM imidazole pH 8.0, 2 mM B-me and after, in 20 mM HEPES pH 7.5, 200 mM KCl, 10 % glycerol, 0.5 mM DTT, before elution with 250 mM imidazole. The His-MBP affinity tag was removed by overnight TEV protease cleavage w/o dialysis.
The cleaved Cas9 protein was separated from the tag and co-purifying nucleic acids by purification on a 5 mL Heparin HiTrap column eluting with a linear gradient from 200 mM - 1 M KCl over 12 CV. Lastly, a gel filtration on Superdex 200 Increase 10/300 GL in 5% glycerol, 250 mM KCl, 20 mM HEPES pH 7.5 separated the nucleic-acid bound protein from the clean SpCas9 protein. Eluted protein was concentrated to ~10 mg/mL, flash-frozen in liquid nitrogen and stored at −80°C.
RNP Formation
2.1µl of 1x phosphate buffered saline (sterile and 0.22um filtered), 1.7µl of Cas9 (11mg/ml) or Cas9MAV, 2.1µl of 300nM to 900nM sgRNA (concentration varied with respect to Cas9 molarity to maintain 1:1 ratio) were combined in a sterile PCR tube, vortexed gently and incubated for 20 minutes at 25°C. For HDR experiments 5-20µl of 300ng/µl biotinylated or non-biotinylated donors are added at 15 minutes into incubation, with the concentration added adapted to concentration of Cas9MAV used. Biotin:monoavidin association occurs within 5 minutes.
Donor DNA
Donor DNA was created by PCR amplification from gblocks (CXCR4 and EMX1 loci) and for TLR donor from a plasmid kindly donated by Francis Robert from the Pelletier Lab (Department of Biochemistry, McGill). Details of primers and gblocks are found in Supplemental Data. For biotinylated donors, substitution of forward or reverse primers with 5’ biotinylated modifications was performed. All oligos were quantified after PCR using nanodrop and concentration of working stocks were 300 ng/µl in all cases. Oligos were used without any further purification. Oligonucleotides used in this research were purchased from IDT (Integrated DNA technologies, USA) and BioCorp (Montreal, Quebec).
PEI:RNP Polyplex Formation
After formation of RNP, 30µl of PEI (25 kDa 1mg/ml) was added to the RNP and vortexed. The resulting solution was incubated for 20 minutes at 25°C. After incubation 71.5 µl of DMEM was added and tubes incubated at 37°C for 20 minutes to temperature equilibrate with cells to be transfected. Polyplex solution was added dropwise to cells. The above protocol functions for transfecting one well of a 6 well plate, or three wells of a 12 well plate. In the case of larger transfections, quantities were scaled by the estimated cell density of culture vessel. In the case of 6 well plates we estimate 1 million cells at point of transfection.
SEC-MALS
A size-exclusion chromatography multi-angle light scattering (SEC-MALS) experiment for Cas9MAV was prepared by injecting 50µl sample at 3 mg/ml onto a Superose 6 increase 10/300 GL column (Sigma) equilibrated in 20 mM HEPES, pH 7.5 and 150 mM KCl at a flow rate of 0.3 ml/min. The eluted peak was analyzed using a Wyatt miniDAWN TREOS multi-angle light scattering instrument and a Wyatt Optilab rEX differential refractometer. Data were evaluated using the ASTRA 5.3.4 software with BSA at 5 mg/ml as a reference.
Dynamic Light Scattering (DLS)
Cas9MAV/sgRNA/dDNA complex hydrodynamic radius and PEI polyplex dimensions were evaluated using a Malvern Nanosizer S dynamic light scattering instrument. Measurements were performed with biotinylated donor DNA concentrations of 2 uM, and preformed Cas9/sgRNA concentrations were adjusted accordingly in a 1:1 molar ratio. All samples were measured in the same buffer (150mM KCL, 20mM HEPES pH 7.5) at 20 µl in a Zen2112 cuvette. The flow cell was equilibrated to 37°C to mirror the conditions of cell culture that polyplexes and Cas9 variant would exist during transfection. Measurements were performed for 1 min with a minimum of 15 measurements. PEI conc was 1mg/ml.
DNA extraction
Cells were harvested using 0.025% trypsin (0.5ml per well for 12 well and 1ml for 6 well plates) after a 1xPBS wash and centrifuged in 1.5ml eppendorfs at 8000k for 5 minutes. Supernatant was discarded and pellet washed with 0.5ml PBS. Cells were spun again and supernatant discarded. Full protocol for extraction is detailed in Supplemental Methods. Homemade lysis, column binding, wash and elution buffers compositions are in Supplemental Data. Proteinase K and RNase 1 were sourced from Sigma Aldrich.
RNA extraction
For cellular RNA Trizol extraction following manufacturer’s protocol (Thermofisher, USA) and Direct-zol clean up columns (Zymo Research) were used for preparation of RNA for RT-PCR validation of genomic edits. DNAse1 treatment for enzymatic degradation of genomic DNA was performed following manufacturer’s protocol. DNA degradation was verified by gel electrophoresis. DNase1 and buffer was supplied by NEB, USA.
PCR
Genomic PCR was performed with either Q5 2X master mix or Phusion polymerase (NEB, USA). PCR was used for donor DNA creation using IDT gBlocks as templates. All oligos used are specified in Suppplemental Table S2.
RT-PCR
Reverse transcription was performed using the High reverse transcriptase kit (Biobasic, Canada). For PCR step 2 microliters of solution were used in the standard Phusion master mix protocol (NEB, USA).
Gel Electrophoresis
For analysis of PCR products, and enzymatic digests, 2% agarose (Biobasic, Montreal) gels were prepared and a Fluoro-loading dye (Zmtech Montreal) was used to visualize the DNA (2µl per 25µl volume). Gels were run at 90 volts in 1x TAE buffer. For RNA analysis urea polyacrylamide gels (12%) were used and run at 120V for one hour with a pre-run prior to loading of half an hour at 130V in 1X TBE buffer. RNA gels were stained with a Ethidium bromide staining solution (10µl EtBr 10mg/ml in 30ml TBE). Gel images were recorded using UV illumination and imaging gel dock.
T7 Endonuclease Indel assay
T7 Indel assay was performed according to the protocol of Guschin et al (37) with products being quantified by agarose gel electrophoresis. PCR amplification of genomic edited and unedited DNA was performed using Q5 polymerase. All T7 endonuclease assay samples were performed with minimum of 3 biological replicates. ImageJ was used for analysis and quantification of band intensity detailed in Supplemental Methods.
Restriction Digest HDR assay
HDR frequency was assessed by restriction digest analysis. Genomic loci chosen were analysed by NEBcutter 2.0 to identify any restriction digest sequences in endogenous DNA. Excluding those enzymes with palindromic sequences present, specific enzymes without palindromic sequences were used to create a donor inserts. In the case of CXCR4 and EMX1, the donor sequence TAG-HindIII and BamHI (respectively, see Figure 2 and 3) were inserted into gblocks with homology arms matching the genomic loci. Details of gblock sequence and subsequent PCR method for creating donors of various lengths are included in the supplemental data. Restriction digest was performed after amplification from DNA extracted from edited and unedited cells following the manufacturer’s protocol (NEB) for cut smart High fidelity restriction enzymes. All HDR restriction digest assays were performed with a minimum of 3 biological replicates. In brief, 5 µl of PCR amplicon was diluted in 35µl nuclease free water and 5µl cut smart buffer. 1µl of high fidelity restriction enzyme (HindIII or BamHI) was added and vortex, volume was adjusted to a final volume of 50µl. Reaction was incubated at 37°C for 15 minutes and quenched with Fluoro-DNA loading dye 6X (Zmtech Scientific, Canada), before analysis directly on 2% agarose gels. Gel images were analysed in ImageJ and percentage HDR was calculated according to equation 2 (Supplemental Methods).
TLR HDR assay
TLR assay was performed using cells transfected by AAV virus with pCVL-TL-dsRed Reporter 2.1 (VF2468 ZFN target) EF1a inserted at a safe harbour loci (Robert et al, 2015). The TLR transfected cell line was a kind gift from Francis Robert in the Pelletier lab. Cells were plated into 12 well plates and split into 6 well plates 24 hours after transfection. Cell were then grown for 5 days with samples. Control Cas9 plasmid transfections were performed by PEI transfection (1ug plasmid DNA, 30µl PEI and 300µl DMEM for a 2 well transfection). PEI:Cas9MAV was transfected into cells with 5’ b donor, 3’ b donor and 5’3’ b donor. Cells were prepared for flow cytometry at 48 hrs and 96 hrs. Further details in supplemental methods.
The Cas9 plasmid was a gift from Jerry Pelletier (McGill University).
Fluorescent Microscopy of TLR assay Cells
For TLR assay cell plates (12 well and 6 well) GFP and RFP fluorescent images were collected using a Zeiss AX10 inverted fluorescence microscope and standard GFP and RFP filters. Image acquisition required a AxioCam MRm camera and Axiovision software. TIFF images were processed in ImageJ.
Cloning and Sanger sequencing
PCR products for sanger sequencing were cloned into bacterial blunt end cloning vectors (pCR, pUCM) as part of blunt end cloning kits (Thermo Fisher ZeroBlunt kit and Biobasic pUCM-T cloning vector kit). Plasmids were transformed into XL1 Blue competent cells. Cells were grown on selective plates and colonies picked for overnight broth culture and subsequent mini-prepping (kit). Where pUCM-T vector was used a 1 -TAQ (NEB, USA) polymerase was used to give the adenosine overhang. Sanger sequencing was performed by the McGill University and Génome Québec Innovation Centre (Montreal, Quebec). Sequence analysis for Sanger sequencing was performed in CLC Main Workbench (QIAGEN Bioinformatics, Redwood City, CA, USA) for sequence alignment and identification of donor insert sequences.
Ion Torrent
Ion Torrent sequencing of DNA samples was performed on a ThermoFisher Scientific Ion Torrent Personal Genome Machine TM (PGM) apparatus using Ion 316 chip kit v2 (ThermoFisher Scientific, 4483324), with potentially 6 million reads. Samples were multiplexed upon the 316 chip by barcoding of 9 samples by PCR prior to the Ion Torrent work flow. Details of barcoding primers is included in supplemental data. After barcoding PCR, samples were run on a 2% gel and each band pertaining to the library amplicons (approximately 200bp) were excised and gel extraction performed (Qiagen gel extraction kit). DNA was quantified by nanodrop and diluted to 8 pmol prior to Ion Torrent work flow.
Ion Torrent CRISPR/Cas9 read processing
Sequenced products were filtered for a minimum length of 61 nucleotides (minimum expected distance to target subsequences of interest – described below) and mapped to a custom BLAST database (v2.5.0) consisting of 14 sequences (a combination of seven barcodes [plus primers] and two expected sequences [no insertion or the insertion of a stop codon and HindIII restriction enzyme site]). BLAST mapped libraries were then analyzed for two target subsequences: 1) the presence of an insertion (TAGAAGCTT) or 2) absence of any change in the canonical sequence near the CRISPR/Cas9 insertion site (CCAGGAT). Aligned reads that did not completely overlap with the target subsequences were removed (‘truncated’ reads). Any alignments that resembled a partial insertion (e.g., TAGA, TAGAA, TAG-A, etc.) or non-insertion (based on BLAST alignments) were then analyzed for their frequency of insertions or deletions (indels) at a given position of the target site, including five nucleotides up-/down-stream of the target subsequence (see Supplemental Data). A total of 793,117 sequenced CRISPR/Cas9 products were produced, where 2.522%, 0.104%, and 97.374% of products were too short to be aligned to expected product sequences, unmappable by the BLAST algorithm, or able to be aligned by BLAST to the expected product sequences, respectively. Of the 772,286 mappable products: 728,664 (91.873% of total sequenced products) contained a perfect match to the target subsequences; 16,083 (2.028%) produced alignments that did not overlap with expected target subsequences; and the remaining 27,539 (3.472%) contained partial insertions/non-insertions.
Competing Interests
The authors have no competing financial interests.
Summary of Supplemental Data sets
SD1 - Tabular summary of HDR improvement strategies
SD2 - Plasmid Map of Cas9MAV
SD3 - Excel spreadsheet for gel quantifications for T7 and Restriction Digest HDR assay
SD4 - TLR flow cytometry statistics
SD5 - PDF file of Sanger sequenced clone raw reads and chromatograms
SD6 - Analysis of Ion torrent sequencing
SD7 - Fastq file for the ion torrent libraires
SD8 - Ion torrent sequencing data processing code
Acknowledgements
The authors would like to acknowledge the financial support of the Faculty of Medicine, McGill University (Akavia), Canadian Institutes of Health Research (CIHR grant MOP-133535 to Nagar; CIHR grant MOP-142451 to Dostie), and the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery grant to Blanchette). We would like to thank Francis Robert and Jerry Pelletier (helpful discussions, loan of equipment, and reagents), Bozena Samborska from Russell Jones' lab (helpful discussions, gift of reagents and cell lines), Sawn McGuirk from Julie St. Pierre's lab, Jutta Steinberger from Jerry Pellietier's lab, Sidong Huang (material gifts of cell lines), Alba Guarne and Kalle Gehring (providing access to DLS and SEC-MALS, respectively).