Abstract
Prokaryotic Argonaute proteins (pAgos) constitute a diverse group of endonucleases of which some mediate host defense by utilizing small interfering DNA guides (siDNA) to cleave complementary invading DNA. This activity can be repurposed for programmable DNA cleavage. However, currently characterized DNA-cleaving pAgos require elevated temperatures (≥65°C) for their activity, making them less suitable for applications that require moderate temperatures, such as genome editing. Here we report the functional and structural characterization of the siDNA-guided DNA-targeting pAgo from the mesophilic bacterium Clostridium butyricum (CbAgo). CbAgo displays a preference for siDNAs that have a deoxyadenosine at the 5’-end and thymidines in the sub-seed segment (siDNA nucleotides 2-4). Furthermore, CbAgo mediates DNA-guided DNA cleavage of AT-rich double stranded DNA at moderate temperatures (37°C). This study demonstrates that certain pAgos are capable of programmable DNA cleavage at moderate temperatures and thereby expands the scope of the potential pAgo–based applications.
Introduction
Eukaryotic Argonaute proteins (eAgos) play a key role in RNA interference (RNAi) processes1–3. As the core of the multiprotein RNA-induced silencing complex (RISC), eAgos bind small non-coding RNA molecules as guides to direct the RISC complex towards complementary RNA targets3–5. Reflecting their physiological function, variation among eAgos is observed with respect to the presence or absence of a catalytic site, and to their potential to interact with other proteins6. Depending on the eAgo and on the sequence complementarity between guide and target RNA, eAgo-guide complexes either catalyze endonucleolytic cleavage of the target RNA7 or indirectly silence the target RNA by repressing its translation and promoting its degradation through recruitment of additional silencing factors8. Independent of the mechanism, eAgo-mediated RNA binding generally results in sequence-specific silencing of gene expression. As such, eAgos can coordinate various cellular processes by regulating intracellular RNA levels.
Prokaryotes also encode Argonaute proteins (pAgos)9,10. Various pAgos share a high degree of structural homology with eAgos as both pAgos and eAgos adopt the same four domain (N-PAZ-MID-PIWI) architecture9–12. Despite their structural homology, several recently characterized pAgos have distinct functional roles and different guide and/or target preferences compared to eAgos. For example, several pAgos have been implicated in host defense by directly targeting DNA instead of RNA13–16. One of the best characterized mechanisms that pAgos utilize is DNA-guided DNA interference, which is demonstrated for pAgos from Thermus thermophilus (TtAgo), Pyrococcus furiosus (PfAgo), and Methanocaldococcus jannaschii (MjAgo)13–15,17–20. These pAgos use 5’-end phosphorylated small interfering DNAs (siDNAs) for recognition and successive cleavage of complementary DNA targets. This mechanism enables both TtAgo and PfAgo to mediate host defense against invading nucleic acids. Prokaryotes lack homologs of eukaryotic enzymes that are involved in guide biogenesis21. Instead, both TtAgo and MjAgo - besides the canonical siDNA-dependent target cleavage termed ‘slicing’ - exhibit an alternative nuclease activity termed ‘chopping’14,17. Chopping facilitates autonomous generation of small DNA fragments from dsDNA substrates. Subsequently, these DNA fragments generated during chopping can serve as siDNAs for canonical slicing14,17.
TtAgo and PfAgo can be programmed with short synthetic siDNA which allows them to target and cleave dsDNA sequences of choice in vitro13,15. This activity has enabled the repurposing of PfAgo as an universal restriction endonuclease for in vitro molecular cloning22. In addition, a diagnostic TtAgo-based application termed NAVIGATER (Nucleic Acid enrichment Via DNA Guided Argonaute from Thermus thermophilus) was developed which enables enhanced detection of rare nucleic acids with single nucleotide precision23. In analogy with the now commonly used CRISPR-Cas9 and CRISPR-Cas12a enzymes24–26, it has also been suggested that pAgos could be repurposed as next-generation genome editing tools27. However, due to the thermophilic nature (optimum activity temperature ≥65°C) and low levels of endonuclease activity at the relevant temperatures (20-37°C), it is unlikely that the well-studied TtAgo, PfAgo and MjAgo are suitable for genome editing. The quest for a pAgo that can cleave dsDNA at moderate temperatures has resulted in the characterization of the Argonaute protein from Natronobacterium gregory (NgAgo), which was claimed to be the first pAgo suitable for genome editing purposes28. However, the study reporting this application has been retracted after a series of reproducibility issues28–30. Instead, it has been suggested that NgAgo targets RNA rather than DNA31.
Although considerable efforts have been made to elucidate the mechanisms and biological roles of pAgos, efforts have mainly focused on pAgo variants from (hyper)thermophiles. This has left a large group of mesophilic pAgos unexplored. We here report the characterization of the Argonaute protein from the mesophilic bacterium Clostridium butyricum (CbAgo). We demonstrate that CbAgo can utilize siDNA guides to cleave both ssDNA and dsDNA targets at moderate temperatures (37°C). In addition, we have elucidated the macromolecular structure of CbAgo in complex with a siDNA guide and complementary ssDNA target in a catalytically competent state. CbAgo displays an unusual preference for siDNAs with a deoxyadenosine at the 5’-end and thymidines in the sub-seed segment (siDNA nt 2-4). The programmable DNA endonuclease activity of CbAgo provides a foundation for the development of pAgo-based applications at moderate temperatures..
Materials and methods
Plasmid construction
The CbAgo gene was codon harmonized for E.coli Bl21 (DE3) and inserted into a pET-His6 MBP TEV cloning vector (obtained from the UC Berkeley MacroLab, Addgene #29656) using ligation independent cloning (LIC) using oligonucleotides oDS067 and oDS068 (Table S1) to generate a protein expression construct that encodes the CbAgo polypeptide sequence fused to an N-terminal tag comprising a hexahistidine sequence, a maltose binding protein (MBP) and a Tobacco Etch Virus (TEV) protease cleavage site.
Generation of the Double mutant
CbAgo double mutant (D541A, D611A) was generated using an adapted Quick Directed Mutagenesis Kit instruction manual (Stratagene). The primers were designed using the web-based program primerX (http://bioinformatics.org/primerx).
CbAgo expression and purification
The CbAgo WT and DM proteins were expressed in E.coli Bl21(DE3) Rosetta™ 2 (Novagen). Cultures were grown at 37°C in LB medium containing 50 μg ml-1 kanamycin and 34 μg ml-1 chloramphenicol until an OD600nm of 0.7 was reached. CbAgo expression was induced by addition of isopropyl β-D-1-thiogalactopyranoside (IPTG) to a final concentration of 0.1 mM. During the expression cells were incubated at 18°C for 16 hours with continues shaking. Cells were harvested by centrifugation and lysed by sonication (Bandelin, Sonopuls. 30% power, 1s on/2s off for 5min) in lysis buffer containing 20 mM Tris-HCl pH 7.5, 250 mM NaCl, 5 mM imidazole, supplemented with a EDTA free protease inhibitor cocktail tablet (Roche). The soluble fraction of the lysate was loaded on a nickel column (HisTrap Hp, GE healthcare). The column was extensively washed with wash buffer containing 20 mM Tris-HCl pH 7.5, 250 mM NaCl and 30 mM imidazole. Bound protein was eluted by increasing the concentration of imidazole in the wash buffer to 250 mM. The eluted protein was dialysed at 4°C overnight against 20 mM HEPES pH 7.5, 250 mM KCl, and 1mM dithiothreitol (DTT) in the presence of 1mg TEV protease (expressed and purified according to Tropea et al. 200955) to cleave of the His6-MBP tag. Next the cleaved protein was diluted in 20mM HEPES pH 7.5 to lower the final salt concentration to 125 mM KCl. The diluted protein was applied to a heparin column (HiTrap Heparin HP, GE Healthcare), washed with 20 mM HEPES pH 7.5, 125 mM KCl and eluted with a linear gradient of 0.125-2 M KCl. Next, the eluted protein was loaded onto a size exclusion column (Superdex 200 16/600 column, GE Healthcare) and eluted with 20 mM HEPES pH 7.5, 500mM KCl and 1 mM DTT. Purified CbAgo protein was diluted in size exclusion buffer to a final concentration of 5 μM. Aliquots were flash frozen in liquid nitrogen and stored at −80°C.
Co-purification nucleic acids
To 500 pmoles of purified CbAgo in SEC buffer CaCl2 and proteinase K (Ambion) were added to final concentrations of 5 mM CaCl2 and 250 μg/mL proteinase K. The sample was incubated for 4 hours at 65°C. The nucleic acids were separated from the organic fraction by adding Roti phenol/chloroform/isoamyl alcohol pH 7.5-8.0 in a 1:1 ratio. The top layer was isolated and nucleic acids were precipitated using ethanol precipitation by adding 99% ethanol in a 1:2 ratio supplied with 0.5% Linear polymerized acrylamide as a carrier. This mixture was incubated overnight at −20°C and centrifuged in a table centrifuge at 16,000 g for 30 min. Next, the nucleic acids pellet was washed with 70% ethanol and solved in 50 μL MilliQ water. The purified nucleic acids were treated with either 100 μg/mL RNase A (Thermo), 2 units DNase I (NEB) or both for 1 hour at 37°C and resolved on a denaturing urea polyacrylamide gel (15%) and stained with SYBR gold.
Single stranded Activity assays
Unless stated otherwise 5 pmoles of each CbAgo, siDNA and target were mixed in a ratio of 1:1:1, in 2x reaction buffer containing 20 mM Tris-HCl (pH 7.5) supplemented with 500 μM MnCl2+. The target was added after the CbAgo and siDNA had been incubation for 15 min at 37°C. Then the complete reaction mixture was incubated for 1 hour at 37°C. The reaction was terminated by adding 2x RNA loading dye (95% Formamide, 0.025% bromophenol blue, 5 mM ETDA) and heating it for 5 minutes at 95°C. After this the samples were resolved on a 20% denaturing (7 M Urea) polyacrylamide gel. The gel was stained with SYBR gold nucleic acid stain (Invitrogen) and imaged using a G:BOX Chemi imager (Syngene).
Double stranded Activity assay
In two half reactions 12.5 pmoles of CbAgo was loaded with either 12.5pmoles of forward or reverse siDNA in reaction buffer containing 10 mM Tris-HCl, 10 μg/ml BSA, 250 μM MnCl2. The half reactions were incubated for 15 min at 37°C. Next, both half reactions were mixed together and 120 ng target plasmid was added after which the mixture was incubated for 1 hour of 37°C. After the incubation the target plasmid was purified from the mixture using a DNA clean and concentrate kit (DNA Clean & Concentrator™-5, Zymogen) via the supplied protocol. The purified plasmid was subsequently cut using either EcoRI-HF (NEB) or SapI-HF (NEB) in Cutsmart buffer (NEB) for 30 min at 37°C. A 6x DNA loading dye (NEB) was added to the plasmid sample prior to resolving it on a 0.7% agarose gel stained with SYBR gold (Invitrogen).
Crystallization
To reconstitute the CbAgo DM-siDNA-target DNA complex, siDNA and target DNA were pre-mixed at a 1:1 ratio, heated to 95°C, and slowly cooled to room temperature. The formed dsDNA duplex (0.5M) was mixed with CbAgo DM in SEC buffer at a 1:1:4 ratio (CbAgo DM:duplex DNA), and MgCl2 was added to a final concentration of 5 mM. The sample was incubated for 15 minutes at 20°C to allow complex formation. The complex was crystallized at 20°C using the hanging drop vapour diffusion method by mixing equal volumes of complex and reservoir solution. Initial crystals were obtained at a CbAgo DM concentration of 5 mg/ml with a reservoir solution consisting of 4 M Sodium Formate. Data was collected from crystals grown obtained using a complex concentration of 4.3 mg/ml and reservoir solution containing 3.8 M Sodium Formate and 5 mM NiCl2 at 20°C. For cryoprotection, crystals were transferred to a drop of reservoir solution and flash-cooled in liquid nitrogen.
X-ray diffraction data were measured at beamline X06DA (PXIII) of the Swiss Light Source (Paul Scherrer Institute, Villigen, Switzerland). Data were indexed, integrated, and scaled using AutoPROC (Vonrhein et al (2011)). Crystals of the CbAgo-siDNA-target DNA complex diffracted to a resolution of 3.55 Å and belonged to space group P63 2 2, with one copy of the complex in the asymmetric unit. The structure was solved by molecular replacement using Phaser-MR (McCoy et al., 2007). As search model, the structure of TtAgo in complex with guide and target DNA strands (PDB: 5GQ9) was used after removing loops and truncating amino acid side chains. Phases obtained using the initial molecular replacement solution were improved by density modification using phenix.resolve (Terwilliger, 2004) and phenix.morph_model (Terwilliger et al., 2013). The atomic model was built manually in Coot (Emsley et al., 2010) and refined using phenix.refine (Afonine et al., 2012). The final binary complex model contains CbAgo residues 1-463 and 466-748, guide DNA residues 1–16, and target DNA residues (−18)–(−1).
Structure analysis
Core Root Means Square Deviations (rmsd) of structure alignments were calculated using Coot SSM superpose (Krissinel et al 2004). Intramolecular interactions were analysed using PDBePISA (Krissinel and Henrick, 2007). Figures were generated using PyMOL (Schrödinger).
Single-Molecule Experimental Set-Up
Single-molecule fluorescence FRET measurements were performed with a prism-type total internal reflection fluorescence microscope. Cy3 and Cy5 molecules were excited with 532 nm and 637 nm wavelength, respectively. Resulting Cy3 and Cy5 fluorescence signal was collected through a 60X water immersion objective (UplanSApo, Olympus) with an inverted microscope (IX73, Olympus) and split by a dichroic mirror (635dcxr, Chroma). Scattered laser light was blocked out by a triple notch filter (NF01-488/532/635, Semrock). The Cy3 and Cy5 signals were recorded using a EM-CCD camera (iXon Ultra, DU-897U-CS0-#BV, Andor Technology) with exposure time 0.1 s. All single-molecule experiments were done at room temperature (22 ± 2C).
Fluorescent DNA and RNA preparation
The RNAs with amine-modification (amino-modifier C6-U phosphoramidite, 10-3039, Glen Research) were purchased from STPharm (South Korea) and DNAs with amine-modification (internal amino modifier iAmMC6T) Ella biotech (Germany). The guide and target strands were labeled with donor (Cy3) and acceptor (Cy5), respectively, using the NHS-ester form of Cy dyes (GE Healthcare). 2012).1 μL of 1 mM of DNA/RNA dissolved in MilliQ H20 is added to 5 μL labeling buffer of (freshly prepared) sodiumtetraborate (380 mg/10mL, pH 8.5). 1 μL of 20 mM dye (1 mg in 56 μL DMSO) is added and incubated overnight at room temperature in the dark, followed by washing and ethanol precipitation. The labeling efficiency was ~100%.
Single-molecule sample preparation
A microfluidic chamber was incubated with 20 μL Streptavidin (0.1 mg/mL, Sigma) for 30 sec. Unbound Streptavidin was washed with 100 μL of buffer T50 (10 mM Tris-HCl [pH8.0], 50 mM NaCl buffer). The fifty microliters of 50 pM acceptor-labelled target construct were introduced into the chamber and incubated for 1 min. Unbound labeled constructs were washed with 100 μL of buffer T50. The CbAgo binary complex was formed by incubating 10 nM purified CbAgo with 1 nM of donor-labeled guide in a buffer containing 50 mM Tris-HCl [pH 8.0] (Ambion), 1mM MnCl2, and 100 mM NaCl (Ambion) at 37°C for 20 min. For binding rate (kon) measurements, the binary complex was introduced into the fluidics chamber using syringe during the measurement. The experiments were performed at the room temperature (23 ± 1°C).
For fluorescence Guide Loading Experiments before immobilizing CbAgo on the single-molecule surface, 1 μL of 5 μM His-tagged apo-CbAgo was incubated with 1 μL of 1 μg/ml biotinylated anti-6x His antibody (Abcam) for 10 min. Afterward, the mixture was diluted 500x in T50 and 50 μL were loaded in the microfluidic channel for 30 s incubation, followed by washing with 100 μL of T50 buffer. Cy3-labeled ssDNA (0.1) was applied to the microfluidic chamber in imaging buffer (50 mM Tris-HCl pH 8.0, 100 mM NaCl, 1 mM MnCl2, 1 mM Trolox ((±)-6-Hydroxy-2,5,7,8-tetramethylchromane-2-carboxylic acid), supplemented with an oxygen-scavenging system (0.5 mg/mL glucose oxidase (Sigma), 85 mg/mL catalase (Merck), and 0.8% (v/v) glucose (Sigma)).
Single-molecule data acquisition and analysis
CCD images of time resolution 0.1 or 0.3 sec were recorded, and time traces were extracted from the CCD image series using IDL (ITT Visual Information Solution). Co-localization between Cy3 and Cy5 signals was carried out with a custom-made mapping algorithm written in IDL. The extracted time traces were processed using Matlab (MathWorks) and Origin (Origin Lab).
The binding rate (kon) was determined by first measuring the time between when CbAgo binary complex was introduced to a microfluidic chamber and when the first CbAgo-guide docked to a target; and then fitting the time distribution with a single-exponential growth curve, . The dissociation rate was estimated by measuring the dwell time of a binding event. A dwell time distribution was fitted by single-exponential decay curve (Ae−t/Δτ).
Fluorescence competition experiments
MBP-tagged CbAgo was immobilized on the quartz surface using an anti-MBP antibody. An equimolar mixture of let7 DNA guide (Cy3 labeled) and let7 RNA guide (Cy5 labeled) in imaging buffer was introduced to the microfluidic chamber. After 5 minutes, 10 snapshots of independent fields of view with simultaneous illumination were collected to estimate the amount of guide molecules bound to protein. Movies were taken for 200 s (2000 frames) at continuous illumination of Cy3 and Cy5 molecules to determine the dwell times of the binding events. Dwell times were binned in a histogram and fitted with a single exponential decay curve.
FRET targeting experiments of ATTT and AAAA guide target combinations
100 pM of target construct annealed with biotin handle were flushed in the microfluidic chamber. After incubation of 1 min, the microfluidic chamber was rinsed with 100 μL T50 buffer. 10 nM of apo-CbAgo was loaded with 1 nM of ATTT seed DNA guide or with AAAA seed DNA guide at 37°C for 30 minutes in imaging buffer after which the mixture is introduced inside the microfluidic chamber. Movies of 200 s were taken at continuous illumination of the Cy3 signal. Site specific protein target interactions were identified as FRET signals and were further analysed.
Results
CbAgo mediates siDNA-guided ssDNA cleavage
CbAgo was successfully expressed in E. coli from a codon-optimized gene using a T7-based pET expression system and purified (Supplementary Figure S1A). To determine the guide and target binding characteristics of CbAgo, we performed single-molecule experiments using Förster resonance energy transfer (FRET). We immobilized either Cy5-labeled single stranded RNA or DNA targets (FRET acceptor) on a polymer-coated quartz surface (Figure 1A). Next we introduced CbAgo in complex with either a Cy3-labeled siRNA or siDNA guide (FRET donor) and recorded the interactions. Strikingly, CbAgo could utilize both siRNAs and siDNAs to bind DNA or RNA targets (Figure 1B). To test which guide is preferentially bound by CbAgo we performed a competition assay in which CbAgo was immobilized into the microfluidic chamber, and an equimolar mixture of siDNA and siRNAs was introduced. While only short-lived interactions (average dwell time: 0.48 seconds) were observed for siRNA, siDNA was strongly bound (average dwell time: 44 seconds) by CbAgo (Figure 1C). This results suggests that CbAgo utilizes siDNA rather than siRNA as a guide.
CbAgo is phylogenetically closest related to the clade of halobacterial pAgos, among which also pAgo from Natronobacterium gregoryi (NgAgo) can be found (Figure 1D and Supplementary Figure S2). A multiple sequence alignment of CbAgo with other pAgos (Supplementary Figure S1B) suggests that CbAgo contains the conserved DEDX catalytic residues (where X can be a D, H or N) which are essential for nuclease activity in ‘slicing’ Agos32. In the case of CbAgo, this concerns residues D541, E577, D611 and D727.
To confirm whether CbAgo indeed is an active nuclease, we performed in vitro activity assays in which CbAgo was loaded with either synthetic siDNAs or siRNAs (21 nucleotides in length). Next the complexes were incubated at 37°C with 45-nucleotide complementary single stranded RNA or DNA target oligonucleotides. While no activity was found in any of the combinations in which siRNAs or target RNAs were used, CbAgo was able to cleave target DNAs in a siDNA-dependent manner (Figure 1E). In agreement with the predicted DEDD catalytic site (Supplementary Figure S1B), alanine substitutions of two of aspartic acids (D541A, D611A) in the expected catalytic tetrad abolished the nuclease activity, demonstrating that the observed siDNA-guided ssDNA endonucleolytic activity was indeed catalyzed by the DEDD catalytic site. To further investigate the full temperature range at which CbAgo is active, we performed additional cleavage assays at temperatures ranging from 10-95°C. While CbAgo displayed the highest activity at its physiologically relevant temperature (37°C), CbAgo also catalyzed siDNA-guided target DNA cleavage at temperatures as low as 10°C and as high as 50°C (Figure 1F).
When CbAgo-siDNA complexes and target ssDNA substrates (45nt) were mixed in equimolar amounts, cleavage of the target DNA was not complete after one hour incubation (Figure 1E). Therefore, we investigated the substrate turnover kinetics of CbAgo by monitoring the cleavage assays in a time course using variable CbAgo:siDNA:target DNA ratios (Figure 1G). A rapid burst of activity was observed during the first minute, likely indicating the first target binding and cleavage event. This stage was followed by a slow steady state, suggesting that under these conditions the CbAgo-siDNA complex slowly dissociates from the cleaved target DNA product before being able to bind and cleave a new target DNA strand. The cleavage kinetics were confirmed using single-molecule assays which demonstrated that the CbAgo-siDNA complex remains bound to the DNA target (N=21) for several minutes (Figure 1H), which prevents CbAgo-siDNA complexes from binding and cleaving new DNA targets. Thus, while CbAgo functions as a multi-turnover nuclease enzyme, its steady-state rate is limited by product release.
Structure of CbAgo in the cleavage-competent conformation
To investigate the molecular architecture of CbAgo in light of its biochemical activity, we crystallized CbAgoDM in complex with both a 21-nt siDNA and a 19-nt DNA target, and solved the structure of the complex at 3.54 Å resolution (Figure 2 and Table S1). Like other Agos, CbAgo adopts a bilobed conformation in which one of the lobes comprises the N-terminal, linker L1, and PAZ domains, which are linked by linker L2 to the other lobe comprising the MID and PIWI domains. Nucleotides 2-16 of the siDNA constitute a 15 base-pairs A-form-like duplex with the target DNA (Figure 2A). The 5’-terminal nucleotide of the siDNA is anchored in the MID domain pocket, where the 5’-phosphate group of the siDNA makes numerous interactions with MID domain residues and the C-terminal carboxyl group of CbAgo (Supplementary Figure S3). To test whether the interactions with the 5’-phosphate group of the siDNA are important for CbAgo activity, we performed target DNA cleavage assays in which we used siDNAs with a 5’ phosphate or a 5’ hydroxyl group (Supplementary Figure S4). As observed for other pAgos33,34, CbAgo is able to utilize both siDNAs for target DNA cleavage, but it cleaves target DNA much more efficiently when the siDNA contains a 5’-phosphate group. This is in agreement with the siDNA-protein interactions observed in the crystal structure. Furthermore, the backbone phosphates of the siDNA seed segment form hydrogen-bonding and ionic interactions with specific residues in the MID, PIWI and L1 domains (Supplementary Figure S3). At the distal end of the siDNA-target DNA duplex, the N-domain residue His35 caps the duplex by stacking onto the last base pair. After this point, the remaining 3’-terminal nucleotides of the siDNA are unordered, while the target DNA bends away from the duplex and enters the cleft between the N-terminal and PAZ domains. In agreement with other ternary pAgo complexes18,35,36, the PAZ domain pocket, which normally binds the 3’ end of the guide in a binary Ago-guide complex, is empty.
CbAgo is phylogenetically closely related to TtAgo (Figure 1D). However, CbAgo is 63 amino acids (9.2%) longer than TtAgo (748 amino acids vs. 685 amino acids) and CbAgo and TtAgo share only 23% sequence identity. Superposition of the CbAgo complex structure with the structure of TtAgo bound to a siDNA and DNA target (PDB: 4NCB) (Figure 2C) reveals that the macromolecular architecture and conformation of these TtAgo and CbAgo structures are highly similar (Core root mean square deviation of 3.0 Å over 563 residues), with differences found mostly in the loop regions. This agrees with the fact that loops of thermostable proteins are generally more compact and shorter37,38. In the TtAgo structure, which is thought to represent a catalytically competent state, a ‘glutamate finger’ side chain (Glu512TtAgo) is inserted into the catalytic site completing the catalytic DDED tetrad35. Similarly, the corresponding residue in CbAgo (Glu577) is located within a flexible loop and is positioned near the other catalytic residues (Figure 2D; Asp541, Asp611, and Asp727). All pAgos and eAgos characterized to date cleave the target strand in between nucleotide 10 and 11 of the target strand. In line with the consensus, the catalytic residues of CbAgo perfectly align with the scissile phosphate linking these nucleotides in our structure (Figure 2D). This observation implies that this structure represents the cleavage competent conformation of CbAgo.
Only 15 siDNA-target DNA base pairs are formed in the complex, which suggests that additional siDNA-target DNA binding is not essential for target DNA cleavage. To determine the minimum siDNA length that CbAgo requires for target binding, we performed single-molecule fluorescence assays. First, CbAgo was immobilized on a surface and next it was incubated with 5’-phosphorylated Cy3-labelled siDNAs (Figure 2E). These assays demonstrate that CbAgo can bind siDNAs with a minimal length of 12 nucleotides. Next, we determined the minimum siDNA length for CbAgo-siDNA mediated target DNA cleavage (Figure 2F). In line with the observation that the CbAgo adopts a cleavage-competent confirmation when only 14 base pairs are formed, CbAgo can cleave target DNAs when programmed with siDNAs as short as 14 nt (forming 13 siDNA-target DNA base pairs) under the tested conditions. This resembles the activity of PfAgo, MjAgo, and MpAgo, which require siDNAs with a minimal length of 15 nt to catalyze target DNA cleavage14,15,34. Only TtAgo has been reported to mediate target DNA cleavage with siDNAs as short as 9 nt12.
CbAgo associates with plasmid-derived siDNAs in vivo
It has previously been demonstrated that certain pAgos co-purify with their guides and/or targets during heterologous expression in Escherichia coli13,16. To determine whether CbAgo also acquires siDNAs during expression, we isolated and analyzed the nucleic acid fraction that co-purified with CbAgo. Denaturing polyacrylamide gel electrophoresis revealed that CbAgo co-purified with small nucleotides with a length of ~12-19 nucleotides (Figure 3A). These nucleic acids were susceptible to DNase I but not to RNase A treatment, indicating that CbAgo acquires 12-19 nucleotide long siDNAs in vivo, which fits with its observed binding and cleavage activities in vitro (Figure 1 and 2).
We cloned and sequenced the siDNAs that co-purified with CbAgo to determine their exact length and sequence. The majority of the siDNAs had a length of 16 nucleotides and are complementary to the plasmid used for expression of CbAgo (Figure 3B and 3C). Likewise the siRNAs and siDNAs that co-purify with respectively Rhodobacter sphaeroides (RsAgo) and TtAgo are also mostly complementary to their expression plasmids13,16. As both TtAgo and RsAgo have been demonstrated to interfere with plasmid DNA, this suggests that also CbAgo might play a role in protecting its host against invading DNA. However, no significant reduction of plasmid content could be detected during or upon expression of CbAgo in E. coli (Supplementary Figure S6). We also investigated whether CbAgo co-purified with nucleic acids that were enriched for certain motifs. Sequence analysis revealed that most siDNAs co-purified with CbAgo contain a deoxyadenosine at their 5’ ends (Figure 3D). In addition, we observed an enrichment of thymidine nucleotides in the three positions directly downstream of the siDNA 5’ end (nt 2-4) (Figure 3D).
The sequence of the siDNA affects CbAgo activity
To investigate if the 5’-terminal nucleotide of the siDNA affects the activity of CbAgo, we performed cleavage assays. CbAgo was loaded with siDNA guides with varied nucleotides at position 1 (g1N) and incubated with complementary target DNAs (Figure 4A). Surprisingly, the highest cleavage rates were observed when CbAgo was loaded with siDNAs containing a 5’-T, followed by siDNAs containing 5’-A. CbAgo bound 5’-G or 5’-C siDNAs displayed slightly lower initial cleavage rates. Also for other pAgos the g1N preference observed in vivo is not reflected in the in vitro activities; TtAgo (which preferentially co-purifies with g1C siDNAs) as well as PfAgo and MpAgo (of which the in vivo g1N preferences are unknown) demonstrate no clear preference for a specific g1N during in vitro cleavage reactions13,17,34. Instead, the preference of TtAgo for 5’-C siDNAs is determined by specific recognition of a guanosine nucleotide in the corresponding position (t1) in the target DNA17. Indeed, TtAgo structures and models have revealed base-specific interactions with target strand guanine, while base-specific interactions with the 5’-terminal cytidine in the siDNA are less obvious17. Similarly, we observe no obvious base-specific interactions with the 5’-terminal cytidine in the structure of the CbAgo complex (Supplementary Figure S7). When we investigated potential base-specific interactions with the base at the opposing target strand t1 position, we observed that the t1 thymine base is not placed in the t1 binding pocket as has been observed in TtAgo, RsAgo and hAGO217,39,40. Instead, the thymine bases is flipped and stacks on Phe557 that also caps the siDNA-target DNA duplex (Supplementary Figure S7). At present, we are unable to rationalize the preferential co-purification of 5’-adenosine siDNAs with CbAgo.
In order to characterize the seed segment of CbAgo, and to test whether the seed length changes depending on the nature of the guide and the target (i.e. DNA vs. RNA), we performed additional single-molecule binding assays. The length of seed was determined based on the minimal number of complementary nucleotide pairs between guide and target that were required to achieve a stable binding event. We first tested the sub-seed (nt 2-4), a 3-nt motif involved in initial target recognition in hAgo241,42. When only the sub-seed segment of the siDNA is complementary to the DNA and RNA targets, CbAgo-siDNA complexes bound to the DNA target with an average dwell time 58-fold longer compared to RNA target-binding (Figure 4B). When nt 2-7 of the guide were complementary to the target, the CbAgo-siDNA complex stably bound to both to target DNA and RNA beyond our observation time of 300 s. This suggests CbAgo prefers DNA targets above RNA targets and that the seed segment of the siDNAs bound by CbAgo comprises nucleotides 2-7.
Next, we set out to investigate whether CbAgo displays a preference for siDNAs with a TTT sub-seed (nt 2-4) in vitro, similar to the observed sequence preference for siDNAs that co-purified with CbAgo in vivo. CbAgo was incubated with siDNAs in which the sub-seed was varied and complementary target DNAs were added. In contrast to the 5’-base preference, the TTT sub-seed preference that we observed in vivo is also reflected in vitro: CbAgo displays the highest target cleavage rates when programmed with TTT sub-seed siDNAs (Figure 4C). To confirm these findings, we performed single-molecule assays in which we compared the target binding properties of CbAgo-siDNA complexes containing siDNAs with either a TTT or an AAA sub-seed segment. These assays demonstrate that the dwell time of CbAgo loaded with a TTT sub-seed siDNA on a target was 18-fold longer compared to CbAgo loaded with siDNA containing an AAA sub-seed (Figure 4D). Combined, these data indicate that CbAgo displays a preference for siDNAs containing a TTT sub-seed segment.
A pair of CbAgo-siDNA complexes can cleave double stranded DNA
Thermophilic pAgos have successfully been used to generate double stranded DNA breaks in plasmid DNA13,15. As each pAgo-siDNA complex targets and cleaves a single strand of DNA only, two individual pAgo-siDNA complexes are required for dsDNA cleavage, each targeting another strand of the target dsDNA. Although all pAgos characterized so far appear to lack the ability to actively unwind or displace a dsDNA duplex substrate, it has been proposed that, at least in vitro, thermophilic pAgos rely on elevated temperatures (>65 °C) to facilitate local melting of the dsDNA targets to target each strand of the DNA individually. However, CbAgo is derived from a mesophilic organism and we therefore hypothesize that it is able to mediate protection against invading DNA at moderate temperatures (37°C). To test if CbAgo can indeed cleave dsDNA targets at 37°C, we incubated apo-CbAgo and pre-assembled CbAgo-siDNA complexes with a target plasmid. Previous studies showed that the ‘chopping’ activity of siDNA-free apo-TtAgo and apo-MjAgo can result in plasmid linearization or degradation, respectively14,17. We observed that apo-CbAgo converted the plasmid substrate from a supercoiled to open-circular state, possibly by nicking one of the strands, but did not observe significant linearization or degradation of the plasmid DNA (Figure 5A). When the plasmid was targeted by CbAgo loaded with a single siDNA, we also observed loss of supercoiling (Figure 5A). As this activity was not observed with nuclease-deficient CbAgoDM, we conclude that apo-CbAgo and CbAgo-siDNA complexes are generate nicks in dsDNA plasmid targets with their DEDD catalytic site. When using two CbAgo-siDNA complexes, each targeting one strand of the plasmid, we observed that a fraction of the target plasmid DNA becomes linearized (Figure 5A). This implies that CbAgo-siDNA complex-mediated nicking of each of the target plasmid DNA strands resulted in the generation of a double stranded DNA break. Next, we investigated if the spacing between the two siDNAs affects the ability of CbAgo to cleave the plasmid. The most efficient plasmid linearization was achieved when the siDNAs were orientated exactly or almost opposite to each other (Figure 5A).
Finally, we investigated whether the GC-content of the target DNA plays a role during DNA targeting by CbAgo. For TtAgo, it has been observed that AT-rich DNA is cleaved more efficiently than GC-rich DNA17. To test if such preference also exists for CbAgo, we designed a target plasmid containing 16 gene fragments of 100 base pairs complementary to sequences from the human genome, with an increasing GC content (Figure 5B). CbAgo-siDNA complexes were only able to generate dsDNA in gene fragments with a GC-content of 31% or lower (Figure 5C). This indicates that, at least in vitro, the GC-content is an important factor that determines target DNA cleavage by CbAgo.
Discussion
Several prokaryotic Argonaute proteins have been demonstrated to protect their host against invading nucleic acids, such as plasmid DNA13,15,16. Similar to TtAgo and RsAgo, CbAgo co-purifies with guides which are preferentially acquired from the plasmid used for its heterologous expression in E. coli. In addition, CbAgo mediates programmable DNA-guided DNA cleavage in vitro. This suggests that, similar to the phylogenetically related TtAgo, also CbAgo can interfere with plasmid DNA via DNA-guided DNA interference.
Sequencing of the nucleic acids that co-purified with CbAgo revealed that CbAgo preferentially associates with siDNAs with a 5’-ATTT-3’ sequence at their 5’ end. It was previously shown that the guide RNA utilized by eAgos can be divided into functional segments. These segments are (from 5’ to 3’) the anchor nucleotide (nt 1), the seed (nt 2-8) and sub-seed segments (nt 2-4), and the central (nt 9-12), 3’ supplementary (nt 13-16) and tail (nt 17-21) segments41,43. Extending this knowledge to the siDNAs that co-purified with CbAgo, CbAgo preferentially associates with siDNAs that have a 5’-terminal adenosine anchor (nt 1) and a T-rich sub-seed. In RNAi pathways, the preference for a specific 5’-terminal nucleotide is important for guide RNA loading into a subset of eAgos44–46 Similarly, several pAgos including RsAgo, TtAgo, and now CbAgo also preferentially associate with specific 5’-terminal nucleotides in vivo13,16. However, for both CbAgo and TtAgo, there is no clear preference for siDNAs with that specific 5’-base during cleavage assays in vitro. Rather than having a functional importance, the preference of pAgos for a specific nucleotide at the siDNA 5’ end might be a consequence of siDNA generation and/or loading, as has been demonstrated for TtAgo17. Several studies on human Ago2 have described the importance of the sub-seed segment (nt 2-4) in its RNA guides41,42,47. For hAgo2, a complete match between the guide RNA sub-seed segment and the target RNA triggers a conformational change that first exposes the remainder of the seed (nt 5-8), and eventually the rest of the guide. This facilitates progressive base paring between the guide RNA and the target48. However, a specific nucleotide preference in the sub-seed segment, as we have observed for CbAgo, has not been described for any other Argonaute protein. The preference for the T-rich sub-seed is not only observed in the in vivo acquired siDNAs, but also plays a clear role during target binding and cleavage assays in vitro. This may reflect a structural preference for these thymidines in the cleft of the PIWI domain. We have not been able to obtain diffracting crystals of CbAgo in complex with siDNAs that have a 5’-ATTT-3’ sequence at the 5’-end. Future research will thus be necessary to determine the structural basis the apparent preference for these nucleotides at these positions. We hypothesize that this bias might reflect the mesophilic nature of CbAgo, which might have better access to AT-rich dsDNA fragments, both for siDNA acquisition and for target cleavage.
Several DNA-targeting pAgos have been repurposed for a range of molecular applications among which a cloning, recombineering and nucleic acid-detection method22,23,49,50. Additionally, the potential repurposing of pAgos for genome editing applications has previously been discussed27. However, all characterized DNA-cleaving pAgos to date originate from thermophilic prokaryotes and are solely active at elevated temperatures, which limits the potential repurposing of pAgos for applications that require moderate temperatures, such as genome editing. The biochemical characterization of CbAgo reported herein is the first example of a pAgo that catalyzes siDNA-guided dsDNA cleavage at 37°C, indicating that the pool of mesophilic pAgos contains candidates that – in theory – can be utilized for potential applications that require moderate temperatures, such as genome editing. If CbAgo or other mesophilic pAgos could be harnessed for genome editing, they will have certain advantages over the currently well-established genome editing tools CRISPR-Cas9 and CRISPR-Cas12a; While CRISPR-based genome editing tools can be programmed with a guide RNA to target DNA sequences of choice, target DNA cleavage additionally requires the presence of a protospacer adjacent motif (PAM) next to the targeted sequence (5’-NGG-3’ for Cas9 and 5’-TTTV-3’ for Cas12a)51. This limits the possible target sites of Cas9 and Cas12a. In contrast, pAgos do not require a PAM for DNA targeting, which would make them much more versatile tools compared to CRISPR-associated nucleases. However, PAM binding by Cas9 and Cas12a also promotes unwinding of dsDNA targets52–54 which subsequently facilitates strand displacement by the RNA guide, and eventually R-Loop formation. The absence of such mechanism in pAgos might explain their limited nuclease activity on dsDNA targets.
Here, we have demonstrated that CbAgo does not strictly rely on other proteins when targeting AT-rich dsDNA sequences in vitro. As such, this study provides a foundation for future efforts to improve double stranded DNA target accessibility of pAgos and to facilitate the further development of pAgo-based applications at moderate temperatures.
Author contributions
J.W.H. and J.v.d.O. conceived the project and designed the biochemical experiments, which were performed by J.H and J.K. Single-molecule experiments were designed by S.C., T.J.C and C.J. and performed by S.C and T.J.C. X-ray crystallographic analysis was designed and performed by D.C.S. under the supervision of M.J.. J.W.H., D.C.S., C.H., M.J., C.J. and J.v.d.O. wrote the manuscript. All authors read and approved the manuscript.
Acknowledgements
We are grateful to Meitian Wang, Vincent Olieric, and Takashi Tomizaki at the Swiss Light Source (Paul Scherrer Institute, Villigen, Switzerland) for assistance with X-ray diffraction measurements. This work was supported by grants from the Netherlands Organization of Scientific Research (NWO; ECHO grant 711013002 and NWO-TOP grant 714.015.001) to J.v.d.O. A Swiss National Science Foundation (SNSF) Project Grant to M.J. (SNSF 31003A_149393) and by long-term postdoctoral fellowships from the European Molecular Biology Organization (EMBO) to D.C.S (ALTF 179-2015 and aALTF 509-2017). M.J. is International Research Scholar of the Howard Hughes Medical Institute and Vallee Scholar of the Bert L & N Kuggie Vallee Foundation. C.J. was supported by Vidi (864.14.002) of the Netherlands Organization for Scientific research.
Footnotes
↵5 Oxford Nanoimaging Ltd, Oxford, United Kingdom