Simple cloning of large natural product biosynthetic gene cluster by CRISPR/Cas12a-mediated fast direct capturing strategy

Directly cloning of biosynthetic gene clusters (BGCs) from even unculturable microbial genomes revolutionized nature products-based drug discovery. However, it is still very challenging to efficiently cloning, for example, the large (e.g. > 80kb) BGCs, especially for samples with high GC content in Streptomyces. In this study, by combining the advantages of CRISPR/Cas12a cleavage and bacterial artificial chromosome (BAC) library construction, we developed a simple, fast yet efficient in vitro platform for direct cloning of large BGCs based on CRISPR/Cas12a, named CAT-FISHING (CRISPR/Cas12a-mediated fast direct biosynthetic gene cluster cloning). It was demonstrated by the efficient direct cloning of large DNA fragments from bacterial artificial chromosomes or high GC (>70%) Streptomyces genomic DNA. Moreover, surugamides, encoded by a captured 87-kb gene cluster, was expressed and identified in a cluster-free Streptomyces chassis. These results indicate that CAT-FISHING is now poised to revolutionize bioactive small molecules (BSMs) drug discovery and lead a renaissance of interest in microorganisms as a source of BSMs for drug development.


SIGNIFICANCE STATEMENT
Natural products (NPs) are one of the most important resources for drug leads. One bottleneck of NPs-based drug discovery is the inefficient cloning approach for BGCs. To address it, we established a simple, fast and efficient BGC directed cloning method CAT-FISHING by combining the advantages of CRISPR/Cas12a (e.g. high specificity) and bacterial artificial chromosome (BAC) library (e.g. large DNA fragment and high GC content). As demonstrations, a series of DNA fragments ranging from 49 kb to 139 kb were successfully cloned. After further optimization, our method was able to efficiently clone and express an 87-kb long, GC-rich (76%) surugamides BGC in a Streptomyces chassis with reduced time-cost. CAT-FISHING presented in this study would much facilitate the process of NPs discovery.

INTRODUCTION
Microorganisms, especially Streptomyces, remain unrivalled in their ability to produce bioactive small molecules (BSMs), some of which reached the market without any chemical modifications required, a testimony to the remarkable potential of Streptomyces to produce novel drugs. The expedition of Streptomyces genomes deciphered a large unexploited pool of novel biosynthetic gene clusters (BGCs), responsible for new but silent BSMs (1,2). However, the cloning of BGCs in Streptomyces is often very difficult because of the high GC and large size of those BGCs. It was found that 92% (1760/1910) of the characterized BGCs are smaller than 85kb, and 40% (756/1910) with > 70% GC content. In Streptomyces, 84% (534/634) BGCs have a GC content over 70% (Supplementary Figure   S1). To date, various processes have been developed for BGC cloning, such as genomic library (i.e., genome editing applications (19,20), CRISPR/Cas12a has been widely used in nucleic acid-based diagnostic applications (21-23), small molecule detection (24,25) etc. Moreover, it worth noting that CRISPR/Cas12a possesses obvious superiority in DNA assembly with regard to its programable endonuclease activity and the DNA sticky ends of 4-or 5-nt overhangs (19). Based on these features, Li et al. developed a CRISPR/Cas12a-based DNA assembly standard, namely C-Brick (26).
Subsequently, a DNA assembly method (namely, CCTL) was reported (27). These features suggested the probability of being able to directly clone of large BGCs by using CRISPR/Cas12a.
Bacterial artificial chromosome (BAC) library construction is a classical method for cloning large DNA fragments, but it is time consuming and labour intensive, as well as technically demanding (6,28).
However, compared to other PCR-based or recombination-based cloning methods that have recently emerged (

MATERIAL AND METHODS
Strains, plasmids and media. The strains and plasmids used in this work are present in Supplementary Table S2. Escherichia coli and its derivatives were cultivated on Luria-Bertani (LB) agar plates (tryptone 10 g/L, yeast extract 5 g/L and NaCl 10 g/L, pH = 7.2). Streptomyces and its derivatives were cultivated on soybean flour-mannitol (SFM) agar plates (soybean flour 20 g/L, mannitol 20 g/L and agar 20 g/L, pH = 7.  (6). The transconjugants were screened and verified by PCR using the primers listed in Supplementary Table S5. After fermentation, the production of target natural product was qualitatively analysed using a high-resolution Q-Exactive Hybrid Quadrupole-Orbitrap mass spectrometer (Thermo Scientific, Waltham, MA).

Design and workflow of CAT-FISHING
By combining the DNA cleavage activity of CRISPR/Cas12 with the unique features of a BAC library construction, an in vitro platform (designated CAT-FISHING) for large BGC cloning has been developed. The flow chart of CAT-FISHING is presented in Figure 1, and includes three steps. The first step is the capture plasmid construction and CRISPR/Cas12a-based plasmid digestion. In this step, two homolog arms (each arm containing at least one PAM site) that flank the target GBC were selected and amplified by PCR. Then the BAC plasmid backbone containing the two homolog arms and selection marker (e.g., antibiotic resistance gene, counter selection gene or lacZ), and the designated capture plasmid was constructed via the DNA assembly method. Under the guidance of crRNAs, the two selected PAM motif regions on the left and right homolog arms were simultaneously digested by CRISPR/Cas12a, resulting in the linear capture plasmid. The second step is genomic DNA isolation and CRISPR/Cas12-based genome digestion. According to the BAC library construction protocol, genomic DNA plugs from the target strain were prepared. And the genomic DNA was digested by the CRISPR/Cas12a system guided by the two designed crRNAs that were previously used in step one. The last step is ligation and transformation. The resulting linear capture plasmid and the digested genomic DNA from steps I and II, respectively, were mixed and ligated by T4 DNA ligase. Then ligation products were introduced into E. coli by electroporation. Target BGCs could then be obtained from transformants by PCR-based screening.

Evaluation of CAT-FISHING cloning efficiency
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint this version posted June 25, 2020.
The principle underlying CAT-FISHING is the cohesive end ligation of two linearized DNA fragments by T4 DNA ligase. However, different from the widely used restriction endonuclease-based DNA cloning methods, here the cohesive ends were generated by paired crRNA-guided CRISPR/Cas12a cleavage. This study therefore evaluated and compared the cloning efficiencies achieved by applying two different kinds of cohesive ends that were individually generated by NEB restriction endonuclease and CRSIPR/Cas12a. As shown in Supplementary Figure S2, plasmid pGY2020 derived from the pCC2-FOS Fosmid vector (Epicentre) was constructed, and this plasmid contains two PAM sites (PAM1 and PAM2) as well as two NEB restriction endonucleases (EcoRI and HindIII). And the specific DNA fragment (ampicillin resistant gene Amp R ) was cloned into pGY2020 by the CRISPR/Cas12a or NEB restriction enzymes-based method. The clone number of the CRISPR/Cas12a-based method was 34% (P > 0.05) lower than that of the NEB restriction enzymesbased method. However, there was no significant difference in the true positive rate between these two cloning methods (

Cloning of a target DNA fragment from a BAC plasmid
In order to further demonstrate CAT-FISHING in a simplified system, a 137-kb BAC plasmid pBAC-ZL was used to evaluate its cloning performance on a large DNA fragment. As shown in Figure 2C, a 50kb fragment and an 80-kb fragment could be obtained by using the corresponding crRNAs-guided CRISPR/Cas12a cleavage. Under the guidance of the corresponding crRNA pairs, the BAC plasmid was digested by CRISPR/Cas12a, and 50-kb and 80-kb target bands were observed on the agarose gel after PFGE (Supplementary Figure S5). By using the corresponding capture plasmid, two target DNA fragments were also successfully cloned from the BAC plasmid, as shown in Figure  were the right clones. For the 80-kb DNA fragment, the number of transformants and the true positive rate were both lower, and about 50% of the transformants were the right clones. These results indicated that CAT-FISHING could achieve high cloning rates of the 50-kb and 80-kb DNA fragments from the BAC plasmid. It also needs to be noted that, for the 80-kb DNA fragment, the cloning difficulty was obviously greater.

Direct cloning of target BGCs from Streptomyces genomic DNA
In order to directly clone large BGCs from genomic DNA, a very simple procedure for fast cloning has been developed, as shown in Figure 3A. After genomic DNA isolation and subsequent CRISPR/Cas12a digestion, the resulting sample containing a mixture of genomic DNA could be directly used for subsequent ligation and transformation without prior DNA fragment isolation by PFGE and purification. In this study, a 49-kb paulomycin gene cluster (33), an 87-kb surugamides gene cluster (34) as well as a 139-kb candicidin gene cluster (35) were selected from the chromosome of S. albus J1074 to demonstrate this method. As shown in Table 2, the 49-kb paulomycin gene cluster (GC content 71%) and the 87-kb surugamides gene cluster (GC content 76%) were successfully captured by CAT-FISHING, as confirmed by PCR and restriction mapping (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint this version posted June 25, 2020. ; https://doi.org/10.1101/2020.06.25.170191 doi: bioRxiv preprint ( Figure 3B-3C). Additionally, the 139-kb candicidin gene cluster (GC content 75%) was also captured with CAT-FISHING, albeit with a much lower efficiency. These results indicate that CAT-FISHING is a simple and fast method for directly cloning large BGCs from high GC genomic DNA samples (Supplementary Table S1).

Expression of the target BGC in a cluster-free Streptomyces chassis strain
To thoroughly check the sequence and functional integrity of these BGCs that were captured by CAT-FISHING, as well as to prove access to genome mining through this route, a captured 87-kb surugamides gene cluster was expressed in a cluster-free Streptomyces chassis strain. The aac (3) In this study, we used crRNA with an 18-nt spacer has been applied to evaluate the cloning efficiency of CAT-FISHING. As shown in Figure 2A-2B, the number of transformants obtained by CAT-FISHING were fewer than the control, while no significant difference (P > 0.05) was observed. On the other hand, non-specific cleavage by CRISPR/Cas12a could be minimized by decreasing the Cas12a concentration and shortening the cleavage time. As a result, when a purified 137-kb BAC plasmid was used to demonstrate CAT-FISHING, the plasmid was almost completely digested and no non-specific cleavage products appeared on the agarose gel ( Figure 2C and Supplementary Figure S5). To some extent, these results verify the relatively higher efficiency of cloning 50-kb and 80-kb DNA fragments from BAC plasmid ( Figure 2D-2E). In addition, due to its high efficiency, CAT-FISHING also could be used as an efficient in vitro large DNA fragment editing system.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint this version posted June 25, 2020. ; https://doi.org/10.1101/2020.06.25.170191 doi: bioRxiv preprint PFGE is a powerful and essential tool for isolating large DNA fragments. During BAC library construction, however, PFGE is time consuming, as it often takes 16 ~ 24 hours to separate specific size DNA fragments (8). Moreover, the following operational steps, such as DNA elution and purification, could drastically decrease the DNA integrity/amount as well as the subsequent ligation or transformation efficiency. In this study, after many preliminary tests, we found that, following agarose digestion, the resulting mixture from a CRISPR/Cas12a-treated genomic DNA plug could be directly used for ligation with the vector and subsequent electro-transformation ( Figure 3A). Without the need for PFGE or the preparation of purified high molecular weight genomic DNA fragments, and compared to previously reported large DNA fragment cloning approaches (e.g. ExoCET, CATCH, TAR etc.), the cloning process in CAT-FISHING has been greatly simplified (3,5,10,18). In this study, by applying CAT-FISHING, we found the ratio of right clones that contain a 49-kb target BGC was 4~5%, and that for an 87-kb target BGC was 2~4% ( Figure 3B-3C and Table 2). Probably due to the complexity of the un-purified DNA mixture sample and high activity of T4 DNA ligase, many short DNA fragments or incomplete pieces of target BGCs were inserted into capture plasmids. Therefore, it is reasonable to predict that, if necessary, through DNA fragment isolation and purification, the cloning performance toward a 139-kb BGC should also be dramatically improved. However, compared to the BAC library in which often only a few right strains could be screened out of thousands of clones (i.e., 1/1000 ~ 1/2000) (6,37), CAT-FISHING is a simpler method with a greater efficiency for cloning target BGCs.
Streptomyces are the source of a majority of antibiotic classes in current clinical and agricultural use (38-41). S. albus J 1074 is one of the most widely used Streptomyces chassis for genome mining (42).
In this study, S. albus Del14, which is a S. albus J 1074-derived cluster-free chassis strain (29), has been used to demonstrate the sequence and functional integrity of the BGCs obtained by CAT-FISHING. As shown in Figure 4, the 87-kb target BGC was successfully expressed, and the corresponding NP surugamides (inhibitors of cathepsin B) could be confirmed by LC-MS. During genome mining, BGC cloning and expression is the most important starting point for the next step of bioactivity analysis and structure identification of target BSMs. The current results present a case study for NP production in Streptomyces by applying CAT-FISHING. Lastly, as an in vitro manipulation platform, not limited to actinomyces, CAT-FISHING could easily be extended to genome mining in fungi and other microbial resources (43). Based on the results described above, we concluded that, by combining the advantages of CRISPR/Cas12a cleavage and BAC library construction, CAT-FISHING offers a simple, fast but efficient direct cloning strategy for targeting large BGCs from high GC content genomic DNA. And, in addition to genome editing, DNA assembly, nucleic acid and small molecule detection etc., this study also expanded the application of CRISPR/Cas12 to direct cloning of large DNA fragments in vitro.
This innovation of a fundamental platform technology for use in genome mining through application of the CRISPR/Cas12 system would facilitate the discovery of novel BSMs from microbial sources.

ACKNOWLEDGEMENT
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint this version posted June 25, 2020. (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint this version posted June 25, 2020. ;