Comparative analyses of gene expression in common marmoset and human pluripotent stem cells (PSCs) identify factors enhancing homologous recombination efficiency in the HPRT locus of human PSCs

A previous study assessing the efficiency of the genome editing technology CRISPR-Cas9 for knock-in gene targeting in common marmoset (marmoset; Callithrix jacchus) embryonic stem cells (ESCs) unexpectedly identified innately enhanced homologous recombination (HR) activity in marmoset ESCs (cmESCs). Here, we compared gene expression in marmoset and human pluripotent stem cells (PSCs) using transcriptomic and quantitative PCR (qPCR) analyses and found that five HR-related genes (BRCA1, BRCA2, RAD51C, RAD51D and RAD51) were upregulated in marmoset cells. Four of these upregulated genes enhanced HR efficiency with CRISPR-Cas9 in human pluripotent stem cells. Thus, the present study provides a novel insight into species-specific mechanisms for the choice of DNA repair pathways.


Introduction
Repairing DNA double strand breaks (DSBs) is indispensable for maintaining genomic stability (Schipler and Iliakis, 2013). From prokaryotes to eukaryotes, conserved repair pathways for DSBs have been identified: nonhomologous end-joining (NHEJ); homologydirected repair (HDR), including homologous recombination (HR); and microhomologymediated end-joining (MMEJ), also known as alternative nonhomologous end-joining. The choice of which repair pathways are employed relies on a variety of factors, including DSB complexity, cell type, cell cycle, and species (Buerstedde and Takeda, 1991;Schipler and Iliakis, 2013;Scully et al., 2019;Sung and Klein, 2006). 2 The genome editing CRISPR-Cas9 technology (Jinek et al., 2012) relies on endogenous pathways for repairing DSBs in cells (Cong et al., 2013). Although the combination of CRISPR-Cas9 and double-stranded DNA (dsDNA) donor enables precisely targeted integration or deletion of long sequences by HDR, the efficiency of this process is limited by competing DSB repair pathways, mainly NHEJ. Previous studies have demonstrated that the efficiency of Cas9-meditated HDR can be modified through use of small molecules that stimulate the HDRrelated factor RAD51 (Pinder et al., 2015;Song et al., 2016) or that inhibit the NHEJ-related factor LIG4 (Chu et al., 2015;Maruyama et al., 2015). Additionally, modification of Cas9 by fusion with DSB repair-related factors such as RAD52, CtIP, MRE11A (Charpentier et al., 2018;Shao et al., 2017;Tran et al., 2019), or a dominant negative mutant of P53BP1 (Jayavaradhan et al., 2019) can improve HDR efficiency. Also, overexpression of several DSBrelated genes, including RAD52 (Shao et al., 2017) and RAD18 (Nambiar et al., 2019), reportedly contributed to enhancement of Cas9-mediated HDR efficiency.
In a previous study (Yoshimatsu et al., 2019a), we showed that common marmoset (marmoset; Callithrix jacchus) ESCs (cmESCs) harbor an unusual facility for DSB repair, which is characterized by a high HR ratio. In KI experiments using a PLP1-EGFP vector targeting the 1st exon of the PLP1 gene, we found rates of homologous recombinants of 90% with the use of CRISPR-Cas9 and 80% without its use (Yoshimatsu et al., 2019a). Moreover, we observed high HR ratios in a variety of other loci, including ACTB, PLP1 (targeting the 2nd, 5th, and 6th exons), FOXP2, PRDM1, DPPA3 (Yoshimatsu et al., 2019a;Yoshimatsu et al., 2020;Yoshimatsu et al., 2019b), and NANOS3 ( Figure S1). On the basis of these results, we have now explored possible mechanisms for the HR-biased DSB repair in cmESCs. Through use of interspecies analyses of gene expression, we have identified factors that enhance the HR ratio with CRISPR-Cas9 in hiPSCs by ectopic overexpression.

Ethical statement
Recombinant DNA experiments were approved by the Recombinant DNA Experiment Safety Committee of Keio University (approval number: 27-023 and 27-034).
For passaging, cells were pre-treated with 10 μM Rho-associated coiled-coil-containing protein kinase inhibitor Y-27632 (Wako) in ESM at 37°C for an hour. The cells were then incubated in CTK solution (Reprocell) at 37°C for 30 sec, mechanically separated from feeder cells, and dissociated by gentle pipetting. The isolated cells were plated onto new feeder cells in ESM supplemented with 10 μM Y-27632. Twenty-four hours later, Y-27632 was removed from the medium. Medium change was performed daily. Prior to the seeding of hESCs/iPSCs, feeder cells were seeded onto a gelatin-coated 100 mm cell culture dish in Dulbecco's modified Eagle's medium (Thermo Fisher Scientific) supplemented with 10% inactivated fetal bovine serum (Thermo Fisher Scientific).

6
For transfection (Day 0), we used the NEPA21 Super Electroporator (Nepagene) as described previously (Yoshimatsu et al., 2019b). Ten μg of HPRT-TV and 5 μg each of the Cas9/gRNA and overexpression vectors were diluted in 100 μl of OPTI-MEM (Thermo Fisher Scientific). The etKA4 hiPSCs (>1 × 10 7 cells) were suspended in the solution and subjected to electroporation. Transfected cells were plated onto new mitomycin-C-treated G418-resistant SNL76/7 feeder cells on a 100-mm cell culture dish in ESM supplemented with 10 μM Y-27632. On Day 2, selection was initiated by adding 100 ng/ml G418 (an analog of neomycin; Sigma) to the ESM; selection was performed for six days. On Day 8, the concentration of G418 was doubled, and selection was continued for an additional three days. On Day 11, the number of G418-resistant colonies was counted. The cells were then subjected to further selection in 10 μM 6-thioguanine (6TG) for five days. On Day 16, the number of 6TG-resistant colonies was counted. Experimental data was only included when at least 80 G418-resistant colonies survived.

Genotyping
Southern blotting was performed as described previously (Yoshimatsu et al., 2019a) using the digoxigenin (DIG) probe system. Genomic DNA samples were digested with BglII and EcoRV by overnight 37°C incubation. We used the PCR DIG Probe Synthesis Kit (Roche) for DIG-labeled probe production. The human HPRT-specific probe (492 bp) was amplified from human genomic DNA using the primers TGCATATCTGGGATGAACTCTGG and AAATGGGACATTTGTGTGTCACC. Molecular sizes were confirmed using DIG-labeled DNA Molecular Weight Marker II (λDNA with Hind III digestion) (Sigma; #11218590910).

Quantitative reverse-transcription PCR (qPCR)
RNA extraction, reverse transcription, and PCR were performed as described previously (Yoshimatsu et al., 2019a). Three biological and technical repetitions of the qPCR analyses were performed. Quantification was performed using the relative standard curve method and endogenous expression of glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used as an internal control. Primers were newly or previously designed to amplify both human and marmoset cDNA sequences; however, they did not amplify murine cDNA sequences to avoid contamination due to the use of MEFs for PSC culture (Yaglom et al., 2014;Yoshimatsu et al., 2019a). The following primers were used: GAPDH-forward (TGGGCTCTCCTGATGCCTGTA) and BRCA2-reverse (GTATACCAGCGAGCAGGCCG).

RNA-seq analysis
Transcriptome data was obtained from cmESCs as described previously (GSE138944) . We used the deposited RNA-seq data from human ESCs/iPSCs (GSE53096) (Ma et al., 2014) as a reference. Marmoset mRNA was sequenced on an Illumina HiSeq2500 and the obtained nucleotide sequences were mapped against the Callithrix jacchus genome (Callithrix_jacchus_cj1700_1.1; https://www.ncbi.nlm.nih.gov/assembly/GCF_009663435.1/) by STAR (ver.2.5.3a). The number of mapped reads was counted by featureCounts (1.5.2) and simultaneously normalized by the TMM method in the edgeR package in R (Robinson et al., 2010). The normalized expression levels processed to log2 and z-scoring were visualized using the pheatmap library. In the statistical analysis, Welch's t-test was performed between normalized gene expression levels of human samples and marmoset samples, and the resulting p values were processed with Bonferroni correction to obtain the adjusted p value. In the present study, adjusted p values less 8 than 0.05 were defined as significant; adjusted p values processed to -log10 were visualized on the vertical axis by the ggplot2 library in R.

Western blotting
Western blotting was performed using the Wes -Automated Western Blots with Simple Western (ProteinSimple) according to the manufacturer's introductions. As primary antibodies, we used polyclonal Rad51 H-92 antibody (1:50 dilution; sc-8349; Santa Cruz) and monoclonal α-tubulin antibody (1:25000 dilution; T9026; Sigma), which was used as an internal control for RAD51 protein quantification. ImageJ software was used to quantify the intensities of Rad51 (37 kDa) and α-tubulin (50 kDa) bands, and then RAD51 expression was normalized against αtubulin expression.

Statistical analysis
All data in this study are expressed as means ± S.D. Statistically significant differences were determined using Welch's t-test; p values < 0.05 are designated by *; p values < 0.01 are designated by **, and are interpreted as statistically significant.
Initially, we investigated HR-and NHEJ-related gene expression in human and marmoset PSCs. In hPSCs, several studies have shown that the HR ratio is less than 50%, generally 9 around 30%, even with use of site-specific nucleases such as ZFN, TALEN, and CRISPR-Cas9 (Hockemeyer et al., 2009;Hockemeyer et al., 2011;Sebastiano et al., 2011;Takayama et al., 2017). We used RNA-seq data of hESCs/iPSCs deposited in databases (Ma et al., 2014) to compare gene expression with cmESCs as described previously . We merged and normalized PSC RNA-seq data derived from both species.
HR-and NHEJ-related genes that can be used for comparing fold changes between hPSCs and cmESCs have been listed by hsa03440 and ko03450 in KEGG  Figure 1D). In light of the high HR activity in cmESCs, we then focused on HR-related genes that showed increased expression in cmESCs compared to hPSCs. To validate the results of the transcriptome analysis, we performed an interspecies qPCR analysis using primer sets specifically designed for these human and marmoset genes. We also designed primers specific for human and marmoset GAPDH for normalization. Preliminary screens using the designed primers showed that an interspecies comparison of expressions was not feasible for several genes (BABAM1 and POLD1/2) owing to a lack of accuracy based on the post-qPCR melt curve analysis (data not shown).

Enhancement of the HR/RI ratio with CRISPR-Cas9 in hiPSCs by overexpression of the defined factors
To explore the effect of high expression of the five HR-related genes (RAD51D, RAD51C, BRCA2, RAD51, and BRCA1) in cmESCs, we induced overexpression of the genes in a gene targeting experiment with hypoxanthine-guanine phosphoribosyltransferase (HPRT) in the male hiPSC line etKA4 . The HPRT protein catalyzes the salvage pathway, synthesizing inosine monophosphate and guanosine monophosphate from hypoxanthine and guanine, respectively (Torres and Puig, 2007). HPRT deficiency results in the loss of susceptibility for 6TG, a toxic analog of guanine (Sharp et al., 1973;Wahl et al., 1975). We selected the HPRT targeting system as it has been frequently used to assess the HR ratio in mammalian male PSCs (Meek et al., 2010;Thomas and Capecchi, 1987;Zwaka and Thomson, 2003).
We constructed a knock-in/knock-out system for the human HPRT gene (Materials and Methods). Following KI, the 2nd exon of HPRT was completely replaced with a PGK-Neo cassette, which resulted in the loss of functional mRNA expression from the HPRT Neo allele ( Figure 3A). Initial G418 selection (both homologous recombinants and non-recombinants survived) and subsequent 6TG selection (only homologous recombinants survived) enabled robust quantification of the HR ratio without the necessity of genotyping individual clones ( Figure 3B). In addition, we constructed a PX459-based Cas9/gRNA vector  containing the sgRNA sequence for the 2nd intron of HPRT, which did not recognize HPRT-TV.
As it is possible that the genomic cleavage of the HPRT 2nd intron after transfection of the Cas9/gRNA vector and subsequent NHEJ or MMEJ-mediated introduction of small/large deletion could produce an undesired knock-out allele, we initially tested transfection of only the Cas9/gRNA vector into the etKA4 hiPSCs. In this initial test, no 6TG-resistant colonies were obtained from 1 × 10 7 transfected cells (n = 3), showing that the NHEJ or MMEJ-mediated deletion in the intronic region has a negligible effect with regard to the assessment of the HR ratio in the hiPSCs.
After co-transfection of the Cas9/gRNA vector and HPRT-TV, and serial G418 and 6TG selection, we confirmed that all analyzed G418 and 6TG-resistant (NeoR+6TGR) clones were hemizygous HPRT Neo recombinants by Southern blotting ( Figure 3C).
Next, we quantified HR ratios with use of CRISPR-Cas9. When the mock vector was transfected with HRPT-TV and the Cas9/gRNA vector, an HR ratio of approximately 30% was achieved ( Figure 3D; 0.336 ± 0.026, n = 4); this HR ratio is comparable with previous results using hPSCs (Takayama et al., 2017). Overexpression of only RAD51 did not result in a significant enhancement of the HR ratio (0.256 ± 0.163, n = 3).

Discussion
In this study, through comparative analyses of gene expression in human and marmoset PSCs, we have identified four genes (RAD51C, RAD51D, BRCA1, and BRCA2) whose single/multiple overexpression increased HR ratios in hiPSCs. Intriguingly, we also observed that overexpression of RAD51 did not enhance the HR ratio in hiPSCs; an alternative explanation is that RAD51 overexpression cancelled the effect of the enhancement of the HR ratio by the other four genes. In Supplementary Discussion, we further discuss how these genes involve in the HR machinery, and the discrepancy of the effect of RAD51 overexpression from part of previous studies. Results presented here also suggests the possibility of a vice versa effect. In fact, we demonstrated the overexpression of four factors (RAD51C, RAD51D, BRCA1, and BRCA2), which were highly expressed in cmESCs, contributed to the enhancement of HR ratios in hPSCs. Thus, it is also possible that overexpression of several NHEJ factors, which were lowly expressed in cmESCs (including FEN1, DCLRE1C (Artemis), NHEJ1, and XRCC4), may contribute to decreased HR ratios in hPSCs. Further analyses are required to evaluate the robust effects of the four HR factors, such as effects on KI in other loci, and in different cell lines and species; nevertheless, our investigation has demonstrated that overexpression of these factors may ameliorate the HR ratio with CRISPR-Cas9 in hiPSCs.