Abstract
Small regulatory RNAs (sRNAs) are involved in antiviral defense and gene regulation. Although roles of RNA-dependent RNA Polymerases (RdRPs) in sRNA biology are extensively studied in nematodes, plants and fungi, understanding of RdRP homologs in other animals is still lacking. Here, we study sRNAs in the ISE6 cell line, which is derived from the black-legged tick, an important vector of human and animal pathogens. We find abundant classes of ∼22nt sRNAs that require specific combinations of RdRPs and sRNA effector proteins (Argonautes or AGOs). RdRP-dependent sRNAs possess 5’- monophosphates and are mainly derived from RNA polymerase III-transcribed genes and repetitive elements. Knockdown of RdRPs misregulates genes including RNAi-related genes and the regulator of immune response Dsor1. Sensor assays demonstrate that Dsor1 is downregulated by RdRP through the 3’UTR that contains a target site of RdRP-dependent repeat-derived sRNAs. Consistent with viral gene repression by the RNAi mechanism using virus-derived small interfering RNAs, viral transcripts are upregulated by AGO knockdown. On the other hand, RdRP knockdown unexpectedly results in downregulation of viral transcripts. This effect is dependent on Dsor1, suggesting that antiviral immunity is enhanced by RdRP knockdown through Dsor1 upregulation. We propose that tick sRNA pathways control multiple aspects of immune response via RNAi and regulation of signaling pathways.
Author summary RNA-dependent RNA Polymerases (RdRPs) are essential for biogenesis of small regulatory RNAs (sRNAs) in many organisms such as plants and fungi, but its general importance in animals besides nematodes remains controversial because experimental evidence is lacking. By using a tick cell line, we demonstrate that RdRP-dependent sRNAs are abundantly expressed and tick RdRPs regulate gene expression. These results indicate that ticks have unexpectedly complex sRNA biogenesis pathways that are essential for proper gene regulation. Interestingly, we show that RdRP negatively regulates immune response by changing expression of a gene that is essential in an immunity-related signaling pathway. Because ticks are important vectors of human and animal pathogens, studying the novel tick sRNA pathways and their effects on immune signaling may lead to a better understanding of vector-virus interactions.
Introduction
Foreign nucleic acids such as transposable elements (TEs), phages and viruses pose constant threats to host cells. To inactivate invading agents, cells are equipped with defense mechanisms that use short fragments of nucleic acids to distinguish those foreign nucleic acids from their own genetic materials and silence them [1].
In prokaryotes, a common defense mechanism involves CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) that are arrays of short sequences derived from phages and produce short RNAs against the foreign nucleic acids [2]. Each of the short phage sequences produces a guide RNA that binds to the effector protein typically to degrade foreign DNA or RNA in a sequence-specific manner. Another major class of defense systems using small RNAs (sRNAs) involves another family of sRNA binding proteins known as Argonautes (AGOs). The AGO system is widely conserved in various organisms and can be found in both eukaryotes and prokaryotes [3].
The large diversity of sRNA pathways is assumed to be a result of a relentless arms race between the host cell and the invading nucleic acids. In bilaterian animals, three distinct AGO-dependent sRNA pathways, namely the PIWI-interacting RNA (piRNA), small interfering RNA (siRNA) and microRNA (miRNA) pathways, are widely present [4]. These pathways were initially characterized in a few model animals including Drosophila, C. elegans and mice [5]. Expression of the piRNA pathway components is virtually restricted within gonads in these organisms and they appear to be specialized in silencing transposons while also playing roles in gene regulation [6, 7]. However, recent studies argue against the notion that active piRNA production is generally confined in gonads as abundant piRNAs have been detected in somatic tissues in many arthropods, even in some insects [8–12].
The siRNA pathway is believed to be a major mechanism to control viruses in insects [13]. Mutants of the core siRNA factors exhibit elevated sensitivity to various viruses [14] and siRNAs against the virus are efficiently produced in infected cells via a specific processing mechanism [15, 16]. Interestingly, the siRNA factors are among the most rapidly evolving genes potentially because they must catch up with the rapidly changing viruses. Due to the rapid evolution, the siRNA pathway shows significant evolutionary diversity [17, 18]. This was first recognized by a comparison between the insect and nematode RNAi mechanisms. The clearest difference is the essential involvement of RNA-dependent RNA polymerases (RdRPs) in worms but not in flies for robust RNA silencing [19–21]. RdRPs are known to produce various sRNAs during RNA silencing in fungi and plants [22, 23] whereas the gene family was believed to have been lost in the animal lineage until genome sequence analyses of non-model organisms started to discover RdRP genes in a broad range of animals [18,24,25].
The sequences of most RdRP genes support that the genes had been vertically transferred and were not introduced to the animals by the horizontal transfer, suggesting that RdRPs may have conserved biological roles in those lineages [26]. The findings that the organisms with no piRNA pathway genes always retain the RdRP genes have led to the notion that the ping-pong amplification mechanism in the piRNA pathway and RdRP- dependent sRNA production pathway have overlapping roles in reinforcing silencing activity against transposable elements (TEs)[8–11]. On the other hand, a past study analyzed sRNAs in another RdRP-positive lineage cephalochordates, but there was no evidence for the production of worm-like secondary sRNAs, such as those with 5’-triphospophate modification and 5’-purine enrichment with their sRNA populations [26]. Experimental evidence for the functional involvement of RdRPs in sRNA biogenesis in animals outside of Nematoda is currently lacking.
To directly test if RdRPs are involved in the production of sRNAs in arachnids, we use a cell line from the model tick Ixodes scapularis and provide experimental evidence that abundant classes of RdRP-dependent sRNAs regulate the expression of genes in tick cells. There are distinct classes of sRNAs produced through the activity of at least two different RdRP genes. We further demonstrate the involvement of RdRP-dependent sRNAs in regulation of genes including the ERK pathway component Dsor1. Knockdown of one of the RdRPs unexpectedly resulted in a reduction of specific viral transcripts in a Dsor1- dependent manner, suggesting that the RdRP allows for viral gene expression potentially by regulating the host’s immune response by lowering the Dsor1 level. In summary, tick RdRPs are essential for the biogenesis of specific sRNAs, play roles in gene regulation and controlling viral transcript levels. This study unveils previously overlooked pathways that are potentially broadly conserved in ticks.
Results
RNAi factors are diversified in the Ixodes genome
The broad presence of recognizable RdRP genes places arachnids in a unique position in the arthropod lineage [11]. We hypothesized that arachnids might have previously unrecognized sRNA pathways fueled by the enzymatic activity of RdRPs to produce antisense RNA molecules.
To identify RdRP genes expressed in ISE6 cells, we first performed transcriptome assembly by sequencing ribosomal RNA-depleted RNAseq libraries (Table S1, Sheet7). In the assembled transcriptome, we found 3 genes that were similar to C. elegans RdRP genes (Figure S1, Gene IDs and contig sequences are on Table S1, Sheet4). These sequences were predicted to contain the entire sequence of the conserved RdRP domain, suggesting that they were genuine RdRP genes (Figure 1A and B). We named them IscRdRP1, 3 and 4. Two (RdRP1 and RdRP3) out of the three RdRP genes were expressed at >10 TPM in our transcriptome data (Figure 1C), and we characterized these two RdRPs in the present study.
sRNA pathways often employ specific AGO proteins as their effectors [27]. We identified 6 contigs that correspond to AGO genes in the ISE6 transcriptome (Figure 1A and S2). We verified the expression of these genes by qPCR and tested the specificity of qPCR primers by introducing dsRNAs against cognate genes from regions that did not overlap with the qPCR amplicons (Figure S3A). In all cases, we observed a strong (40-90%) reduction of the tested genes upon knockdown, confirming the specificity of the qPCR assay. Therefore, we concluded that these tested genes were indeed expressed in ISE6 cells.
Among the tested genes, we found a contig in our transcriptome data that matched two annotated genes (ISCI012408 and ISCI004800). This contig was potentially derived from two genomic loci with very similar sequences and the sequences similar to ISCI012408 and ISCI004800 were next to each other at both loci (Figure S3B). This contig showed the highest similarity to the Drosophila AGO3 (dAGO3) (Figure 1A), and raised a possibility that these two ISCI entries represented fragments of a dAGO3 homolog. We consider the two ISCI entries a single gene throughout the manuscript and renamed this IscAGO3 because knockdown using dsRNAs derived from either of the annotated sequences resulted in depletion of both ISCI012408 and ISCI004800 (Figure S3C). We found another gene belonging to the PIWI-clade (IscAub). Other genes were similar to AGO-clade Argonautes, which were identified in a previous study [28]; an ortholog of the miRNA-class AGO (IscAgo- 78) and three other genes that were relatively distant from miRNA AGOs (IscAgo-16, IscAgo-30 and IscAgo-96) (Figure 1A). The predicted protein sequences of Ago-16, Ago-30 and Ago-96 contained the entire PIWI domain (Figure 1B) and their catalytic residues were also conserved (Figure S2), suggesting that they were functional slicer enzymes [29].
Using our total RNAseq data, the expression levels of these AGO genes in ISE6 cells were determined (Figure 1C). The PIWI-related genes (IscAGO3 and IscAub) were highly expressed (>50 and >90 TPM, respectively) in ISE6 cells, which were assumed to be derived from the neural lineage [30]. Although PIWI proteins were previously believed to be confined in the animal gonad in general [31], our observation was consistent with the recent findings that piRNA mechanisms are active broadly in arachnid somatic tissues [8, 11]. Expression of the PIWI proteins was further confirmed by Western blotting using antibodies against asymmetric di-methyl-arginine, a post-translational modification that is conserved among PIWI proteins [32]. The dsRNA against IscAub decreased the signal, suggesting that IscAub was the major PIWI protein modified with asymmetric di-methyl-arginine in ISE6 cells (Figure S3D). The other four AGO genes were also highly expressed (10-250 TPM, Figure 1C). Therefore, our RNAseq data indicated that components of multiple sRNA pathways including distinct PIWI/AGO genes as well as RdRP genes were expressed in ISE6 cells.
We next studied the localization of the homologs of AGOs and RdRPs, we cloned the putative ORFs of Ago-16, Ago-30, RdRP1 and RdRP3 into a mammalian expression vector with an N-terminal EGFP tag. Using the plasmids, we transfected HEK293T cells and analyzed their subcellular localization by confocal microscopy. Successful expression of the fusion proteins was confirmed by Western blotting analysis (Figure S3E). The fluorescent signals in transfected cells were detected mainly in the cytoplasm for all of the RdRP/AGO constructs (Figure 1D). Although such a heterologous experimental system might not accurately reflect their natural subcellular localization as seen with mislocalization of PIWI proteins expressed in cells lacking an active piRNA processing pathway [33], our results suggested that these proteins could localize in the cytoplasm at least under certain conditions.
The sRNA repertoire of ISE6 cells
To understand the tick sRNA repertoires, we performed sRNAseq analysis (Table S1). To understand their biogenesis mechanisms, we also generated sRNA libraries from ISE6 cells after knocking down the AGO/PIWI/RdRP genes, and each of the libraries yielded ∼13-20 million reads that could be mapped to the ISE6 genome (Table S1). sRNA reads in these libraries showed a bimodal distribution with peaks at ∼22nt and in the 26-29nt range, which typically represents peaks of miRNAs/siRNAs and piRNAs, respectively (Figure S4A).
To find clues to their functions and biogenesis mechanisms, we categorized sRNA reads based on their genomic origins (Figure 2A). The sRNA reads were sequentially mapped to the reference sequences in different categories, including miRNAs, RNA polymerase III (RNAP III) transcribed genes, rRNAs, snoRNAs, protein-coding genes and repetitive sequences (See Table S1 sheet2 for the details of the reference sequences). In the control library transfected with GFP dsRNA, ∼15% of the library was comprised of annotated miRNAs in miRBase (ver 22). As this class of sRNAs was strongly reduced upon the knockdown of Ago-78, which encoded the miRNA AGO ortholog (Figure 2A), this result confirmed the major role of Ago-78 in the miRNA pathway. We did not observe strong effects on miRNAs when other AGOs were knocked down (Figure 2A and Figure S4B), suggesting that other AGOs might support functions of other sRNA classes. Repetitive sequences produced multiple classes of sRNAs. More than 40% of 22nt and 25-30nt species were derived from repeats (Figure S4C), and they might represent repeat- associated siRNAs and piRNAs as seen in the fly system [5]. Indeed, the 25-30nt species derived from repeats were strongly decreased when PIWI genes were knocked down (Figure S4D, 25-30nt). On the other hand, the 22nt species showed no strong reduction upon knockdown of any of the factors (Figure S4D, 22nt), suggesting that there were multiple classes of repeat-associated 22nt sRNAs depending on distinct biogenesis mechanisms as discussed later.
We also found a group of abundant sRNA reads derived from various genes that were known to be transcribed by RNAP III such as SRP RNA, RNase P, RNase MRP and tRNAs (Figure 2A). The read counts of sRNAs in this category accounted for ∼9% of the control library, which was nearly as abundant as miRNAs (∼15%) as a class. The sRNAs were mapped to both sense and antisense directions with respect to the direction of transcription of their host genes, excluding the possibility that they were mere degradation products of abundant RNAP III transcripts (Figure 2B). Furthermore, these sRNAs were virtually eliminated when RdRP1 was knocked down, and their dependence on RdRP1 was verified by Northern blotting (Figure S5A and B), indicating that ISE6 cells possess molecular mechanisms to produce sRNAs that are different from those known in Drosophila or C. elegans (Figure 2A).
The sRNA species may regulate levels of the host ncRNA species. However, the level of the 300nt product of SRP RNA showed no clear difference between control and any of the knockdown samples (Figure S5B, second panel from the bottom). The effects of sRNAs on their host transcripts remain unclear.
Chemical structures of tick sRNAs
The chemical structures of 5’- and 3’-terminal nucleotides of sRNAs often reflect their biogenesis mechanisms because RNA processing and modifying enzymes leave characteristic functional groups at these ends [34, 35]. In general, AGO-bound sRNAs possess 5’-monophosphate groups that are recognized by the 5’ binding pocket of the MID domain [36], with a notable exception of nematode secondary siRNAs which possess 5’- triphosphate groups [27]. piRNAs, fly siRNAs and plant miRNAs have 2’-O-methyl groups at their 3’ nucleotides, whereas animal miRNAs carry hydroxyl groups at the equivalent position [37]. The 2’ modification status at the 3’-nucleotide could be analyzed by oxidizing RNA samples with a periodate, as the presence of vicinal free 2’-, 3’-OH species makes the RNA molecule amenable to oxidization and resulting oxidized RNA molecules lack a 3’-OH group that is required for the 3’ linker ligation for sRNA library construction [38, 39]. Although piRNA species were efficiently enriched in our oxidized sRNA library (Figure S4E), sRNAs from RNAP III-transcribed genes were depleted, indicating that the latter had free 2’-, 3’-OH groups (Figure 2C). To further support this conclusion, we verified the results by Northern blotting. ß-elimination of the 3’ nucleotides occurs when oxidized RNA species are incubated in an alkaline solution, resulting in faster migration ß-eliminated RNA species on the denaturing gel (Horwich et al., 2007). After ß-elimination, piRNAs remained at the same size, while miRNAs migrated more rapidly, consistent with the previously known 3’ structures of their counterparts in other animals (Figure 2D). We observed faster migration of sRNAs from RNAP III-transcribed genes after ß-elimination, confirming the conclusion that they had 2’-OH species at the 3’-nucleotide. Although this was different from the known structure of the fly siRNA, recent reports also showed that TE-derived sRNAs in arachnids had free 2’- OH at their 3’-ends [8,10,11,26].
We also analyzed the 5’ chemical structures of the sRNAs. The standard sRNA cloning protocol is selective for 5’-mono-phosphorylated species by taking advantage of the substrate specificity of the RNA ligase [41]. The efficient inclusion of the sRNAs derived from RNAP III-transcribed genes in our libraries suggested that they harbored monophosphate groups at their 5’ ends. To confirm this, we prepared an sRNA library after removing sRNA species with 5’-monophosphate groups by Terminator exonuclease [42] followed by dephosphorylation and re-phosphorylation of the 5’ ends using T4 polynucleotide kinase, allowing the resulting libraries to enrich 5’ di- or tri-phosphorylated sRNAs (Figure 2C, bottom). sRNAs from RNAP III-transcribed genes were not enriched in the 5’ mono-P- depleted library when compared with the regular 5’-mono-P-enriched library, supporting the hypothesis that these sRNAs were 5’-mono-phosphorylated.
Taken together, we concluded that the novel sRNA species from RNAP III transcribed genes carried a 5’-mono-phosphorylated group and were not modified at the 2’-position of the 3’-nucleotide.
Evolutionary conservation of sRNA production from RNAP III-transcribed genes
If RdRP-dependent sRNAs play important biological roles, one would expect the production of similar sRNA species to be conserved in evolutionarily distant tick species. We reanalyzed sRNA libraries from the Asian longhorned tick (H. longicornis) [43–45]. Phylogenetic analysis suggested that H. longicornis and Ixodes species shared the last common ancestor ∼200 million years ago [46]. We identified sequences homologous to the RNase P, RNase MRP and SRP RNA genes in the H. longicornis genome, and found that sRNAs were mapped to both strands of these loci (Figure 2E and S6A). Importantly, they showed peaks at 22nt on both strands, suggesting that they were produced by specific processing machineries (Figure 2F and S6B).
Therefore, the production of 22nt species from RNAP III transcribed genes was broadly conserved in ticks. Furthermore, the presence of similar sRNA species in libraries made from tick animals and saliva suggested that the sRNA production was not restricted in cultured cell lines. The deep conservation of sRNA production from RNAP III transcribed genes suggests the biological importance of this sRNA class.
Various sRNAs are produced from select coding genes
Although the fraction of sRNAs that mapped to coding exons was small (Figure 2A and Table S2), the production of sRNAs from both strands suggested the involvement of RdRPs. We analyzed sRNA reads mapping to individual annotated protein-coding genes and collected genes that produced sRNA reads at >35RPM on average in the knockdown libraries (Figure 3A and Supplementary Data). Most of the loci showed a strong reduction (>40%) in at least one library compared to the GFP-KD control (28 out of 39 loci, Figure 2B and Supplementary Data). sRNAs from some loci were reduced in more than one sample with frequent overlaps between the RdRP3-Ago-16 and Aub-AGO3 combinations (Figure 3A). On the other hand, no locus showed reductions both in RdRP1- and RdRP3-knockdown libraries. These results suggested that certain combinations of factors formed sRNA processing pathways and the two RdRPs belonged to different pathways. The size distributions of sRNAs from individual loci roughly corresponded to their processing dependencies, where 22nt and 25-29nt peaks tended to be RdRP- and PIWI-dependent, respectively (Figure 3B, lower panels, Supplementary Data).
ISCI012234, which encodes a homolog of histone H1, produced the highest number of sRNAs and the sRNAs were dependent on RdRP3 and Ago-16 (Figure 3B). As mature histone mRNAs are generally not poly-adenylated and polyA tails inhibit the antisense production by plant RDR6 [47, 48], an interesting possibility was that the polyA status might correlate with the sRNA production efficiency from the mRNA. We identified mRNAs enriched in rRNA-depleted RNAseq libraries when compared with polyA-enriched RNAseq libraries [49] (Figure S7A, Table S3) and tested whether the mRNAs produced sRNAs more efficiently than the other mRNAs. Indeed, we observed a slight but statistically significant negative correlation between the polyA status and the sRNA production (Figure S7B). Two other histone-like genes produced sRNAs at relatively high levels (Figure S7A and C). Many mRNAs were relatively strongly enriched in the rRNA-depleted RNAseq libraries without producing many sRNAs, suggesting that other factors also affect the sRNA production (Figure S7C). The production of sRNAs from PolyA (-) group was significantly higher than that from the polyA (+) group, supporting the hypothesis that polyA tails inhibit the production of RdRP-dependent sRNAs (Figure S7D).
Because ISCI012234 produced the highest sRNA read counts, we tested whether RdRP3-dependent sRNAs affect its expression. However, the expression level of this mRNA showed no consistent change (see below and Table S3). Therefore, molecular functions of RdRP3-dependent sRNAs remained unknown. Global analysis of sRNAs from protein- coding genes revealed that multiple biogenesis mechanisms were involved in the production of coding gene-derived sRNAs. The fact that each RdRP was involved in the production of sRNAs from a small number of loci suggested that the RdRPs selectively recognize their substrates for sRNA production.
sRNAs produced from repeats
A common role for metazoan RNAi/piRNA pathways is silencing of TEs [27]. To study TE- derived sRNAs, repetitive sequences were identified by the RepeatModeler2/RepeatMasker pipeline [50] and a genome-wide annotation of the repetitive sequences was obtained (Materials and Methods). To test which of the sRNA factors might be working together within the same sRNA biogenesis pathways, we counted the numbers of TEs whose sRNAs were commonly reduced in multiple knockdown libraries (Figure 4A). We used 67 repeats that produced abundant sRNA reads (>800 rpm on average in the knockdown libraries) for this analysis. As expected, the repetitive sequences often produced sRNAs that were reduced (<60%) upon knockdown of the PIWI genes (30 out of the 67 TEs examined), and many of these showed reduced levels of sRNAs in all of the three PIWI-family knockdown libraries (dsAub, dsAGO3-1 and dsAGO3-2, 13 out of the 30 TEs producing PIWI-family dependent sRNAs). Large overlaps were seen with Ago-16-RdRP3 and Ago-30-RdRP1 combinations, suggesting that the AGOs and RdRPs might work together to produce repeat associated- sRNAs. Interestingly, very few loci overlapped between the three groups, similar to the observation with the sRNAs from coding genes (Figure3). All these results again suggested that these groups of processing factors represented sRNA production pathways that largely independently operate to produce their own classes of sRNAs.
To gain further insights, we analyzed individual repeat families (Figure 4B). We noticed that repeat families producing larger numbers of sRNAs tended to be RdRP- dependent whereas families that produced PIWI-dependent sRNAs tended to produce fewer sRNAs (Figure 4B, upper). When their sRNA sizes were analyzed, families that produced RdRP1- or RdRP3-dependent sRNAs showed clear 22nt peaks and families that produced PIWI-dependent sRNAs showed peaks at ∼25-28nt (Figure 4B, bottom). The peaks at the expected sizes of their corresponding classes were strongly reduced by knockdown of the RdRP or PIWI protein, confirming the specific roles of these machineries in producing the respective classes of sRNAs. We verified biogenesis mechanisms of the TE-derived piRNA from family-423 (Figure S5B, second panel). The ∼27nt band recognized by the probe was strongly reduced by Aub or AGO3 knockdown, as expected from the sRNAseq results (Supplementary Data). In addition, we noticed that there was a less abundant species at ∼22nt, whose expression was reduced in RdRP1-knockdown cells (Figure S5B). This indicated that some repeats produce both 22nt sRNAs and piRNAs. Furthermore, we occasionally observed repeats whose 22nt peaks were reduced (e.g. rnd-1_family-1111TE, Supplementary Data) or increased (e.g. rnd−5_family−5812, Figure 4B and Supplementary Data) upon Aub knockdown in addition to the reduction of the 25-28nt piRNA peaks. Therefore, interactions between these pathways should not be ruled out.
To test if repeat-associated sRNAs silence expression of repeats, the levels of transcripts from repeats were analyzed after knocking down Ago-16, RdRP1 or RdRP3 (Table S4). To our surprise, very few repeats were misregulated. The most significantly misregulated repeat in the Ago-16 knockdown libraries was rnd-6_family-4937, which was also most significantly misregulated in the RdRP3 knockdown libraries. As this repeat produced the second highest number of RdRP3-Ago-16 dependent sRNAs (Figure 4B), these results suggest that the RdRP3-Ago-16 axis may silence repeats.
Roles for sRNA factors in gene regulation
To clarify whether the new sRNA pathways described here had roles in gene regulation, we analyzed the total RNAseq data of ISE6 cells after knocking down Ago-16, RdRP1 or RdRP3 (Figure 5).
Upon knockdown of these genes, we detected 47-84 genes to be differentially expressed compared to the control GFP KD sample (adjusted p-value <0.05, Figure 5A-C and Table S3). GO-term analysis revealed enrichment of the biological process categories related to RNAi and response to dsRNAs upon knockdown of Ago-16 or RdRP3 (Figure 5D and F, Supplementary PDF). In particular, Dicer homologs were upregulated in both libraries, while AGO homologs were up- and down-regulated in RdRP3 and Ago-16 knockdown libraries, respectively. Although the possibility of off-target effects of the introduced Ago-16 dsRNAs on their homologs could not be excluded, these results strongly suggested auto-regulation of the genes in sRNA-related pathways. Upon RdRP1 knockdown, stress-response-related genes were often down-regulated (Supplementary PDF).
Misregulation of the extracellular signal-regulated kinase (ERK) protein (ISCI005428, hereafter IscDsor1 after the fly gene name Dsor1) upon RdRP1-knockdown caught our attention because it was most strongly upregulated in this dataset (Figure 5B highlighted by blue). The annotation of Ixodes genes was incomplete and gene models generally lacked UTRs. We noticed that there was a strong peak of RdRP1-dependent sRNAs in the downstream region of the IscDsor1 CDS (Figure 6A). The total RNAseq data showed continuous signals for ∼4kb after the IscDsor1 coding region, suggesting that the signal represented the 3’ UTR of IscDsor1 (Figure 6A). Consistent with this idea, RT-PCR using primers that bind the 3’ end of the CDS and the 3’ end of the putative 3’ UTR yielded products having the correct sequence in a reverse-transcription-dependent manner (Figure S8A and B). The region where a large number of sRNAs were mapped corresponded to the rnd-1_family-272 sequence in our repeat annotation, which showed similarity to LTR/Gypsy family transposons (Table S1). Therefore, the sRNAs targeting IscDsor1 might be produced from other copies of this TE and act in trans.
The reciprocal changes in the targeting sRNAs and the target mRNA suggested direct regulation (Figure 6B and C). We first verified that IscDsor1 was upregulated in RdRP1 knockdown cells by qPCR (Figure 6C, qPCR panel). A statistically significant increase in the IscDsor1 level was also observed upon Ago-96 knockdown in addition to RdRP1 knockdown, suggesting that Ago-96 might also be involved in the regulation of IscDsor1 (Figure 6C). To test if RdRP1 regulates IscDsor1 through its 3’UTR, we cloned IscDsor1 3’UTR after the firefly luciferase coding region of the pmirGLO/Fer-Luc2/Act-hRluc vector [51]. After depleting RdRP1 or RdRP3 in ISE6 cells, we transfected the IscDsor1 3’UTR sensor plasmid and performed luciferase assays (Figure 6D). Upon knockdown of RdRP1, we detected ∼3-fold upregulation (p=0.002) of the sensor expression, whereas RdRP3 knockdown had no effect. These results demonstrated that RdRP1 regulates IscDsor1 through the 3’UTR, presumably through the production of repeat associated sRNAs.
Roles for sRNA factors in controlling viral RNA levels
Besides endogenous gene regulation, invertebrate RNAi pathways play roles in antiviral defense [52]. In ticks, a previous study demonstrated that RNAi factors including some AGOs were involved in controlling tick-borne viruses in tick cells and animals [28, 45]. We were interested in testing if RdRPs control viral transcript levels in ISE6 cells.
In a recent study, a set of persistently infecting viruses in the ISE6 culture were identified by next-generation sequencing of putative viral particles [53]. We profiled sRNAs derived from the viral sequences in the knockdown sRNAseq libraries. In the control library, sRNAs mapping to the viral genomes were abundantly present (Figure 7A). The reads were distributed across the entire genomes with no strong enrichment in particular regions. The size distribution of the mapped reads showed a strong peak at 22nt without a recognizable peak at the piRNA size (Figure 7B). This was consistent with the vsiRNA seen in TBEV- infected ticks and cells [28, 45].
We sought to determine if vsiRNAs were dependent on any of the AGOs or RdRPs. Interestingly, vsiRNAs were still produced upon knockdown of any of the RNAi factors (Figure 7C, upper panel). This suggested that these sRNA factors were dispensable for vsiRNA production although functional redundancy between the factors might hinder the detection of the effects. We tested if the knockdown of these factors had any effects on the abundance of viral transcripts (Figure 7C lower panel). In our total RNAseq libraries, upregulation of some of the viruses upon knockdown of Ago-16 was observed (Figure 7C, lower panel), consistent with the previous study using TBEV/LGTV [28]. Furthermore, we observed unexpected downregulation of some viruses upon knockdown of RdRP1 (Figure 7C, lower panel). While we observed no reduction of vsiRNA in RdRP1 knockdown samples, these results suggested that RdRP1 might have a role in maintaining the levels of viruses independently from vsiRNA production.
We hypothesized that the reduced viral gene expression in RdRP1-knockdown cells was due to an enhanced immune response by upregulation of Dsor1. To test this, we knocked down RdRP1 and Dsor1 simultaneously (Figure 7D). As expected, Dsor1 knockdown caused upregulation of viral transcripts, and RdRP1 knockdown caused a decrease of viral transcripts, consistent with our RNAseq data. When both RdRP1 and Dsor1 were knocked down together, no significant downregulation was observed compared with Dsor1 single knockdown. These results suggested that RdRP1 downregulated viral transcripts via upregulation of Dsor1.
In summary, our results showed that the components of the sRNA pathway play roles in the regulation of mRNA expression primarily to regulate genes related to sRNA pathways. Furthermore, RdRPs appeared to be involved in controlling viral transcripts in a vsiRNA- independent manner. While the biological significance of these mechanisms in normal gene regulation and virus-tick interactions needs to be studied in the future, this study unveiled novel and unexpected regulatory mechanisms involving tick-specific sRNA factors (Figure 7E).
Discussion
In contrast to the established roles of RdRPs in plants, fungi and worms, their roles remain unclear in other animals. Although RdRP genes were found in many arthropods, their roles in sRNA production were not experimentally demonstrated mainly due to the lack of suitable experimental systems [8–11]. Analysis of sRNA chemical structures of sRNAs from various animals possessing RdRP genes did not find evidence for the production of sRNAs containing terminal structures similar to those of RdRP-dependent sRNAs in worms[26]. Based on these results, the importance of RdRPs in animals outside of Nematoda remained controversial.
In the present study, we demonstrated the presence of abundant classes of RdRP- dependent sRNAs in tick cells. Some of them are expressed as highly as the most abundant miRNA genes expressed in the cell line (Figure 2A), implying that they played important biological roles. The conservation of the catalytic site in the tick RdRPs suggested they are active enzymes potentially producing antisense RNA species [54, 55].
A large fraction of RdRP-dependent sRNAs was derived from RNAP III-transcribed genes (Figure 2A), pointing to a potential functional link. Transcription by RNAP III is terminated by the presence of a stretch of 5 or more Ts on the non-template strand [56] and the presence of short U-tails is a signal for clearance by the quality control mechanisms [57–59]. Therefore, these U-tails may also act as a signal for RdRP1 to produce their antisense RNAs in ticks. In fact, in C. elegans, 3’-oligouridylation acts as a signal for RdRP-dependent sRNA production in artificial RNAi or silencing of rRNA transcription upon erroneous pre- rRNA production [60, 61]. In addition, histone mRNAs are oligourydilated before they are subjected to degradation [48]. Therefore, oligo-U tails may be a signal for RdRP recognition.
However, the distance between the positions of abundant tick sRNAs and the 3’- ends of RNAP III transcripts showed no obvious trends in contrast to the expectation that the antisense RNA production may be initiated at a certain distance from the U-tails [60]. Additional evidence to support this hypothesis is lacking. Alternatively, RdRPs might physically interact with RNAP III during transcription, similarly to how RDR2 in Arabidopsis recognizes RNAP IV products to produce their antisense strands [55]. The functional links between RdRP1 and the RNAP III machineries remain unclear, and this deserves further investigation.
What might be the roles for the tick endogenous sRNA pathways? The production of antisense sRNAs from RNAP III-dependent genes appears to be conserved in the two tick species, I. scapularis and H. longicornis, suggesting that this pathway has a conserved role (Figure 2E-F). However, we observed no discernible effects of RdRP1-KD on the expression levels of the RNAP III product SRP RNA, suggesting that the sRNAs may play roles independently of gene regulation in cis (Figure S5B). In fission yeast, genes transcribed by RNAP III are involved in the compartmentalization of the genome in the nucleus by defining boundaries between regions with distinct chromosomal states [62, 63]. It is interesting to speculate that the tick sRNA pathway might also contribute to the higher order organization of the genome in the tick nucleus. On the other hand, as demonstrated with IscDsor1, tick sRNAs can also down-regulate mRNAs containing sRNA targets at least in some cases (Figure 6).
The function of RdRP3-dependent pathway is even more mysterious. RdRP3 knockdown caused misregulation of 63 genes (Table S3), many of which were also misregulated in Ago-16 knockdown (Figure 5A and C). This was consistent with the overlapping dependencies of sRNAs on these factors (Figures 3 and 4). The misregulated genes included many sRNA-related factors including Dicer homologs. Autoregulatory mechanisms have also been described in worms, suggesting that this is a common phenomenon [64]. However, as these loci do not produce abundant RdRP3-dependent sRNA species, how the RdRP3-Ago-16 axis controls gene expression remains unknown. It is important to identify direct targets to understand their molecular functions. Studying gene regulation using multiple approaches such as proteomics and chromatin structure analysis will be important. Characterization of biochemical properties of the AGOs may also provide clues to the molecular functions for the novel sRNA species. In addition to molecular analysis, biological roles for RdRP-dependent sRNAs need to be investigated especially in the in vivo context.
One of the main roles for invertebrate RNAi pathways is the antiviral response. In animals including worms, the roles for RdRPs in anti-viral defense mechanisms remain to be studied. In plants, an Arabidopsis mutant of the RdRP gene RDR6 exhibited normal antiviral responses, suggesting that RDR6 is dispensable [23, 65]. On the other hand, in tobacco, knockdown of the RDR6 gene caused higher susceptibility to viruses especially at high temperatures [66]. Viruses often encode proteins to suppress hosts’ RNAi mechanisms (Viral Suppressors of RNA silencing or VSRs), and complex interactions between VSRs and RNAi factors often complicate the interpretation of experimental data [67]. Therefore, the contribution of VSRs needs to be taken into consideration to understand interactions between viruses and host or vector cells. The knowledge regarding the RdRP-dependent sRNA pathways obtained here will help us untangle the complex interactions at the molecular level.
Various signaling pathways, including the ERK pathway, play roles in antiviral immunity [68]. It is interesting that the integral ERK pathway factor, Dsor1 is upregulated upon RdRP1 knockdown in tick cells, suggesting that RdRP1 might negatively regulate the immune response (Figure 6). In addition, some of the persistently infecting viruses were reduced upon RdRP1 knockdown whereas knockdown of Ago-16 tended to have the opposite effect (Figure 7). Therefore, our results suggested complex interactions between the host sRNA pathways and the virus rather than simple antiviral roles of sRNA factors. This is at least in part mediated by Dsor1 as RdRP1 knockdown did not decrease viral transcripts when Dsor1 was also knocked down. The direct and indirect roles of tick RdRPs in the life-cycle of tick-bone viruses should be studied in the future.
Tick cell lines have been used as an in vitro model to study interactions between tick- borne viruses and vectors [69]. Based upon the previously established tools, we provide a genomics platform to study the ticks’ sRNA pathways and viruses by annotating endogenous sRNAs and their biogenesis pathways. For convenience, we made the genomics resources available to the research community, including ISE6 and H. longicornis UCSC genome browser assembly tracks, RNAseq mapping data tracks and the sRNA size distribution charts. The UCSC assembly hubs with the RNAseq mapping tracks are available at https://data.cyverse.org/dav-anon/iplant/home/okamuralab/trackhub/Isc_ISE6/IscaI1_hub.txt and https://data.cyverse.org/dav-anon/iplant/home/okamuralab/trackhub/ucscgb_haeL2018/hubHaeL2018.txt. Together with the initial characterization of the sRNA pathways presented in this study, these resources will facilitate studies related to gene regulation and virus-vector interactions that are mediated by sRNAs. Furthermore, as RNAi technologies depend on sRNA factors, detailed understanding through the characterization of the sRNA factors may pave the way for the development of RNAi-based pesticides.
Materials and methods
Tick cell culture and dsRNA transfection
Ixodes scapularis embryonic 6 (ISE6) cells were obtained from ATCC and cultured according to the published protocol at 34 degrees C [70]. The dsRNA transfection was performed using Effectene (QIAGEN). Cells were seeded at 1x10∧6 /ml in 2ml fresh L-15B medium on a 6-well plate. 400ng dsRNA was diluted in 100ul Buffer EC and 3.2ul Enhancer was added. The mixture was incubated for 2-5min at room temperature. Then, 10ul Effectene was added and incubated for 5-10min at room temperature. The mixture was added to the culture and the cells were incubated for 7-10 days. After the first incubation, the dsRNA transfection procedure was repeated again and incubated for 7-10 days to ensure the maximal efficacy of RNAi.
Plasmids and dsRNA production
cDNA was amplified using the total RNA of ISE6 cells as a template. Contaminating genomic DNA was removed by treating total RNA samples with the Turbo DNA-free kit (Ambion) and cDNA was synthesized using Superscript III (Invitrogen) according to the manufactures’ instructions. The cDNA encoding Ago-16, Ago-30, RdRP1 and RdRP3 were amplified using primers listed in Table S5, and clones were obtained by inserting the amplified cDNAs in the NotI-XbaI sites of pEGFP with a modified multiple-cloning-site sequence [71]. The sequences of the inserts were verified by sequencing.
For the preparation of templates for dsRNA synthesis, ∼500bp fragments of sRNA factors were amplified from cDNA of ISE6 cells using primers listed in Table S1 using iProof High- Fidelity Master Mix (Bio-Rad) and the amplicons were treated with XhoI to insert them in the XhoI site of pLitmus28i (NEB). To obtain templates for in vitro transcription, LitmusA and LitmusB primers were used (Table S5). 5ul of the PCR product was used in a 20ul in vitro transcription reaction using Megascript T7 kit (Ambion). dsRNA was purified by Phenol/Chloroform extraction followed by ethanol precipitation as described previously [71].
RNA extraction and RNAseq library preparation
Total RNA was extracted from ISE6 cells using Trizol-LS (Invitrogen) according to the manufacturer’s instructions. For sRNA libraries of ISE6 cells depleted of sRNA factors and their control samples, 1ug total RNA was used for library construction using the TruSeq Small RNA Library Preparation kit (Illumina). For 5’-tri-P libraries, the sRNA fraction (∼15- 35nt) was isolated by gel extraction and RNA species bearing 5’-OH were monophosphorylated by T4 polynucletide kinase (NEB). This was followed by treatment by a terminator exonuclease (Epicentre), dephosphorylation by Calf Intestine Phosphatase (NEB) and re-phosphorylation by T4 polynucletide kinase (NEB). For the oxidized sRNA library, the sRNA fraction (∼15-35nt) from 50 ug total RNA was isolated by gel extraction and treated with 25mM NaIO4 dissolved in 60 mM Borax buffer and incubated for 30 minutes at room temperature in the dark. Both 5’-tir-P enriched and oxidized samples were subjected to a sRNA library construction by the TruSeq Small RNA Library Preparation kit (Illumina). The resulting libraries were sent to BGI (Hongkong) for sequencing on a Hiseq2000. For total RNAseq analysis, three sets of knockdown experiments were performed independently, and total RNA samples extracted by Trizol-LS (Invitrogen) were sent to BGI (Hongkong) for ribosomal RNA depletion using Ribo-Zero Gold rRNA Removal Kit (Human/Mouse/Rat) (Epicentre) followed by library construction using non-stranded (Replicate 1) or stranded (Replicates 2 and 3) TruSeq mRNA Library Prep Kit (Illumina). The resulting libraries were sequenced on Hiseq4000.
Identification of repetitive sequences
Repetitive sequences in the ISE6 genome (IscaI1; https://metazoa.ensembl.org/Ixodes_scapularis_ise6/Info/Index) were identified using the RepeatModeler2 pipeline [50]. The reference TE sequences were downloaded from Repbase < https://www.girinst.org/server/RepBase/> and the RepBaseRepeatMaskerEdition-20181026.tar file was used for annotation.
Analysis of sRNAseq data
sRNA libraries were analyzed following the previously established pipeline [72]. Adaptor sequences were removed by using fastx_clipper and collapsed by fastx_collapser and converted by fasta_formatter (The tools were downloaded from <http://hannonlab.cshl.edu/fastx_toolkit/commandline.html>). The fasta files were used for mapping using the ISE6 genome sequence (IscaI1) [73] using bowtie1.3.0 < http://bowtie-bio.sourceforge.net/index.shtml> without allowing any mismatches. For identification of sRNA sources, genome-mapping reads were mapped to the following reference sequences:
1. miRNAs (miRbase Release 22.1) [74], RNAP III transcripts (downloaded from RNAcentral) [75], rRNAs from NCBI and mRNAs (Vectorbase IscaI1.0) [73]. As 5S rRNA loci are separated from the rDNA repeats and transcribed by RNAP III, we added the 5S rRNA sequences to the “RNAP III” group. The reference sequences are summarized in Table S1.
Phylogenetic tree construction
Based on multiple sequence alignment with MUSCLE, we used ModelFinder [76] to determine the best-fit model and obtained branch supports with the ultrafast bootstrap [77] implemented in the IQ-TREE software [78]. The structural prediction was performed on the ColabFold [79, 80]
Analysis of total RNAseq data
Total RNAseq libraries were analyzed as described previously [72]. The adaptor sequences were trimmed using Cutadapt [81] with the default quality cutoff value (20). Gene expression was quantified by salmon [82] using the Vectorbase IscaI1.0 annotation or a fasta file containing the viral genome sequences persistently present in the ISE6 culture (Table S2; Nakao et al., 2017). Genome mapping was performed by bowtie2 <http://bowtie-bio.sourceforge.net/bowtie2/index.shtml> with the default setting. Differential gene expression analysis was done using the DESeq2 package [83], with the cut-off of adjusted- p-value set to 0.05. GO-term enrichment analysis of misregulated genes was conducted using clusterProfiler [84], with GO terms obtained from eggNOG-mapper [85] and TRAPID [86]. The transcriptome was assembled using Trinity [87]. A blast database was constructed using the assembled cDNA sequences, and homologs of RNAi factors were identified by tblastn https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE= Download using the bait sequences (RRF-1, EGO-1 and RRF-3 from C. elegans; AGO1, AGO2, Aub, AGO3 and PIWI from D. melanogaster). MUSCLE (Figure 1A and S1) or CLUSTAL-O (Figure S3) was used for alignment.
Northern blotting
Northern blotting was done as described previously [71]. Briefly, 10ug total RNA samples were separated on a 15% Sequagel (National Diagnostics) and transferred onto a positively charged nylon membrane and hybridized using DNA or LNA probes. The probe sequences are listed in Table S5.
Western blotting
Western blotting was performed as previously described [88]. Table S5 lists antibodies used in this study.
Accession number
The small RNA library data produced for this study are deposited at NCBI SRA under GSE183810.
Author contributions
C.F., K.To., L.-L.C., and K.O. planned and performed bioinformatics analysis. M.L., K.Ts., T.T., J.I. and K.O. planned and performed experiments. C.F. analyzed transcriptome data and designed analyses. K.Ts performed experiments using HEK293T cells. M.L. generated total RNAseq libraries, cloned RdRP/AGO cDNAs and characterized the proteins. L.-L.C. processed sequencing data and performed initial characterization of mRNAs and sRNAs. K.To. identified repeats and performed sRNA analysis. T.T. produced tick plasmids and designed H. longicornis analysis. J.I. performed other experiments including Western blots, Northern blots and qPCR. K.O. produced sRNA libraries. K.O. and T.T. provided funds. C.F. and K.O. wrote the initial draft. All authors edited and approved the manuscript.
Supplementary Materials
Supplementary Tables
Table S1. General bioinformatics information.
Sheet1: Statistics of the sRNAseq libraries.
Sheet2: Reference sequences used for classification of sRNA origins.
Sheet3: ISCI IDs of genes with sequences that matched with the dsRNA sequences used for the knockdown.
Sheet4: Trinity contigs and Gene IDs of the AGO/RdRP genes analyzed in this study.
Sheet5: References for amino acid sequences of AGO/RdRPs used for the phylogenetic analysis.
Sheet6: Repeat families identified by RepeatModeler2/RepeatMasker
Sheet7: Statistics of the total RNAseq libraries.
Sheet8: List of the H. longicornis sRNA libraries analyzed in this study.
Table S2. Summary of sRNA analysis
Sheets1-10: Related to Figure 2A and S4. sRNA read counts for each category are shown. Each sheet reports the numbers in each of the knockdown libraries. The sum of sense and antisense reads (C18-C30), the number of sense reads (S18-S30) and the number of antisense reads (AS18-AS30) that matched the reference sequence in the category in the header row are reported for each length.
Sheet11: Normalized counts of sRNAseq and total RNAseq reads mapping to the persistently present viruses.
Table S3. Summary of DGE analysis
The results of DGE analysis using the total RNAseq libraries by the Salmon-DESeq2 pipeline are summarized. Sheet1: Ago-16 KD vs GFP control, Sheet2 RdRP1 KD vs GFP control, Sheet3: RdRP3 KD vs GFP control, Sheet4: TPM values for individual libraries.
Table S4. Summary of expression analysis for repeats
The results of DGE analysis using the total RNAseq libraries by the Salmon-DESeq2 pipeline are summarized. Sheet1: Ago-16 KD vs GFP control, Sheet2 RdRP1 KD vs GFP control, Sheet3: RdRP3 KD vs GFP control, Sheet4: TPM values for individual libraries. These analyses were done using a reference file containing both protein-coding genes and repeats and data only for repeats are shown.
Table S5. Materials used in this study.
The workbook contains the information of oligos (Sheet1), cell lines (Sheet2) and antibodies (Sheet3).
Table S6. List of commonly misregulated genes upon knockdown of RdRPs and viral infection. Transcriptome data of cells infected with tick-borne viruses are from [89]
Supplementary Data
https://data.cyverse.org/dav-anon/iplant/home/okamuralab/TickMS_supplementaryData/Cyverse.TE_cDNA_links.html Links to the sortable tables of CDS- and TE-derived sRNAs in the knockdown library can be found in TE_cDNA_links.html. On the Normalized and Relative levels pages, normalized read counts (RPM) and relative levels compared to the control (dsGFP) library were used. TEs with more than 50RPM and Coding Genes with more than 3.5RPM on average in KD libraries are included in this website.
Full tables including those with fewer reads can be downloaded from the following links. CDS:
Repeats:
Supplementary PDF
Contains dot plots and Gene-GO networks visualizing the results of GO enrichment analysis for misregulated genes upon Ago-16/RdRP1/RdRP3 knockdown.
Acknowledgments
The authors are grateful to members of the K.O. laboratories in TLL and NAIST for discussion. We thank Dr. Kentaro Yoshii for critically reading the manuscript. We thank ATCC for ISE6 cells and BGI for sequencing. Research in K.O.’s group was supported by the National Research Foundation, Prime Minister’s Office, Singapore under its NRF Fellowship Programme (NRF2011NRF-NRFF001-042), Temasek Life Sciences Laboratory core funding and the JSPS Fund for the Promotion of Joint International Research (Returning Researcher Development Research, 17K20145). Work in the T.T.’s group was supported by Takeda Science Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of these agencies.
Footnotes
New results regarding regulation of persistently infecting viruses by RdRP1 via the ERK pathway have been added.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.