Genome-wide identification of Aedes albopictus long noncoding RNAs and their association with dengue and zika virus infection

The Asian tiger mosquito, Aedes albopictus (Ae. albopictus), is an important vector that transmits arboviruses such as dengue (DENV), Zika (ZIKV) and Chikungunya virus (CHIKV). On the other hand, long noncoding RNAs (lncRNAs) are known to regulate various biological processes. Knowledge on Ae. albopictus lncRNAs and their functional role in virus-host interactions are still limited. Here, we identified and characterized the lncRNAs in the genome of an arbovirus vector, Ae. albopictus, and evaluated their potential involvement in DENV and ZIKV infection. We used 148 public datasets, and identified a total of 10, 867 novel lncRNA transcripts, of which 5,809, 4,139, and 919 were intergenic, intronic and antisense respectively. The Ae. albopictus lncRNAs shared many characteristics with other species such as short length, low GC content, and low sequence conservation. RNA-sequencing of Ae. albopictus cells infected with DENV and ZIKV showed that the expression of lncRNAs was altered upon virus infection. Target prediction analysis revealed that Ae. albopictus lncRNAs may regulate the expression of genes involved in immunity and other metabolic and cellular processes. To verify the role of lncRNAs in virus infection, we generated mutation in lncRNA loci using CRISPR-Cas9, and discovered that two lncRNA loci mutations, namely XLOC_029733 (novel lncRNA transcript id: lncRNA_27639.2) and LOC115270134 (known lncRNA transcript id: XR_003899061.1) resulted in enhancement of DENV and ZIKV replication. The results presented here provide an important foundation for future studies of lncRNAs and their relationship with virus infection in Ae. albopictus. Author summary Ae. albopictus is an important vector of arboviruses such as dengue and Zika. Studies on virus-host interaction at gene expression and molecular level are crucial especially in devising methods to inhibit virus replication in Aedes mosquito. Previous reports showed that, besides protein-coding genes, noncoding RNAs such as lncRNAs are also involved in virus-host interaction. In this study, we report a comprehensive catalog of novel lncRNA transcripts in the genome of Ae. albopictus. We also show that the expression of lncRNAs was altered upon infection with dengue and Zika. Additionally, depletion of certain lncRNAs resulted in increased replication of dengue and Zika; hence, suggesting potential association of lncRNAs in virus infection. Results of this study provide a new avenue to the investigation of mosquito-virus interactions that may potentially pave way to the development of novel methods in vector control.

244 size of 2650.1 bp. Short in size is a typical feature of lncRNAs; hence, our findings are congruent with 245 lncRNAs identified in other living organisms [8], [35], [42]. Besides, Ae. albopictus known and novel 246 lncRNAs exhibited slightly lower GC content in comparison to protein-coding transcripts (Fig 2B). This 247 lower GC or AT enriched characteristic is another notable feature of lncRNA transcripts [8], [9], [35], 248 [36], [40]. Mean GC content of novel lncRNAs, known lncRNAs and protein-coding transcripts were 249 42%, 44%, and 48% respectively. We also analyzed GC content of other noncoding sequences in the 250 genome such as 5'UTR and 3'UTR. We discovered that both of them had lower GC content in 251 comparison to protein-coding transcripts. The average GC content of 5'UTR and 3'UTR were 44% and 252 36% respectively.

253
As previous studies showed that lncRNAs demonstrate low sequence conservation even among 254 closely related species [8], [9], [ [44]. BLAST algorithm bit score was used as an indicator for sequence similarity 260 [8], [9]. In general, protein-coding transcripts were found to be more conserved than lncRNAs (Fig 3A). 261 Ae. albopictus lncRNAs showed high degree of sequence similarity (E-value < 10 262 indicating that they are likely genus specific (Fig 3B) 278 the expressed genes showed that the replicates were grouped into their three distinct groups (S1 Figure).
279 This unsupervised clustering also showed that the data formed two main clusters, i.e., uninfected, and  (Figure 4B). Similar lncRNAs were found to be differentially expressed -11 lncRNAs were 292 upregulated and 12 lncRNAs were downregulated in both DENV and ZIKV infected cells ( Table 2). The 293 similar changes seen in these 23 lncRNAs post-infection suggest that they may be key players, regardless 294 of the type of viral infection. On the other hand, the number of differentially expressed protein-coding 295 mRNAs were higher, i.e., 1,718 and 1,098 mRNAs were differentially expressed (FC ≥ |2|, FDR < 0.05) 296 in DENV and ZIKV-infected cells respectively. List of differentially expressed lncRNAs can be found 297 in S2 Table. 298 We randomly selected ten differentially expressed lncRNAs from RNA-seq analysis for qPCR.
299 The ten chosen candidates were upregulated or downregulated in both DENV and ZIKV-infected 300 samples. We validated the differentially expressed lncRNAs using qPCR from the same RNA samples 301 used for Illumina sequencing. RPS17 and RPL32 were used as housekeeping genes [46], [47]. Our 302 preliminary tests showed that the expression level of the two housekeeping genes were unchanged in 303 DENV, and ZIKV-infected C6/36 cells; hence, qualifying them as suitable references in this study.
304 Overall, the qPCR results of all the ten lncRNAs were consistent with RNA-seq data; hence, validating 305 the bioinformatic approach used in this study (Fig 5) 306 307 Functional analysis of the differentially expressed lnRNAs 308 To determine the putative regulatory roles of Ae. albopictus lncRNAs upon virus infections, we 309 searched for protein-coding genes located within 100 kb (upstream and downstream) from differentially 310 expressed lncRNAs as potential cis-regulated genes [48]- [50]and found a total of 198 and 153 coding 311 genes which fulfilled that criteria in the DENV and ZIKV transcriptome respectively. As lncRNAs also 312 have the ability to regulate genes in trans or distal regions [48], [51]- [53]we searched for protein-coding 313 genes that were co-expressed with the differentially expressed lncRNAs based on Pearson correlation 314 (coefficient > 0.95 or < -0.95, P-value < 0.05). In DENV transcriptome, 110 coding genes were co-315 expressed with the differentially expressed lncRNAs. Meanwhile, we detected 38 coding genes having 316 high correlation in expression with differentially expressed lncRNAs in ZIKV transcriptome. We 317 categorized the co-expressed genes as putative trans-regulated genes in Ae. albopictus cells upon DENV 318 and ZIKV infection. Results of co-expression analysis between lncRNAs and protein-coding mRNAs 319 can be found in S3 Table. 320 We selected the potential cis and trans-regulated genes for functional annotation and enrichment 321 analyses. Previous reports demonstrated that upon virus infection in Aedes mosquitoes, there were 322 changes in the expression level of host immune-related genes [13], [45], [54]- [56]. By conducting 323 functional annotation using Pannzer [30] we discovered 13 lncRNAs that potentially regulated genes 324 involved in immune-related functions (S4 Table). Based on gene ontology (GO), these genes were 334 [45]. In this study, we discovered lncRNAs that potentially regulate genes that were also involved in 335 serine-type endopeptidase activity (GO:004252, GO:004253, GO:004254) and metalloendopeptidase 336 activity (GO:004222).

337
We then performed functional enrichment analysis of lncRNA-regulated genes using g;Profiler 338 [31]. We discovered that 32 and 22 GO terms were significantly enriched (P-value < 0.05) in ZIKV and 339 DENV-transcriptome respectively (Supplementary information 8, Supplementary Figure 2). A total of 8 340 GO terms were significant and shared by both transcriptomes, suggesting that lncRNAs may target genes 341 that perform similar regulatory functions upon DENV and ZIKV infections (S5 Table). Among the 342 significant GO terms were cellular processes, ion/protein binding, metabolic processes, and DNA binding.