Functional Transcriptome Analysis of Bladder Cancer Cell Lines Persistently Infected with Oncolytic Newcastle Disease Virus

Background Newcastle disease virus (NDV) has been an attractive virotherapy agent that targets various type of human cancers while leaving normal cells unharmed. Wild-type NDV strain AF2240 has been found to persistently infect subpopulation of cancer cells in vitro, making the cells less susceptible to NDV-mediated oncolysis. It is proposed that transcriptome profiling of NDV persistently infected bladder cancer cell lines will provide insights to understand such occurrence by identifying specific pathways associated with NDV persistent infection due to transcriptomic dysregulation. Results Transcriptome profiling revealed a total of 63 and 134 differentially expressed genes (DEGs) from NDV persistently infected TCCSUPPi and EJ28Pi bladder cancer cells relative to their uninfected controls, respectively. Of the 63 DEGs identified for TCCSUPPi cells, 25 DEGs were upregulated (log2 fold-change ≥ 0) and 38 DEGs were downregulated (log2 fold-change ≤ 0). These genes were significantly enriched in the molecular function of calcium binding (GO:0005509) and DNA-binding transcription repressor activity, RNA polymerase II-specific (GO:0001227) and the enriched important upregulated pathways were mainly heme metabolism, TGF-beta signaling and spermatogenesis. As for EJ28Pi, 55 DEGs were upregulated (log2 fold-change ≥ 0) and 79 DEGs were downregulated (log2 fold-change ≤ 0). These DEGs resulted in significantly enriched molecular function such as protein domain specific binding (GO:0019904) and RNA polymerase II regulatory region sequence-specific DNA binding (GO:0000977). The enriched important upregulated pathways were allograft rejection, KRAS signaling up and interferon gamma response. Other important pathways that were downregulated in both the NDV-persistently infected cell lines were angiogenesis, apoptosis, and xenobiotic metabolism. Conclusion The transcriptome profiles (RNA-Seq) of these cell lines suggest that evasion of apoptosis and increase in TGF-beta signaling and interferon gamma response activities are crucial for establishment of NDV persistent infection in bladder cancer cells. Findings from this study provide the molecular basis that warrant further study on how bladder cancer cells acquired NDV persistent infection. Resolving the mechanism of persistent infection will facilitate the application of NDV for more effective treatment of bladder cancer.


Introduction
Newcastle Disease Virus (NDV) is an oncolytic virus belonging to a promising novel class of virotherapy agent against human cancers while leaving normal cells unharmed [1,2]. NDV is a member of the new genus Avulavirus within the family Paramyxoviridae with a nonsegmented negative-stranded RNA genome of about 15.9 kb long [3,4] that contains six genes [5], which encodes for at least eight proteins [6,7]. The six essential structural proteins: nucleoprotein (NP), phosphoprotein (P), and the large polymerase protein (L) form the nucleocapsid [8,9] while haemagglutinin-neuraminidase (HN) and fusion protein (F) attached in the lipid bilayer of membrane in the external envelope [10,11], and inner layer of the virion envelope is formed by the matrix protein (M). The HN protein mediates the binding of virus to cell surface receptors through its haemagglutinating activity and it also displays neuraminidase activity [10,11]. The F protein is responsible for fusion between the viral envelope and the target membrane. NDV can also be grown in cell culture which causes cell fusion, resulting from the co-expression of the F and HN proteins that are responsible for viral infectivity and virulence [12,13]. The two additional non-structural proteins, V and W are formed by the RNA editing process during P gene transcription [14]. Studies have also discovered that the V protein plays an important role in preventing interferon responses and apoptosis in chicken cells, but not in human cells. These species-specific effects of V protein make it a determinant of host range restriction [6,14]. The W protein also play a role in the viral replication and pathogenesis [14].
NDV has been shown to have higher oncolytic specificity against most cancer cell types as compared to normal cells [15]. Once it is inside the cancer cell, the virus replicates and has the propensity to infect the surrounding cancer cells. The virus also stimulates the host anti-cancer immunological system [16,17]. Different strains of NDV such as 73-T, MTH68, Italian, Ulster, and HUJ have shown to exhibit an oncolytic activity [17][18][19][20]. In addition, the oncolytic effects of two Malaysian strains of NDV, AF2240 and V4-UPM, have also been studied in vitro and in vivo on several human tumour such as breast cancer [2], astrocytoma cell lines [21], and leukemia [22]. However, the wild-type NDV AF2240 has been found to persistently infect subpopulation of colorectal cancer cells that exhibit a smaller plaque morphology and resist NDV-mediated oncolysis [23]. Thus, persistency of infection poses a potential problem in maximizing the efficacy of this virus for the use of cancer treatment. Although, wild-type NDV has been utilized in clinical trials [24][25][26][27][28], the risk of persistent infection still remains unchanged [29]. There is currently no report on the dysregulated transcripts signature of bladder cancer cell lines persistently infected with oncolytic Newcastle disease virus to understand the molecular basis underlying the development of persistent infection. Thus, understanding the mechanisms by which cancer cells acquire persistent infection and reversing these mechanisms will be crucial in ensuring the clinical efficacy of NDV for cancer treatment.
To explore the molecular mechanism that supports the establishment of NDV persistent infection in bladder cancer cell lines, we employed whole transcriptome sequencing (RNA-Seq) on the persistently infected cell lines by next-generation sequencing (NGS) technology to compare gene expression patterns in these cells and that of the untreated control cells to enable identification of dysregulated genes and signaling pathways associated with persistent infection. The findings from this paper suggest that evasion of apoptosis and increase in TGFbeta signaling and interferon gamma response activities are crucial for establishment of NDV persistent infection in bladder cancer cells.

Exploratory analysis of the RNA-Seq data
Sample distance was explored to evaluate similarities and dissimilarities between samples

Global changes of transcript expression in persistently infected cell lines
To identify the differentially expressed genes (DEGs) that accumulate due to NDV persistent infection in TCCSUP and EJ28 cell lines, gene expression levels were quantified based on fragments per kilobase of exon per million fragments mapped (FPKM) [31] with threshold (qvalue) of false discovery rate (FDR) set at 0.05 to judge the statistical significance of the differentially expressed transcripts in the samples. Only transcripts with log2 fold-change  0 or log 2 fold-change  0 and q-value 0.05 are considered differentially expressed genes (DEGs) between persistently infected cells and non-infected control cells. Any genes detected in either non-infected control cells or persistently infected cells based on the above set criteria were considered as uniquely expressed.
Based on FPKM method calculated from DESeq2 package [32], a total of 197 genes were identified in response to NDV persistent infection in TCCSUP and EJ28 cell lines.  Fig. 4C).
The number of downregulated genes was found to be higher than upregulated genes when comparing persistently infected cells with their control (Fig. 2A). In contrast, the comparison between EJ28Pi and TCCSUPPi cells showed that the number of upregulated genes were higher than downregulated genes ( Fig. 2A). Furthermore, EJ28Pi had the highest number (78) of DEGs while TCCSUPPi had the lowest number (7)  The number of genes that was altered due to persistent NDV infection in EJ28Pi cells were found to be greater by two-fold than those observed in TCCSUPPi cells with only 0.51% number of co-regulated gene (UPK3BL1). Although UPK3BL1 gene is found to be coexpressively regulated in both the NDV persistently infected cell lines, it was observed to be upregulated in TCCSUPPi but downregulated in EJ28Pi. Thus, the finding suggests that there is a great difference in transcripts expression between NDV persistent infection in TCCSUPPi .05) and absolute value of the log2 ratio >2). The horizontal bar above the dendrogram represents untreated (turquoise blue) and treated (dark peach) samples. Level of gene expression was indicated by a range between red (high expression) and blue (low expression).

Functional annotation of the DEGs
To further investigate the biological significance of the identified putative differentially expressed genes (DEGs) in the two-way comparisons of the libraries from EJ28Pi vs TCCSUPPi, TCCSUPPi vs TCCSUP, and EJ28Pi vs EJ28, statistically significant differentially expressed transcripts with p <0.05 were used to test against the background set of all genes with GO annotations in metascape [33]. A total of 55 GO terms were discovered in  Table   3).
While for EJ28Pi vs EJ28, a total of 44 GO terms were discovered and also respectively     5A) and that could be the reason why NDV persistently infected cancer cell lines maintains their normal growth and proliferation. It is well known that NDV induces apoptosis and kills cancer cells through activation of apoptotic signaling pathways thus it is an exciting discovery that this pathway is downregulated in the established persistently infected cell lines. Note: The arrow up () indicates that the pathway is upregulated in TCCSUPPi and the arrow down () indicates that the pathway is downregulated in TCCSUPPi.

Calcium ion binding genes in TCCSUPPi
Two classical and important cadherin superfamily genes were significantly expressed in NDV persistently infected TCCSUPPi cells (FDR <0.05). These genes are cadherin 2 (CDH2) and cadherin 5 (CDH5) with CDH2 upregulated at 0.92 fold and CDH5 downregulated at -2.21 fold. Both cadherin 2 (CDH2) and cadherin 5 (CDH5) are protein-coding type of genes and are calcium-dependent cell-cell adhesion molecules that encode five extracellular cadherin repeats known as transmembrane region. They allow cells to make homophilic interactions and play a vital role in endothelial adherens junction assembly and maintenance, making the cells more connected to the actin cytoskeleton through formation of alpha-catenin [36]. Moreover, these key genes have been reported to be involved in tissue morphogenesis [37,38] and their misexpression is implicated in human cancer [39][40][41]. Thus, it is likely that upregulation of CDH2 may be associated with increased interconnection between the persistent cells leading to the maintenance of cellular structure, cell mobility and intercellular transport [42], and downregulation of CDH5 may lead to reduced calcium ion binding and intercellular junctions disorganization in persistent TCCSUPPi.

Repression of stress proteins and transcriptional genes in TCCSUPPi
Genes involved in heat shock proteins and homeobox subfamilies were revealed to be downregulated in NDV persistent TCCSUPPi cells. The downregulated genes found for heat shock includes HSPA1B, HSPA4 and HSPB7, and those of homeobox family were HOXA13 and HOXB13 (Supplementary Table 5). Expression of heat shock protein family A (Hsp70) member 1B (HSPA1B) and, heat shock protein family A (Hsp70) member 4 (HSPA4) as well as heat shock protein family B (small) member 7 (HSPB7) were reported to be frequently overexpressed in bladder cancer cell lines [43,44]. This is due to the remarkable differences in bladder cancer expression compared with normal urothelium [45], leading to an intense investigation and suggestion for biomarkers of bladder cancer treatment response and prognosis [46]. Thus, since the heat shock proteins were found to be significantly downregulated (FDR <0.05) in TCCSUPPi cells, it can be said that NDV may be utilized to specifically target those family genes to enhance therapeutic strategies for patients with bladder cancer.
Additionally, homeobox transcription factor gene family such as HOXA13 and HOXB13 were also shown to be downregulated in persistent TCCSUPPi cells. These genes are members of homeobox family which encode the homeodomain proteins that play an important role in Mullerian duct development, gut and urogenital tract [47,48]. Repression or expression of HOX genes have both been reported to be linked with cancer where they act as either tumour suppressor or proto-oncogenes [49,50]. Mutations of HOXA13 gene have been reported to be associated with abnormalities in mice and a serious uterine development effect in human.
HOXA13 has been described to have the same regional level of expression in both adult and embryo [47,51]. Both HOXA13 and HOXB13 genes act to provide protein that bind to a specific region of DNA and co-regulate functional activity of other genes [52]. The proteins produced by these set of genes are called transcription factors that are often referred to as homeobox protein family. Many RNA viruses (such as micro RNA virus, influenza, vesicular stomatitis virus, rhabdovirus etc.) were reported to inhibit host cell gene expression [53,54] with NDV inhibiting transcription inhibitor genes [55]. Based on this repression of transcriptional and stress protein genes in TCCSUPPi cells, it can be hypothesized that NDV requires reduction of mRNA output of cells for its effective replication in cancer cells.

Novel apoptosis, cell cycle and immune-related genes in EJ28Pi
It is established that infection of cancer cells with NDV trigger an innate immune response that leads to apoptosis and cell death, making NDV an efficient oncolytic mediator. In this study, four novel apoptosis related genes, including etoposide-induced gene 24 (EI24), epithelial membrane protein 2 (EMP2), tumour protein p53 (TP53), and prostaglandin E synthase (PTGES) were revealed to be significantly downregulated in EJ28Pi cells (FDR <0.05) (Supplementary Table 6). EI24 gene is known to suppress cell growth by inducing apoptosis through activation of caspase 9 and mitochondrial pathways [56,57]. EMP2 gene encodes tetraspan protein of PMP22/EMP family and functioning to signaled cell death and apoptosis [58], overexpression of this gene has resulted in cancer progressions in various types of tissues [59,60]. TP53 gene is activated in response to cellular stress to help in regulating the expression of the target genes by inducing cell cycle arrest, DNA repair and apoptosis [61] while PTGES is reported to induce proinflammatory cytokine interleukin 1 beta (IL1B) and involve in inducing apoptosis via TP53 gene activation.  Table 6).

Solute carrier and potassium voltage-gated channel genes in EJ28Pi
Three solute carrier superfamily genes (SLC) including solute carrier family 1 member 5

Validation of RNA-Seq data by qPCR
To validate the expression of RNA-Seq data obtained from the persistent TCCSUPPi and EJ28Pi cells, three genes with annotations from statistical analysis of RNA-Seq were randomly selected from the TCCSUPPi RNA-Seq data and two genes from EJ28Pi RNA-Seq data. The selected genes are HOXA13, HOXB13 and OTX1. RT-qPCR analysis showed similar expression patterns to that observed in RNA-Seq analysis of TCCSUPPi cells (Fig. 6A) (Fig. 6B). These results demonstrated that the change in gene expression deduced by RNA-Seq reflected the actual transcriptome differences between persistently infected bladder cancer cell lines (TCCSUPPi and EJ28Pi) and that of the uninfected control cells (TCCSUP and EJ28). Thus, confirming that the NGS data was accurately quantified and valid for inferring biological significance. .

Discussion
To better understand the mechanisms by which bladder cancer cell lines acquire NDV persistent infection, we employed RNA sequencing to profile the cellular transcriptome of the NDV-persistently infected TCCSUPPi and EJ28Pi cell lines compared to their uninfected controls. Our results indicated that there were considerable differences in the level of gene expressions between persistently infected bladder cancer cell lines (TCCSUPPi and EJ28Pi)

A B
and their respective uninfected controls (TCCSUP and EJ28), and that several genes were uniquely differentially expressed in the NDV-persistently infected cells. We were able to discover more than (>190) differentially expressed genes (DEGs) associated with NDV persistent infection when compared with previous study using microarray where there were approximately more than (>300) DEGs identified [64]. This could be due to the use of different mRNA quantification techniques and variation in the use of NDV dose. In addition, the high number of gene expression identified in persistent EJ28Pi relative to those identified in persistent TCCSUPPi cells (Figure 2A & B) could be attributed to the fact that each bladder cancer cell line contains a heterogenous subset of cells that respond differently to NDVmediated oncolysis.
Our findings on persistently infected TCCSUPPi corroborate with previous report [64] where some genes such as CDH2 and CDH5 cadherin superfamily are involves in calcium binding function. These key genes have been reported to be involved in tissue morphogenesis [37,38] and their dysregulated expression have been implicated in human cancer [39][40][41]. CDH2 is often referred to as cadherin switching encoding N-cadherin and is known to regulate many biological processes [65,66]. Our results reveal that this gene is upregulated in persistent TCCSUPPi cells and is consistent with the previous findings that reported CDH2 gene is to be commonly upregulated in bladder cancer [67]. Moreover, CDH2 gene is found to play a significant role in the transition between epithelial and mesenchymal cells (EMT) [68]. On the other hand, CDH5 gene plays a role in the morphology of a vascular muscle and stability, and is also reported to be overexpressed in aggressive human cutaneous melanoma cells as compared to nonaggressive melanoma [69]. Herein, we found that this gene was significantly downregulated in persistent TCCSUPPi cells making the cells less aggressive because repression of CDH5 gene is associated with lack of vascular-like formation networks [70], implying that dysregulated expression of this gene in persistent TCCSUPPi cells may disrupt the formation of vasculogenic networks as observed in melanoma.
Other important genes that were implicated in TCCSUPPi are stress proteins such as HSPA1B, HSPA4 and HSPB7 as well as transcription factor genes comprising of HOXA13 and HOXB13, respectively. Heat shock or stress proteins are expressed in normal cells however their expression are upregulated when triggered by cellular stressors such as cytotoxic agents and hypoxia [45]. These set of genes are overexpressed in various types of cancer including bladder cancer cells. Their varied expression in bladder cancers compared to normal bladder tissue has led to the investigation as being a cancer biomarker. Contrary to this, we observed a significant repression of these heat shock family proteins (HSPA1B, HSPA4 and HSPB7) in the established NDV persistent TCCSUPPi cell line, demonstrating that NDV has potential application for bladder cancer treatment strategies by targeting these class of heat shock proteins and may serve synergistic effect with some of the current therapeutic measures of bladder cancers such as chemotherapy and radiation therapy. Moreover, HOXA13 and HOXB13 homeobox transcription genes were also repressed in NDV persistent TCCSUPPi cells. Repression or overexpression of HOX genes have both been reported to be linked with cancer where they act as either tumour suppressor or proto-oncogenes [49,50]. HOXA13 and HOXB13 genes provide protein that bind to a specific region of DNA and co-regulate the functional activity of other genes in cells [52]. Therefore, we hypothesize that repression of transcriptional and stress proteins genes in TCCSUPPi cells may be required for NDV to effectively replicate and continue its cycle within the host cells. Further study needs to be carried out to unravel the relationship of NDV and repression of these genes in bladder cancer cells.
On the other hand, the DEGs found to be implicated in EJ28Pi were genes associated with apoptosis such as EI24, EMP2, TP53 and PTGES, cell cycle and growth such as MNT, NDRG4, ZNF346, FGF14 and FGF9, immune related genes that included FASLG, SP1 and STAT6, as well as solute carrier family SLC1A5, SCL43A3, and SLC48A1 and potassium voltage-gated channel genes comprising of KCNJ10, KCNJ12 and KCNJ5. NDV is reported to mediate its oncolytic activity by inducing apoptosis in many types of cancer cells [18,[71][72][73] and induction of apoptosis in cancer may lead to upregulation of various kind of apoptosis-related genes [74,75] and apoptosis-related pathways [76]. Contrary to our findings, we observed that apoptosis- On the basis of functional annotation analysis of the DEGs identified in persistent TCCSUPPi cells, we found that the significantly enriched GO terms were response to unfolded protein and topologically incorrect protein for biological processes, collagen-containing extracellular matrix and endoplasmic reticulum for cellular component, calcium ion biding and DNAbinding transcription repressor activity, RNA polymerase II specific for molecular function.
The results of GO term for biological process and cellular component in our findings is consistent with the previous studies which reported that NDV induces autophagy to favour its replication in cell via activation of cellular stress that lead to interference with endoplasmic reticulum (ER) function resulting in unfolding of proteins in ER, a term known as ER stress [77,78]. Unfolded protein response is a process by which ER employs certain mechanism to prevent cell death [79]. Recent findings have demonstrated that this event may be associated with response to autophagy which plays a vital role in response to stress stimuli like oncolytic viruses such as NDV [80,81].
On the other hand, the identified DEGs in persistent EJ28Pi cells were significantly enriched with the GO terms such as negative regulation of cell proliferation and cell cycle check point for biological processes, membrane region and membrane raft for cellular component, protein domain specific binding and RNA polymerase II regulatory region sequence-specific DNA binding for molecular function. Viruses either directly damage the cell's DNA or disrupt basic mechanism of cell division such as the segregation of chromosome and DNA replication [82].
This usually results in the activation of cell cycle checkpoints, a mechanism implemented by cells to repair damage or commit suicide when the cell's damage fails to repair. In the present study, we found that cell cycle checkpoint is activated as part of the biological event occurring due to NDV persistent infection in EJ28 bladder cancer cell line. Studies have localized the presence of viral structural proteins in both the membrane region and membrane raft of the host cell and demonstrated their contribution for virus entry, assembly and budding [83]. In NDV persistent EJ28Pi cells, the cellular involvement of membrane raft is obvious, suggesting that NDV may be using the membrane raft component of the host cell in facilitating its replication and infectious cycle processes to persistently infect the EJ28 cells through protein domain specific binding and RNA polymerase II regulatory sequence-specific DNA binding.
It is interesting to note that the enriched GO terms in persistent TCCSUPPi cells were entirely different from those found in persistent EJ28Pi cells. This suggests that NDV persistent infection in different bladder cancer cell lines involve different sets of biological processes, cellular component and molecular function for each individual cell line. However, the exact mechanism for this difference is unclear and may need further investigation.
In terms of the disrupted molecular pathways that were enriched due to NDV persistent infection in these two bladder cancer cell lines, 15 pathways were dysregulated in persistent TCCSUPPi cells with 9 upregulated and 6 downregulated pathways. The top three most enriched upregulated pathways in this cell were heme metabolism, TGF-beta signaling and spermatogenesis while the downregulated pathways were estrogen response late, complement, and hypoxia. These annotated pathways provided valuable insight in revealing the molecular cascades interaction for TCCSUP bladder cancer cell line to develop NDV persistent infection.

One important pathway involved in NDV persistent infection in TCCSUP cells is TGF-beta
signaling, which is an essential pathway that regulates many cellular processes such as cell growth, proliferation, differentiation and migration [84], and dysregulation of this pathway contributes to various types of pathologies that includes cancer. It is reported that enrichment of TGF-beta signaling is a result of an induced action of epithelial-mesenchymal transition (EMT) that facilitates tumour migration and invasion [85]. Moreover, TGF-beta signaling is associated with release of mitogenic growth factor that stimulate proliferation and survival of cancer which correlates with cancer angiogenesis, and metastasis [85]. Therefore, data from Indeed, NDV is known as a potent stimulator of type I interferon but defect in interferon signaling enhances NDV replicative ability in cancer cells and this has been the basis for its selective replication in cancer. Evidence of proinflammatory cytokine particularly type I IFN production upon NDV infection in both normal and cancer cells has been document [86][87][88][89]. However, differences in the level of proinflammatory cytokine between cancer cells and normal tissue as well as their role in NDV-mediated oncolysis is not clearly understood. Taken together, twelve (12) [90][91][92] and these genes were implicated in bladder cancer progression [93,94] and poor prognosis [95]. The different expression of HOXA13, HOXB13 and OTX1 genes between TCCSUPPi and TCCSUP as well as HOXB13 gene between EJ28Pi and EJ28 suggest that regulatory mechanism of these genes varied between these cell lines.
Findings from this study reflected the actual differences in expression profiles computed by RNA-Seq between NDV persistent bladder cancer cell lines and their uninfected control cells.

Conclusion
We employed RNA sequencing analysis to identify commonly dysregulated transcriptomic profiles and disrupted molecular pathways in NDV persistently infected bladder cancer cell lines. Through this analysis, we were able to identify several important dysregulated genes as well as pathways that seems to be the key players in development of NDV persistent infection. We

Cells and virus
Two human bladder cancer cell line, TCCSUP and HTB-5, were purchased from the European Malaysia (UPM) and was propagated in a specific pathogen free (SPF) embryonated chicken eggs of 9 to 11 days old at 37∘C for 72 hours as previously described [23]. Allantoic fluid were harvested from the eggs and presence of virus was confirmed by haemagglutination assay [96] while virus purification was based on previously established protocol [97]. The resulting virus pellet were stored in -80  C freezer until use.

Establishment of NDV persistent infection
Persistently infected cells were developed as described previously [23]. The cells were allowed to grow until sub-confluent monolayer was formed. The newly established NDV persistently infected bladder cancer cell line was designated as TCCSUPPi.

RNA extraction and quality control
Total RNA was extracted for each of the persistently infected cells as well as the cell lines without NDV infection (negative control) using Qiagen RNeasy micro kit (Qiagen, Germany) following the manufacturer's protocol. Genomic DNA was removed from the RNA samples

MultiQC analysis
The FASTQ file for the raw sequence data was subjected to quality control checks (QC) using the FastQC software [98] and the verified sequence data has been submitted to NCBI Gene Expression Omnibus (GEO) with accession number GSE140902. The dataset was available at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE140902. The results and the log files (output) generated by FastQC were collected and applied to MultiQC to have an overall summary statistics of the data in a single standalone HTML report file while saving the directory of the remaining parsed data for further downstream application [99].

Alignment of the sequencing read to a reference
The FASTQ mapping was performed using HISAT2 that is fast and suitable for splice site alignment [100]. In brief, reference human genome file sequence in FASTA format was obtained from Ensembl genome browser [101] and the sequences were used to map each of the paired-end reads to the hg19 human reference genome by using HISAT2. The mapping rate for the samples were above 95% and shows high mapping rate. Mapping rate were summarized by type to gain an insight into which samples were properly mapped and which was not. Reads that mapped to the hg19 human reference genome was used to quantify the transcriptome assembly using StringTie [102].

Gene expression analysis
Quantification of the transcripts abundance was carried out using Salmon's quasimapping strategy which counts the number of reads that is mapped to a given transcript [103] calculated in fragments per kilobase of gene model per million mapped (FPKM) of the coding genes in each samples. Differential transcripts analysis (DEGs) was carried out using DESeq2 package [32] via tximport in R environment.

Functional and pathway enrichment analysis
Functional enrichment analysis of the differentially expressed genes (DEGs) was carried out using metascape (http://metascape.org/gp/index.html), an online platform for comprehensive gene annotation analysis of omics data [33]. Gene ontology (GO) analysis of the DEGs was ranked by enrichment score of 1/log10 p value and curated in MS Excel. Hallmark pathways enrichment analysis was performed using FGSEA R-package for fast preranked gene set enrichment analysis (GSEA) [104] and the background was set using Molecular Signatures Databases (MSigDB) [105]. The identified GO terms and hallmark pathways with a corrected p <0.05 were considered significantly enriched.

RT-qPCR analysis of RNA-Seq data
Total RNAs from both TCCSUPPi and EJ28Pi cells were used for RT-qPCR analysis with uninfected TCCSUP and EJ28 as negative controls. A subset of three genes (OTX1, HOXA13 and HOXB13) with annotations from RNA-Seq analysis were randomly selected for RT-qPCR analysis. In brief, each RNA sample was converted to cDNA using QuantiTect Reverse Transcriptase kit (Qiagen, Germany) following the manufacturer's protocol. The cDNA samples with Quantinova SYBR Green PCR kit (Qiagen, Germany) were prepared in a final volume of 20 μL and then subjected to RT-qPCR using LightCycler® 480 instrument (Roche Diagnostics). qPCR reaction was performed under the following conditions: 95ºC for 2 minutes for pre-denaturing step, followed by 40 cycles of denaturation step at 95ºC for 5 seconds, annealing at 60ºC for 30 seconds, extension at 72ºC for 30 seconds and finished with a cooling step at 4ºC for 2 minutes. Primers for these three unigenes were purchased from Qiagen and each sample were performed in triplicates. The mRNA expression of all genes was quantified using 2 -CT method with housekeeping genes GAPDH and TBP.

Statistical analysis
Data analysis was performed using R/Bioconductor and Seaborn library in Python. Results of differentially expressed genes was visualised using EhancedVolcano package in R/Bioconductor. For gene expression studies, the fold change statistical significance was determined using One-way statistical analysis of variance (ANOVA). Statistical analysis details are described in the figure legends. All data are expressed as mean ± SD (Standard Deviation) (n = 3). To determine if datasets show a normal distribution, the Shapiro-Wilk test was used and P-values of less than 0.05 were considered as statistically significant.

Acknowledgement
This study was supported by the Ministry Energy, Science, Technology, Environment and Climate Change (MESTECC) Malaysia Flagship Fund, reference number: FP0514B0021-2(DSTIN).

Author contributions
A.V., U.A., S.C.C., D.M.C., S.L.C., S.A., and Y.K., designed the study. A.T performed the qPCR experiments. U.A. performed the wet and dry lab work as well as analysed the data. U.A wrote the paper. All authors reviewed the manuscript.

Availability of data and materials
All data supporting the results in this article are included in the present and supplementary files except the raw and processed sequencing data that have been submitted to the NCBI Sequence Read Archive with accession number: PRJNA543209 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA543209) and Gene Expression Omnibus (GEO) with accession number GSE140902 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE140902).

Ethics approval and consent to participate
Not application.

Consent for publication
Not applicable.