The Expression Pattern of miR-17, −24, −124 and −145 as Diagnostic Factor for Metastatic Gastric Cancer; a Lesson from Gastric Cancer Stem cells

Background Distant metastasis of Gastric Cancer (GC) causes more than 700 000 deaths worldwide. Cancer Stem Cells (CSCs) are a subpopulation of cancer cells responsible for aggressiveness and chemoresistance in clinical settings. MicroRNAs (miRNAs) emerge as important players in regulating self-renewal and metastasis in CSCs. Understanding the role of miRNAs in CSCs offer a potential diagnostic tool for GC patients. This study is aimed to identify miRNAs that target both stemness and metastasis in gastric cancer stem cells (GCSCs) and differentially expressed in metastatic GC patients as diagnostic biomarkers for GC metastasis. Methods We investigate the gene expression profile of patients using the GEO database and Rstudio software. To obtain the regulatory networks and miRNAs, the STRING and miRwalk database used. The gastric cancer tissues were obtained from Iranian National Tumor Bank (INTB) to validate the results. Results Our results indicated three important regulatory cores affecting the immune system’s regulation, tumor progress, and metastasis. Based on the bioinformatics results, four miRNAs miR-17-5p, miR-24-3p, miR-124-3p, and miR-145-5p, were selected, and their expression pattern was evaluated in 10 patients’ metastatic tumors compared to 10 nonmetastatic tumors by real-time PCR. The expression level of mir-17, −24, and −124 was upregulated about 8, 10, 60 folds, respectively, and miR-145 was down-regulated 4.5 folds in metastatic tumors compared to nonmetastatic tumors. Conclusion the high expression level of miR-17, −24, −124, and low level of miR-145 in GC patients’ samples could be a potential biomarker for the presence of GCSCs and the diagnosis of metastasis.


Introduction
According to The Global Cancer Observatory (GCO), GC is the fifth common Cancer with 1033701 incidences per year and remains the third cause of death with 782 685 death worldwide [1]. The survival rate of GC patients decreases dramatically based on distance metastasis. The genetic signature, environmental factors, and cancer stem cells have been identified to contribute to GC metastasis. Among genes, the OPCML, RNASE1, YES1, ACK1, and all of the suppressor genes, have been reported to have an essential role in metastatic GC [2]. Moreover, several transcription factors (FOXO4, POU2F2, Snail, Twist) [3], cytokines and signaling pathways (inflammatory cytokines, transforming growth factor-β (TGF-β), tumor necrosis factor (TNF), mitogen-activated protein kinase (MAPK) signaling pathway, and PI3K/Akt/NF-κB/Snail) may affect on self-renewal and also metastasis of GCSCs [4,5]. Another regulator factors are small non-coding RNAs that bind the 3′UTRs of target mRNAs. They can regulate several biological pathways and cell features by targeting multiple components of signaling pathways in GC [6][7][8][9][10]. Our previous studies provided evidence related to miRNAs' regulatory role on CSCs and subsequently control EMT and metastasis in melanoma and breast cancer [11][12][13]. Therefore, in the present study, we aim to study the differentially expressed miRNAs to determine a signature for predicting metastasis in GC patients.
Different strategies were performed to select the molecules associate with initiation and metastasis tumors.
Among them, bioinformatics approaches and advanced technologies such as high-throughput sequencing systems such as R-sequencing and DNA microarray provide an enormous amount of data that can be used in different aspects of cancer biology such as diagnostics, cancer progress, and prediction [14]. Moreover, these bioinformatic tools provide opportunities to combine and merge other available datasets to gain a dataset with many samples and specific criteria. These approaches can be applied to correct the batch effect through standard normalization techniques such as gene fuzzy score (GFS), Combat [15].
This study aimed to (1) identify genes that can be biomarkers for GC metastasis. To this end, we combined two different datasets containing metastatic and nonmetastatic samples and detect the differentially expressed genes by bioinformatics tools. Further, we aimed to (2) determine the mechanism by which selected genes regulate GC metastasis and explore pathways involved in tumor growth, regulation of the immune system, and metastasis. Finally, we sought to (3) identify the possible key microRNAs contributing to stemness and metastasis in metastatic and nonmetastatic GC patients using a mirwalk database and validate them by real-time PCR.

Patients and samples
Twenty GC tissues, including 10 metastatic and 10 nonmetastatic samples, were used upon approval of the Iranian National Tumor Bank (INTB) and based on tumor bank regulations. According to local authorities, the Ethics Committee of the Tumor Bank of the Iranian Cancer Institute had obtained patients' approval.
All procedures in the present study were performed following the relevant guidelines and regulations of the Royan Institute for stem cell biology and technology and approved by the Institutional Review Board and Ethics Committee of the Royan Institute, Tehran, Iran (IR.ACECR.ROYAN.REC.1398.93). All participants signed a written informed consent form to enroll in the study. Patients' histopathological information, including tumor size and depth of invasion, lymph-vascular and perineural invasion, grade, and the clinical tumor/node/metastasis, was recorded and pathologically staged using the TNM staging method [16] at INTB (Supplementary Table 1). Preparation and preservation of the samples were based on Tumor Bank standard operating protocols, and all specimens were frozen using nitrogen vapor within 20 minutes after surgery.

Data Acquisition
The GEO database was used to select array datasets containing nonmetastatic (no invasion and migration) and metastatic GC samples. Subsequently, two gene expression profiles based on Affymetrix Human Genome U133A Array and Affymetrix Human Genome U133 Plus 2.0 Array platforms have been selected for more analysis. The GSE29272 contained 9 nonmetastatic and 17 metastatic samples in the GPL96 platform. The GSE57303 was including 8 nonmetastatic and 13 metastatic samples in the GPL570 platform.

Data Processing
Two expression profiles of GC with different platforms containing nonmetastatic and metastatic GC samples have been downloaded from the GEO database. In the first step, the Affy package has been used to read the data, and subsequently, the rma function is applied to normalize the datasets mentioned above.
In the following, gene symbols are replaced with probe IDs, and the two datasets with 13237 similar genes combined. The ComBat function from SVA and limma package is used to remove the batch effect and detect differentially expressed genes, respectively (Fig. 1a). Furthermore, to visualize data, the ggplot2, pheatmap, gplots, and msmsTests, packages have been used.

KEGG Pathways enrichments and Gene ontology
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a useful database that uses the biological system's data to reveal the impact of genes on different concepts such as signaling pathways, diseases, and chemical compounds [17]. Gene ontology is a useful standard project which provides data concerning gene annotation and gene products. The data are available in three different categories cellular component (CC), molecular function (MF), and biological process (BP) [18]. Enrichr is a free source and freely available database to visualize and enrichments of unbiased gene list. Enrichr uses three types of assessment tests to evaluate the significant overlap between its gene library and the input list [19,20]. The data presented in this database are classified in different concepts; particularly for this study, we only used KEGG pathways analysis and Gene Ontologybiological process. P-value < 0.05 was selected as cut-off criteria for overrepresented terms.

Construction of PPI Network and Module Analysis
PPI network provides a large amount of information concerning the biological function or interaction of proteins in different cellular processes and signaling pathways. To obtain PPI networks for upregulated and downregulated genes, the STRING database (Search Tool for the Retrieval of Interacting Genes/Proteins) is used. STRING illustrate networks through different methods such as known experimental interaction, interactions observed in other organism or used a specific algorithm to predict interactions [21]. We used STRING software in Cytoscape and generated a PPI network for DEGs with a confidence score of more than 0.4 and the maximum number of interactors = 0 as cut-off criteria.

Real-time polymerase chain reaction (RT-PCR)
According to the manufacturer's instructions, total RNAs were extracted from nonmetastatic and metastatic tumor samples using TRIzol reagent (Qiagen). A spectrophotometer determined the concentration of the RNAs. 18S rRNA gene was used as the internal control to normalize the expression level of selected genes.
Primer sequences were listed in Supplementary Table 2.

MiRNAs validation by real-time PCR
MiRNAs were evaluated by using SYBR green qRT-PCR. In brief, 1ug of total RNA containing the miRNAs was polyadenylated by poly (A) polymerase and reverse transcript to cDNA by RT enzyme.
According to the manufacturer's instructions, the first-strand cDNA synthesis reaction was provided in the Zist Royesh kit (Iran). Each reaction was performed in a final volume of 10 μL containing diluted cDNA and PCR master mix, and all reactions were run in triplicates. According to the manufacturer's protocol, the qRT-PCR reaction was performed using Applied Biosystems real-time PCR instruments (ABI). The expression levels of miRNAs were normalized against internal controls U6 primer as a reference gene control.

Statistical analysis
Raw data were normalized by the rma package and log-transformed in RStudio. Only DEGs with adjusted P-value < 0.05 and fold change >1 considered differentially expressed genes in metastatic and nonmetastatic samples. We also used the Student's t-test statical test to assess the polymerase chain reaction results , and only genes and microRNAs with a p-value < 0.05 are considered significant. The Spearman's rank correlation test was used to identify the relation between the DEGs, genes, and microRNAs, and the correlations with P-value < 0.05 were considered statistically significant.

Identification of differentially expressed genes (DEGs)
To find differentially expressed genes between metastatic and nonmetastatic GC, we selected two datasets with GSE number GSE29272 including 9 nonmetastatic and 17 metastatic samples with GPL96 GSE57303 contained 8 and 13 nonmetastatic and metastatic samples, respectively with GPL570 platform. Affy and limma packages were used with adjusted P-value < 0.05 and |logFC| ≥ 1 as the cut-off to analyze the data.
We found 107 genes differentially expressed between two groups; 73 down-regulated and 34 upregulated

KEGG pathways analysis and biological process
To find the impact of down-or upregulated genes on the progression of GC, we investigate the Kyoto Encyclopedia of Genes and Genomes (KEGG) and biological process-gene ontology (GO) by the Enrichr (Supplementary Table 4A, 4B and 5A, 5B). The results indicated that down-regulated genes are primarily involved in the immune response to infectious including Epstein-Barr virus (EBV), staphylococcus aureus, influenza A, tuberculosis, cellular response to interferon-gamma, and inflammatory response. Moreover, they are responsible for cytokine-mediated signaling pathways, NOD-like receptor signaling pathways, and Th17 cell differentiation. In contrast, upregulated genes are primarily involved in steroid biosynthesis, Wnt signaling pathway, folate biosynthesis, positive regulation of ubiquitin-protein transferase activity, positive regulation of epithelial cell migration, and cellular response to epidermal growth factor stimulus (Table 1).
Moreover, according to Enrichr, the upregulated genes can be found in colorectal adenocarcinoma tissue, categorized in gastrointestinal (GI) cancers.

Chromosomal location
Classifying the chromosomal location of genes involved in complex diseases such as cancer would be a significant step toward understanding these molecular patterns. In this regard, we investigate the up-and down-regulated genes' location to detect the chromosomes that met the most change of gene expression during its progress from the nonmetastatic stage to the metastatic stage. According to table 2, it can be noticed that chromosomes 1, 11, 20, and X with 10, 7, 4, and 5 variations, respectively, were the main chromosomes that face changes in expression pattern during the development of GC from nonmetastatic to metastatic stages.

Construction of the Protein-Protein interaction network and identification of core networks
Protein-protein interaction (PPI) networks are critical components, which regulate a wide array of cellular processes. To deeply understand DEGs' effect in GC's progress from nonmetastatic to metastatic stages, Cytoscape-String plugin was used to generate PPI networks. In this regard, we previously published a list of genes that are involved in self-renewal and metastasis in GC patients (Supplementary Table 6) [22].
Therefore, we added the up-and down-regulated genes with the mentioned genes that contributed to stemness and metastasis in the Cytoscape-string plugin. Then genes that had a connection with metastatic and stemness genes at the same time were selected. Subsequently, we found two central cores for upregulated ( Fig. 2a, b) and one core for downregulated genes consisting of five genes from this category (Fig. 2c). The first core was containing LGR5 identified as one of the GCSCs markers. It is a critical gene that showed a connection with both stemness and metastatic genes.
Moreover, it had a connection with VILLIN1 as an essential player in metastasis (Fig. 2a). The second core was Ubiquitin Conjugating Enzyme E2 C (UBE2C), which is involved in the cell cycle, has connections with both stemness and metastatic genes, and can promote tumor progression (Fig. 2b). We found one pivotal core for down-regulated genes consist of five genes CXCL12, CAV1, CXCL13, CCL5, IDO1, that have connections with both stemness and metastatic genes (Fig. 2c). In this core, CXCL12, CAV1 is the top 2 genes that play vital roles in the regulation of the immune system, proliferation, and metastasis [23][24][25][26].
Moreover, by using Enrichr, we found that the mentioned core is involved in the chemokine signaling pathway, cytokine-cytokine receptor interaction, positive regulation of cell migration, and positive regulation of T cell migration.

Expression of Snail and Sox2 in nonmetastatic and metastatic tumors
We evaluated the expression of SNAIL I/II and SOX2 genes, which are known contributors in self-renewal and EMT pathways [5,[27][28][29][30][31], in 20 samples (10 nonmetastatic and 10 metastatic) using RT-PCR. All patients had gastric adenocarcinoma; age varied from 38 years old in both sexes. They were from different ethnicities in the Middle East. Notably, Azari patients had the highest frequency. All metastatic patients showed invasion and the tumor size more than T2 (Table 3). Analysis of RT-PCR showed significant upregulation of SNAILI/II by 1.64 and 5.32 times respectively, (P<0.01, Fig. 3a) in metastatic samples as well as SOX2 and HOXC6 by 7.25 and 1.16 respectively (P<0.02, Fig. 3b). These data suggested that all metastatic samples reveal evidence for stemness and EMT characteristics. However, it should be confirmed by further experiments.

MicroRNAs selection and expression level
To identify the cluster of microRNAs that can control the mentioned regulatory cores, the mirWalk database was used. All the core genes were placed in the mirwalk database, and as a result, four microRNAs, hasmir-17-5p, has-mir-24-3p, has-mir-124-3p, and has-mir-145-5p that were able to target different genes at the same time in the cores were selected for expression evaluation in tumors (Table 4). To confirm the predicted miRNAs, 20 samples, 10 nonmetastatic and 10 metastatic were set and followed with RT-PCR.

Correlation between mRNA-mRNA, miRNAs-miRNAs, and mRNA-miRNAs
At first steep, we investigated the correlation between the mRNAs and found that SNAIL1 has a positive correlation with SOX2 and HOXC6 (P.value < 0.02) (Fig. 4a). Then we found a positive correlation between miR-124-3p with miR-17-5p and miR-24-3p with a p-value under 0.001. There is also a positive correlation with a p-value of 0.003 between miR-17-5p and mir-24-3p (Fig. 4b). There was not any significant correlation between mir-145-5p with other miRNAs. Finally, our results indicated that miR-17-5p and SOX2, and miR-145-5p and SNAIL1 with P-value 0.001 and 0.04 have a positive and negative correlation, respectively (Fig. 4c).
Our results indicated that 107 genes were differentially expressed between 47 patients in two nonmetastatic and metastatic groups. Nineteen of 34 upregulated genes were involved in metastasis and were associated with the Wnt signaling pathway, positive regulation of epithelial cell migration, and cellular response to epidermal growth factor stimulus. Interestingly, most 71 down-regulated genes are involved in immune response and inflammation, including Epstein-Barr virus infection, which is one of the main reasons that cause GC, cytokine-mediated signaling pathway, and Th17 cell differentiation.
The present study's second goal was to explore the suggested mechanisms and signaling pathways by which selected genes regulate GC metastasis. In this regard, we constructed networks with differentially expressed genes along with the most important stemness and metastatic genes related to GCSCs. The results led us to three main cores of networks containing LGR5, UBE2C, CXCL12, CXCL13, CCL5, CAV1, and IDO1.
Leucine-Rich Repeat Containing G Protein-Coupled Receptor 5 (LGR5) among upregulated genes has an enormous effect on the tumorigenicity and metastasis through positive regulation NANOG, OCT4, SOX2, AICDA, PRRX1, TWIST1, and BMI1 genes and promote self-renewal in GC [43][44][45]. Also, co-expression of LGR5 with VIL1 induces metastasis of GC [46,47]. Ubiquitin Conjugating Enzyme E2 C (UBE2C) was another core of our constructed networks connected to essential genes like MYC, BMI1, TP53, CDKN1A CXCL8. Up-regulation of UBE2C plays an important role in anaphase-promoting complex/cyclosome, loss of control on spindle checkpoint signaling, promote ERK signaling pathway, activation of Wnt/β-catenin and PI3K/Akt signaling pathways, and increasing cell proliferation in GC. The high expression level of UBE2C is associated with poor prognostic in GC [48][49][50]. Also, it has been reported that the downregulation of UBE2C reduces metastasis through increasing E-cadherin expression [51]. Another core of our constructed network was contained CXCL12, CXCL13, CCL5, CAV1, and IDO1. The CXCL12, CAV1 were the top two hub genes in this network that also have been reported to have the leading role in proliferation and metastasis in GC through ERK/PI3K signaling pathway, PI3K/Akt/mTOR, inducing F-actin, and c-MET (Hepatocyte Growth Factor Receptor)/ STAT3-ZEB1 axis [52][53][54][55]. most of the down-regulated genes are identified as chemokines with important role in inflammation and mainly cell-cell adhesion and promoting EMT in GC [54][55][56][57]. A few studies are regards to the function of CXCL12, CXCL13, CCL5, CAV1, and IDO1 in GCSCs. In the present study, we found a considerable number of down-regulated genes that were connected to stemness and metastatic genes through CD44 and they contributed in different signaling pathways including NF-kappa B signaling pathway and TNF signaling pathway.
Finally, we sought to identify the possible key microRNAs contributing to stemness and metastasis in metastatic and nonmetastatic GC patients. Therefore, first, we evaluated SOX2, HOXC6, SNAILI/II, which can regulate metastasis and self-renewal in tumor samples. Then the expression of the selected miRNAs was assessed in 20 non-(metastatic) tumors. Interestingly, the results indicated the up-regulation of SOX2, HOXC6, SNAILI/II, and miR-17-5p, miR-24-3p, miR-124-3p metastatic samples that were associated with down-regulation of miR-145. It has been suggested that the overexpression of SOX2 leads to poor overall survival in breast cancer, pancreatic carcinoma, and also GC. SOX2 affects not only CSCs but has documented effects on chemoresistance [58][59][60][61][62]. On the other hand, in parallel with SOX2, the overexpression of SNAILI/II can increase the metastatic ability [63,64] and develop chemoresistance against genotoxic agents by suppressing the cell death pathway [65]. In the present study, a positive correlation was observed between SNAIL1 and SOX2, and even HOXC6. Interestingly, the role of HOCX6 in cell growth and survival of hepatocellular carcinoma and cervical cancer [66,67] has been reported previously. It can also induce drug resistance in oral and breast cancer cells [68,69] and promote metastasis and invasion by inducing MMP9 in GC and reducing overall survival [68][69][70].
Moreover, we found a positive correlation between miR-17-5p and SOX2. Previously, the up-regulation of mir-17-5p has been reported in the colon, prostate, liver cancer, and GC, leading to poor overall survival in patients. The up-regulation of mir-17-5p through activation of STAT3 can eventually induce cell proliferation and metastasis [71][72][73][74][75].
Another finding of the present study was the up-regulation of mir-24-3p in metastatic gastric that can be a novel microRNAs with an essential role in the progression and metastasis of GCs. It has been reported that miR-24-3p induces immortality and cell proliferation in breasts by targeting p27Kip1 and even promoting metastasis and invasion of bladder cancer [76,77]. Moreover, miR-24-3p can induce proliferation, metastasis, and invasion in hepatocellular carcinoma and lung cancer by targeting SOX7 [78,79].
Unlike other studies, we found up-regulation mir-124-3p in metastatic GCs. However, it has been reported as a tumor suppressor effect in breast cancer, bladder cancer, and osteosarcoma by targeting SPHK1, CBL, AURKA, and STAT3 [80][81][82][83]. It is also declared that a low level of mir-124-3p is related to the high level of metastasis and poor survival in GC patients [84]. Searching the validated targets of mir-124-3p by Enrichr showed that these genes are involved in Proteoglycans in Cancer, PI3K-Akt signaling pathway, and VEGF signaling pathway. There was a positive correlation between mir-17-5p, mir-24-3p, and mir-124-3p in the present study.
MiR-145-5p was the last miRNAs in the present study that was down-regulated. Its expression had a negative correlation with SNAILI in metastatic GCs. In parallel with our results, miR-145 prevents metastasis by targeting EMT markers such as ZEB2 and N-cadherin and cell proliferation by targeting COX-2 [85,86]. Furthermore, it is demonstrated that mir-145-5p can inhibit esophageal squamous cell carcinoma progression by targeting SP1/NFkB signaling pathway [87]. We found that mir-145-5p can affect MAPK signaling pathway, signaling pathways regulating pluripotency of stem cells, TGF-beta signaling pathway, cell cycle, PI3K-Akt signaling pathway, mTOR signaling pathway, and hippo signaling pathway.
In conclusion, we reported for the first time that up-regulation of miR-17-5p, miR-24-3p, miR-124-3p, and down-regulation of miR-145-5p could be associated with metastasis induction, self-renewal promoting, and GC progression. On the other hand, metastatic cancer tissues revealed the up-regulation of SOX2, SNAILI/II, and HOXC6 that could be an ideal marker for molecular pathology of metastatic GC.