Integrated Bioinformatics Analysis Deciphering the microRNA Regulation in Protein-Protein Interaction Network in Lung Adenocarcinoma

Background The architecture of the protein-protein interaction (PPI) network in any organism relies on their gene expression signature. microRNAs (miRNAs) have recently emerged as major post transcriptional regulators that control PPI by targeting mainly untranslated regions of the gene encoding proteins. Here, we aimed to unveil the role of miRNAs in the PPI network for identifying potential molecular targets for lung adenocarcinoma (LUAD). Materials and methods The expression profiles of miRNAs and mRNAs were collected from the NCBI Gene Expression Omnibus (GEO) database (GSE74190 and GSE116959). Abnormally expressed mRNAs from the data were appointed to construct a PPI network and hence incorporated with the miRNA-mRNA regulatory network. The miRNAs and mRNAs in this network were subjected to functional enrichment. Through the network analysis, hubs were identified and their mutation rate and probability of cooccurrence were calculated. Results We identified 17 miRNAs and 429 mRNAs signature for differentially altered transcriptome in LUAD. The combined miRNA–mRNA regulatory network exhibited scale-free characteristics. Network analysis showed 5 miRNA (including hsa-miR-486-5p, hsa-miR-200b-5p, and hsa-miR-130b-5p) and 10 mRNA (including ASPM, CCNB1, TTN, TPX2, and BIRC5) which expressively contribute in the LUAD. We further investigated the hub genes and noticed that ASPM and TTN had the maximum rate of mutation and possessed a high tendency of cooccurrence in LUAD. Conclusion This study provides a unique network approach to the exploration of the underlying molecular mechanism in LUAD. Identified mRNAs and miRNAs may therefore, serve as significant prognostic predictors and therapeutic targets.


Introduction
Lung adenocarcinoma (LUAD), the most common primary lung cancer, mainly manifests in the form of non-small cell lung cancer (NSCLC) (Myers and Wallen 2020). As with other tumors, lung adenocarcinoma shows heterogeneity by high rates of genetic mutation (Dong et al. 2020). Despite the availability of diverse technologies, such as stereotactic radiotherapy, targeted therapy, minimal invasion methods and immunotherapy, the long-term survival is still poor in LUAD (Ferrara et al. 2018). One of the main reasons for this could be a delayed diagnosis. Thus, it is necessary to understand the molecular mechanisms behind lung adenocarcinoma genesis and metastasis, and identify effective biomarkers of this disease.
MicroRNAs (miRNAs) are the 19-25 nucleotides long single-stranded non-coding RNAs, that regulate genes by mainly binding to the 3' untranslated regions (UTRs) of their target genes (Berindan-Neagoe et al. 2014). In recent, miRNAs were demonstrated to be significantly dysregulated in LUAD tissues and could be a potential target for this disease (Petkova et al. 2022;Han and Li 2018).
In the post-genomic era, biological networks offer exciting possibilities in understanding gene regulation and pattern recognition for autonomous disease identification. Protein-protein interaction (PPI) network is conventionally a static network, where proteins are represented by nodes and interactions by edges (Barabasi and Oltvai 2004). However, in reality, PPI network has dynamic property, which is intrinsically regulated by complex cellular mechanisms through time and space (Liang and Li 2007). Recent studies showed the significance of the miRNA regulation on protein-protein interaction networks (Rahimpour et al. 2022;Du et al. 2019;Lei et al. 2019;Liang and Li 2007). Moreover, different methods have been developed to predict miRNA targets accurately at the genomic level (John et al. 2004;Lewis et al. 2003). But much effort had not been given to understand the relationship between miRNA regulations on PPI in LUAD from a regulatory network approach.
Gene Expression Omnibus (GEO) is a public data repository of array-and sequence-based functional genomics data (Edgar et al. 2002). Owing to the advancement of high-throughput bioinformatics technologies, we aimed to re-analyze the mRNA (GSE116959) and miRNA (GSE74190) expression profile data sets in LUAD from GEO to unveil more specific molecular targets associated with the tumorigenesis of the disease. We sought to construct a regulatory network combining miRNA-mRNA and protein-protein interactions in LUAD. Our study presented an integrative bioinformatics data analysis that could be appointed for the improvement in the clinical consequences aided through endorsement of the potential biomarkers in LUAD.

Data
The transcriptomic data used in this publication were collected from the National Center of [miRBase release 9.1 miRNA ID version] platform used for the miRNA profile data.

Screening of Dysregulated miRNA and mRNA
In our study, GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/), an R based interactive web tool, was used to identify differentially expressed miRNAs and mRNAs between tumor and adjacent normal samples separately within each dataset (Barrett et al. 2013). A log fold change (logFC) >2 was set as a cut off for differentially expressed miRNAs and mRNAs. Probes without a corresponding gene symbol were filtered. The data were processed through the paired samples t-test and corrected using the Benjamini -Hochberg method. The adjusted P-value <0.05 cut off was applied. The log2 normalized expression values of differentially expressed miRNAs and mRNAs were employed for the complete unsupervised clustering using online tool MORPHEUS (https://software.broadinstitute.org/morpheus/). Further, Principal component analysis (PCA) and volcano plot of the same were performed using the R based online tool ClustVis(https://biit.cs.ut.ee/clustvis/) and the MATLAB v.R2018a respectively (Metsalu and Vilo 2015).

Protein-Protein Interaction Network Construction
The Protein-protein interaction (PPI) network was constructed and analyzed by using the STRING: functional protein association network database (https://string-db.org). The dysregulated mRNA signature in LUAD was set as the input and the highest confidence (0.90) of active interaction was chosen. The isolated nodes were hidden and supervised clustering (Kmeans) was performed for recognizing three clusters in the provided data set.

Development of miRNA-mRNA Expression Network
Interactions between dysregulated miRNAs and mRNAs were anticipated using miRWalk 3.0 (http://mirwalk.umm.uni-heidelberg.de/) (Sticht et al. 2018), which compiled the prediction results of both TargetScan (Agarwal et al. 2015) and miRDB (Chen and Wang 2020), and a maximum score (= 1) was considered for the prediction analysis in miRWalk. The miRNA-mRNA interaction network was incorporated with the PPI network for a better understanding of the underlying mechanism in LUAD. The Cytoscape v.3.8.0 (https://cytoscape.org/) was used for visualizing the regulatory network between miRNA-mRNA. Cytohubba, a cytoscape plug-in, was employed to predict top 5 miRNAs and top 10 mRNAs based on their degree. A network with these 5 miRNAs and their common target mRNAs was constructed and analyzed.
BoxPlotR (http://shiny.chemgrid.org/boxplotr/), an online tool was used to create violin plots for the degree centrality (DC), closeness centrality (CC), and betweenness centrality (BC) data of the nodes present in the developed interactive network.

Pathway Analysis
Pathway analysis was carried out for differentially expressed mRNA signatures in LUAD using the open access, manually curated, peer-reviewed pathway database Reactome (https://reactome.org/). A cut-off of p value < 0.05 was set to recognize the altered signaling pathways.

Screening and Overall Survival Analysis of the Hub Genes
The mutation rates and the tendency of cooccurrence of the top 10 hub genes were examined with the cBioPortal platform (http://www.cbioportal.org). Further, the overall survival plots for the hub miRNAs and mRNAs were plotted using the Kaplan-Meier plotter (https://kmplot.com/analysis/) (Gyorffy et al. 2013) to determine their prognostic characteristics.

Genetic Alteration in miRNAs and mRNAs Expression in LUAD
Our analysis revealed 17 miRNAs and 429 mRNAs to be differentially expressed in LUAD tissues (Supplementary Table S1

Interaction of Dysregulated Proteins in LUAD
The 429 differentially expressed mRNAs were assigned for the PPI network formation. The network exposed two major clusters with GNG11 (Guanine nucleotide-binding protein subunit gamma) and CCNB2 (G2/mitotic-specific cyclin-B2).
GNG11 was identified to be a key hub protein that interacted with 15 other proteins including (Membrane-associated tyrosine-and threonine-specific cdc2-inhibitory kinase), and CDCA5 (Sororin) (Supplementary Figure S1B). The details of the PPI network were provided in Supplementary Table S3.

miRNA-mRNA Interacting Network Analysis
The role of miRNA on PPI network in LUAD was investigated by constructing a miRNA-mRNA regulatory network and visualized using Cytoscape ( Figure. Table S4).
Degree of a node in a network implies the number of edges connected with the given node.
Moreover, the degree distribution can be defined as the probability distribution of degrees over the network. In this study, the degree of distribution of the nodes in the regulatory network was studied, and the power-law distribution was observed (Figure.   Figure. 2C-E). However, the scale-free distribution of the miRNA-mRNA network indicate the existance of functionally relevant nodes in the development of LUAD.

Functional Enrichment Analysis of the Dysregulated miRNAs and mRNAs
Gene ontology (GO) enrichment analyses for differentially expressed miRNAs and mRNAs were performed. We identified the top 5 most significant GO terms of each group.
Differentially expressed miRNAs were enriched in signal transduction, regulation of nucleic acid metabolism, on the BP level. On CC level, they were enriched in nucleus, cytoplasm and lysosome. Transcription factor activity, GTPase activity, protein serine/ threonine kinase activity and phosphatase activity were significant on MF level for dysregulated miRNAs ( Figure. 3A).
However, differentially expressed mRNAs were enriched in extracellular matrix (ECM) organization, caveola assembly, nitric oxide transport, on the BP level. Cell surface and extracellular regions were enriched on CC level for dysregulated mRNAs. On MF level, calcium ion binding, lipoprotein particle binding, heparin binding were noteworthy ( Figure. 3B).

Extracellular Matrix Organisation Was the Most Enriched Pathway in LUAD
Considering the signature 429 dysregulated mRNAs, a pathway enrichment analysis was done by using the Reactome pathway analysis database. Top 10 most enhanced pathways were shown ( Figure. Supplementary Table S6.

Hub Selection and Analysis
To identify the hub nodes in the network we obtained Cytohubba plug-in of Cytoscape. hsa-miR-486-5p, hsa-miR-200b-5p, hsa-miR-130b-5p, hsa-miR-183-5p, and hsa-miR-139-5p were the top 5 hub miRNAs with the degree 119, 110, 102, 97, and 85 respectively, in the network. Oncopoint analysis of the hub genes with the cBioPortal showed that TTN (38%) and ASPM (14%) have the exceptionally high genetic mutation rates in LUAD (Supplementary Figure   S2). We also analyzed the tendency for the cooccurrence of these hub genes (Supplementary Table S7). The result showed TTN and ASPM also had a high tendency of cooccurrence in LUAD with a log2 odd ratio 1.368 and p < 0.001.

Discussion
Understanding the elementary molecular mechanisms of LUAD development to the identification of the therapeutic targets could considerably improve patient prognosis. Large expression profiling data sets have demonstrated to be beneficial in determining the regimen for cancer treatment (Deb et al. 2020;Dong et al. 2020;Klaeger et al. 2017;Zhang et al. 2020).
In the post-genomic era, biological networks become a useful tool to understand gene Also, the higher CAV1 (Caveolin 1) expression, which leads to caveola assembly, induce filopodia formation, cell migration and metastasis in LUAD cells (Ho et al. 2002). In summary, our results are consistent with these previous reports.
We further analyzed the miRNA-mRNA regulatory network for identifying specific key genes and miRNAs related to LUAD. Among identified top 10 hub genes, ASPM had the highest degree at 20. ASPM gene, which involved in mitotic spindle regulation, is responsible for the coordination of cell cycle. A recent study by Deb et al. showed alteration of phosphorylation pattern in S425 position in ASPM targeted by Serine/threonine-protein kinase Nek2 (NEK2), is a common signature pattern for five cancer types including LUAD (Deb et al. 2020).Wang et al. assessed ASPM expression and the functional significance in LUAD and reported that increasing level of expression of ASPM was positively correlated with poor disease prognosis ). However, ASPM was also reported to be a potential biomarker candidate for bladder cancer (Saleh et al. 2020;Xu et al. 2019) breast cancer , and prostate cancer (Pai et al. 2019). CCNB1, another hub gene with second highest degree (19), an important mitosis initiator and is a key member of the cyclin family. This gene was previously reported as candidates for the development and prognosis of lung cancer (Arinaga et al. 2003;Soria et al. 2000;Wang et al. 2020), hepatocellular carcinoma (Huang et al. 2018;Long et al. 2019), breast cancer (Ding et al. 2014).In the present study, we found that ASPM and CCNB1 were significantly upregulated in LUAD samples compared to the paired adjacent normal samples. Furthermore, overall survival analysis showed that the LUAD patients having higher expression of these two genes (ASPM and CCNB1) related with the poor overall survival (Supplementary Figure S3A).
In other hand, hsa-miR-486-5p and hsa-miR-139-5p were under-expressed with degree 119 and 85 respectively. hsa-miR-486-5p, having highest degree in the network, could serve as a potential biomarker for LUAD. Tian et al. validated the significant downregulation of hsa-miR-486-5p in NSCLC tissue, serum and cell samples (Tian et al. 2019). Our data showed hsa-miR-486-5p regulated both ASPM and CCNB1 during the alteration of cellular activities in LUAD patients. In addition, Rui et al. reported a significant involvement of hsa-miR-200b-5p expression in SPC-A1 LUAD cells (Rui et al. 2010). High expression of hsa-miR-130b-5p was associated with the poor overall survival and act as an oncogenic miRNA in NSCLC patients (Hirono et al. 2019). In our study, hsa-miR-130b-5p regulated hub gene SPAG5, CENPF and TTN. The miRNA, hsa-miR-183-5p is involved in cellular migration and invasion (Zaporozhchenko et al. 2016). In contrast to many other studies, we found hsa-miR-183-5p to be upregulated in LUAD patients. However, it plays a significant oncogenic role in other cancer types including breast (Macedo et al. 2017), gastric (Cao et al. 2014), and pancreatic cancer (Lu et al. 2016). Another downregulated miRNA, hsa-miR-139-5p was reported to be associated with occurrence along with development of NSCLC and may serve as a tumor suppressor gene for LUAD (Yong-Hao et al. 2019). The overall survival plots of the hub miRNAs in LUAD were plotted using KM plotter (Supplementary Figure S3B).
We further screened the mutation rates of these 10 LUAD hub genes with cBioPortal platform.
TTN which is a key component in the vertebrate striated muscles assembly and functioning, has correlation with lung cancer upon missense mutation (Cheng et al. 2019).We observed that TTN was regulated by all five hub miRNAs, worth mentioning. A cooccurrence study among genes showed ASPM had significantly higher cooccurrence log odd ratios with CENPF, TTN, SPAG5, TPX2, and KIF2C. Hub gene KIF2C was reported as a regulator of cell signaling pathway, significantly contribute in LUAD progression (Bai et al. 2019).Survival analysis showed that a higher expression of TPX2, KIF2C is significantly related to the poor overall survival in LUAD patients (Ma et al. 2019) (Supplementary Figure S3A). CENPF, another critical hub gene in our network, is associated with the centromere-kinetochore complex and plays a vital role in tumor development in lung ), breast (O'Brien et al. 2007Sun et al. 2019), prostate (Shahid et al. 2018) and nasopharynx (Cao et al. 2010). BIRC5, which is overexpressed in our data, performed a role in apoptosis and cell proliferation ).CCNB2 which was essential for the regulation of the cell cycle at the G2/M (mitosis) transition, played a significant role in LUAD development Zhang et al. 2020). A current study by He et al. reported SPAG5 as "As emerging oncogene" for its overexpression in cancer types . SPAG5 was also reported to be upregulated in most of the LUAD cell lines (Kim et al. 2019), which further confirmed our analysis.
Nevertheless, these predictions ought to be validated exhaustively through the clinical experiments and accomplished execution in LUAD therapy and treatment.

Conclusion
From the above discussion it can be concluded that, it is a network-based comprehensive reinvestigation of the transcriptomic da ta-sets has been obtained from NCBI GEO. Our study aimed to unveil the role of miRNAs in PPI network of LUAD. miRNA-mRNA regulatory network has identified 5 hub miRNAs and 10 hub mRNAs which may serve as potential and miRNAs (B) in LUAD; Table S1:Dysregulated (logFC>2) miRNA signature in LUAD; Table S2: Dysregulated (logFC>2) mRNA signature in LUAD; Table S3: Details of protein-protein interaction network in LUAD; Table S4: Interactions of miRNA-mRNA regulatory network in LUAD; Table S5: Node details of miRNA-mRNA regulatory network in LUAD; Table S6: Proteins enriched in the extracellular matrix organization pathway;

Funding:
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.  showing that 17 dysregulated miRNAs targeting 321 dysregulated mRNAs. Yellow nodes in the regulatory network represent the mRNAs and blue nodes represent the miRNAs.B. Degree distribution of the regulatory network. C. The difference of degree centrality between miRNAs and mRNAs. The miRNA nodes showed an aberrantly higher degree centrality than mRNA nodes. D. The difference of closeness centrality between miRNAs and mRNAs. The miRNA nodes showed a significantly higher closeness centrality than mRNA nodes. E. The difference of betweenness centrality between miRNAs and mRNAs. The miRNA nodes had a significantly higher betweenness centrality than mRNAs. P-values were calculated based on Mann -Whitney U test.BoxPlotR (http://shiny.chemgrid.org/boxplotr/) was used to create violin plots for the degree centrality (DC), closeness centrality (CC), and betweenness centrality (BC) data of the nodes present in the developed interactive network.