Abstract
Success story of plant-based medicine had been overlooked during the advent of modern pharmaceutical industry. Despite the negligence of the multimillion-dollar drug industry, people entirely rely on medicinal plants in some part of the world. In this study, we have emphasized on going back to those traditional medicinal practices to figure out their underlying mechanism to move forward on phytochemical based drug development. We screened Medicinal Plant Database Bangladesh 1.0 (MPDB1.0) and on-going extension, MPDB2.0, of that database to find traditionally used medicinal plants and their active compounds. Here, Mangiferin, extracted from Mangifera indica, have been demonstrated to interact with cell cycle regulator Cyclin-dependent Kinase 4 (CDK4). CDK4 is differentially expressed during Glioblastoma multiforme (GBM), Brain Lower Grade Glioma (LGG), and Sarcoma (SARC). Expression of CDK4 is interlinked to the patients’ survival rate and its consistent expression throughout different stages have provided the advantage to use it as diagnostic tool and drug target. Altogether, this study demonstrated that simple mango tree extracted active compounds, mangiferin, can work as potential anticancer drug and leveraging the recent advancement of sequencing and gene expression data can accelerate the phytochemical based drug discovery process.
Introduction
Cancer is expected to supersede all other non-communicable diseases to become the major cause of death and the most indomitable barrier to increasing life expectancy in every country of the world in the 21st century (Bray, Ferlay, and Soerjomataram 2018). World Health Organization (WHO) in 2015 estimated cancer to be the first leading cause of death before age 70 years in 91 of 172 countries, and also appraised it as the third or fourth major cause in an additional 22 countries. According to the GLOBOCAN 2018 estimation, 18.1 million new cancer cases and 9.6 million cancer deaths occurred in 2018 among which lung cancer is the most commonly diagnosed cancer (11.6% of the total cases) closely followed by female breast cancer (11.6%), prostate cancer (7.1%), and colorectal cancer (6.1%) for incidence (Bray, Ferlay, and Soerjomataram 2018). Approximate number of cancer patients in Bangladesh vacillates between 1.3 to 1.5 million, with about an addition of 0.2 million new patients each year (Uddin et al. 2016; Noronha et al. 2012). Despite of being a compilation of approximate 5 decades of systemic drug delivery and establishment, a repertoire of chemotherapeutic drugs which is the standard cancer treatment is not bereft of their own intrinsic problems such as toxicity and lesser efficacy (Desai et al. 2008). In the last decade, identification of medicinal plants with significant cytotoxic potential useful for the development of cancer therapeutics has become a center of attention with still lots of unexplored areas for elucidation via research (Al-kalaldeh, Abu-dahab, and Afifi 2010). More than 1000 plants species have been identified with significant anticancer potential (Mukherjee et al. 2001). The isolation of the vinca alkaloids, vinblastine (Balunas and Kinghorn 2005) from the Madagascar periwinkle, Catharanthus roseus G. Don. (Apocynaceae) bolstered medicinal plants utilization as a promising source of anti-cancer medication. This in combination with vincristine and other cancer chemotherapeutic drugs are used for the treatment of a spectrum of cancers such as leukemias, lymphomas, advanced testicular cancer, breast and lung cancers, and Kaposi’s sarcoma (Cragg and Newman 2005). The discovery of paclitaxel (Taxol) (Butler 2004) from the bark of the Pacific Yew, Taxus brevifolia Nutt. (Taxaceae), is another evidence of the success in natural product drug discovery. Utilization of various parts of Taxus brevifolia from which paclitaxel was discovered and other Taxus species (e.g., Taxus Canadensis Marshall, Taxus baccata L.) by several Native American Tribes sheds light on indigenous knowledge of medicinal plants (Cragg and Newman 2005). Another potent plant-acquired active compound, Homoharringtonine (Norman et al. 1985) was isolated from the Chinese tree Cephalotaxus harringtonia var. drupacea (Sieb and Zucc.) (Cephalotaxaceae) and has been used successfully in China in a racemic mixture with harringtonine for the treatment of acute myelogenous leukemia (Cragg and Newman 2005). Elliptinium, a derivative of ellipticine, isolated from a Fijian medicinal plant Bleekeria vitensis A.C. Sm., is marketed in France for the treatment of breast cancer (Cragg and Newman, 2005).
With a distinguished heritage of herbal medicines for primary health care among the South Asian countries, Bangladesh is estimably home to more than thousands of species of medicinal plants. These native plants are a considerable source of Unani, Ayurvedic and homeopathic medicines in Bangladesh (Gani et al., 2003) with their background entrenched in folklores and century-old knowledge of traditional medicine practitioners (Mohammed et al. 2010).
Unfortunately, major portion of these plants have not yet been studied extensively in terms of their chemical, pharmacological and toxicological properties to explore their bioactive compounds which may prove to be an astonishing source of new anticancer drug discovery (Khatun et al. 2014). A comprehensive database including all endemic medicinal plants works as a foundational basis for future drug discovery. There have been such extensive, curated databases such IPPAT (Indian Medicinal Plants, Phytochemistry And Therapeutics) consisting 1742 Indian Medicinal Plants, 9596 Phytochemicals and 1124 Therapeutic uses spanning 27074 plant-phytochemical associations (Mohanraj, Karthikeyan, Chand, et al. 2018). CMKb (Customary Medicinal Knowledge) is another such endeavor for storing, preserving and circulating aboriginal Australian medicinal plant knowledge (Gaikwad et al. 2008). In this article, we are mainly shedding light on changes in several anti-neoplasmic bioactive compounds found in Bangladeshi indigenous plants when they are bound to proteins of cancer cascade pathways with cheminformatic approaches.
We have identified Mangiferin, a xanthonoid extracted from the bark and leaves of mango tree (Mangifera indica), as a potential anticancer drug which targets cell cycle Cyclin-dependent kinase 4 (CDK4). This study took the leverage of curated medicinal plants’ information of Bangladesh to screen potential anticancer drugs from phytochemicals and used the large scale RNA sequencing data to analyze expression of CDK4 in cancer sub-types, different stages of cancer, survival events correlated with the expression, and co-expression as well. This study is a big step forward to broaden our understanding about the primitive plant-based medicine and bridging this ancient knowledge with cutting-edge large-scale cancer dataset.
Materials and Methods
Database creation
MPDB 2.0 is the continuation of the MPDB 1.0 (http://www.medicinalplantbd.net/) which contains the information such as scientific name, family name, local name, utilized part, location, ailment, active compounds and PubMed ID of related research article about medicinal plants from Bangladeshi. To acquire this information regarding medicinal plants from Bangladesh, ∼75 research, survey, and review articles (published in both national and international journals till September 2019) were considered. As most of these articles lack the knowledge about active ingredients, we have used the scientific name of these plants to search one more round through PubMed to find out reported active compounds extracted from these plants.
Identification of interacting proteins
The interaction between plant-based active compounds and human proteins was investigated by using STITCH 4.0 (http://stitch.embl.de/); a web server focusing on the interactions between proteins and small molecules. STITCH (Search Tool for Interactions of Chemicals) amalgamates information about such interactions from metabolic pathways, crystal structures, binding experiments and drug–target relationships (Kuhn et al. 2008).
Molecular docking
3-D structure of the proteins are derived from RCSB PDB (http://rcsb.org). RCSB PDB is the single global archive for experimentally determined, atomic-level three dimensional structures of biological macromolecules in PDB format (Rose et al., 2017). Chemical structures of active compounds are attained from PubChem (https://pubchem.ncbi.nlm.nih.gov), a public repository for information on chemical substances and their biological activities in (S. Kim et al. 2016; Wang et al. 2010; 2012; 2009). The phytochemical compounds are docked against the proteins using PyRx (https://pyrx.sourceforge.io/), an open-source software with virtual molecular screening ability to dock small-molecule libraries to a macromolecule with an aim to discover lead compounds with desired biological function (Dallakyan and Olson 2015). For the docking of targeted phytochemicals into discovered protein binding pockets (Sousa, Fernandes, and Joa 2006) and to approximate the binding affinities of docked ligands, a molecular docking program AutoDock Vina (Oleg and J. 2011) in PyRx Virtual screening tool is mainly employed. The protein PDB file was changed into the PDBQT format file containing the protein atom coordinates, partial charges and deliverance parameters and the ligands file (SDF) are distorted into PDBQT format (Saddala et al. 2016).
Large-scale gene expression analysis
Over the last few years, cancer related large scale RNA sequencing data has been available through TCGA and GTEx (Consortium 2015; Lonsdale et al. 2013; Weinstein et al. 2013). This data has become more accessible through GEPIA (Gene Expression Profiling Interactive Analysis) (Tang et al. 2017) and their recent updated GEPIA2 (Tang et al. 2019). For gene expression analysis in cancer sub-types, differential methods LIMMA was used with log2FC cutoff 1 and q-value cutoff 0.01. In stage-wise expression analysis, major stages were only considered. During the patients’ survival analysis, data was normalized using overall survival with 95% confidence interval. Correlation analysis is based on Pearson correlation test.
Statistical analysis and graphs
Statistical analysis and graphs are either generated by web server as mentioned in the manuscript or other cases, used statistical programming language R (3.6.0). R codes are available from the authors upon request. Final figures were prepared on Adobe Illustrator (version 24.0.1) without compromising the details of the analysis.
Results
Anticancer compounds are prevalent among traditional plant extracts
We have started to gather the information of traditionally used medicinal plants of Bangladesh based on Medicinal Plant Database Bangladesh (MPDB1.0) (Ashraf et al. 2014), which contains 353 plants and known active compounds for 78 plants. This search was extended, and 2,349 new plants information included along with known active compounds is considered for this study. The extended dataset is considered and mentioned as MPDB 2.0 (unpublished) in this manuscript. Among 2,702 traditionally used plants, we have searched published active compounds reported for their role in regulating diabetes, respiratory disorder, jaundice, cancer, diarrhea, skin disease and so on. Interestingly, we have observed that highest number of active compounds, 42, are reported for the anticancer activity (Figure 1a). Unfortunately, majority of these compounds are neither used for clinical trial nor the anticancer mechanism is known.
Cell cycle regulatory genes are targeted by identified active compounds
Using the identified active compounds reported for cancer (Figure 1a), we tried to find their interacting human proteins. Out of 42 active compounds, we have found interacting protein hits for 12 compounds (tannic acid, quercetin, betacyanin, amaranthin, mangiferin, voacangine, quinic acid, andrographolide, luteolin, apigenin, rutin, gallic acid). Among these compounds, tannic acid (Bridgeman, Nguyen, and Kishore 2018), quercetin (J. Jeong et al. 2009), mangiferin (Núñez Selles, Daglia, and Rastrelli 2016), voacangine (Y. Kim, Jung, and Kwon 2012), quinic acid (Singh, Chauhan, and Tripathi 2018), andrographolide (Peng et al. 2018), luteolin (Cook 2018), apigenin (Yan et al. 2017), rutin (Khan et al. 2019), and gallic acid (Liu et al. 2012) are already reported for their anticancer activities. This suggests that the screening process from the traditionally used medicinal plants contain both potential anticancer properties.
Against these 12 compounds, we could identify 83 interacting protein targets. These targeted proteins are mostly enriched by cell cycle regulator proteins. We have selected 6 active compounds (andrographolide, luteolin, apigenin, quercetin, amaranthin, and mangiferin) (Figure 1b) and 7 interacting proteins (CCR3, CDK2, MAPK8, TP53, HIBCH, NOS3, and CDK4) to test their binding affinity through molecular docking study. Binding affinity ranges from −7.8 kcal/mol to - 9.6 kcal/mol (Figure 2, Table 1). In general, the most negative numerical value for the binding affinity indicates the best predicted binding between a ligand and a macromolecule (Dallakyan and Olson 2015). This result indicates that these plant-based active compounds have potential binding affinity with identified protein targets.
CDK4 expression is differentially regulated by multiple cancer
The combination of mangiferin and Cyclin-dependent kinase 4 (CDK4) seems interesting because CDK4 inhibition is potential target for cancer treatment (Goel et al. 2018). As a fist step of targeting the CDK4, we tried to find out in which type of cancer CDK4 is differentially expressed. We have analyzed the gene expression data of CDK4 in 33 cancer sub-types: ACC (Adrenocortical carcinoma), BLCA (Bladder Urothelia Carcinoma), BRCA (Breast invasive carcinoma), CESC (Cervical squamous cell carcinoma and endocervical adenocarcinoma), CHOL (Cholangio carcinoma), COAD (Colon adenocarcinoma), DLBC (Lymphoid Neoplasm Diffuse Large B-cell Lymphoma), ESCA (Esophageal carcinoma), GBM (Glioblastoma multiforme), HNSC (Head and Neck squamous cell carcinoma), KICK (Kidney Chromophobe), KIRC (Kidney Renal clear cell carcinoma), KIRP (Kidney renal papillary cell carcinoma), LAML (Acute Myeloid Leukemia), LGG (Brain Lower Grade Glioma), LIHC (Liver hepatocellular carcinoma), LUAD (Lung adenocarcinoma), LUSC (Lung squamous cell carcinoma), MESO (Mesothelioma), OV (Ovarian serous cystadenocarcinoma), PAAD (Pancreatic adenocarcinoma), PCPG (Pheochromocytoma and Paraganglioma), PRAD (Prostate adenocarcinoma), READ (Rectum adenocarcinoma), SARC (Sarcoma), SKMC (Skin Cutaneous Melanoma), STAD (Stomach adenocarcinoma), TGCT (Testicular Germ Cell Tumors), THCA (Thyroid carcinoma), THYM (Thymoma), UCEC (Uterine Corpus Endometrial Carcinoma), UCS (Uterine Carcinosarcoma), UVM (Uveal Melanoma) (Figure 3). CDK4 is highly expressed in GBM, LGG, and SARC (Figure 3). This observation is further tested by comparing the CDK4 expression between tumor and normal cells in GBM, LGG, and SARC. CDK4 expression differs significantly in GBM and LGG, but the difference in SARC is not significant due to lack of enough normal cell data (Figure 4). Additionally, expression of CDK4 is analyzed at different stage of GBM, LGG, and SARC (Figure 5). Its expression is consistent and higher in every stage compared to TP53 indicates the potential use of CDK4 expression for diagnostic and drug target candidate (Figure 5a, 5b). CCR3 expression is used as negative control (Figure 5c). This gene expression analysis clearly indicates that CDK4 expression is a prime indicator of uncontrolled cell division during cancerous growth and their expression compared to TP53 strengthen the evidence.
CDK4 as potential mangiferin based drug target
The patients’ survival analysis is one of the major concerns to focus on certain gene to use as diagnosis tool or drug target. We have analyzed the relation between patients’ survival and CDK4 expression (Figure 6). In both SARC (Figure 6a) and LGG (Figure 6b), patients’ survival is directly correlated with the higher expression of CDK4. This data corroborates with our previously shown expression profile of CDK4 at different stage of SARC, LGG, and GBM (Figure 5). As CDK4 expression is directly linked to the patients’ survival events, next we have compared the CDK4 expression level in other 33 sub-types (Figure 3), where CDK4 expression was detected, and found that CDK4 expression is significantly higher in all of these subtypes compared to TP53 (Figure 7). Altogether, the CDK4 expression profiling at 33 cancer sub-types and correlation with the patients’ survival has emphasized its importance as diagnostic tool and anticancer target, which we aimed based on our phytochemical screening (Figure 1, 2).
Additionally, we have identified the other genes (METTL1, METTL21B, TSFM, OS9, MARCH9, TSPAN31) which co-express along with CDK4 (Figure 8). CDK4 expression correlates to METTL1 (Figure 8a), METTL21B (Figure 8b), TSFM (Figure 8c), OS9 (Figure 8d), MARCH9 (Figure 8e), TSPAN31 (Figure 8f) with a R value 0.74, 0.61, 0.61, 0.57, 0.5, 0.41; respectively. This co-expression profiling is helpful to develop a set of genes along with CDK4, where mangiferin effect will be demonstrated by their expression level.
Discussion
In the recent years, due to the effort of MoonshotSM project and an increase amount of interest among local scientists to develop medicinal plant databases (Ashraf et al. 2014; Brito and Brito 1993; Babu et al. 2006; Mohammed et al. 2010; Mohanraj, Karthikeyan, Vivek-Ananth, et al. 2018), the trend of using phytochemical as drug targets in reviving. Our endeavor started with the MPDB1.0 (Ashraf et al. 2014) and extended through MPDB2.0. This database provides a huge resource to start the screening. From our screening, we have identified mangiferin. This simple compound is not only available in our food items, but also used in traditional medicinal practice. It has been known as treatment for diabetes, infection, and cancer (Kavitha et al. 2013; Tolosa et al. 2013). Natural medicine, Vimang®, is produced from Mangifera indica extracts and used for anti-inflammatory phytomedicine (Rajendran et al., 2014). Previous studies suggest that mangiferin regulates Mitogen Activated Protein Kinase pathway and progression of G2/M phase of cell cycle (Lv et al. 2013). Consistent with that, we have identified MAPK8, CDK2, and CDK4 in our list of interacting proteins (Figure 2). Fundamentally, this study identified one of the cell cycle regulators, CDK4, which was predicted in other complementary studies.
Interestingly, CDKs are known to be upregulated during cancer and CDK inhibitors are used as anticancer drug trial (Goel et al. 2018; J. H. Jeong et al. 2009; Yin et al. 2018; Malumbres and Barbacid 2009; Vijayaraghavan et al. 2018). As a result, finding mangiferin targeting CDK4 as anticancer drug target has validated our approach (Figure 2). Even in five years back, it was not straight forward to test our hypothesis and screen active compounds from MPDB1.0 for their anticancer activity. During the last few years, availability of cancer related RNA sequencing datasets such as TCGA and GTEx has made it possible (Consortium 2015; Lonsdale et al. 2013; Weinstein et al. 2013). Our analysis to link the mangiferin with CDK4 was accelerated by using this large-scale RNA sequencing data. Our analysis of CDK4 expression in the different cancer sub-types has opened the new door to develop CDK4 as a cancer diagnostic tool as it is expressed prominently in multiple cancer sub-types throughout different stages and correlate with patients’ survival (Figure 3, 4, 5, 6, 7). It has also emphasized on cyclin-dependent kinase as a general target to study against mangiferin and its derivative in near future.
Based on the current study, we have developed the hypothesis that mangiferin is targeting cell cycle regulator CDK4, which is upregulated in majority of cancer events, and mango tree (Mangifera indica) derived extract mangiferin can be used as inhibitor of CDK4 to suppress its expression and used as potential anticancer drug (Figure 9).
Conclusions
Mangiferin, a mango tree (Mangifera indica) derived extract, which is inexpensive and easily available is one of the potential CDK4 inhibitors. As the extraction procedure of purified mangiferin and commercially purified mangiferin are available, it will help us to quickly move forward to test the hypothesis whether mangiferin can inhibit CDK4 to regulate cancerous growth. Fundamentally, this study ushered the way of developing affordable anticancer drugs.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ contributions
R.K.C., N.H., and M.S.R. collected the medicinal plant and their active compounds’ information. R.K.C. and N.H. studied the interacting proteins for active compounds. S.A. executed the molecular docking. M.A.A. did the large-scale gene expression analysis. S.S. did the gene expression analysis for the heatmap. T.A.F. is responsible for the curation of database. A.A.T. and F.M. contributed from the MPDB1.0 team. M.K.H. supervised R.K.C. and N.H. and M.A.S supervised M.S.R. for this project. Manuscript draft was written by M.K.H. and S.S. E.H.A. provided his expert opinion to improve the manuscript. M.K.H., M.A.S., and M.A.A. coordinated the project. M.A.A. has adopted the project idea, designed experiments, prepared final figures, and wrote the manuscript.
Data availability
Data used in this study or produced is either available from the public domain and mentioned here in the manuscript. Apart from that, authors always welcome to share the data required for reviewers and other researchers.
Funding statement
This project is currently not funded.
Acknowledgement
Authors would like to thank researchers around the globe for their relentless effort to put together the information of medicinal plants.