The efficacy and safety of Momordica charantia L. in animal models of type 2 diabetes mellitus; A systematic review and meta-analysis

Background Momordica charantia L. (Cucurbitaceae) has been used to control hyperglycemia in people with type 2 diabetes mellitus in Asia, South America, and Africa for decades. However, a meta-analysis of clinical trials confirmed very low-quality evidence of its efficacy. To potentially increase the certainty of evidence, we evaluated the effect of M. charantia L. in comparison with vehicle on glycemic control in animal models of type 2 diabetes mellitus. Methods Review authors searched in MEDLINE, Web of Science, Scopus, and CINAHL databases without language restriction through April 2019. Two authors independently evaluated full texts, assessed the risk of bias, and extracted data. We analyzed the influence of study design and evidence of publication bias. Results The review included 66 studies involving 1861 animals. They had a follow up between 7 and 90 days. Majority 29 (43.9%) used Wistar albino rats, and 37 (56.1%) used male animals. Thirty-two (48%) used an aqueous extract of fresh fruits. M. charantia L. reduced fasting plasma glucose (FPG) and glycosylated hemoglobin A1c in comparison to vehicle control (42 studies, 815 animals; SMD, −6.86 [95% CI; −7.95, −5.77], 3 studies, 59 animals; SMD; −7.76 [95%CI; −12.50, −3.01]) respectively. Magnitude of FPG was large in Wistar albino rat subgroup; SMD; −10.29, [95%CI; −12.55, −8.03]. Publication bias changed FPG to non-significant −2.46 SMD, [95%CI; - 5.10, 0.17]. We downgraded the evidence to moderate quality due to poor methodological quality, high risk of bias, unexplained heterogeneity, suspected publication bias, and lack of standardized dose. Conclusion M. charantia L. lowers elevated plasma glucose level in type 2 diabetes mellitus animal models. Publication bias and poor methodological quality call for future researches to focus on standardizing dose with chemical markers and provide measures to improve preclinical type 2 diabetes mellitus studies. Systematic review registration CRD42019119181

Introduction each database through April 2019. They also screened reference lists of included studies and 152 reviews for additional eligible studies not retrieved by the search. 153 The search strategy involved a combination of MeSH terms and keywords. The search terms were 154 divided into three components i.e., the population component with the following words; "animals," 155 "animal," "animals model," "preclinical studies," "experimental animals," "experimental animal," 156 "laboratory animal," "laboratory animals," "rodents," "rodent," "rabbits," "rabbit," "rats," "rat," 157 "diabetic rats," "animal disease model," "mice," "mouse." The intervention component's terms 158 were "Momordica charantia," "bitter melon," "bitter gourd," and "karela." The last component 159 had "diabetes mellitus, type 2," "non-insulin dependent diabetes mellitus," "NIDDM," "glucose 8 160 metabolic disorders," "metabolic diseases," "hyperlipidemia," "hyperglycemia," "insulin 161 resistance," and "glucose intolerance" terms. The three search components were combined with 162 the boolean logic term "AND" while the keywords within each component were combined with 163 "OR." Search filters for the identification of preclinical studies in PubMed were applied to increase 164 search efficiency [34]. Review authors did not restrict language during the search and 165 identification of studies. The final searches for each database were re-run just before the final 166 analyses to retrieve the most recent studies eligible for inclusion: The appendix S2 elaborated 167 search strategy and their results for PubMed, Scopus, and CINAHL databases (S2 Appendix).

168
Study design and animal models eligibility 169 Review authors included experimental animal studies if they were either randomized or non-170 randomized controlled designed, original full article with data presented numerically or 171 graphically, and those conducted in animal models of type 2 diabetes mellitus. The animal models 172 were carefully assessed to include those which closely mimic at least some aspects of the 173 pathophysiology of humans with type 2 diabetes mellitus such as insulin resistance and β-cells 174 failure to ensure construct validity [35]. Our review also included all sex, age, species and strain 175 of animals. However, the review excluded studies done in a human, in vitro, ex vivo, and in-silico 176 designs, and before-after studies without a description of the control group.

177
Intervention and comparison eligibility 178 The preclinical intervention group included animals from studies that evaluated the efficacy or

213
The authors gave "A" grade for studies with full information about the species of plant, 214 identification of specimen, and deposited voucher specimen, while they grade "B" those studies 215 with partial information about the species of plant such as studies which did not present 216 information on identification of specimen and a voucher specimen and those with inaccurate 217 taxonomic information. Finally, the authors rated "C" to studies with incomplete or not presented 218 at all information about the species of plant, or identification of specimens and a voucher specimen.

219
Methodological quality and risk of bias assessment 220 Review authors used SYRCLE's risk of bias tool to assess the risk of bias for each preclinical 221 animal study included [37]. The tool assessed domains of random sequence generation, baseline 222 characteristics, allocation concealment, random housing, blinding of investigators/caregivers, 223 random outcome assessment, blinding of assessor, incomplete outcome data, selective outcome 224 reporting, and other sources of bias. Each criterion was assigned value as high, low or unclear risk 225 of bias. The authors also used a modified CAMARADES checklist to assess the methodological 226 quality of the included studies. This checklist combined the reporting of several measures to reduce 11 227 bias and some indicators of external validity. The quality indicators are based on 10 criteria; 1) 228 peer-reviewed publication 2) statement of control of temperature 3) random allocation to treatment 229 or control 4) blinded caregiver/investigator 5) blinded assessment of outcome 6) use of co-230 interventions/co-morbid 7) appropriate animal model (age, sex, species, strain) 8) sample size 231 calculation 9) compliance with animal welfare regulations 10) statement of potential conflict of 232 interests [38]. Each study was given a quality score out of a possible total of 10 points. Finally, the 233 authors calculated mean score and categorized studies into "low quality" for mean score 1-5 and 234 "high quality" for mean score 6-10.   1. Heterogeneity assessment 254 We used the I 2 statistic to quantify heterogeneity in primary studies [40]. The I 2 of 75 or more was 255 considered as indicative of substantial heterogeneity [41,42]. Sensitivity analysis was done to 256 examine potential factors that influence heterogeneity on the primary outcome (FPG). For this 257 analysis we considered risk of bias score, methodological quality score, and performed subgroups 258 analysis by study design (randomized and non-randomized design), duration of treatment, dose,

263
Publication bias for each outcome was assessed by testing the asymmetry of the funnel plot using 264 Egger's test [43]. For the publication bias assessment, we only considered meta-analysis of ten or 265 more studies because test power is generally too low to distinguish chance from real asymmetry 266 when it includes a smaller number of the primary studies [43,44]. When publication bias was 267 detected, the trim and fill method was used to correct the probable publication bias by imputing     The majority of included studies 51 (77.3%) used rats, whereas 12 (18.2%) used mice and only 3 296 (4.5%) used rabbits. Regarding the strains of animal species used; 29 (43.9%) studies used Wistar 297 albino rats, 11 (16.7%) used Sprague-Dawley rats and the remaining 14 different strains used are 298 as shown in table 1. However, three of the included studies did not specify any strains used.

299
Among the studies included, 37 (56.1%) used male animals, six used only female animals (9.1%),       Four (4) out of 66 included studies were given taxonomical validation score of "A" because they  (Table 4).  Methodological quality 346 The quality score of the majority of studies included in the analysis 51 (77.3%) was between 2 and 347 3. The median score was 3 (interquartile range 1), which means that these studies had poor   consistently favored M. charantia L. (Fig 3). The data from three preclinical studies were pooled for assessment of HbA1c (Fig 4). There was  (Fig. 4). The review authors downgraded the evidence to very low-quality due to a severe

398
Although such an increase was not statistically significant, all the three studies favored the 399 intervention (Fig. 4). The I 2 was 93% indicated the presence of heterogeneity in the individual 400 study.  The data from 13 preclinical studies were pooled for assessment of triglycerides (Fig 6). Results

415
showed a very low-quality evidence that M. charantia L. significantly lowered TGs level in treated 416 group (n = 142) compared to vehicle control group (n = 87); -9.12 of SMD (95% CI; -11.76, -417 6.49). The I 2 was 92% indicated the presence of substantial heterogeneity in individual studies.  The HDL-c was assessed by integrating data from eight studies (Fig 6). There was low quality-  The LDL-c level in the M. charantia L. treated group (n = 72) was significantly decreased 430 compared to that observed in the vehicle control group (n = 50). The SMD of -6.71 (95% CI; -431 9.06, -4.36). The I 2 was 89% indicated the presence of heterogeneity (Fig 6). L. treated groups compared with the vehicle control groups. The I 2 were 77%, 64% and 82% 440 respectively; indicated the presence of significant heterogeneity (Fig 7). with the vehicle control groups. The I 2 were 84% and 89% respectively; indicated the presence of 448 significant heterogeneity (Fig 8). the presence of moderate heterogeneity in individual studies (Fig 9).     Competing interests 613 We wish to confirm that there are no known conflicts of interest associated with this publication 614 and there has been no significant financial support for this work that could have influenced its