Metabolomics profile of umbilical cord blood is associated with maternal pre-pregnant obesity in a prospective multi-ethnic cohort displaying health disparities

Maternal obesity has become a growing global health concern that impacts fetal health and subsequently predisposes the offspring to medical conditions later in life. However, the molecular link between abnormal fetal metabolomic profiles and maternal obesity has not yet been fully elucidated. In this study, we report new discoveries from the newborn cord blood metabolomes associated with a case-control maternal obesity cohort, collected from multi-ethnic populations in Hawaii, including rarely reported Native Hawaiian and other Pacific Islanders (NHPI). This cohort displays significant maternal obesity disparities by subjects’ residential area average income and health insurance. An elastic net penalized logistic regression model was constructed to associate cord blood metabolomics and demographic/physiological information to maternal obesity, with accuracy as high as 0.947. We identify 29 metabolites as early-life biomarkers manifesting intrauterine effect of maternal obesity.


Introduction
Obesity is a global health concern. While some countries have a relative paucity of obesity, in the United States, obesity affects 38% of adults (1, 2). As such, maternal obesity has risen to epidemic proportions in recent years and can impose significant risk to both the mother and unborn fetus. Recently, research has focused on the association of maternal health during pregnancy and the subsequent effects on the future health of offspring (3). Since the inception of Barker's hypothesis in the 1990's, efforts to connect intrauterine exposures with the development of disease later in life has been the subject of many studies (4). Both obesity and its accompanying morbidities, such as diabetes, cardiovascular diseases and cancers, are of particular interest as considerable evidence has shown that maternal metabolic irregularities may have a role in genotypic programming in offspring (5,6). Identifying markers of predisposition to health concerns or diseases would present an opportunity for early identification and potential intervention, thus providing life-long benefits (7)(8)(9).
Previous studies have found that infants born to obese mothers consistently demonstrate elevation of adiposity and are at more substantial risk for the development of metabolic disease (10). While animal models have been used to demonstrate early molecular programming under the effect of obesity, human research to elucidate the underlying mechanisms in origins of childhood disease is lacking (11). In Drosophila melanogaster, offspring of females given a high-sucrose diet exhibited metabolic aberrations both at the larvae and adult developmental stages (12,13). Though an invertebrate model, mammalian lipid and carbohydrate systems show high level of conservation in Drosophila melanogaster (14,15). In a mouse model of maternal obesity, progeny demonstrated significant elevations of both leptin and triglycerides when compared with offspring of control mothers of normal weight (5). The authors proposed that epigenetic modifications of obesogenic genes during intrauterine fetal growth play a role in adaption to an expected future environment. Recently, Tillery et al. used a primate model to examine the origins of metabolic disturbances and altered gene expression in offspring subjected to maternal obesity (16). The offspring consistently displayed significant increases in triglyceride level and also fatty liver disease on histologic preparations. However, human studies that explore the fetal metabolic consequences of maternal obesity are still in need of investigation.
Metabolomics is the study of small molecules using high throughput platforms, such as mass spectroscopy (17). It is a desirable technology that can detect distinct chemical imprints in tissues and body fluids (18). The field of metabolomics has shown great promise in various applications including early diagnostic marker identification (19), where a set of metabolomics biomarkers can differentiate samples of two different states (eg. disease and normal states). Cord blood metabolites provide information on fetal nutritional and metabolic health (20), and could provide an early window of detection to potential health issues among newborns. Previously, some studies have reported differential metabolite profiles associated with pregnancy outcomes such as intrauterine growth restriction (21) and low birth weight (22). For example, abnormal lipid metabolism and significant differences in relative amounts of amino acids were found in metabolomic signatures in cord blood from infants with intrauterine growth restriction in comparison to normal weight infants (21). In another study higher phenylalanine and citrulline levels but lower glutamine, choline, alanine, proline and glucose levels were observed in cord blood of infants of low-birth weight (22). However, thus far no metabolomics studies have been reported to specifically investigate the impact of maternal obesity on metabolomics profiles in fetal cord blood (21)(22)(23)(24).
This study aims to investigate metabolomics changes in fetal cord blood associated with obese (BMI>30) and normal pre-pregnant weight (18.5<BMI<25) mothers, in a multi-ethnic population including Native Hawaiian and other Pacific Islanders (NHPI). NHPI is a particularly under-represented minority population across most scientific studies. To ensure the quality of the study, we enrolled the mothers undergoing elected C-sections without any clinically known gestational diseases. In addition to the cord blood samples of their babies at birth, we collected comprehensive EMR records from the subjects, other maternal and paternal parameters such as ethnicities. This study has not only revealed the maternal health disparities on Oahu island, Hawaii, but also discovered the metabolomic links between cord blood and maternal pre pregnant obesity. These metabolites are potential early-life biomarkers affected by maternal obesity.

Study population
We performed a multi-ethnic case-control study at Kapiolani Medical Center for Women and Children, Honolulu, HI from June 2016 through June 2017. The study was approved by the Western IRB board (WIRB Protocol 20151223). To avoid confounding of inflammation accompanying labor and natural births (25) we recruited women scheduled for full-term cesarean section at ≥ 37 weeks gestation. All subjects fasted for at least 8 hours before the scheduled cesarean delivery. Patients meeting inclusion . CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint criteria were identified from pre-admission medical records with pre-pregnancy BMI ≥ 30.0 (cases) or 18.5-25.0 (controls). The pre-pregnancy BMIs were also confirmed during the enrollment. Women with preterm rupture of membranes (PROM), labor, multiple gestations, pre-gestational diabetes, hypertensive disorders, cigarette smokers, HIV, HBV, and chronic drug users were excluded. Demographic and clinical characteristics were recorded, including residential zipcode, insurance type, maternal and paternal age, maternal and paternal ethinicities, mother's pre-pregnancy BMI, net weight gain, gestational age, parity, gravidity and ethnicity. A total of 57 samples (28 cases and 29 controls) were collected.

Sample collection, preparation and quality control
Cord blood was collected under sterile conditions at the time of cesarean section using Pall Medical cord blood collection kit with 25 mL citrate phosphate dextrose (CPD) in the operating room. The umbilical cord was cleansed with chlorhexidine swab before collection to ensure sterility. The volume of collected blood was measured and recorded before aliquoting to conicals for centrifugation. Conicals were centrifuged at 200g for 10 minutes, with break off, and plasma was collected. The plasma was centrifuged at 350g for 10 minutes, with break on, aliquoted into polypropylene cryotubes, and stored at -80 C.

Metabolome profiling
The plasma samples were thawed and extracted with 3-vol cold organic mixture of ethanol: chloroform and centrifuged at 4 °C at 13500 rpm for 20 min. The supernatant was split for lipid and amino acid profiling with an Acquity ultra performance liquid chromatography coupled to a Xevo TQ-S mass The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint obtained for each metabolite. The detected metabolites from GC-MS were annotated and combined using an automated mass spectral data processing (AMSDP) software package (26). The levels of lipids and amino acids detected from LC-MS were calculated with calibration curves established with reference standards.

Metabolomics data processing
We conducted data pre-processing similar to the previous report (27). Briefly, we used K-Nearest Neighbors (KNN) method to impute missing metabolomics data (28). To adjust for the offset between high and low-intensity features, and to reduce the heteroscedasticity, the logged value of each metabolite was centred by its mean and autoscaled by its standard deviation (29). We used quantile normalization to reduce sample-to-sample variation (30). We applied partial least squares discriminant analysis (PLS-DA) to visualize how well metabolites could differentiate the obese from normal samples.

Classification modeling and evaluation
To reduce the dimensionality of our data (230 metabolites vs 57 samples), we selected the unique metabolites associated with separating obese and normal status. To achieve this, we used a penalized logistic regression method called elastic net that was implemented in the glment R package (31). Elastic net method selects metabolites that have non-zero coefficients as features, guided by two penalty parameters alpha and lambda (31). Alpha sets the degree of mixing between lasso (when alpha=1) and the ridge regression (when alpha=0). Lambda controls the shrunk rate of cofficients regardless of the value of alpha. When lambda equals zero, no shrinkage is performed and the algorithm selects all the features. As lambda increases, the coefficients are shrunk more strongly and the algorithm retrives all features with non-zero coefficients. To find optimal parameters, we performed 10-fold cross-validation that yield the smallest prediction minimum square error (MSE). We then used the metabolites selected by the elastic net to fit the regularized logistic regression model. Three parameters were tuned: cost, which controls the trade-off between regularization and correct classification, logistic loss and epsilon, which sets the tolerance of termination criterion for optimization. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint To construct and evaluate the model, we divided samples into 5 folds. We trained the model on four folds (80% of data) using leave one out cross validation (LOOCV) and measured model performance on the remaining fold (20% of data). We carried out the above training and testing five times on all folds combination. We plotted the receiver-operating characteristic (ROC) curve for all folds prediction using pROC R package. To adjust confounding other clinical covariants such as ethnicity, gravidity and parity, we reconstructed the metabolomics model above by including these factors.

Analysis on metabolite features
We used Classification And REgression Training (CARET) R package to rank metabolites based on the model-based approach (32). In this approach, each metabolite was assigned a score that estimates its contribution to the model performance (33). These scores were scaled to have a maximum of 100. We performed metabolomic pathway analysis on metabolites chosen by the elastic net method using Consensus Pathway DataBase (CPDB). We used rcorr function implemented in Hmisc R package to compute the correlations among clinical metabolomics data.

Data availability
The metabolomics data generated by this study is deposited to Metabolomics workbench (ID1312).

Cohort subjects characteristics
Our cohort consisted of three ethnic groups: Caucasian, Asian and Native Hawaiian and other Pacific Islander (NHPI). Women undergoing scheduled cesarean delivery were included based on the previously described inclusion and exclusion criteria (Methods). Demographical and clinical characteristics in obese and control groups are summarized in Table 1. In the Caucasian group (10 mothers), 6 were categorized as non-obese and 4 as obese. In the Asian group (23 mothers), 16 were categorized as non-obese and 7 as obese. In the NHPI group (24 mothers), 7 (24%) were categorized as non-obese and 17 (61%) as obese. Babies of obese mothers have significantly (P=0.03) higher birth weight compared to the normal pre pregnant weight group, consistent with earlier observations (34,35).

The cohort displays disparities of maternal obesity
In an interest to seek possible relationship between maternal obesity and social economics status, we retrieved the residential zip code and patient insurance from the subjects. To estimate each subject's income, we used the surrogate of annual averaged personal income in that person's zip code, based on the IRS data of year 2016. A 2-tail t-test analysis shows that the difference of annual income between obese and control group's residential areas is statistically significant (p=0.04, Figure 1A). Further, we looked into the differences of insurance carriers of the subjects. In the state of Hawaii, the lowest tier of insurance is Quest care, which covers medicaid and less wealthy population. The mid-tier coverage are provided by HMSA HMO, HMA, TriCare, and United Health Care. The highest and most expensive health coverage is under HMSA PPO, in which patients pay a premium to have more choices in picking healthcare providers. Figure 1B shows the number of obese and non-obese subjects across 11 different working insurance companies in Hawaii. Around 57% of obese group are insured by Quest, compared to 28% of control group with this insurance; whereas 38% of control group are insured by HMSA PPO, compared to 17% of the obese group with this insurance; A two-way ANOVA analysis between obesity vs. high and low insurance tier has almost significant p-value (p=0.06), indicating signs of social economial disparities among mothers.

Preliminary assessment of metabolomics results
. CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint We detected a total of 230 metabolites, including 79 untargeted and 151 targeted metabolites (11 amino acids, and 140 lipids). To test if these metabolites allow clear separation between the obese and normalweight subjects, we used elastic net regularization based logistic regression, rather than the partial least squares (PLS) model, a routine supervised multivariate method which only yielded modest accuracy AUC=0.62 (Fig 1S). Elastic net regularization overcomes the limitation of either ridge and lasso regularization alone, and combines their strengths to identify an optimized set metabolites [25]. Using the optimized regularization parameters (Fig. 2S

Combined predictive model for maternal obesity with consideration of confounding
Some demographic and physiological factors, such as maternal/paternal ethnicity and parity (Table 1) may confound the metabolites signatures above. To check this, we conducted two analyses. First, we explored the correlations among the demographic factors and metabolomics data. It is evident that several metaboloties are correlated with maternal and paternal ethnicity, gravidity, and/or parity ( Figure 3). For example, maternal ethnicity is positively correlated with 2-hydroxy-3-methylbutyric acid. Next, we built a logistic regression model using the above-mentioned four covariates (parity, gravidity, maternal and To elucidate the biological processes in newborns that may be effected by maternal obesity, we performed pathway enrichment analysis on the 29 metabolite features, using Consensus pathway database (CPDB) tool (36). To gain most comphrehsenive pathway list, we combined multiple pathway databases including KEGG, Wikipathways, Reactome, EHNM and SMPDB. A list of 10 pathways are enriched with adjusted p-value q<0.05 ( Figure 4B). Among them, alanine and aspartate metabolism is the most significantly enriched pathway (q=0.004). Transmembrane transport of small molecules and SLC-mediated transmembrane transport are also significantly enriched (q=0.004 and q=0.01 respectively).
. CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint

Ethinicity association of metabolites
Our earlier correlational analysis suggested that maternal ethnicity may be correlated with 2-hydroxy-3methylbutyric acid level. To confirm this, we conducted 2-way ANOVA statistical tests and indeed obtained significant p-value (P=0.023, chi-square test). We thus stratified the levels of 2-hydroxy-3methylbutyric acid by ethnicity ( Figure 5). There is no significant difference in normal pre pregnantweight subjects across the three ethnic groups ( Figure 5A). However, in cord blood samples associated with obese mothers, the concentration of 2-hydroxy-3-methylbutyric acid is much higher in NHPI, as compared to Caucasians (p=0.05) or Asians (p=0.04) ( Figure 5B). 2-hydroxy-3-methylbutyric acid originates mainly from ketogenesis through the metabolism of valine, leucine and isoleucine (37). Since all subjects have fasted 8 hours before the C-section, we expect the confounding from diets is minimized among the three ethnical groups. Thus the higher 2-hydroxy-3-methylbutyric acid level may indicate the higher efficiency of ketogenesis in babies born from obese NHPI mothers.

Discussion
This study aims to distinguish key cord blood metabolites associated with maternal pre-pregnancy obesity. The novelty of the study is manifested in several folds. First, we have collected a unique multiethnic population in Hawaii, which includes Asian, NHPI and Caucasians. Secondly, this is the first human metabolomics study that is also connected to maternal obesity disparities, demonstrated by geographical and health insurance analyses. Thirdly, we utilize state of the art metabolomics technology platform coupling GC-MS and LC-MS platforms, which allows us to detect hundreds of metabolites simultaneously. Lastly, we use the state of art method called elastic net based logistic regression that drastically improves the classification accuracy on cord blood metabolomics data.
We conducted rigorous statistical modeling and found that metabolites can distinguish the two maternal groups with accuracy as high as AUC=0.97 (or 0.947 after adjusting for confounding effects).
Metabolomics pathway analysis on the metabolite features in the model identified 10 significant . CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint pathways. Among them, alanine and aspartate metabolism was previously reported to be associated with obesity (38). Transmembrane transport was identified as another significant pathway. The transmembrane transport pathway corresponds to the acylcarnitine metabolites in the features. Acylcarnitines are known transmembrane transporters of fatty acids across the mitochondrial membrane (39). Among all metabolites and physiological/demographic features selected by the combined model, galactonic acid has the largest impact on the model performance (importance score =86%). Galactonic acid, was previously shown to be associated with diabetes in a mouse model, due to a proposed mechanism of oxidative stress (40). On the other hand, maternal ethnicity has the largest impact among physiological factors (importance score =84%).
A very few cord blood metabolomics studies have been carried out to associate with maternal obesity directly, or birth weight (22,41,42). In a recent Hyperglycemia and Adverse Pregnancy Outcome (HAPO) Study, Lowe et al. reported that branched-chain amino acids such as valine, phenylalanie, leucine/isoleucine and AC C4, AC C3, AC C5 are associated with maternal BMI in a meta-analysis over 4 large cohorts (400 subjects in each) (42). In another study to associate cord blood metabolomics with low birth weight (LBW), Ivorra et al. found that newborns of LBW (birth weight < 10th percentile, n = 20) had higher levels of phenylalanine and citrulline, compared to the control newborns (birth weight between the 75th-90th percentiles, n = 30) (22). They also found lower levels of choline, proline, glutamine, alanine and glucose in new borns of LBW, however, there was no significant differences between the mothers of the two groups. In our study, isoleucine is also identified as one of the 29 metablite features related to maternal obesity; although alanine iteself is not selected by the model to be a maternal obesity biomarker in cord blood, we did find that alanine and aspartate metabolism are enriched in the cord blood samples associated with maternal obesity group.
Notably, our study has identified 5 metabolites which are previously not reported in the literature with association to obesity or maternal obesity: galactonic acid, L-arabitol, indoxyl sulfate, 2-hydroxy-3-. CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint methylbutyric acid and citric acid. Except citric acid, all the other four metabolites are increased in obese associated cord blood samples. 2-hydroxy-3-methylbutyric acid concentrations varied by ethnicity, but only in babies born from obese pre-pregnant mothers. 2-hydroxy-3-methylbutyric acid is known to accumulate in high levels during ketoacidosis and fatty acid breakdown. Therefore, the higher elevation of 2-hydroxy-3-methylbutyric acid is likely due to increased cellular ketoacidosis and fatty acid breakdown in new borns from obese pre-pregnant mothers. To the best of our knowledge, this is the first study that shows differences in the 2-hydroxy-3-methylbutyric acid concentration levels among different ethnicities. Additionally, Indoxyl sulfate is a metabolite of the amino acid tryptophan. As tryptophan is commonly found in fatty food, red meat and cheese, it is possible that high levels of indoxyl sulfate detected in the cord blood associated with obese pre-pregnant mothers could be due to the maternal high fat diet. Oppositely, citric acid, a compound associated with the citric acid cycle (43), is decreased in the cord blood associated with obese pre-pregnant mothers. This could be related to the lower vegitable and fruit consumptions among obese pre-pregnant mothers. In all, the data suggest that maternal obesity may impact offspring cord blood metabolites. Further research into the specific mode of action of these metabolites would be beneficial in understanding its association with maternal obesity.
One limitation of the study was the modest sample size, given the stringent inclusion and exclusion crtieria. To avoid the confounding from labor and vaginal delivery, we only targeted mothers having elective C-sections. We also excluded obese mothers who had known complications during pregnancy, such as pre-gestational diabetes, smoking, and hypertension. These criteria helped to improve the qualtiy of the metabolomics data, at the tradeoff of the sample size. Awaring of this potential issue, we assessed the regression model using cross-validation and hold-out testing dataset, rather than using another validation cohort. The second caveat is that the extent of confounding due to maternal diet is unknown, although all subjects fasted 8 hours before the Cesarean section. Fasting states are commonly employed in metabolomics studies. Thirdly, we determined the subjects' ethnicity by self-reporting rather than genotyping, due to the restriction of the currently approved IRB protocol. Additionally, there has been . CC-BY 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint debates on the use of BMI as an indicator of obesity (44), and more direct measures of body fat could be considered such as skin-fold thickness measurements, bioelectrical impedance and energy x-ray absorptiometry (45,46). Lastly, a longitudinal follow-up study on the developmental trajectory of offspring of obese mothers would provide further insights. We plan to conduct a larger-scale maternaloffspring obesity study by addressing all the issues above. Neverthless, this study has established relationships between cord blood metabolomics with maternal pre-pregnant obesity, which in turn is associated with socialeconomical disparties.

Conclusion
In this study, we identified 29 metabolites that are associated with maternal obesity, 5 of which are previously unreported in the literature. These metabolites have the potential to be maternal obesity-related bio-markers in newborns that warranty dietary interventions in early-life.    The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint    The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/264374 doi: bioRxiv preprint