Abstract
Current dietary recommendations are often generalized, conflicting, and highly subjective, depending on the source biases. This results in confusion, skepticism, and frustration in the general population. As an alternative, we propose an objective, integrated, automated, algorithmic approach to diet and supplement recommendations that is powered by artificial intelligence that analyzes individualized molecular data from the gut microbiome, the human host, and their interactions. This platform enables precise, personalized, and data-driven nutritional recommendations that consist of foods and supplements, based on the individual molecular data, to support healthy homeostasis. We describe the application of this precision technology platform to populations with depression, anxiety, irritable bowel syndrome (IBS), and type 2 diabetes (T2D). We show that our precision nutritional recommendations resulted in improvements in clinical outcomes by 36% in severe cases of depression, 40% in severe cases of anxiety, 38% in severe cases of IBS, and more than 30% in the T2D risk score which was validated against clinical measurement of HbA1c. Our data support the integration of precision food and supplements into the standard of care for these chronic conditions.
Introduction
There is strong epidemiological and molecular evidence that the human diet contributes significantly to the onset and progression of many chronic diseases and cancers [1][2][3][4][5][6]. Dietary advice, however, has been controversial and conflicting. Scientific publications often contradict one another, with one demonstrating health benefits while another suggesting potential harm for the same foods or diets [7][8][9][10]. As an example, one epidemiological study reports that daily consumption of only 100 mL of sugary drinks, even when the source is fruit juice, raises cancer rates significantly [11]. Other studies report that the consumption of daily fruit or fruit juice can actually prevent cancer [12][13][14]. Dairy consumption has also generated contradictory claims, with some evidence supporting an association with a higher incidence of some cancers and chronic diseases, some studies showing no such correlations, while other studies demonstrating significant benefits, and USDA recommending dairy to the Americans [3][15][16][17][18]. Likewise, recent publications provide conflicting data on the effects of saturated fats on cardiovascular and metabolic diseases [15][16][19][20]. Red meats have been touted as healthy sources of critical nutrients, such as vitamin B12 and dietary protein, and even shown to provide health benefits; yet there is strong epidemiological and mechanistic evidence that they may be responsible for increased rates of inflammatory diseases and cancers [1][21][22]. To complicate the landscape further, current nutritional guidelines are influenced by the food industry, which promotes the increased consumption of sugars, dairy, and meat [23][24].
The large body of conflicting nutritional information has manifested in the creation of hundreds of diets, each claiming health benefits. This leaves the consumer confused and frustrated, with little direction or confidence in results. Nutritional science finds itself in this predicament for several reasons. First and foremost, the vast majority of nutritional research doesn’t account for the contributions of an individual’s gut microbiome, despite a multitude of evidence proving that the impact is significant [25][26][27][28][29]. Secondly, researchers have mistakenly assumed that each food is either good or bad for all humans without understanding that the same food can have very different effects on different people. There is now strong scientific evidence that genetics plays a minor role, and the gut microbiome plays a major role in the effects of nutrition on human physiology [30][31]. This was elegantly demonstrated in a large twins study of metabolic disease parameters [28]. It is clear that the gut microbiome not only influences how food is digested, but also converts many of the molecular ingredients found in foods into beneficial or harmful secondary metabolites that have profound effects on human physiological functions, such as neurotransmitter production, immune system activation and deactivation (inflammation), immune tolerance, and carbohydrate metabolism [32][33][34–36].
Technological advances have made it possible for nutritional science to adopt a paradigm shift. Each food can now be viewed as a container of molecular ingredients (i.e. micronutrients), rather than a homogenous material that is either good or bad for human health. Additionally, the metabolic functions of each person’s unique microbiome need to be identified and quantified, in order to “prescribe’’ specific foods that contain molecular ingredients that will be converted to beneficial (associated with health) secondary metabolites by that person’s individual gut microbiome. Microbial functions also need to be quantified so each person can avoid foods that contain molecular ingredients that will be converted by their gut microbiome into harmful (associated with disease) secondary metabolites. In addition to the approaches that utilize known microbial metabolites and their micronutrient precursors, machine-learnt models from large clinical studies need to be applied to each person’s gut microbiome analysis to guide the best choice of foods and supplements. Such models have already been developed for selection of foods rich in carbohydrates and can minimize the blood glucose levels, especially in the postprandial compartment [27][26]. Besides the human microbiome, an analysis of human gene expression must be integrated into the systems biology view of the human body to enable the understanding of the network interactions among the dietary molecular ingredients, the microbiome, and human physiology. This understanding is critical because human gene expression, whether modulated by genetic, nutritional, microbial, or other environmental factors, has a direct connection with the onset and progression of chronic diseases [37,38][39][40][41].
In this report, we describe an evidence-based approach to improve clinical outcomes using precision diet and supplements that are computed by artificial intelligence algorithms using each individual’s molecular data and the self-reported phenotype. This approach is 100% algorithmic and data-driven, and there are no humans involved in making the nutritional recommendations. The molecular data are obtained from stool or a combination of stool and blood samples using highly accurate and reproducible, clinically validated, and clinically licensed metatranscriptomic tests [42][43]. These data are used to quantify the activity of microbial metabolic pathways, specific microbial taxa (at the strain level), and human gene expression levels. This information is then converted to precise and personalized diet and supplement recommendations. An overview of this approach is shown in Figure 1 and described in detail in the sections below. When this approach is used, it can reduce the symptoms of several important and highly prevalent chronic diseases.
AI algorithms-generated precision nutritional recommendations. Stool and blood samples are collected from participants. The RNAs are extracted, sequenced, and quantified using clinically validated laboratory and bioinformatics methods. Microbial genes including KOs are quantified from the RNA sequencing data of each participant. Functional scores are subsequently computed as a weighted function of relevant KOs. Finally, personalized nutrition recommendations are computed using all functional scores and phenotypes.
Methods and Results
Human subjects, ethical considerations, and study design
The clinical studies described here were approved by a federally-accredited Institutional Review Board (IRB). All samples and metadata were obtained from human subjects at least 18 years old and residing in the USA at the time of participation. All study participants consented to participating in the studies. The study design was non-conventional; while there was no control arm, the subjects were blinded to the fact that they were participating in an interventional study with clinical endpoints. All study participants were recruited into a “wellness study” that asked them questions. These questions were actually clinically validated surveys for depression (PHQ9), anxiety (GAD7), and IBS (Rome IV criteria), or request for their lab results for HbA1c.
Metatranscriptomic analyses of stool and transcriptomic analyses of capillary blood samples
The molecular analyses for the studies reported here focus on sequencing messenger RNAs (mRNAs) isolated from human stool and blood samples. Stool samples were collected and analyzed as previously reported [44]. Briefly, stool samples were collected by the study participants using the Viome commercial kits that included ambient temperature preservation solution and pre-paid return mailers. Stool metatranscriptomic analyses (RNA sequencing, RNAseq) were performed using an automated, clinically-validated laboratory and bioinformatics methods. Results consist of quantitative strain, species, and genus level taxonomic classification of all microorganisms, and quantitative microbial gene and KO (KEGG Ortholog, KEGG = Kyoto Encyclopedia of Genes and Genomes) expression levels. The matching blood samples were collected and analyzed as previously described [45]. Briefly, blood samples are collected by the study participants using the Viome commercial kits that included ambient temperature preservation solution and pre-paid return mailers. The kits include lancets and minivettes that enable easy and accurate collection of small volumes of blood from a finger prick. Transcriptomic analyses (RNA sequencing, RNAseq) were performed using an automated, clinically-validated laboratory and bioinformatics methods that require 50 microliters of capillary blood. Test results consist of quantitative human gene expression data.
All microorganisms that live in the intestines obtain their energy by converting chemical substrates into products, using metabolic pathways that consist of enzymes. Substrates are typically the molecular ingredients found in foods, and products are biochemicals usually referred to as secondary metabolites. Metatranscriptomic analysis of the gut microbiome enables the quantification of thousands of microbial pathways using the KEGG database. As part of the efficacy trials we describe here, we identified an average of 377 strains, 363 species, and 102 genera per stool sample, and a total of 2930 strains, 2,007 species, and 451 genera in all stool samples combined. In addition, we have identified an average of 1,916 KOs and 202 pathways per stool sample, and 5,467 KOs and 262 pathways in all stool samples combined. In blood samples, we quantified the expression of an average of 11,687 genes per sample and 15,434 genes in all samples combined.
Metabolic pathways and functional scores
We have designed functional scores to quantify certain biological phenomena; for example, leaky gut, inflammation, gas production, protein fermentation, cellular health, mitochondrial health, etc., that are relevant to human physiology and healthy homeostasis. Functional scores are weighted functions (Score = C1F1 + C2F2 +… + CnFn, where F is the feature and C is its weight) of components from the molecular data from the gut microbiome and/or blood transcriptome. The components that make up the functional scores can be taxa, microbial pathways, human pathways, or other functional scores. Functional scores range from 0 to 100, with 100 representing the highest possible activity. For example, a microbiome-derived functional score such as Butyrate Production Pathways is calculated as a weighted function consisting of the expression levels of many known butyrate-associated KOs (Figure 2) [46]. The expression level of each KO from the Butyrate Production Pathways score is quantified using the metatranscriptomic stool test and the score is then computed using the weighted formula [47]. The weights attributed to each KO within a pathway are determined by a combination of domain knowledge and statistical analyses obtained from a collection of 200,000 stool and blood samples [48][49]. By quantifying each enzymatic member (KO) of the butyrate pathways using microbial and human gene expression data, accurate pathway activities can be calculated [50].
Microbial butanoate (butyrate) production pathways based on KEGG annotations [46].
Making personalized nutritional recommendations
Precision nutritional recommendations are computed using the Viome AI Recommendation Engine and are designed to address an individual’s biological patterns at the functional level by boosting beneficial (health-associated) and suppressing harmful (disease-associated) activities with molecular ingredients from foods and supplements (Figure 1). This approach is built on the concept that on a molecular level, a particular food or supplement may be beneficial for one person, but harmful to a different person. Table 1 shows examples of this concept. The Viome AI Recommendation Engine uses a complex set of algorithms to determine the final food and supplement recommendation. The algorithms are developed from the domain knowledge (publications on microbial and human physiology, food science, clinical trials, etc.), phenotypic information, and extensive clinical studies from which machine-learned models of nutritional modulation of the microbiome and human physiology were developed (e.g. [26]). These algorithms are applied to the functional scores and phenotype for each person whose stool, or stool and blood samples are analyzed. Phenotype is determined from information provided by a participant in the wellness questionnaire, such as symptom assessment, known health conditions, allergies, and medications.
Examples of specific foods and supplements, and reasons for recommendation to consume or avoid.
The Viome AI Recommendation Engine considers compounds (molecular ingredients) in foods and supplements that can support the healthy functions of both the gut microbiome and the human. These compounds include specific polysaccharides, polyphenols, vitamins, minerals, amino acids, fatty acids, and many phytochemicals. This approach highlights the concept that a single food is more than simply its macronutrient content and that foods from the same family can have very different molecular compositions. For example, an almond is a source of many phytonutrients and compounds such as kaempferol (flavonoid), naringenin (flavonoid), ferulic acid (phenolic acid), oxalic acid, phytic acid, quercetin, procyanidin B2 and B3, magnesium, phytosterols such as retinol, a-tocopherol, vitamin K, vitamin D, and beta-sitosterol, fatty acids such as oleic acid, linoleic acid, and palmitic acid, and specific amino acids [51]. The decision to recommend a specific compound and its amount depends on the values of multiple functional scores. After considering all inputs, the recommendation engine classifies foods into one of four categories based on the molecular composition of each food. The food categories are superfoods, enjoy foods, minimize foods, and avoid foods, which are consumerized names that correspond to the recommended servings per day for each food.
Personalized supplement recommendations follow the same logic, considering all inputs to identify compounds that are beneficial or harmful to an individual’s functional scores and phenotype. Supplements include minerals, vitamins, botanicals or herbs, food extracts, enzymes, phospholipids, amino acids, prebiotics, and probiotics. When considering individual functional scores, supplement ingredients commonly believed to be beneficial may not be recommended. For example, turmeric is a commonly consumed supplement for its anti-inflammatory properties, but has also been shown to increase bile flow [52]. For individuals with a high Bile Acid Metabolism Pathway functional score, turmeric supplementation may be more harmful than beneficial. A high Bile Acid Metabolism Pathway score suggests that the microbial activity of transforming bile salts into bile acids is high. While such biotransformation is part of a balanced gut microbiota and bile acid homeostasis, excessive intestinal bile acids may promote a pro-inflammatory environment and play a role in the development of gastrointestinal diseases [53][54].
The process of categorizing foods and supplements (determining the servings or dose) includes prioritizing scores that need improvement and considering conflicts within the recommendations. An example is shown in Table 2: a low Energy Production Pathway functional score will yield recommendations based on compounds that contribute to the score activity, one of which is alpha-lipoic acid (ALA). Spinach and broccoli are recommended due to the ALA content that is a critical cofactor for mitochondrial energy production enzymes such as pyruvate dehydrogenase (PDH), alpha-ketoglutarate dehydrogenase (alpha-KGDH), and branched-chain ketoacid dehydrogenase (BCKDC) [55]. However, when considering additional score results, broccoli and spinach will be placed on the avoid food list due to broccoli’s glucosinolate content and spinach’s oxalate content. Instead, tomatoes and peas are recommended as sources of ALA to support the Energy Production Pathway functional score.
Food Recommendations Case Example
There are circumstances where a beneficial compound cannot be obtained from food due to an allergy or other health-related issue, or due to the lack of sufficient amounts in food. Personalized supplements help support those gaps in nutrition. In Table 2, pistachios are recommended due to their CoQ10 content. However, if an individual has an allergy to pistachios or exhibits small intestinal bacterial overgrowth (SIBO) symptoms, pistachios will be placed on the avoid or minimize food list. In this situation, CoQ10 can be provided through a supplement as the recommendation engine associates it as beneficial for the same score and/or phenotypic conditions.
Efficacy of precision recommendations
We performed single-arm, blinded, interventional studies to measure the clinical efficacy of precision recommendations to reduce the symptoms of Irritable Bowel Syndrome, Depression, and Anxiety.
The primary endpoints for the three conditions were:
For IBS, the IBS Symptom Severity Score (IBS-SSS, the Rome Foundation), a clinically validated questionnaire on a scale of 0 through 500, with 300+ indicating severe IBS, 175 to 300 indicating moderate IBS, and 75 to 175 indicating mild IBS.
For depression, the PHQ9 score, a clinically validated questionnaire of 9 questions that yields a score of 0 through 27, with 15+ indicating moderately severe or severe depression, 10 to 15 indicating moderate depression, and 5 to 10 indicating mild depression [56].
For anxiety, the GAD7 score, a clinically validated questionnaire of 7 questions that yields a score of 0 through 21, with 15+ indicating severe anxiety, 10 to 15 indicating moderate anxiety, and 5 to 10 indicating mild anxiety [57].
There are several reasons this pilot trial is blinded. The main one is that the trial participants were recruited to participate in a wellness study, where we collect survey information. None of the diseases or symptoms were mentioned to the potential participants. We then created a single “wellness questionnaire” that included questions from all three of the above questionnaires, as well as a number of additional general wellness questions. This way, participants were not aware of the fact that we were performing a condition-specific study for any of the three conditions (hence “blinded”).
Each interventional study had a single arm. At time point T1, each participant was given precision nutritional recommendations (diet and supplements) and asked to fill out the wellness questionnaire. At time point T2 each subject was asked to fill out the wellness questionnaire again. There were no additional communications with the participants between the two time points.
Figures 3A, 3B, and 3C show the changes in the interventional study endpoints following the precision recommendations for the three conditions studied.
Results showing the efficacy of precision nutritional recommendations on subjects with IBS (N=118) over a mean time period of 161 days, measured using the IBS-SSS score. Bars on the right are mean +/− standard deviation of IBS-SSS score. Figure 3B. Results showing the efficacy of precision nutritional recommendations on subjects with depression (N=143) over a mean time period of 144 days, measured using the PHQ9 score. Bars on the right are mean +/− standard deviation of PHQ9 score. Figure 3C. Results showing the efficacy of precision nutritional recommendations on subjects with anxiety (N=101) over a mean time period of 158 days, measured using the GAD7 score. Bars on the right are mean +/− standard deviation of GAD7 score. Note that moderate anxiety did not improve as significantly as the others (p=0.09).
We performed another interventional study to test the efficacy of our precision nutritional recommendations on people with type 2 diabetes (T2D). The primary endpoint for this study is a T2D risk score built using data from the gut microbiome metatranscriptomic analyses of over 50,000 subjects, and validated against an independent cohort of over 2200 subjects (cite). At time point T1, participants donated their stool sample from which their T2D risk score was calculated, and they were provided their precision nutritional recommendations. At time point T2, these participants donated a second stool sample from which their T2D risk score was calculated. They were also asked how well they adhered to their recommendations, low or high.
To evaluate the effect of adhering to the precision recommendations, we compared the difference in T2D risk scores between the “low adherence” cohort and the “high adherence” cohort. The two cohorts were matched by (a) starting risk score of +/− 10 points (i.e., within 20 points) and (b) period of adherence of +/− 60 days (i.e., within a total of 120 days), yielding a cohort size of N=1456, evaluated over a mean time period of 332 days. Figure 4 shows the difference in the change of the T2D risk score (y axis) between the low-adherent participants (blue box plot) and high-adherent participants (green box plot). We observe a statistically significant risk score improvement for people with high adherence (p=1.99e-05), with a mean risk score reduction of 7 points, which translates to 30% improvement (over the initial risk score of ∼23).
Participants who adhered to precision recommendations reduced their T2D risk score significantly.
Discussion
We describe a novel technology platform that overcomes some of the major shortcomings of nutritional science and improves clinical outcomes in several highly prevalent chronic diseases[58][59][60]. Traditional nutritional research has been conducted using various available methods. Epidemiological studies have been used to derive nutritional rules using large populations. These have yielded interesting results that have guided certain national-scale dietary recommendations, such as the current USDA guidelines in the USA. Many clinical and pre-clinical studies have also tested the effects of certain foods and diets on various health outcomes. Additionally, thousands of different supplements, including prebiotics and probiotics, and their combinations have been tested for their ability to improve certain symptoms. All these approaches have yielded varied results for different people; that is, some people benefit from the intervention and others do not, or some people are harmed by certain foods or diets, and others are not.
Over the last decade, it has become clear that the gut microbiome plays a crucial role in modulating the human physiology by regulating its immune system, metabolic functions, hormones, and neurotransmitters. It is important to note that the microbiome’s influence on human physiology is exerted via its functions, and not merely the taxonomy. For example, the fact that a person’s gut microbiome contains Faecalibacterium prausnitzii, which is a well-known butyrate producer, does not mean that it is producing butyrate. Only when it encounters the correct fiber substrate will it produce this important metabolite. Escherichia coli is a bacterial species that contains very beneficial members that produce vitamins K and B12, and help human hosts efficiently absorb iron; it also has deadly members, such as the enterohemorrhagic strains [61][62][63]. Therefore, identifying the species of Escherichia coli in a gut microbiome is not very informative, given the vastly different functions that members of this taxon can perform.
The key to understanding the effects of the gut microbiome on the human physiology and health is to quantify the microbial functions using metatranscriptomic, metaproteomic, or metabolomic approaches. We have chosen metatranscriptomics as the best approach, as it enables quantitative measurements of microbial protein expression that can identify both protein-based and metabolite-based effects on the human physiology, identify molecular ingredients in foods that can modulate microbial functions, and still provide the highest resolution of taxonomic classification (can distinguish strains) that can guide some aspects of the nutritional recommendations. Another essential aspect to understanding the relationship between the human physiology, microbiome and diet is to integrate the human molecular data into the equation. We use a whole blood transcriptome test that can quantify the expression levels of approximately 12,000 human genes from capillary (finger prick) blood.
We also want to emphasize that foods should be viewed as containers of molecular ingredients, instead of objects that we can visually recognize as onions, peppers, etc. The reason for this is that each food contains many different molecular ingredients, as exemplified above, that can exert a multitude of health-related effects on the host, either directly or via the microbiome metabolism.
Here, we describe a highly personalized, data-driven, AI recommendation system that uses each person’s molecular data (from stool, or stool and blood) and phenotype (symptoms, medications, etc.) to compute the best diet and supplements for each person. The recommendations are computed in two stages. First, the specific molecular ingredients that should be consumed or avoided are identified, based on which microbial or human functions they support. Next, these molecular ingredients are mapped onto foods that are available in stores. Using this approach, we provided food and supplement recommendations to study participants and collected the clinical outcomes measures at baseline (prior to delivering the nutritional recommendations) and at follow-up. We present the data that show significant improvement in clinical outcomes for all conditions and disease activity levels. This novel precision nutrition approach that uses highly individualized molecular data and machine-learned algorithms should be considered by healthcare professionals and health coaches as an effective supplement to their existing therapeutic strategies.
Limitations of the study
The studies presented here were single arm interventional studies without control arms. However, the participants were blinded to the fact they were in an interventional study, which minimized any placebo effect. We also did not capture the level of adherence to the nutritional recommendations for the studies on IBS, depression, and anxiety, which could significantly affect the observed results. The diet and supplement recommendations were delivered digitally, and no effort was made to either monitor or improve compliance. All of these limitations will be addressed in future studies.
Footnotes
Misspelling of the word "depending" was corrected.
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵