PT - JOURNAL ARTICLE AU - Anna Niehues AU - Daniele Bizzarri AU - Marcel J.T. Reinders AU - P. Eline Slagboom AU - Alain J. van Gool AU - Erik B. van den Akker AU - Peter A.C. ’t Hoen AU - the BBMRI-NL BIOS and Metabolomics Consortia TI - Metabolic predictors of phenotypic traits can replace and complement measured clinical variables in transcriptome-wide association studies AID - 10.1101/2022.02.01.478610 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.02.01.478610 4099 - http://biorxiv.org/content/early/2022/02/02/2022.02.01.478610.short 4100 - http://biorxiv.org/content/early/2022/02/02/2022.02.01.478610.full AB - Transcriptome-wide association studies (TWAS) can provide valuable insights into biological and disease-underlying mechanisms. For studying clinical effects, availability of (confounding) phenotypic traits is essential. The (re)use of RNA-seq or other omics data can be limited by missing, incomplete, or inaccurate phenotypic information. A possible solution are molecular predictors inferring clinical or behavioral phenotypic traits. Such predictors have been developed based on different omics data types and are being applied in various studies.In this study, we applied 17 metabolic predictors to infer various traits, including diabetes status or exposure to lipid medication. We evaluated whether these metabolic surrogates can be used as an alternative to reported information for studying the respective phenotypes using TWAS. Our results revealed that in most cases, the use of metabolic surrogates yields similar results compared to using reported information, making them suitable substitutes for such studies.The application of metabolomics-derived surrogate outcomes opens new possibilities for reuse of multi-omics data sets, especially in situations where availability of clinical metadata is limited. Missing or incomplete information can be complemented by these surrogates, thereby increasing the size of available data sets. Studies would likely also benefit from the use of such surrogates to correct for potential biological confounding. This should be further investigated.Author summary Transcriptome-wide association studies (TWAS) can be used to associate gene expression levels with phenotypic traits. These associations can provide insights into biological mechanisms including those that underlie diseases. Such studies require molecular profiling data from a large number of individuals as well as information on the phenotypic trait of interest. Biobanks that collect samples and corresponding molecular data, also collect information on phenotypic traits or clinical information. However, this information can be heterogeneous and/or incomplete, or a certain piece of information could be missing entirely. In this study, we apply metabolic predictors to infer various traits, including diabetes status or exposure to lipid medication. We evaluate whether these metabolic surrogates can be used as an alternative to reported information for studying the respective phenotypes using TWAS. Our results reveal that in many cases, the use of metabolic surrogates yields similar results compared to using reported information, making them suitable substitutes for such studies. The possibility of using these surrogate outcomes can thus increase the size of data sets for studies where phenotypic information is incomplete or missing.Competing Interest StatementThe authors have declared no competing interest.