Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Interoperable and scalable data analysis with microservices: Applications in Metabolomics

View ORCID ProfilePayam Emami Khoonsari, View ORCID ProfilePablo Moreno, View ORCID ProfileSven Bergmann, View ORCID ProfileJoachim Burman, View ORCID ProfileMarco Capuccini, View ORCID ProfileMatteo Carone, View ORCID ProfileMarta Cascante, View ORCID ProfilePedro de Atauri, View ORCID ProfileCarles Foguet, View ORCID ProfileAlejandra Gonzalez-Beltran, View ORCID ProfileThomas Hankemeier, View ORCID ProfileKenneth Haug, View ORCID ProfileSijin He, View ORCID ProfileStephanie Herman, View ORCID ProfileDavid Johnson, View ORCID ProfileNamrata Kale, View ORCID ProfileAnders Larsson, View ORCID ProfileSteffen Neumann, View ORCID ProfileKristian Peters, View ORCID ProfileLuca Pireddu, View ORCID ProfilePhilippe Rocca-Serra, View ORCID ProfilePierrick Roger, View ORCID ProfileRico Rueedi, View ORCID ProfileChristoph Ruttkies, View ORCID ProfileNoureddin Sadawi, View ORCID ProfileReza M Salek, View ORCID ProfileSusanna-Assunta Sansone, View ORCID ProfileDaniel Schober, View ORCID ProfileVitaly Selivanov, View ORCID ProfileEtienne A. Thevenot, View ORCID ProfileMichael van Vliet, View ORCID ProfileGianluigi Zanetti, View ORCID ProfileChristoph Steinbeck, View ORCID ProfileKim Kultima, View ORCID ProfileOla Spjuth
doi: https://doi.org/10.1101/213603
Payam Emami Khoonsari
1Department of Medical Sciences, Clinical Chemistry, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Payam Emami Khoonsari
Pablo Moreno
2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pablo Moreno
Sven Bergmann
3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
4Swiss Institute of Bioinformatics, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sven Bergmann
Joachim Burman
5Department of Neuroscience, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joachim Burman
Marco Capuccini
6Department of Information Technology, Uppsala, Sweden
7Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marco Capuccini
Matteo Carone
7Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Matteo Carone
Marta Cascante
8Department of Biochemistry and Molecular Biomedicine, Faculty of Biology, Universitat de Barcelona, Barcelona, Spain
9Institute of Biomedicine of the Universitat de Barcelona (IBUB) and Associated Unit to CSIC, Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marta Cascante
Pedro de Atauri
8Department of Biochemistry and Molecular Biomedicine, Faculty of Biology, Universitat de Barcelona, Barcelona, Spain
9Institute of Biomedicine of the Universitat de Barcelona (IBUB) and Associated Unit to CSIC, Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pedro de Atauri
Carles Foguet
8Department of Biochemistry and Molecular Biomedicine, Faculty of Biology, Universitat de Barcelona, Barcelona, Spain
9Institute of Biomedicine of the Universitat de Barcelona (IBUB) and Associated Unit to CSIC, Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carles Foguet
Alejandra Gonzalez-Beltran
10Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alejandra Gonzalez-Beltran
Thomas Hankemeier
11Division of Analytical Biosciences, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thomas Hankemeier
Kenneth Haug
2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kenneth Haug
Sijin He
2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sijin He
Stephanie Herman
1Department of Medical Sciences, Clinical Chemistry, Uppsala University, Uppsala, Sweden
7Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Stephanie Herman
David Johnson
10Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Johnson
Namrata Kale
2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Namrata Kale
Anders Larsson
12National Bioinformatics Infrastructure Sweden, Uppsala University, Uppsala, Sweden
7Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anders Larsson
Steffen Neumann
13Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
14German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Steffen Neumann
Kristian Peters
13Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kristian Peters
Luca Pireddu
15CRS4: Center for Advanced Studies, Research and Development in Sardinia, Pula, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Luca Pireddu
Philippe Rocca-Serra
10Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philippe Rocca-Serra
Pierrick Roger
16CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB, Gif-sur-Yvette, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pierrick Roger
Rico Rueedi
3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
4Swiss Institute of Bioinformatics, Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rico Rueedi
Christoph Ruttkies
13Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christoph Ruttkies
Noureddin Sadawi
17Faculty of Medicine, Department of Surgery & Cancer, Imperial College London, London, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Noureddin Sadawi
Reza M Salek
18International Agency for Research on Cancer 150 cours Albert Thomas 69372 Lyon CEDEX 08, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Reza M Salek
Susanna-Assunta Sansone
10Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Susanna-Assunta Sansone
Daniel Schober
13Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Daniel Schober
Vitaly Selivanov
8Department of Biochemistry and Molecular Biomedicine, Faculty of Biology, Universitat de Barcelona, Barcelona, Spain
9Institute of Biomedicine of the Universitat de Barcelona (IBUB) and Associated Unit to CSIC, Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vitaly Selivanov
Etienne A. Thevenot
16CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB, Gif-sur-Yvette, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Etienne A. Thevenot
Michael van Vliet
11Division of Analytical Biosciences, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael van Vliet
Gianluigi Zanetti
15CRS4: Center for Advanced Studies, Research and Development in Sardinia, Pula, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gianluigi Zanetti
Christoph Steinbeck
2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
19Friedrich-Schiller-University, Jena, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christoph Steinbeck
Kim Kultima
1Department of Medical Sciences, Clinical Chemistry, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kim Kultima
Ola Spjuth
7Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ola Spjuth
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed in parallel using the Kubernetes container orchestrator. The access point is a virtual research environment which can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and established workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry studies, one nuclear magnetic resonance spectroscopy study and one fluxomics study, showing that the method scales dynamically with increasing availability of computational resources. We achieved a complete integration of the major software suites resulting in the first turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, multivariate statistics, and metabolite identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted July 13, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Interoperable and scalable data analysis with microservices: Applications in Metabolomics
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Interoperable and scalable data analysis with microservices: Applications in Metabolomics
Payam Emami Khoonsari, Pablo Moreno, Sven Bergmann, Joachim Burman, Marco Capuccini, Matteo Carone, Marta Cascante, Pedro de Atauri, Carles Foguet, Alejandra Gonzalez-Beltran, Thomas Hankemeier, Kenneth Haug, Sijin He, Stephanie Herman, David Johnson, Namrata Kale, Anders Larsson, Steffen Neumann, Kristian Peters, Luca Pireddu, Philippe Rocca-Serra, Pierrick Roger, Rico Rueedi, Christoph Ruttkies, Noureddin Sadawi, Reza M Salek, Susanna-Assunta Sansone, Daniel Schober, Vitaly Selivanov, Etienne A. Thevenot, Michael van Vliet, Gianluigi Zanetti, Christoph Steinbeck, Kim Kultima, Ola Spjuth
bioRxiv 213603; doi: https://doi.org/10.1101/213603
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Interoperable and scalable data analysis with microservices: Applications in Metabolomics
Payam Emami Khoonsari, Pablo Moreno, Sven Bergmann, Joachim Burman, Marco Capuccini, Matteo Carone, Marta Cascante, Pedro de Atauri, Carles Foguet, Alejandra Gonzalez-Beltran, Thomas Hankemeier, Kenneth Haug, Sijin He, Stephanie Herman, David Johnson, Namrata Kale, Anders Larsson, Steffen Neumann, Kristian Peters, Luca Pireddu, Philippe Rocca-Serra, Pierrick Roger, Rico Rueedi, Christoph Ruttkies, Noureddin Sadawi, Reza M Salek, Susanna-Assunta Sansone, Daniel Schober, Vitaly Selivanov, Etienne A. Thevenot, Michael van Vliet, Gianluigi Zanetti, Christoph Steinbeck, Kim Kultima, Ola Spjuth
bioRxiv 213603; doi: https://doi.org/10.1101/213603

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4685)
  • Biochemistry (10362)
  • Bioengineering (7683)
  • Bioinformatics (26343)
  • Biophysics (13535)
  • Cancer Biology (10697)
  • Cell Biology (15446)
  • Clinical Trials (138)
  • Developmental Biology (8501)
  • Ecology (12824)
  • Epidemiology (2067)
  • Evolutionary Biology (16868)
  • Genetics (11402)
  • Genomics (15485)
  • Immunology (10623)
  • Microbiology (25226)
  • Molecular Biology (10225)
  • Neuroscience (54484)
  • Paleontology (402)
  • Pathology (1669)
  • Pharmacology and Toxicology (2897)
  • Physiology (4345)
  • Plant Biology (9255)
  • Scientific Communication and Education (1587)
  • Synthetic Biology (2558)
  • Systems Biology (6781)
  • Zoology (1467)