PT - JOURNAL ARTICLE AU - Yasset Perez-Riverol AU - Pablo Moreno TI - Scalable data analysis in proteomics and metabolomics using BioContainers and workflows engines AID - 10.1101/604413 DP - 2019 Jan 01 TA - bioRxiv PG - 604413 4099 - http://biorxiv.org/content/early/2019/04/11/604413.short 4100 - http://biorxiv.org/content/early/2019/04/11/604413.full AB - The recent improvements in mass spectrometry instruments and new analytical methods are increasing the intersection between proteomics and big data science. In addition, the bioinformatics analysis is becoming an increasingly complex and convoluted process involving multiple algorithms and tools. A wide variety of methods and software tools have been developed for computational proteomics and metabolomics during recent years, and this trend is likely to continue. However, most of the computational proteomics and metabolomics tools are targeted and design for single desktop application limiting the scalability and reproducibility of the data analysis. In this paper we overview the key steps of metabolomic and proteomics data processing including main tools and software use to perform the data analysis. We discuss the combination of software containers with workflows environments for large scale metabolomics and proteomics analysis. Finally, we introduced to the proteomics and metabolomics communities a new approach for reproducible and large-scale data analysis based on BioContainers and two of the most popular workflows environments: Galaxy and Nextflow.AWSAmazon web servicesDDAData-dependant acquisitionDIAData-independent acquisitionDSLDomain-specific languageFDRFalse discovery rateHPCHigh-performance computerHPCSHigh-performance computing systemsHUPOHuman Proteome OrganizationIOInput-outputMRMMultiple reaction monitoringMSMass spectrometryMS/MSTandem mass spectrometryLC-MSLiquid chromatography–mass spectrometryLSFIBM Platform LSFPRPull requestPRMParallel reaction monitoringPSIProteomics Standards InitiativeREST APIRepresentational State Transfer programming interfaceSRMSelected reaction monitoring