Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons

PeerJ. 2017 Jan 11:5:e2836. doi: 10.7717/peerj.2836. eCollection 2017.

Abstract

The importance of 16S rRNA gene amplicon profiles for understanding the influence of microbes in a variety of environments coupled with the steep reduction in sequencing costs led to a surge of microbial sequencing projects. The expanding crowd of scientists and clinicians wanting to make use of sequencing datasets can choose among a range of multipurpose software platforms, the use of which can be intimidating for non-expert users. Among available pipeline options for high-throughput 16S rRNA gene analysis, the R programming language and software environment for statistical computing stands out for its power and increased flexibility, and the possibility to adhere to most recent best practices and to adjust to individual project needs. Here we present the Rhea pipeline, a set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha- and beta-diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language. Rhea scripts and documentation are freely available at https://lagkouvardos.github.io/Rhea.

Keywords: 16S rRNA gene; Alpha-diversity; Beta-diversity; Correlations; Host-microbe interactions; Microbial profile; Microbiome analysis; R script; Statistical analysis.

Grants and funding

This work was supported by the German Research Foundation (DFG) and the Technical University of Munich (TUM) in the framework of the Open Access Publishing Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.