Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles

Varuni Sarwal, Jaqueline Brito, Serghei Mangul, View ORCID ProfileDavid Koslicki
doi: https://doi.org/10.1101/2022.04.28.489926
Varuni Sarwal
1Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jaqueline Brito
2Department of Clinical Pharmacy, University of Southern California, Los Angeles, California, 90089, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Serghei Mangul
2Department of Clinical Pharmacy, University of Southern California, Los Angeles, California, 90089, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Koslicki
3Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
4Department of Biology, The Pennsylvania State University, University Park, PA, USA
5Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Koslicki
  • For correspondence: dmk333@psu.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Metagenomic taxonomic profiling aims to predict the identity and relative abundance of taxa in a given whole genome sequencing metagenomic sample. A recent surge in computational methods that aim to accurately estimate taxonomic profiles, called taxonomic profilers, have motivated community driven efforts to create standardized benchmarking datasets and platforms, standardized taxonomic profile formats, as well as a benchmarking platform to assess tool performance. While this standardization is essential, there is currently a lack of tools to visualize the standardized output of the many existing taxonomic profilers, and benchmarking studies rely on a single value metrics to compare performance of tools and compare to benchmarking datasets. Here we report the development of TAMPA (Taxonomic metagenome profiling evaluation), a robust and easy-to-use method that allows scientists to easily interpret and interact with taxonomic profiles produced by the many different taxonomic profiler methods beyond the standard metrics used by the scientific community. We demonstrate the unique ability of TAMPA to provide important biological insights into the taxonomic differences between samples otherwise missed by commonly utilized metrics. Additionally, we show that TAMPA can help visualize the output of taxonomic profilers, enabling biologists to effectively choose the most appropriate profiling method to use on their metagenomics data. TAMPA is available on GitHub, Bioconda and Galaxy Toolshed at https://github.com/dkoslicki/TAMPA, and is released under the MIT license.

Introduction

Metagenomics has become an essential tool to study microbiomes due to improvements in technology and bioinformatic algorithms. Taxonomic metagenome profiling aims to predict the identity and relative abundances of taxa in a given whole genome sequencing (WGS) metagenomic sample. A recent surge in computational methods that aim to accomplish this, called taxonomic profilers, have motivated community-driven efforts to create standardized benchmarking datasets 1–3, standardized taxonomic profile formats 4, as well as a benchmarking platform to assess tool performance on simulated data 5. While this standardization is essential, there is currently a lack of tools to visualize the standardized output of the many existing taxonomic profilers, and benchmarking studies rely on a single value metrics to compare performance of tools and compare to benchmarking datasets. Indeed, the only two such WGS taxonomic profiling visualization and analysis tools that do exist are either integrated into a single taxonomic profiling method 6, or else lack the flexibility and interpretability for the analysis and visualization of multiple taxonomic profiles 7. Neither of these methods are designed for or compatible with the community-driven output formats previously mentioned.

Despite the availability of flexible and interactive visualization tools in the area of amplicon microbial analysis (such as 16S rRNA studies), similar methods are yet to be developed for WGS metagenomics. For example, metacoder 8 is a tool that allows for visualizing, analyzing, and manipulating amplicon microbial data. However, metacoder is not designed for WGS metagenomic analyses and cannot be used for analysis and visualizing metagenomic taxonomic profiles due to amplicon analyses relying on Operational Taxonomic Units, a concept that is not relevant to metagenomic studies. Similarly, the recently published preprint for the software package EMPress9 is an interactive phylogenetic tree viewer not explicitly intended for the visualization of WGS taxonomic profiles.

Additionally, lack of tools that provide an interpretable visualization of multiple taxonomic profiles limits the ability of the biomedical community to select a tool. As such, when WGS metagenomic data is generated and a scientist wishes to determine which of the dozens 10–23 of taxonomic profilers to use, they typically rely on benchmark studies 1,24,25. These benchmark studies often use simulated data that does not accurately reflect their samples of interest. Alternatively, they can run their own simulation and benchmarking study tailor to their use-case, but this requires significant time investment 2. Often scientists resort to simply picking a familiar tool regardless of its performance characteristics. Given the substantial variability in the performance of taxonomic profiling tools 1,24,25, this may result in misinterpretation of their data and can potentially lead to unfortunate situations where utilizing a single taxonomic profiling tool can lead to an interpretation of data 26 (i.e. presence of Bubonic plague in the New York subway system) that is later to be found to be inaccurate 27.

Furthermore, after a taxonomic profiler is utilized, scientists can encounter difficulty when interpreting statistical measures of differences between the estimated taxonomic frequencies and the ground truth, as well as when comparing differences between tools. This difficulty is further compounded when new or emerging statistical measures are used to characterize differences between metagenomic samples, where confusion can arise about how to interpret such measures 9, 28.

To empower biomedical researchers with a robust and easy to use metagenomic taxonomic profile analysis and visualization platform, we have developed a software package TAMPA (Taxonomic metagenome profiling evaluation). Our platform assists scientists in contextualizing, assessing, and extracting insight from taxonomic profiles produced by multiple taxonomic profilers when applied to either real or simulated data. TAMPA is designed to allow users to effectively analyze one or more taxonomic profiles produced by any of the numerous taxonomic profiling methods. Additionally, TAMPA can operate on the widely utilized and community developed BIOM 29 and CAMI 1 profiling formats. We demonstrate the utility of TAMPA by showing how it illuminates the important biological differences between samples and conditions otherwise missed by commonly utilized statistical metrics. When gold standard taxonomic profiles are available, we show how TAMPA can augment existing benchmarking platforms such as OPAL by being incorporated within the tool and providing an interpretable visualization of the profiles 5. Additionally, we show that TAMPA can enable biologists to choose an appropriate profiling method to use on their real data when a ground truth taxonomic profile is not available, since TAMPA allows users to quickly ascertain similarities or differences in predictions made by multiple taxonomic tools.

Results

TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles

TAMPA is a computational tool that allows the user to effectively visualize one or more taxonomic profiles produced by taxonomic profiling methods. TAMPA contextualizes, assesses, and extracts insight from multiple taxonomic profiler results. The taxonomic profile files are provided to TAMPA as an input and it produces a graph with the relative abundances of taxa in a given sample. The input profile files are first parsed and converted into objects. Then, the ete toolkit is used to convert the python objects into trees. We have added several command line options to improve the visualization of the tree. TAMPA allows for users to choose among multiple graph layout formats, including pie, bar, circle and rectangular. Users can further customize the graph by choosing the scaling options for the graph (log, sqrt, power), and other parameters such as the branch vertical margin, leaf separation, label font size, figure width and height, and image resolution. In the case that the number of samples is very large and the graph becomes crowded, users can choose to display only the nodes with abundance higher than a particular threshold, and/or add labels to specific parts of the graph such as only the leaf nodes. Users can also choose if they want to plot the L1 error or normalize the relative abundances of the samples. TAMPA allows users to analyze one or more samples of interest, and allows for the analysis of both single input taxonomic profiles, as well as input profiles with a ground truth. Users can also choose to decide alternate taxonomies, and restrict visualization to a particular taxonomic rank. It can be used to study the impact of filtering low abundance taxa. While the default database used for reading the input is the ncbi taxdump database, the users can specify a different database dump file. TAMPA is run using a command line interface, and takes the profile files as an input and provides a plot with the relative abundances of the taxa as the output.

To demonstrate the ability of TAMPA to provide an interpretable analysis and visualization of metagenomics-based taxon abundance profiles, we apply it to the results of three profiles from the publicly available CAMI dataset1–3: MetaPhyler 14; mOTU 15; and Taxy-Pro 30. We demonstrate three major ways in which TAMPA provides a novel way to visualize the outputs of existing profilers and visualization platforms.

TAMPA enables effective comparison of the outputs of multiple profilers

First, TAMPA can be used to compare the outputs of multiple profilers and reveal insight even when traditional metrics report no differences. TAMPA does this by identifying which specific clades contributed to metric values, thus revealing biological differences that could otherwise be overlooked when looking only at metric values. We choose two profilers with an identical UniFrac score on a particular sample, Taxy-Pro and Metaphyler, and demonstrate the specific differences in their predictions of taxonomic profiles using TAMPA on the phylum level (Figure 1a), as well as other taxonomic levels (Figure S1-S5).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the phylum level. The size of discs represents the total amount of relative abundance at the corresponding clade in the ground truth, or the tool prediction if that clade is not in the ground truth. If the tool predictions agree, a disc is colored half orange and half teal. The proportion of teal to orange changes with respect to the disagreement in prediction of that clade’s relative abundance between the two tools being compared. Highlighted text represents clades where the difference between the relative abundances of the prediction and ground truth exceeds 30%.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2:

Visualization of the taxonomic profile of a top performing CAMI tool in terms of L1 norm, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the phylum level. The size of discs represents the total amount of relative abundance at the corresponding clade in the ground truth, or the tool prediction if that clade is not in the ground truth. If the tool predictions agree, a disc is colored half orange and half teal. The proportion of teal to orange changes with respect to the disagreement in prediction of that clade’s relative abundance between the two tools being compared. Highlighted text represents clades where the difference between the relative abundances of the prediction and ground truth exceeds 30%.

Figure 3:
  • Download figure
  • Open in new tab
Figure 3:

Visualization of the taxonomic profile of the lowest performing tool in terms of L1 norm, mOTU vs the ground truth using TAMPA on the CAMI dataset at the phylum level. The size of discs represents the total amount of relative abundance at the corresponding clade in the ground truth, or the tool prediction if that clade is not in the ground truth. If the tool predictions agree, a disc is colored half orange and half teal. The proportion of teal to orange changes with respect to the disagreement in prediction of that clade’s relative abundance between the two tools being compared. Highlighted text represents clades where the difference between the relative abundances of the prediction and ground truth exceeds 30%.

Second, even when tools performance is distinguishable by traditional numerical metrics, TAMPA can be used to quickly ascertain how tool predictions differ from the ground truth profile. For example, we chose both the top (Figure 1b) and bottom (Figure 1c) performing tools in terms of the L1 Norm, according to the CAMI challenge: Metaphyler, and mOTU, and demonstrate that TAMPA can illuminate important biological differences between the two tools and the ground truth at the phylum level (Figure 1b,c), as well as at all other taxonomic ranks (Figure S6-S15).

TAMPA augments existing benchmarking platforms

TAMPA can be used to augment existing benchmarking platforms. We have integrated TAMPA into the taxonomic profiling benchmarking platform OPAL (5) in order to provide biological insight when scientists and tool developers aim to benchmark and compare taxonomic profilers (Figure S16). OPAL is a popular web-based tool used to compute commonly used performance metrics for profiler outputs. While OPAL provides global metrics and visualizations, it is unable to provide specific information on the taxonomic differences in the profiles. With the inclusion of TAMPA in OPAL, users can now quickly ascertain the performance of the tools being analyzed at a level of resolution not possible before. For example, by utilizing the figures returned by TAMPA, a user can quantify tool performance on a particular taxonomic clade of interest. Based on our results (eg., Supplementary Figure S1), we show that TAMPA can highlight important taxonomic differences easily missed by statistical metrics, thus enabling biologists to choose the most appropriate profiling method to use on their data.

Discussion

Metagenomics has emerged as a technology of choice for analyzing microbial communities, with thousands of WGS metagenomic samples being produced annually 31. Taxonomic profiling is an important first step in analyzing these kinds of data. Hence, TAMPA will be of broad interest to all scientists engaged in such research as this important first step will now be easily interpreted, thus allowing scientists to quickly contextualize, assess, and extract insight from taxonomic profiles instead of relying primarily on statistical summaries or manual manipulation. Indeed, TAMPA was effectively applied in the second round of the Critical Assessment of Metagenome Interpretation (CAMI) competition32 where it was used to visualize the most difficult to correctly classify taxa. We present this tool as an example of how adoption of the standards put forward by the CAMI consortium can help facilitate tool development.

Code availability

TAMPA is provided in a platform independent fashion via Bioconda 33: Bioconda link: https://anaconda.org/vsarwal/tampa as well as integrated into the Galaxy Toolshed 34 for easy “point and click” analysis for less computationally inclined users: Galaxy Toolshed link: https://toolshed.g2.bx.psu.edu/repository?repository_id=7b5054a8c1e84051

The source code is available at: https://github.com/dkoslicki/TAMPA All code required to produce the figures and analysis performed in this paper are freely available at: https://github.com/Addicted-to-coding/TAMPA_publication

Methods

TAMPA was run on the profiling datasets generated in the CAMI challenge. The profile files were extracted from the github repo of the CAMI challenge: https://github.com/CAMI-challenge/firstchallenge_evaluation/tree/master/profiling/data/profile_submissions. The description.property file, found in the corresponding subdirectory of each tool athttps://github.com/CAMI-challenge/firstchallenge_evaluation/tree/master/profiling/data/profile_submissions was used to map the anonymous name to the tool name. We limited our analysis to Sample 1 of the high complexity dataset, denoted by CAMI_HIGH_S001. We studied tools with the highest and lowest precision, recall and UniFrac score. The following command was used to run TAMPA: python src/profile_to_plot.py -i tool.profile -g ground_truth rank -b basename -k linear -o.

TAMPA inputs several command line options from the user, such as the input and ground truth profile, output file name and sample interest, and visualization options including fontsize, labelsize and label width, layout and leaf separation. It then parses them into parameters for the plot. Depending on the user input for the different command line parameters, the ete3 toolkit is used to generate a tree which is saved after the tool finishes running.

Data availability

TAMPA was run on the .profile files produced by the top and bottom performing taxonomic profilers. The taxonomic profiles represent the taxonomic identities and relative abundances of microbial community members from metagenome samples. The profiling files used to run TAMPA are freely available on the github repo of the CAMI challenge: https://github.com/CAMI-challenge/firstchallenge_evaluation/tree/master/profiling/data/profile_submissions.

Supplementary Materials

Supplementary Figures

Figure S1:
  • Download figure
  • Open in new tab
Figure S1:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the class rank. Note the differences in taxa predictions even though the tools have identical UniFrac scores.

Figure S2:
  • Download figure
  • Open in new tab
Figure S2:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the order rank.

Figure S3:
  • Download figure
  • Open in new tab
Figure S3:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the family rank

Figure S4:
  • Download figure
  • Open in new tab
Figure S4:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the genus rank

Figure S5:
  • Download figure
  • Open in new tab
Figure S5:

Visualization of the taxonomic profiles of tools with identical UniFrac scores of 4, Taxy_pro vs Metaphyler using TAMPA on the CAMI dataset at the species rank

Figure S6:
  • Download figure
  • Open in new tab
Figure S6:

Visualization of the taxonomic profiles of a top performing CAMI tool, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the class level.

Figure S7:
  • Download figure
  • Open in new tab
Figure S7:

Visualization of the taxonomic profiles of a top performing CAMI tool, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the order level.

Figure S8:
  • Download figure
  • Open in new tab
Figure S8:

Visualization of the taxonomic profiles of a top performing CAMI tool, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the family level.

Figure S9:
  • Download figure
  • Open in new tab
Figure S9:

Visualization of the taxonomic profiles of a top performing CAMI tool, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the genus level.

Figure S10:
  • Download figure
  • Open in new tab
Figure S10:

Visualization of the taxonomic profiles of a top performing CAMI tool, Metaphyler vs the ground truth using TAMPA on the CAMI dataset at the species level.

Figure S11:
  • Download figure
  • Open in new tab
Figure S11:

Visualization of the taxonomic profiles of the lowest performing tool, mOTU vs the ground truth using TAMPA on the CAMI dataset at the class level.

Figure S12:
  • Download figure
  • Open in new tab
Figure S12:

Visualization of the taxonomic profiles of the lowest performing tool, mOTU vs the ground truth using TAMPA on the CAMI dataset at the order level.

Figure S13:
  • Download figure
  • Open in new tab
Figure S13:

Visualization of the taxonomic profiles of the lowest performing tool, mOTU vs the ground truth using TAMPA on the CAMI dataset at the family level.

Figure S14:
  • Download figure
  • Open in new tab
Figure S14:

Visualization of the taxonomic profiles of the lowest performing tool, mOTU vs the ground truth using TAMPA on the CAMI dataset at the genus level.

Figure S15:
  • Download figure
  • Open in new tab
Figure S15:

Visualization of the taxonomic profiles of the lowest performing tool, mOTU vs the ground truth using TAMPA on the CAMI dataset at the species level.

Figure S16:
  • Download figure
  • Open in new tab
Figure S16:

Incorporation of TAMPA into OPAL

References

  1. 1.↵
    Sczyrba, A. et al. Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software. Nat. Methods 14, 1063–1071 (2017).
    OpenUrlCrossRefPubMed
  2. 2.↵
    Meyer, F. et al. Tutorial: Assessing metagenomics software with the CAMI benchmarking toolkit. 2020.08.11.245712 (2020) doi:10.1101/2020.08.11.245712.
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    Mangul, S. et al. Systematic benchmarking of omics computational tools. Nat. Commun. 10, 1393 (2019).
    OpenUrlCrossRef
  4. 4.↵
    Meyer, F. et al. Tutorial: Assessing metagenomics software with the CAMI benchmarking toolkit. Nat. Protoc. (under revision).
  5. 5.↵
    Meyer, F. et al. Assessing taxonomic metagenome profilers with OPAL. Genome Biol. 20, 51 (2019).
    OpenUrlCrossRefPubMed
  6. 6.↵
    Asnicar, F., Weingart, G., Tickle, T. L., Huttenhower, C. & Segata, N. Compact graphical representation of phylogenetic data and metadata with GraPhlAn. PeerJ 3: e1029. (2015).
    OpenUrlCrossRefPubMed
  7. 7.↵
    Ondov, B. D., Bergman, N. H. & Phillippy, A. M. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics 12, 385 (2011).
    OpenUrlCrossRefPubMed
  8. 8.↵
    Foster, Z. S. L., Sharpton, T. J. & Grünwald, N. J. Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. PLoS Comput. Biol. 13, e1005404 (2017).
    OpenUrlCrossRef
  9. 9.↵
    Cantrell, K. et al. EMPress enables tree-guided, interactive, and exploratory analyses of multi-omic datasets. 2020.10.06.327080 (2020) doi:10.1101/2020.10.06.327080.
    OpenUrlAbstract/FREE Full Text
  10. 10.↵
    Koslicki, D. & Falush, D. MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation. mSystems 1, (2016).
  11. 11.
    Piro, V. C., Lindner, M. S. & Renard, B. Y. DUDes: a top-down taxonomic profiler for metagenomics. Bioinformatics 32, 2272–2280 (2016).
    OpenUrlCrossRefPubMed
  12. 12.
    Silva, G. G. Z., Cuevas, D. A., Dutilh, B. E. & Edwards, R. A. FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares. PeerJ 2, e425 (2014).
    OpenUrlCrossRefPubMed
  13. 13.
    Segata, N. et al. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat. Methods 9, 811–814 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  14. 14.↵
    Liu, B., Gibbons, T., Ghodsi, M., Treangen, T. & Pop, M. Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics 12 Suppl 2, S4 (2011).
    OpenUrlCrossRef
  15. 15.↵
    Sunagawa, S. et al. Metagenomic species profiling using universal phylogenetic marker genes. Nat. Methods 10, 1196–1199 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.
    Nguyen, N.-P., Mirarab, S., Liu, B., Pop, M. & Warnow, T. TIPP: taxonomic identification and phylogenetic profiling. Bioinformatics 30, 3548–3555 (2014).
    OpenUrlCrossRefPubMedWeb of Science
  17. 17.
    Lu, J., Breitwieser, F. P., Thielen, P. & Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).
    OpenUrlCrossRef
  18. 18.
    Koslicki, D., Foucart, S. & Rosen, G. WGSQuikr: fast whole-genome shotgun metagenomic classification. PLoS One 9, e91784 (2014).
    OpenUrlCrossRefPubMed
  19. 19.
    Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).
    OpenUrlCrossRefPubMed
  20. 20.
    Shi, L. & Chen, B. A Vector Representation of DNA Sequences Using Locality Sensitive Hashing. 726729 (2019) doi:10.1101/726729.
    OpenUrlAbstract/FREE Full Text
  21. 21.
    Marcelino, V. R. et al. CCMetagen: comprehensive and accurate identification of eukaryotes and prokaryotes in metagenomic data. Genome Biol. 21, 103 (2020).
    OpenUrlCrossRef
  22. 22.
    LaPierre, N., Alser, M., Eskin, E., Koslicki, D., Mangul, S. Metalign: efficient alignment-based metagenomic profiling via containment min hash. (Github).
  23. 23.↵
    Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).
    OpenUrlAbstract/FREE Full Text
  24. 24.↵
    McIntyre, A. B. R. et al. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol. 18, 182 (2017).
    OpenUrlCrossRef
  25. 25.↵
    Lindgreen, S., Adair, K. L. & Gardner, P. P. An evaluation of the accuracy and speed of metagenome analysis tools. Sci. Rep. 6, 19233 (2016).
    OpenUrlCrossRefPubMed
  26. 26.↵
    Afshinnekoo, E. et al. Erratum: Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics. Cell Syst 1, 72–87e (2015).
    OpenUrl
  27. 27.↵
    Ackelsberg, J. et al. Lack of Evidence for Plague or Anthrax on the New York City Subway. Cell Syst 1, 4–5 (2015).
    OpenUrl
  28. 28.↵
    McClelland, J. & Koslicki, D. EMDUniFrac: exact linear time computation of the UniFrac metric and identification of differentially abundant organisms. J. Math. Biol. 77, 935–949 (2018).
    OpenUrl
  29. 29.↵
    McDonald, D. et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Gigascience 1, 7 (2012).
    OpenUrlCrossRefPubMed
  30. 30.↵
    Klingenberg, H., Aßhauer, K. P., Lingner, T. & Meinicke, P. Protein signature-based estimation of metagenomic abundances including all domains of life and viruses. Bioinformatics 29, 973–980 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  31. 31.↵
    Leinonen, R., Sugawara, H., Shumway, M. & International Nucleotide Sequence Database Collaboration. The sequence read archive. Nucleic Acids Res. 39, D19–21 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  32. 32.↵
    Meyer, F. et al. Critical Assessment of Metagenome Interpretation - the second round of challenges. bioRxiv 2021.07.12.451567 (2021) doi:10.1101/2021.07.12.451567.
    OpenUrlAbstract/FREE Full Text
  33. 33.↵
    Grüning, B. et al. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat. Methods 15, 475–476 (2018).
    OpenUrl
  34. 34.↵
    Blankenberg, D. et al. Dissemination of scientific software with Galaxy ToolShed. Genome Biol. 15, 403 (2014).
    OpenUrlCrossRefPubMed
  35. 32.
    Meyer, Fernando, et al. “Critical Assessment of Metagenome Interpretation: the second round of challenges.” Nature Methods (2022): 1–12.
Back to top
PreviousNext
Posted April 29, 2022.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles
Varuni Sarwal, Jaqueline Brito, Serghei Mangul, David Koslicki
bioRxiv 2022.04.28.489926; doi: https://doi.org/10.1101/2022.04.28.489926
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles
Varuni Sarwal, Jaqueline Brito, Serghei Mangul, David Koslicki
bioRxiv 2022.04.28.489926; doi: https://doi.org/10.1101/2022.04.28.489926

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3590)
  • Biochemistry (7562)
  • Bioengineering (5503)
  • Bioinformatics (20753)
  • Biophysics (10308)
  • Cancer Biology (7964)
  • Cell Biology (11625)
  • Clinical Trials (138)
  • Developmental Biology (6598)
  • Ecology (10177)
  • Epidemiology (2065)
  • Evolutionary Biology (13589)
  • Genetics (9530)
  • Genomics (12830)
  • Immunology (7917)
  • Microbiology (19525)
  • Molecular Biology (7651)
  • Neuroscience (42025)
  • Paleontology (307)
  • Pathology (1254)
  • Pharmacology and Toxicology (2195)
  • Physiology (3261)
  • Plant Biology (7028)
  • Scientific Communication and Education (1294)
  • Synthetic Biology (1949)
  • Systems Biology (5422)
  • Zoology (1113)