Abstract
We developed a python package called mbctools, designed to offer a cross-platform tool for processing amplicon data from various organisms in the context of metabarcoding studies. It can handle the most common tasks in metabarcoding pipelines such as paired-end merging, primer trimming, quality filtering, sequence denoising, zero-radius operational taxonomic unit (ZOTU) filtering, and has the capability to process multiple genetic markers simultaneously. mbctools is a menu-driven program that eliminates the need for expertise in command-line skills and ensures documentation of each analysis for reproducibility purposes. The software, designed to run in a console, offers an interactive experience, guided by keyboard inputs, assisting users along the way through data processing and hiding the complexity of command lines by letting them concentrate on selecting parameters to apply in each step of the process. In our workflow, VSEARCH is utilized for processing fastq files derived from amplicon-based Next-Generation Sequencing data. This software is a versatile open-source tool for processing amplicon sequences, offering advantages such as high speed, efficient memory usage, and the ability to handle large datasets. It provides functions for various tasks such as dereplication, clustering, chimera detection, and taxonomic assignment. VSEARCH is thus very efficient in retrieving the overall diversity of a sample. To adapt to the diversity of projects in metabarcoding, we facilitate the reprocessing of datasets with the possibility to adjust parameters. mbctools can also be launched in a headless mode, making it suited for integration into pipelines running on High-Performance Computing environments. mbctools is available at https://github.com/GuilhemSempere/mbctools, https://pypi.org/project/mbctools/.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
just the formating for the accepted review from PCI, i.e the logo.