TY - JOUR T1 - metaPR<sup>2</sup>: a database of eukaryotic 18S rRNA metabarcodes with an emphasis on protists JF - bioRxiv DO - 10.1101/2022.02.04.479133 SP - 2022.02.04.479133 AU - Daniel Vaulot AU - Clarence Wei Hung Sim AU - Denise Ong AU - Bryan Teo AU - Charlie Biwer AU - Mahwash Jamy AU - Adriana Lopes dos Santos Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/02/06/2022.02.04.479133.abstract N2 - In recent years, metabarcoding has become the method of choice for investigating the composition and assembly of microbial eukaryotic communities, and an increasing number of environmental datasets are being published. Although unprocessed sequence files are often publicly available, processed data, i.e. sequences clustered as operational taxonomic units (OTUs) or amplicon sequence variants (ASVs) are rarely at hand in a comparable format. This hampers comparative studies between different environments and datasets, for example examining the biogeographical patterns of specific groups/species, as well analysing the micro-genetic diversity within these groups. Here, we present a newly-assembled database of processed 18S rRNA metabarcodes that are annotated with the PR2 reference sequence database. This database, called metaPR2, contains 41 datasets corresponding to more than 4,000 samples and 73,000 ASVs. The database is accessible through both a web-based interface (https://shiny.metapr2.org) and as an R package, and should prove very useful to all researchers working on protist diversity in a variety of systems.Competing Interest StatementThe authors have declared no competing interest. ER -