RT Journal Article SR Electronic T1 SCRAPP: A tool to assess the diversity of microbial samples from phylogenetic placements JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.02.28.969980 DO 10.1101/2020.02.28.969980 A1 Pierre Barbera A1 Lucas Czech A1 Sarah Lutterop A1 Alexandros Stamatakis YR 2020 UL http://biorxiv.org/content/early/2020/03/02/2020.02.28.969980.abstract AB Microbial ecology research is currently driven by the continuously decreasing cost of DNA sequencing and the improving accuracy of data analysis methods. One such analysis method is phylogenetic placement, which establishes the phylogenetic identity of the anonymous environmental sequences in a sample by means of a given phylogenetic reference tree. However, assessing the diversity of a sample remains challenging, as traditional methods do not scale well with the increasing data volumes and/or do not leverage the phylogenetic placement information.Here, we present SCRAPP, a highly parallel and scalable tool that uses a molecular species delimitation algorithm to quantify the diversity distribution over the reference phylogeny for a given phylogenetic placement of the sample. SCRAPP employs a novel approach to cluster phylogenetic placements, called placement space clustering, to efficiently perform dimensionality reduction, so as to scale on large data volumes. Furthermore, it utilizes the phylogeny-aware molecular species delimitation method mPTP to quantify diversity.We evaluated SCRAPP using both, simulated and empirical datasets. We use simulated data to verify our approach. Tests on an empirical dataset show that SCRAPP-derived metrics can classify samples by their diversity-correlated features equally well or better than existing, commonly used approaches.SCRAPP is available at https://github.com/pbdas/scrapp