PT - JOURNAL ARTICLE AU - Steinegger, Martin AU - Mirdita, Milot AU - Söding, Johannes TI - Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold AID - 10.1101/386110 DP - 2018 Jan 01 TA - bioRxiv PG - 386110 4099 - http://biorxiv.org/content/early/2018/08/07/386110.short 4100 - http://biorxiv.org/content/early/2018/08/07/386110.full AB - The open-source de-novo Protein-Level ASSembler Plass (https://plass.mmseqs.org) assembles six-frame-translated sequencing reads into protein sequences. It recovers 2 to 10 times more protein sequences from complex metagenomes and can assemble huge datasets. We assembled two redundancy-filtered reference protein catalogs, 2 billion sequences from 640 soil samples (SRC) and 292 million sequences from 775 marine eukaryotic metatranscriptomes (MERC), the largest free collections of protein sequences.