RT Journal Article SR Electronic T1 Sensitive protein sequence searching for the analysis of massive data sets JF bioRxiv FD Cold Spring Harbor Laboratory SP 079681 DO 10.1101/079681 A1 Martin Steinegger A1 Johannes Söding YR 2017 UL http://biorxiv.org/content/early/2017/03/24/079681.abstract AB Sequencing costs have dropped much faster than Moore’s law in the past decade, and sensitive sequence searching has become the main bottleneck in the analysis of large metagenomic datasets. We developed the parallelized, open-source software MM-seqs2 (mmseqs.org), which improves on current search tools over the full range of speed-sensitivity trade-off, achieving sensitivities better than PSI-BLAST at > 400 times its speed. MMseqs2 offers great potential to better exploit large-scale (meta)genomic data.