RT Journal Article SR Electronic T1 Bakta: Rapid & standardized annotation of bacterial genomes via alignment-free sequence identification JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.09.02.458689 DO 10.1101/2021.09.02.458689 A1 Oliver Schwengers A1 Lukas Jelonek A1 Marius Dieckmann A1 Sebastian Beyvers A1 Jochen Blom A1 Alexander Goesmann YR 2021 UL http://biorxiv.org/content/early/2021/09/02/2021.09.02.458689.abstract AB Command line annotation software tools have continuously gained popularity compared to centralized online services due to the worldwide increase of sequenced bacterial genomes. However, results of existing command line software pipelines heavily depend on taxon specific databases or sufficiently well annotated reference genomes. Here, we introduce Bakta, a new command line software tool for the robust, taxon-independent, thorough and nonetheless fast annotation of bacterial genomes. Bakta conducts a comprehensive annotation workflow including the detection of small proteins taking into account replicon metadata. The annotation of coding sequences is accelerated via an alignment-free sequence identification approach that in addition facilitates the precise assignment of public database cross references. Annotation results are exported in GFF3 and INSDC-compliant flat files as well as comprehensive JSON files facilitating automated downstream analysis. We compared Bakta to other rapid contemporary command line annotation software tools in both targeted and taxonomically broad benchmarks including isolates and metagenomic-assembled genomes. We demonstrated that Bakta outperforms other tools in terms of functional annotations, the assignment of functional categories and database cross-references whilst providing comparable wall clock runtimes. Bakta is implemented in Python 3 and runs on MacOS and Linux systems. It is freely available under a GPLv3 license at https://github.com/oschwengers/bakta. An accompanying web version is available at https://bakta.computational.bio.Competing Interest StatementThe authors have declared no competing interest.