ABSTRACT
Complex cellular functions are usually encoded by a set of genes in one or a few organized genetic loci in microbial genomes. MacSyFinder uses these properties to model and then annotate cellular functions in microbial genomes. This is done by integrating the identification of each individual gene at the level of the molecular system. We hereby present a major release of MacSyFinder (Macromolecular System Finder), MacSyFinder version 2 (v2). This new version is coded in Python 3 (>= 3.7). The code was improved and rationalized to facilitate future maintainability. Several new features were added to allow more flexible modelling of the systems. We introduce a more intuitive and comprehensive search engine to identify all the best candidate systems and sub-optimal ones that respect the models’ constraints. We also introduce the novel macsydata companion tool that enables the easy installation and broad distribution of the models developed for MacSyFinder (macsy-models) from GitHub repositories. Finally, we have updated, improved, and made available MacSyFinder popular models for this novel version: TXSScan to identify protein secretion systems, TFFscan to identify type IV filaments, CONJscan to identify conjugative systems, and CasFinder to identify CRISPR associated proteins.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Email addresses: Bertrand Néron: bneron{at}pasteur.fr
Eduardo Rocha: erocha{at}pasteur.fr
Sophie Abby: sophie.abby{at}univ-grenoble-alpes.fr