PT - JOURNAL ARTICLE AU - Jonathon Brenner AU - Laurynas Kalesinskas AU - Catherine Putonti TI - Exploring the Diversity of <em>Bacillus</em> whole genome sequencing projects using Peasant, the Prokaryotic Assembly and Annotation Tool AID - 10.1101/132084 DP - 2017 Jan 01 TA - bioRxiv PG - 132084 4099 - http://biorxiv.org/content/early/2017/04/28/132084.short 4100 - http://biorxiv.org/content/early/2017/04/28/132084.full AB - Background The persistent decrease in cost and difficulty of whole genome sequencing of microbial organisms has led to a dramatic increase in the number of species and strains characterized from a wide variety of environments. Microbial genome sequencing can now be conducted by small laboratories and as part of undergraduate curriculum. While sequencing is routine in microbiology, assembly, annotation and downstream analyses still require computational resources and expertise, often necessitating familiarity with programming languages. To address this problem, we have created a light-weight, user-friendly tool for the assembly and annotation of microbial sequencing projects.Results The Prokaryotic Assembly and Annotation Tool, Peasant, automates the processes of read quality control, genome assembly, and annotation for microbial sequencing projects. High-quality assemblies and annotations can be generated by Peasant without the need of programming expertise or high-performance computing resources. Furthermore, statistics are calculated so that users can evaluate their sequencing project. To illustrate the computational speed and accuracy of Peasant, the SRA records of 322 Illumina platform whole genome sequencing assays for Bacillus species were retrieved from NCBI, assembled and annotated on a single desktop computer. From the assemblies and annotations produced, a comprehensive analysis of the diversity of over 200 high-quality samples was conducted, looking at both the 16S rRNA phylogenetic marker as well as the Bacillus core genome.Conclusions Peasant provides an intuitive solution for high-quality whole genome sequence assembly and annotation for users with limited programing experience and/or computational resources. The analysis of the Bacillus whole genome sequencing projects exemplifies the utility of this tool. Furthermore, the study conducted here provides insight into the diversity of the species, the largest such comparison conducted to date.