PT - JOURNAL ARTICLE AU - Suma Tiruvayipati AU - Tan Wen Ying AU - Timothy Barkham AU - Swaine L. Chen TI - GBS-SBG - GBS Serotyping by Genome Sequencing AID - 10.1101/2021.06.16.448630 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.06.16.448630 4099 - http://biorxiv.org/content/early/2021/06/17/2021.06.16.448630.short 4100 - http://biorxiv.org/content/early/2021/06/17/2021.06.16.448630.full AB - Group B Streptococcus agalactiae (GBS; Streptococcus agalactiae) is the most common cause of neonatal meningitis and a rising cause of sepsis in adults. Recently, it has also been shown to cause foodborne disease. As with many other bacteria, the polysaccharide capsule of GBS is antigenic, enabling its use for strain serotyping. Recent advances in DNA sequencing have made sequence-based typing attractive (as has been implemented for several other bacteria, including Escherichia coli, Klebsiella pneumoniae species complex, Streptococcus pyogenes, and others). For GBS, existing WGS-based serotyping systems do not provide complete coverage of all known GBS serotypes (specifically including subtypes of serotype III), and none are simultaneously compatible with the two most common data types, raw short reads and assembled sequences. Here, we create a serotyping database (GBS-SBG, GBS Serotyping by Genome Sequencing), with associated scripts and running instructions, that can be used to call all currently described GBS serotypes, including subtypes of serotype III, using both direct short-read- and assembly-based typing. We achieved higher concordance using GBS-SBG on a previously reported data set of 790 strains. We further validated GBS-SBG on a new set of 572 strains, achieving 99.8% concordance with PCR-based molecular serotyping using either short-read- or assembly-based typing. The GBS-SBG package is publicly available and will accelerate and simplify serotyping by sequencing for GBS.DATA SUMMARYThe GBS-SBG package is open source and available for at Github under the MIT license (URL - https://github.com/swainechen/GBS-SBG)Accession numbers of the sequencing reads and reference sequences used in the study from earlier reports have been provided within the article and the supplementary dataThe WGS data for the 572 isolates used in the study is available at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA293392Competing Interest StatementThe authors have declared no competing interest.