PT - JOURNAL ARTICLE AU - Danielle J. Ingle AU - Mary Valcanis AU - Alex Kuzevski AU - Marija Tauschek AU - Michael Inouye AU - Tim Stinear AU - Myron M. Levine AU - Roy M. Robins-Browne AU - Kathryn E. Holt TI - EcOH: <em>In silico</em> serotyping of <em>E. coli</em> from short read data AID - 10.1101/032151 DP - 2015 Jan 01 TA - bioRxiv PG - 032151 4099 - http://biorxiv.org/content/early/2015/11/18/032151.short 4100 - http://biorxiv.org/content/early/2015/11/18/032151.full AB - The lipopolysaccharide (O) and flagellar (H) surface antigens of Escherichia coli are targets for serotyping that have traditionally been used to identify pathogenic lineages of E. coli. As serotyping has several limitations, public health reference laboratories are increasingly moving towards whole genome sequencing (WGS) for the rapid characterisation of bacterial isolates. Here we present a method to rapidly and accurately serotype E. coli isolates from raw, short read sequence data, leveraging the known genetic basis for the biosynthesis of O- and H-antigens. Our approach bypasses the need for de novo genome assembly by directly screening WGS reads against a curated database of alleles linked to known E. coli O-groups and H-types (the EcOH database) using the software package SRST2. We validated our approach by comparing in silico results with those obtained via serological phenotyping of 197 enteropathogenic (EPEC) isolates. We also demonstrated the utility of our method to characterise enterotoxigenic E. coli (ETEC) and the uropathogenic E. coli (UPEC) epidemic clone ST131, and for in silico serotyping of foodborne outbreak-related isolates in the public GenomeTrakr database.