Abstract
The biology of bacterial cells is, in general, based on the information encoded on circular chromosomes. Regulation of chromosome replication is an essential process which mostly takes place at the origin of replication (oriC). Identification of high numbers of oriC is a prerequisite to enable systematic studies that could lead to insights of oriC functioning as well as novel drug targets for antibiotic development. Current methods for identyfing oriC sequences rely on chromosome-wide nucleotide disparities and are therefore limited to fully sequenced genomes, leaving a superabundance of genomic fragments unstudied. Here, we present γBOriS (Gammaproteobacterial oriC Searcher), which accurately identifies oriC sequences on gammaproteobacterial chromosomal fragments by employing motif-based DNA classification. Using γBOriS, we created BOriS DB, which currently contains 25,827 oriC sequences from 1,217 species, thus making it the largest available database for oriC sequences to date.