TY - JOUR T1 - <em>Tempus et Locus</em>: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V) JF - bioRxiv DO - 10.1101/061697 SP - 061697 AU - Alice R. Carter AU - Derek Gatherer Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/07/04/061697.abstract N2 - The presence of data in the “collection_date” field of a GenBank sequence record is of great assistance in the use of that sequence for Bayesian phylogenetics using “tip-dating”. We present Tempus et Locus (TeL), a tool for extracting such sequences from a GenBank-formatted sequence database. TeL shows that 60% of viral sequences in GenBank have collection date fields, but that this varies considerably between species. Primate erythroparvovirus 1 (human parvovirus B19 or B19V) has only 40% of its sequences dated, of which only 112 are of more than 4 kb. 100 of these are from B19V sub-genotype 1a and were collected from a mere 6 studies conducted in 5 countries between 2002 and 2013. Nevertheless, Bayesian phylogenetic analysis of this limited set gives a date for the common ancestor of sub-genotype 1a in 1990 (95% HPD 1981-1996) which is in reasonable agreement with estimates of previous studies where collection dates have been assembled by more laborious methods of literature search and direct enquiries to sequence submitters. We conclude that although collection dates should become standard for all future GenBank submissions of virus sequences, accurate dating of ancestors is possible with even a small number of sequences if sampling information is high quality. ER -