TY - JOUR T1 - UniqTag: Content-derived unique and stable identifiers for gene annotation JF - bioRxiv DO - 10.1101/007583 SP - 007583 AU - Shaun Jackman AU - Joerg Bohlmann AU - Inan̉« Birol Y1 - 2014/01/01 UR - http://biorxiv.org/content/early/2014/08/01/007583.abstract N2 - Summary When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k - mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to nine builds of the Ensembl human genome spanning seven years to demonstrate this stability.Availability and implementation The implementation of UniqTag is available at https://github.com/sjackman/uniqtagSupplementary data and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper ER -