Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

UniqTag: Content-derived unique and stable identifiers for gene annotation

View ORCID ProfileShaun Jackman, View ORCID ProfileJoerg Bohlmann, View ORCID ProfileInanҫ Birol
doi: https://doi.org/10.1101/007583
Shaun Jackman
1Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
2Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shaun Jackman
Joerg Bohlmann
3Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
4Department of Forest Science, University of British Columbia, Vancouver, BC, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joerg Bohlmann
Inanҫ Birol
1Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
5Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Inanҫ Birol
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

ABSTRACT

Summary When working on an ongoing genome sequencing and assembly project, it is rather inconvenient when gene identifiers change from one build of the assembly to the next. The gene labelling system described here, UniqTag, addresses this common challenge. UniqTag assigns a unique identifier to each gene that is a representative k - mer, a string of length k, selected from the sequence of that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without requiring that previous annotations be lifted over by sequence alignment. We assign UniqTag identifiers to nine builds of the Ensembl human genome spanning seven years to demonstrate this stability.

Availability and implementation The implementation of UniqTag is available at https://github.com/sjackman/uniqtag

Supplementary data and code to reproduce it is available at https://github.com/sjackman/uniqtag-paper

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 01, 2014.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
UniqTag: Content-derived unique and stable identifiers for gene annotation
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
UniqTag: Content-derived unique and stable identifiers for gene annotation
Shaun Jackman, Joerg Bohlmann, Inanҫ Birol
bioRxiv 007583; doi: https://doi.org/10.1101/007583
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
UniqTag: Content-derived unique and stable identifiers for gene annotation
Shaun Jackman, Joerg Bohlmann, Inanҫ Birol
bioRxiv 007583; doi: https://doi.org/10.1101/007583

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2235)
  • Biochemistry (4302)
  • Bioengineering (2958)
  • Bioinformatics (13483)
  • Biophysics (5959)
  • Cancer Biology (4633)
  • Cell Biology (6641)
  • Clinical Trials (138)
  • Developmental Biology (3939)
  • Ecology (6240)
  • Epidemiology (2053)
  • Evolutionary Biology (9181)
  • Genetics (6883)
  • Genomics (8803)
  • Immunology (3918)
  • Microbiology (11286)
  • Molecular Biology (4458)
  • Neuroscience (25625)
  • Paleontology (183)
  • Pathology (722)
  • Pharmacology and Toxicology (1209)
  • Physiology (1776)
  • Plant Biology (3999)
  • Scientific Communication and Education (892)
  • Synthetic Biology (1194)
  • Systems Biology (3627)
  • Zoology (654)