RT Journal Article
SR Electronic
T1 Uniform Resolution of Compact Identifiers for Biomedical Data
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 101279
DO 10.1101/101279
A1 Sarala M. Wimalaratne
A1 Nick Juty
A1 John Kunze
A1 Greg Janée
A1 Julie A. McMurry
A1 Niall Beard
A1 Rafael Jimenez
A1 Jeffrey Grethe
A1 Henning Hermjakob
A1 Tim Clark
YR 2017
UL http://biorxiv.org/content/early/2017/08/18/101279.abstract
AB Compact identifiers have been widely used in biomedical informatics both formally and informally. They consist of two parts: 1) a unique prefix or namespace indicating the assigning authority and 2) a locally assigned database identifier sometimes called an accession number. The former is used to avoid global identifier collisions when integrating separately managed datasets that are run by different communities and consortia under a variety of autonomous data management systems and practices. This bi-partite identifier approach predates the invention of the Web, but can be leveraged to work more harmoniously with it.Identifiers.org and N2T.net are two meta-resolvers that take any given identifier from over 500 source databases and reliably redirect it to its original source on the Web. Identifiers.org is based at the European Molecular Biology Laboratory–European Bioinformatics Institute (EMBL-EBI) and serves the biomedical domain; whereas N2T.net (Name-to-Thing) is based at the California Digital Library (CDL), University of California Office of the President, and is domain-agnostic. Both resolvers, while derived from independently developed code bases, with different features and objectives, can now uniformly resolve compact identifiers using a set of common procedures and redirection rules. Here we report on significant further work by our teams toward a more unified approach to making compact identifiers available for long-term use in an ecosystem supporting formal citation of primary scholarly research data. This approach is intended to be robust beyond the operational and funding scope of any one organization, enabling long-term resolution of persistent archived data, whether it is cited in the literature, or is referenced in the web at large. We demonstrate that multiple resolvers with fundamentally different underlying code bases, organizational settings and international alignments, can readily support this approach.As part of this project we have deployed public, production-quality resolvers using a common registry of prefix- based redirection rules. We believe these products and our approach will be of significant help to publishers, authors and others implementing persistent, machine-resolvable citation of research data in compliance with emerging science policy recommendations and funder requirements.