Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

One tagger, many uses: Illustrating the power of ontologies in dictionary-based named entity recognition

View ORCID ProfileLars Juhl Jensen
doi: https://doi.org/10.1101/067132
Lars Juhl Jensen
Novo Nordisk Foundation Center for Protein Research Faculty of Health and Medical Sciences, University of Copenhagen Copenhagen, Denmark,
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lars Juhl Jensen
  • For correspondence: lars.juhl.jensen@cpr.ku.dk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Automatic annotation of text is an important complement to manual annotation, because the latter is highly labour intensive. We have developed a fast dictionary-based named entity recognition (NER) system and addressed a wide variety of biomedical problems by applied it to text from many different sources. We have used this tagger both in real-time tools to support curation efforts and in pipelines for populating databases through bulk processing of entire Medline, the open-access subset of PubMed Central, NIH grant abstracts, FDA drug labels, electronic health records, and the Encyclopedia of Life. Despite the simplicity of the approach, it typically achieves 80–90% precision and 70–80% recall. Many of the underlying dictionaries were built from open biomedical ontologies, which further facilitate integration of the text-mining results with evidence from other sources.

Footnotes

  • This work was in part funded by the Novo Nordisk Foundation (NNF14CC0001) and the National Institutes of Health (U54 CA189205-01).

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 02, 2016.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
One tagger, many uses: Illustrating the power of ontologies in dictionary-based named entity recognition
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
One tagger, many uses: Illustrating the power of ontologies in dictionary-based named entity recognition
Lars Juhl Jensen
bioRxiv 067132; doi: https://doi.org/10.1101/067132
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
One tagger, many uses: Illustrating the power of ontologies in dictionary-based named entity recognition
Lars Juhl Jensen
bioRxiv 067132; doi: https://doi.org/10.1101/067132

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2440)
  • Biochemistry (4803)
  • Bioengineering (3340)
  • Bioinformatics (14724)
  • Biophysics (6658)
  • Cancer Biology (5188)
  • Cell Biology (7455)
  • Clinical Trials (138)
  • Developmental Biology (4378)
  • Ecology (6904)
  • Epidemiology (2057)
  • Evolutionary Biology (9943)
  • Genetics (7357)
  • Genomics (9550)
  • Immunology (4583)
  • Microbiology (12730)
  • Molecular Biology (4960)
  • Neuroscience (28422)
  • Paleontology (199)
  • Pathology (810)
  • Pharmacology and Toxicology (1400)
  • Physiology (2031)
  • Plant Biology (4521)
  • Scientific Communication and Education (980)
  • Synthetic Biology (1305)
  • Systems Biology (3922)
  • Zoology (731)