Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Wikidata as a semantic framework for the Gene Wiki initiative

Sebastian Burgstaller-Muehlbacher, View ORCID ProfileAndra Waagmeester, View ORCID ProfileElvira Mitraka, View ORCID ProfileJulia Turner, View ORCID ProfileTim Putman, View ORCID ProfileJustin Leong, View ORCID ProfilePaul Pavlidis, View ORCID ProfileLynn Schriml, View ORCID ProfileBenjamin M. Good, View ORCID ProfileAndrew I. Su
doi: https://doi.org/10.1101/032144
Sebastian Burgstaller-Muehlbacher
1)The Scripps Research Institute, La Jolla, CA, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andra Waagmeester
2)micelio.be, Antwerp, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andra Waagmeester
Elvira Mitraka
3)University of Maryland Baltimore, Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Elvira Mitraka
Julia Turner
1)The Scripps Research Institute, La Jolla, CA, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julia Turner
Tim Putman
1)The Scripps Research Institute, La Jolla, CA, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tim Putman
Justin Leong
4)The University of British Columbia, Vancouver, BC, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Justin Leong
Paul Pavlidis
4)The University of British Columbia, Vancouver, BC, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Paul Pavlidis
Lynn Schriml
3)University of Maryland Baltimore, Baltimore, MD, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lynn Schriml
Benjamin M. Good
1)The Scripps Research Institute, La Jolla, CA, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin M. Good
Andrew I. Su
1)The Scripps Research Institute, La Jolla, CA, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew I. Su
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Open biological data is distributed over many resources making it challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia.

In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59,530 human genes and 73,130 mouse genes have been imported from NCBI and 27,662 human proteins and 16,728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for this data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata is modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for a usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a powerful SPARQL endpoint and up to date exporting functionality (e.g. JSON, XML) enable very convenient further use of the data by scientists.

In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted November 19, 2015.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Wikidata as a semantic framework for the Gene Wiki initiative
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Wikidata as a semantic framework for the Gene Wiki initiative
Sebastian Burgstaller-Muehlbacher, Andra Waagmeester, Elvira Mitraka, Julia Turner, Tim Putman, Justin Leong, Paul Pavlidis, Lynn Schriml, Benjamin M. Good, Andrew I. Su
bioRxiv 032144; doi: https://doi.org/10.1101/032144
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Wikidata as a semantic framework for the Gene Wiki initiative
Sebastian Burgstaller-Muehlbacher, Andra Waagmeester, Elvira Mitraka, Julia Turner, Tim Putman, Justin Leong, Paul Pavlidis, Lynn Schriml, Benjamin M. Good, Andrew I. Su
bioRxiv 032144; doi: https://doi.org/10.1101/032144

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Subject Areas
All Articles
  • Animal Behavior and Cognition (4688)
  • Biochemistry (10379)
  • Bioengineering (7695)
  • Bioinformatics (26373)
  • Biophysics (13547)
  • Cancer Biology (10724)
  • Cell Biology (15460)
  • Clinical Trials (138)
  • Developmental Biology (8509)
  • Ecology (12843)
  • Epidemiology (2067)
  • Evolutionary Biology (16887)
  • Genetics (11416)
  • Genomics (15493)
  • Immunology (10638)
  • Microbiology (25257)
  • Molecular Biology (10241)
  • Neuroscience (54595)
  • Paleontology (402)
  • Pathology (1671)
  • Pharmacology and Toxicology (2899)
  • Physiology (4355)
  • Plant Biology (9263)
  • Scientific Communication and Education (1588)
  • Synthetic Biology (2561)
  • Systems Biology (6789)
  • Zoology (1471)