Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data

View ORCID ProfileGinger Tsueng, View ORCID ProfileJulia Mullen, View ORCID ProfileManar Alkuzweny, Marco Cano, Benjamin Rush, Emily Haag, Outbreak Curators, Alaa Abdel Latif, View ORCID ProfileXinghua Zhou, Zhongchao Qian, Kristian G. Andersen, View ORCID ProfileChunlei Wu, View ORCID ProfileAndrew I. Su, Karthik Gangavarapu, View ORCID ProfileLaura D. Hughes
doi: https://doi.org/10.1101/2022.01.20.477133
Ginger Tsueng
1The Scripps Research institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ginger Tsueng
  • For correspondence: gtsueng@scripps.edu
Julia Mullen
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Julia Mullen
Manar Alkuzweny
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Manar Alkuzweny
Marco Cano
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin Rush
3Outbreak Curators;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Emily Haag
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alaa Abdel Latif
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xinghua Zhou
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xinghua Zhou
Zhongchao Qian
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kristian G. Andersen
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chunlei Wu
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chunlei Wu
Andrew I. Su
4Scripps Research Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew I. Su
Karthik Gangavarapu
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laura D. Hughes
2The Scripps Research Institute;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Laura D. Hughes
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

To combat the ongoing COVID-19 pandemic, scientists have been conducting research at breakneck speeds, producing over 52,000 peer reviewed articles within the first 12 months. In contrast, a little over 1,000 peer reviewed articles were published within the first 12 months of the SARS-CoV-1 pandemic starting in 2002. In addition to publications, there has also been an upsurge in clinical trials to develop vaccines and treatments, scientific protocols to study SARS-CoV-2, methodology for epidemiological modeling, and datasets spanning molecular studies to social science research. One of the largest challenges has been keeping track of the vast amounts of newly generated disparate data and research that exist in independent repositories. To address this issue, we developed outbreak.info, which provides a standardized, searchable interface of heterogeneous data resources on COVID-19 and SARS-CoV-2. Unifying metadata from 14 data repositories, we have assembled a collection of over 200,000 publications, clinical trials, datasets, protocols, and other resources as of October 2021. We used a rigorous schema to enforce a consistent format across different data sources and resource types, and linked related resources where possible. This enables users to quickly retrieve information across data repositories, regardless of resource type or repository location. Outbreak.info also combines the combined research library with spatiotemporal genomics data on SARS-CoV-2 variants and epidemiological data on COVID-19 cases and deaths. The web interface provides interactive visualizations and reports to explore the unified data and generate hypotheses. In addition to providing a web interface, we also publish the data we have assembled and standardized in a high performance public API and an R package. Finally, we discuss the challenges inherent in combining metadata from scattered and heterogeneous resources and provide recommendations to streamline this process to aid scientific research.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted January 21, 2022.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data
Ginger Tsueng, Julia Mullen, Manar Alkuzweny, Marco Cano, Benjamin Rush, Emily Haag, Outbreak Curators, Alaa Abdel Latif, Xinghua Zhou, Zhongchao Qian, Kristian G. Andersen, Chunlei Wu, Andrew I. Su, Karthik Gangavarapu, Laura D. Hughes
bioRxiv 2022.01.20.477133; doi: https://doi.org/10.1101/2022.01.20.477133
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Outbreak.info: A standardized, searchable platform to discover and explore COVID-19 resources and data
Ginger Tsueng, Julia Mullen, Manar Alkuzweny, Marco Cano, Benjamin Rush, Emily Haag, Outbreak Curators, Alaa Abdel Latif, Xinghua Zhou, Zhongchao Qian, Kristian G. Andersen, Chunlei Wu, Andrew I. Su, Karthik Gangavarapu, Laura D. Hughes
bioRxiv 2022.01.20.477133; doi: https://doi.org/10.1101/2022.01.20.477133

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4246)
  • Biochemistry (9176)
  • Bioengineering (6807)
  • Bioinformatics (24069)
  • Biophysics (12161)
  • Cancer Biology (9568)
  • Cell Biology (13847)
  • Clinical Trials (138)
  • Developmental Biology (7662)
  • Ecology (11739)
  • Epidemiology (2066)
  • Evolutionary Biology (15547)
  • Genetics (10673)
  • Genomics (14366)
  • Immunology (9517)
  • Microbiology (22916)
  • Molecular Biology (9135)
  • Neuroscience (49170)
  • Paleontology (358)
  • Pathology (1488)
  • Pharmacology and Toxicology (2584)
  • Physiology (3851)
  • Plant Biology (8353)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2302)
  • Systems Biology (6207)
  • Zoology (1304)