Abstract
Biomedical data, in particular omics datasets are being generated at an unprecedented rate. This is due to the falling costs of generating experimental data, improved accuracy and better accessibility to different omics platforms such as genomics, proteomics and metabolomics1,2. As a result, the number of deposited datasets in public repositories originating from various omics approaches has increased dramatically in recent years. With strong support from scientific journals and funders, public data sharing is increasingly considered to be a good scientific practice, facilitating the confirmation of original results, increasing the reproducibility of the analyses, enabling the exploration of new or related hypotheses, and fostering the identification of potential errors, discouraging fraud3. This increase in public data deposition of omics results is a good starting point, but opens up a series of new challenges. For example the research community must now find more efficient ways for storing, organizing and providing access to biomedical data across platforms. These challenges range from achieving a common representation framework for the datasets and the associated metadata from different omics fields, to the availability of efficient methods, protocols and file formats for data exchange between multiple repositories. Therefore, there is a great need for development of new platforms and applications to make possible to search datasets across different omics fields, making such information accessible to the end-user. The FAIR paradigm describes a set of guiding principles to address many of these issues, and aims to make data Findable, Accessible, Interoperable and Re-usable(https://www.force11.org/group/fairgroup/fairprinciples).
Abbreviations
- API
- Application Programming Interface
- bioCADDIE
- biomedical healthCAre Data Discovery and Index Ecosystem
- CV
- Controlled Vocabulary
- DAC
- Data Access Committee
- DOI
- Digital Object Identifier
- EGA
- European Genome–Phenome Archive
- EuroPMC
- Europe PubMed Central
- GNPS
- Global Natural Products Social Molecular Networking
- GPMDB
- Global Proteome Machine Database
- MassIVE
- Mass spectrometry Interactive Virtual Environment
- PRIDE
- PRoteomics IDEntifications (PRIDE) database
- OmicsDI
- Omics Discovery Index
- URL
- Uniform Resource Identifier