Abstract
Underutilized sheep and goat breeds have the ability to adapt to challenging environments due to their genetic composition. Integrating publicly available genomic datasets with new data will facilitate genetic diversity analyses; however, this process is complicated by important data discrepancies, such as outdated assembly versions or different data formats. Here we present the SMARTER-database, a collection of tools and scripts to standardize genomic data and metadata mainly from SNP chips arrays on global small ruminant populations with a focus on reproducibility. SMARTER-database harmonizes genotypes for about 12,000 sheep and 6,000 goats to a uniform coding and assembly version. Users can access the genotype data via FTP and interact with the metadata through a web interface or programmatically using their custom scripts, enabling efficient filtering and selection of samples. These tools will empower researchers to focus on the crucial aspects of adaptation and contribute to livestock sustainability, leveraging the rich dataset provided by the SMARTER-database.
Availability & Implementation The code is available as open source software under the MIT license at https://github.com/cnr-ibba/SMARTER-database.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵* paolo.cozzi{at}ibba.cnr.it
List of abbreviations
- API
- Application Programming Interface
- CI
- Continuous Integration
- EBI
- European Bioinformatics Institute
- EVA
- European Variation Archive
- FAO
- Food and Agriculture Organization
- FTPS
- File Transfer Protocol Secure
- GNU
- GNU’s Not Unix
- GPS
- Global Positioning System
- HTTP
- HyperText Transfer Protocol
- JSON
- JavaScript Object Notation
- MAF
- Minor Allele Frequency
- ODM
- Object-Document Mapper
- PATO
- Phenotype And Trait Ontology
- REST
- Representational State Transfer
- RESTful
- REST-compliant systems
- rsID
- Reference SNP cluster ID
- SNP
- Single Nucleotide Polymorphism
- UML
- Unified Modeling Language
- URL
- Uniform Resource Locator
- VCF
- Variant Calling Format
- WGS
- Whole Genome Sequencing