Abstract
Background Vaccinium floribundum Kunth, known as "mortiño," is an endemic shrub species of the Andean region adapted to harsh conditions in high-altitude ecosystems. It plays an important ecological role as a pioneer species in the aftermath of deforestation and human-induced fires within paramo ecosystems, emphasizing its conservation value. While previous studies have offered insights into the genetic diversity of mortiño, comprehensive genomic studies are still missing to fully understand the unique adaptations of this species and its population status, highlighting the importance of generating a reference genome for this plant.
Results ONT and Illumina sequencing were used to establish a reference genome for this species. Three different de novo genome assemblies were generated and compared for quality, continuity and completeness. The Flye assembly was selected as the best and refined by filtering out short ONT reads, screening for contaminants and genome scaffolding. The final assembly has a genome size of 529 MB, containing 1,317 contigs and 97% complete BUSCOs, indicating a high level of integrity of the genome. Additionally, the LAI Index of 12.93, further categorizes this assembly as a reference genome.
Conclusions The genome of V. floribundum reported in this study is the first reference genome generated for this species, providing a valuable tool for further studies. This high-quality genome, based on the quality and completeness parameters obtained, will not only help uncover the genetic mechanisms responsible for its unique traits and adaptations to high-altitude ecosystems, but will also contribute to conservation strategies for a species endemic to the Andes.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Institutional Addresses: Martina Albuja-Quintana, Gabriela Pozo, Milton Gordillo-Romero, Carolina E. Armijos, María de Lourdes Torres: Diego de Robles S/N y Pampite, Cumbayá, Quito, Ecuador, 170901
Email Addresses: Martina Albuja-Quintana: malbujaq{at}usfq.edu.ec, Gabriela Pozo: gpozo{at}usfq.edu.ec, Milton Gordillo-Romero: mgordillo{at}usfq.edu.ec, Carolina E. Armijos: carmijos{at}unc.edu, María de Lourdes Torres: ltorres{at}usfq.edu.ec
Data Availability
The genome assembly and all sequencing data have been deposited in the GenBank database under the BioProject PRJNA1071645 (SAMN39706721). The script used for the assembly and annotation of this genome is described in protocol.io (dx.doi.org/10.17504/protocols.io.n92ldmo4nl5b/v1). All supporting data and material is available in the GigaScience GigaDB database.
Abbreviations
- m.a.s.l
- meters above sea level
- UV
- ultraviolet light
- BUSCO
- Benchmarking Universal Single-Copy Orthologs
- LAI
- Long Terminal Repeat Assembly Index
- QUSF
- USFQ herbarium
- HMW-DNA
- High Molecular DNA
- CTAB
- Cetyltrimethylammonium bromide
- SRE
- short-read eliminator
- ONT
- Oxford Nanopore Technologies
- MaSuRCA
- Maryland Super-Read Celera Assembler
- QUAST
- Quality Assessment Tool for Genome Assemblies
- LTR
- long terminal repeat
- BWA
- Burrows-Wheeler Aligner
- NCBI
- National Center for Biotechnology Information
- EST
- Expressed sequence tag
- HMM
- hidden Markov model
- gff
- general feature format
- AGAT
- Another Gff Analysis Toolkit
- Gb
- gigabase
- Kb
- kilobase
- bp
- base pair
- Mb
- megabase
- AED
- Annotation Edit Distance
- CDS
- coding sequence