Abstract
BACKGROUND The Mongolian gerbil (Meriones unguiculatus) has historically been used as a model organism for the auditory and visual systems, stroke/ischemia, epilepsy and aging related research since 1935 when laboratory gerbils were separated from their wild counterparts. In this study we report genome sequencing, assembly, and annotation further supported by transcriptome data from 27 different tissues samples.
FINDINGS The genome was assembled using Illumina HiSeq 2000 and resulted in a final genome size of 2.54 Gbp with contig and scaffold N50 values of 31.4 Kbp and 500.0 Kbp, respectively. Based on the k-mer estimated genome size of 2.48 Gbp, the assembly appears to be complete. The genome annotation was supported by transcriptome data that identified 36 019 predicted protein-coding genes across 27 tissue samples. A BUSCO search of 3023 mammalian groups resulted in 86% of curated single copy orthologs present among predicted genes, indicating a high level of completeness of the genome.
CONCLUSIONS We report a de novo assembly of the Mongolian gerbil genome that was further enhanced by annotation of transcriptome data from several tissues. Sequencing of this genome increases the utility of the gerbil as a model organism, opening the availability of now widely used genetic tools.
The data sets supporting the results of this article are available in the China National GeneBank CNSA repository, Accession id: CNP0000340.
Footnotes
Email Addresses for Authors: Shifeng Cheng: chengshifeng{at}caas.cn Yuan Fu: fuyuan{at}genomics.cn Yaolei Zhang: zhangyaolei{at}genomics.cn Wenfei Xian: xianwenfei{at}caas.cn Hongli Wang: wanghongli{at}genomics.cn Benedikt Grothe: grothe{at}lmu.de Xin Lu: liuxig{at}genomics.cn Xun Xu: xuxun{at}genomics.cn Achim Klug: achim.klug{at}ucdenver.edu Elizabeth A McCullagh: elizabeth.mccullagh{at}ucdenver.edu
↵* co first authors
Abbreviations
- bp
- base pair
- BUSCO
- Benchmarking Universal Single-Copy Orthologs
- CDS
- coding sequence
- LINEs
- long interspersed elements
- LTRs
- long terminal repeats
- Myr
- million years
- NCBI
- National Center for Biotechnology Information
- RefSeq
- Reference sequence
- RNA-seq
- high-throughput messenger RNA sequencing
- RIN
- RNA integrity number
- SINEs
- short interspersed elements