RT Journal Article
SR Electronic
T1 SMRT sequencing yields the chromosome-scale reference genome of tea tree, Camellia sinensis var. sinensis
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 2020.01.02.892430
DO 10.1101/2020.01.02.892430
A1 Qun-Jie Zhang
A1 Wei Li
A1 Kui Li
A1 Hong Nan
A1 Cong Shi
A1 Yun Zhang
A1 Zhang-Yan Dai
A1 Yang-Lei Lin
A1 Xiao-Lan Yang
A1 Yan Tong
A1 Dan Zhang
A1 Cui Lu
A1 Chen-feng Wang
A1 Xiao-xin Liu
A1 Wen-Kai Jiang
A1 Xing-Hua Wang
A1 Xing-Cai Zhang
A1 Zhong-Hua Liu
A1 Evan E. Eichler
A1 Li-Zhi Gao
YR 2020
UL http://biorxiv.org/content/early/2020/01/02/2020.01.02.892430.abstract
AB Tea is the oldest and most popular nonalcoholic beverage consumed in the world. It provides abundant secondary metabolites that account for its diverse flavors and health benefits. Here we present the first high-quality chromosome-length reference genome of C. sinensis var. sinensis using long read single-molecule real time (SMRT) sequencing and Hi-C technologies to anchor the ∼2.85-Gb genome assembly into 15 pseudo-chromosomes with a scaffold N50 length of ∼195.68 Mb. We annotated at least 2.17 Gb (∼74.13%) of repetitive sequences and high-confidence prediction of 40,812 protein-coding genes in the ∼2.92-Gb genome assembly. This accurately assembled genome allows us to comprehensively annotate functionally important gene families such as those involved in the biosynthesis of catechins, theanine and caffeine. The contiguous genome assembly provides the first view of the repetitive landscape allowing us to accurately characterize retrotransposon diversity. The large tea tree genome is dominated by a handful of Ty3-gypsy long terminal repeat (LTR) retrotransposon families that recently expanded to high copy numbers. We uncover the latest bursts of numerous non-autonomous LTR retrotransposons that may interfere with the propagation of autonomous retroelements. This reference genome sequence will largely facilitate the improvement of agronomically important traits relevant to the tea quality and production.