RT Journal Article SR Electronic T1 SMRT sequencing yields the chromosome-scale reference genome of tea tree, Camellia sinensis var. sinensis JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.02.892430 DO 10.1101/2020.01.02.892430 A1 Qun-Jie Zhang A1 Wei Li A1 Kui Li A1 Hong Nan A1 Cong Shi A1 Yun Zhang A1 Zhang-Yan Dai A1 Yang-Lei Lin A1 Xiao-Lan Yang A1 Yan Tong A1 Dan Zhang A1 Cui Lu A1 Chen-feng Wang A1 Xiao-xin Liu A1 Wen-Kai Jiang A1 Xing-Hua Wang A1 Xing-Cai Zhang A1 Zhong-Hua Liu A1 Evan E. Eichler A1 Li-Zhi Gao YR 2020 UL http://biorxiv.org/content/early/2020/01/02/2020.01.02.892430.abstract AB Tea is the oldest and most popular nonalcoholic beverage consumed in the world. It provides abundant secondary metabolites that account for its diverse flavors and health benefits. Here we present the first high-quality chromosome-length reference genome of C. sinensis var. sinensis using long read single-molecule real time (SMRT) sequencing and Hi-C technologies to anchor the ∼2.85-Gb genome assembly into 15 pseudo-chromosomes with a scaffold N50 length of ∼195.68 Mb. We annotated at least 2.17 Gb (∼74.13%) of repetitive sequences and high-confidence prediction of 40,812 protein-coding genes in the ∼2.92-Gb genome assembly. This accurately assembled genome allows us to comprehensively annotate functionally important gene families such as those involved in the biosynthesis of catechins, theanine and caffeine. The contiguous genome assembly provides the first view of the repetitive landscape allowing us to accurately characterize retrotransposon diversity. The large tea tree genome is dominated by a handful of Ty3-gypsy long terminal repeat (LTR) retrotransposon families that recently expanded to high copy numbers. We uncover the latest bursts of numerous non-autonomous LTR retrotransposons that may interfere with the propagation of autonomous retroelements. This reference genome sequence will largely facilitate the improvement of agronomically important traits relevant to the tea quality and production.