RT Journal Article SR Electronic T1 Evolution and variation of 2019-novel coronavirus JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.30.926477 DO 10.1101/2020.01.30.926477 A1 Chenglong Xiong A1 Lufang Jiang A1 Yue Chen A1 Qingwu Jiang YR 2020 UL http://biorxiv.org/content/early/2020/01/30/2020.01.30.926477.abstract AB Background The current outbreak caused by novel coronavirus (2019-nCoV) in China has become a worldwide concern. As of 28 January 2020, there were 4631 confirmed cases and 106 deaths, and 11 countries or regions were affected.Methods We downloaded the genomes of 2019-nCoVs and similar isolates from the Global Initiative on Sharing Avian Influenza Database (GISAID and nucleotide database of the National Center for Biotechnology Information (NCBI). Lasergene 7.0 and MEGA 6.0 softwares were used to calculate genetic distances of the sequences, to construct phylogenetic trees, and to align amino acid sequences. Bayesian coalescent phylogenetic analysis, implemented in the BEAST software package, was used to calculate the molecular clock related characteristics such as the nucleotide substitution rate and the most recent common ancestor (tMRCA) of 2019-nCoVs.Results An isolate numbered EPI_ISL_403928 showed different phylogenetic trees and genetic distances of the whole length genome, the coding sequences (CDS) of ployprotein (P), spike protein (S), and nucleoprotein (N) from other 2019-nCoVs. There are 22, 4, 2 variations in P, S, and N at the level of amino acid residues. The nucleotide substitution rates from high to low are 1·05 × 10−2 (nucleotide substitutions/site/year, with 95% HPD interval being 6.27 × 10−4 to 2.72 × 10−2) for N, 5.34 × 10−3 (5.10 × 10−4, 1.28 × 10−2) for S, 1.69 × 10−3 (3.94 × 10−4, 3.60 × 10−3) for P, 1.65 × 10−3 (4.47 × 10−4, 3.24 × 10−3) for the whole genome, respectively. At this nucleotide substitution rate, the most recent common ancestor (tMRCA) of 2019-nCoVs appeared about 0.253-0.594 year before the epidemic.Conclusion Our analysis suggests that at least two different viral strains of 2019-nCoV are involved in this outbreak that might occur a few months earlier before it was officially reported.CoVsCoronaviruses2019-nCoV2019-novel coronavirusSARS-CoVsevere acute respiratory syndrome coronavirusMERS-CoVMiddle East respiratory syndrome coronavirusCDScoding sequencetMRCAthe most recent common ancestorGISAIDthe Global Initiative on Sharing Avian Influenza DatabaseESSsEffective sample sizes