ABSTRACT
Novel Coronavirus (nCoV) outbreak in the city of Wuhan, China during December 2019, has now spread to various countries across the globe triggering a heightened containment effort. This human pathogen is a member of betacoronavirus genus carrying 30 kilobase of single positive-sense RNA genome. Understanding the evolution, zoonotic transmission, and source of this novel virus would help accelerating containment and prevention efforts. The present study reported detailed analysis of 2019-nCoV genome evolution and potential candidate peptides for vaccine development. This nCoV genotype might have been evolved from a bat-CoV by accumulating non-synonymous mutations, indels, and recombination events. Structural proteins Spike (S), and Membrane (M) had extensive mutational changes, whereas Envelope (E) and Nucleocapsid (N) proteins were very conserved suggesting differential selection pressures exerted on 2019-nCoV during evolution. Interestingly, 2019-nCoV Spike protein contains a 39 nucleotide (5’-aAT GGT GTT GAA GGT TTT AAT TGT TAC TTT CCT TTA CAA Tca-3’) sequence insertion, which shares homology to fish genomic sequence of Myripristis murdjan, an abundant fish type in Indo-Pacific Ocean. Furthermore, we identified eight high binding affinity (HBA) CD4 T-cell epitopes in the S, E, M and N proteins, which can be commonly recognized by HLA-DR alleles of Asia and Asia-Pacific Region population. These immunodominant epitopes can be incorporated in universal subunit CoV vaccine. Diverse HLA types and variations in the epitope binding affinity may contribute to the wide range of immunopathological outcomes of circulating virus in humans. Our findings emphasize the requirement for continuous surveillance of CoV strains in live animal markets to better understand the viral adaptation to human host and to develop practical solutions to prevent the emergence of novel pathogenic CoV strains.