The Essential Facts of Wuhan Novel Coronavirus Outbreak in China and Epitope-based Vaccine Designing against COVID-19

Wuhan Novel Coronavirus disease (COVID-19) outbreak has become a global outbreak which has raised the concern of scientific community to design and discover a definitive cure against this deadly virus which has caused deaths of numerous infected people upon infection and spreading. To date, no antiviral therapy or vaccine is available which can effectively combat the infection caused by this virus. This study was conducted to design possible epitope-based subunit vaccines against the SARS-CoV-2 virus using the approaches of reverse vaccinology and immunoinformatics. Upon continual computational experimentation three possible vaccine constructs were designed and one vaccine construct was selected as the best vaccine based on molecular docking study which is supposed to effectively act against SARS-CoV-2. Later, molecular dynamics simulation and in silico codon adaptation experiments were carried out in order to check biological stability and find effective mass production strategy of the selected vaccine. Hopefully, this study will contribute to uphold the present efforts of the researches to secure a definitive treatment against this lethal virus.


Origin of Coronavirus, Their Morphology, Pathology and Others
Coronaviruses (CoVs) are enveloped positive-sense RNA viruses, which is surrounded by crownshaped, club-like spikes projection on the outer surface [1] [2]. Coronaviruses spike protein are glycoprotein that are embedded over the viral envelope. This spike protein attaches to specific cellular receptors and initiates structural changes of spike protein, and causes penetration of cell membranes which results in the release of the viral nucleocapsid into the cell [3]. The viral spike protein includes N-terminal, which is crucial for identification of Coronaviruses [4]. Coronaviruses have a large RNA genome in the size ranging from 26 to 32 kilobases and capable of obtaining distinct ways of replication [5]. Like all other RNA viruses, coronavirus undergoes replication of genome and transcription of mRNAs upon infection. Synthesis of a full-length negative-strand RNA serves as template for full-length genomic RNA [3] [6]. Coronaviruses are a large family of viruses belonging to the family Coronaviridae and the order Nidovirales [7]. that are common in many different species of animals, including camels, cattle, cats, and bats [8]. This group of viruses can cause wide varieties of respiratory illness in mammals and birds. In human, it causes common cold, leading to severe illness pneumonia to elderly people and immunocompromised people, like hospital patients [9]. This viral pathogen was responsible for the Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory (SARS) and 2019 Novel coronavirus (SARS-CoV-2) in China outbreaks [10]. Coronavirus in form of SARS, MERS and Novel coronavirus are lethal leading to large number of deaths. Coronavirus subfamily is classified into four genera, the alpha, beta, gamma and delta coronaviruses. Human Coronavirus (HCoV) infections are caused by alpha-and beta-CoV. CoVs are common human pathogens, and 30% to 60% of the Chinese population is positive for anti-CoV antibodies [11]. These viral infections caused by CoV are generally associated the upper respiratory tract to lower respiratory tract. Immunocompromised people like elderly people and infants are more vulnerable and susceptible to this group of viruses [12].
For many decades HCoVs are well adapted to humans and it had been prevalent among Human races. In 1960s, the first identification of Human coronaviruses (HCoVs) was made in patients with common cold. Following that incident more HCoVs has been detected, that involved the two major outbreak SARS and MERS, two pathogens that, upon infection, can cause fatal respiratory disease in humans [13].
The most common coronaviruses among humans are 229E, NL63, OC43, and HKU1 and some can evolve and cause human diseases, becoming new human coronaviruses. Three recent examples of these are SARS-CoV, SARS-CoV-2 and MERS-CoV [14]. SARS-CoV was the causal agent of the severe acute respiratory syndrome outbreaks in 2002 and 2003 in Guangdong Province, China.
6-8 MERS-CoV was the pathogen responsible for severe respiratory disease outbreaks in 2012 in the Middle East [15].
Chinese scientists were able to sequence the viral genome using next genome sequencing from the collected samples and identified the cause as a SARS-like coronavirus. The genetic sequence is now available for scientist all around the world which would aid faster diagnosis of further cases [2] [16] [17]. Based on full genome sequence data on the Global Initiative on Sharing All Influenza Data [GISAID] platform, the genetic sequence of the SARS-CoV-2 ensuring faster development of point-of-care real-time RT-PCR diagnostic tests specific for novel Coronavirus diseases or COVID-19 [18]. Association of genome sequence with other evidences demonstrated that SARS-4520 infected cases were reported on 28 th January [57] [58]. Alarming situation due to rising confirmed cases outside China made governments of other countries to start evacuation of their citizens from Wuhan, China and these include U.S., France, Japan, Australia, Germany, New Zealand, France [59]. Approximately, 170 deaths and 7,711 confirmed cases were reported with 1,737 new cases by Chinese National health commission on 29 th January [60]. The U.S. government evacuated 240 Americans, mostly diplomats and their family members from the epicenter of the virus outbreak on 29 th January by a chartered airplane [61]. Another report stated that on the same day chartered ANA airplane by Japanese government landed in Heneda Airport after evacuating 206 Japanese nationals [62].
The World Health Organization (WHO) had a meeting at the WHO headquarter in Geneva on 30 th January and WHO declared the outbreak as Public Health Emergency and it is also said by the committee that the declaration was made giving priority the other countries outside China [63].
On 30 th January the U.S. state department advised their citizens not to travel to China as the death toll jumped to 204 with 9,692 infected cases worldwide [64]. In China 213 deaths and globally 9,826 confirmed cases were reported on 31 st January by Chinese national health authorities, among which 9,720 cases in China [82]. Within 48 hours the number of infected cases on the cruise ship jumped sharply to 61, reported on 7th February, 2020 [83] and another report stated the death of the Chinese doctor, Li Wenliang, who tried to warn about the destructivity of coronavirus on the same date and more than 630 fatalities along with more than 30,000 infected cases were reported worldwide [84] [85]. As of 12th February, 2020 the infected cases on the cruise ship increased around 3 fold and the number was 174 and this cruise ship is considered as the large cluster of infected patients outside China by the Health experts and compared with " Floating petri dish" because it is easier to spread infectious disease in such a confined area [86]. The National Health Commission of China announced 1,843 deaths and 56,249 confirmed cases in Hubei, China on 15th February, 2020 [87]. First European fatality of a 80-year-old-man in France was reported on the same date [88]. 1,933 deaths in Hubei alone and more than 70,000 cases were reported by Health authorities on 18th February, 2020 [89]. China's National Health Committee reported on 19th February, 2020 that blood plasma can be extracted from the recovered patients to treat serious patients [90]. Japan's Health Ministry announced two deaths of a 87-year-old-man and 84-yearold-woman on the cruise ship on 20th February, 2020 and on the same date South Korea's first fatality was reported [91]. The youngest coronavirus infected patient who is under age 10 was reported in Japan on 21th February, 2020 [92]. As of 22th February, 2020, 2,239 death cases and 75,567 infected cases including 346 infected cases in South Korea were reported to the World Health Organization (WHO) [93]. World Health Organization (WHO) welcomed the report of China on 23th February, 2020 saying there was a fall in new death case in mainland China but concerned with infected cases in other counties as the total infected cases in South Korea stood 433 on the same date [94].
To date, there is no effective antiviral therapies that can combat the Coronavirus infections and hence the treatments are only supportive. Use of Interferons in combination with Ribavirin is somewhat effective. However, the effectiveness of combined remedy needs to be further evaluated [95]. This experiment is carried out to design novel epitope-based vaccine against four proteins of SARS-CoV-2 i.e., nucleocapsid phosphoprotein which is responsible for genome packaging and viral assembly [96]; surface glycoprotein that is responsible for membrane fusion event during viral entry [97] [98]; ORF3a protein that is responsible for viral replication, characterized virulence, viral spreading and infection [99] and membrane glycoprotein which mediates the interaction of virions with cell receptors [100] using the approaches of reverse vaccinology.
Reverse vaccinology refers to the process of developing vaccines where the novel antigens of a virus or microorganism or organism are detected by analyzing the genomic and genetic information of that particular virus or organism. In reverse vaccinology, the tools of bioinformatics are used for identifying and analyzing these novel antigens. These tools are used to dissect the genome and genetic makeup of a pathogen for developing a potential vaccine. Reverse vaccinology approach of vaccine development also allows the scientists to easily understand the antigenic segments of a virus or pathogen that should be given more emphasis during the vaccine development. This method is a quick, cheap, efficient, easy and cost-effective way to design vaccine. Reverse vaccinology has successfully been used for developing vaccines to fight against many viruses i.e., the Zika virus, Chikungunya virus etc. [101] [102].

Materials and Methods:
The current experiment was conducted to develop potential vaccines against the Wuhan novel coronavirus (strain SARS-CoV-2) (Wuhan seafood market pneumonia virus) exploiting the strategies of reverse vaccinology (Figure 01).

Strain Identification and Selection
The strain of the SARS-CoV-2 was selected by reviewing numerous entries of the online database of National Center for Biotechnology Information or NCBI (https://www.ncbi.nlm.nih.gov/).

Antigenicity Prediction and Physicochemical Property Analysis of the Protein Sequences
The antigenicity of protein sequences was determined using an online tool called VaxiJen v2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.htm). During the antigenicity prediction, the prediction accuracy parameter threshold was set at 0.4 in the tumor model [103]- [105]. The tumor model was used in prediction because it generated excellent results (when compared to other models) in both the leave-on-out cross-validation (LOO-CV) and external validation in numerous studies. The accuracy, sensitivity, and specificity of a prediction (by the VaxiJen v2.0 server) depends on the prediction accuracy threshold. For this reason, to improve the accuracy of the prediction, the threshold 0.4 was used. The server is a user-friendly web tool for effective prediction of the antigenicity of proteins [106]- [108]. Only the highly antigenic protein sequences were selected for further analysis. Next, the selected antigenic protein sequences were analyzed by ExPASy's online tool and ProtParam (https://web.expasy.org/protparam/) to determine their physicochemical properties i.e., the number of amino acids, theoretical pI, extinction co-efficient, aliphatic index and grand average of hydropathicity (GRAVY) etc. [109].

Prediction of T-cell and B-cell Epitopes
The online epitope prediction server Immune Epitope Database or IEDB (https://www.iedb.org/) was used for T-cell and B-cell epitope prediction. The database contains a huge collection of experimental data on T-cell epitopes and antibodies [110]. The NetMHCpan EL 4.0 prediction method was used for MHC class-I restricted CD8+ cytotoxic T-lymphocyte (CTL) epitope prediction (for HLA-A*11-01 allele) and the MHC class-II restricted CD4+ helper T-lymphocyte (HTL) epitopes were predicted (for HLA DRB1*04-01 allele), using the Sturniolo prediction method [111]. Ten of the top twenty MHC class-I and MHC class-II epitopes were randomly selected based on their antigenicity scores (AS). For, the B-cell lymphocytic epitopes (BCL), with amino acid number of more than ten, were selected for analysis that were predicted using BepiPred linear epitope prediction method [112] [113].

Prediction of the Selected Epitopes
The epitopes selected in the previous step were then subjected to the transmembrane topology experiment using the transmembrane topology of protein helices determinant, TMHMM v2.0 server (http://www.cbs.dtu.dk/services/TMHMM/) [114]. The antigenicity of the epitopes were determined by the online VaxiJen v2.0 (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.htm) server. The threshold was kept at 0.4 in the tumor model for improving the accuracy of the prediction [106]- [108].
The allergenicity of the selected epitopes were predicted using two online tools i.e., AllerTOP v2.0 (https://www.ddg-pharmfac.net/AllerTOP/) and AllergenFP v1.0 (http://ddgpharmfac.net/AllergenFP/). AllerTOP server generates more accurate results than AllergenFP server [115] [116]. For this reason, during prediction, the results predicted by AllerTOP were given much priority. The toxicity prediction of the selected epitopes was carried out using ToxinPred server (http://crdd.osdd.net/raghava/toxinpred/) using SVM (support vector method) based method, keeping all the parameters default. The SVM is a machine learning technique which is used in the server for differentiating the toxic and non-toxic peptides [117]. After the antigenicity, allergenicity and toxicity tests, the epitopes that were found to be antigenic, non-allergenic and non-toxic, were considered as the best predicted epitopes and selected for the later phases of the experiment.

Cluster Analysis of the MHC Alleles
The cluster analysis of the MHC alleles was carried out to identify the alleles of the MHC class-I and class-II molecules with similar binding specificities. The cluster analysis of the MHC alleles were carried out using online tool MHCcluster 2.0 (http://www.cbs.dtu.dk/services/MHCcluster/) [118]. During the analysis, all the parameters were kept default and all the HLA supertype representatives (MHC class-I) as well as the HLA-DR representatives (MHC class-II) were selected. The NetMHCpan-2.8 prediction method was used for analyzing the MHC class-I alleles.

Generation of the 3D Structures of the Selected Epitopes
The 3D structures of the selected epitopes were generated using PEP-FOLD3 (http://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/) online tool. Only the best selected epitopes from previous steps (the epitopes that followed the selection criteria of high antigenicity, non-allergenicity and non-toxicity in the previous steps, were considered best) were taken for 3D structure prediction [119]- [121].

Molecular Docking of the Selected Epitopes
Molecular docking analysis is one of the essential steps in reverse vaccinology to design vaccines.
In this step, peptide-protein docking was performed to predict the binding of epitopes with various antibodies or the MHC receptors [122].

3D Structure Refinement and Validation
Protein structure refinement is an important step in vaccine design because when protein 3D structures are predicted by online tools, they may lack the true, native structure. The refinement of the structures can convert the low resolution predicted model to high resolution predicted model that closely resembles the native structure of the protein . The 3D structures of the constructed   vaccines  were  refined  using  online  protein  refinement  tool,  3Drefine (http://sysbio.rnet.missouri.edu/3Drefine/). The tool is a quick and efficient 3D structure refining tool (works on i3Drefine protocol) since it can refine a 300 amino acid long protein in just 5 minutes [143] [114]. Next, the validation was conducted by analyzing the Ramachandran plots which were generated using the PROCHECK (https://servicesn.mbi.ucla.edu/PROCHECK/) server [145][146].

Vaccine Protein Disulfide Engineering
Disulfide bonds are important characteristics of protein because they provide stability to the proteins. For this reason, to predict the correct disulfide interactions, the vaccine protein disulfide engineering was carried out with the aid of Disulfide by Design 2 v12.2 (http://cptweb.cpt.wayne.edu/DbD2/) server which predicts the potential sites within a protein structure that have higher possibility of undergoing disulfide bond formation [147]. During disulfide engineering, the intra-chain, inter-chain and Cβ for glycine residue were selected and the χ3 Angle was kept -87° or +97° ± 5 and Cα-Cβ-Sγ Angle was kept 114.6° ±10. The χ3 angle was set at -87° or +97° ± 5 because numerous disulfides were generated by the server, when the default angle (+97° ±30° and -87° ±30°) was used. So, to shorten the amount of disulfides, the χ3 angle was kept low. Furthermore, studies have estimated that Cα-Cβ-Sγ angle reach a peak of near 115° and covers a range from 105° to 125°, in known disulfides. For this reason, the Cα-Cβ-Sγ angle was kept default (114.6° ±10) [148] [149].

Protein-Protein Docking
In protein-protein docking, the constructed SARS-CoV-2 vaccines were analyzed by docking against some MHC alleles as well as the toll like receptors (TLRs specific algorithm that depicts that, the lower score and lower energy represent the better scores [160]- [163]. From the docking experiment, one best vaccine was selected based on the docking score. The docked structures were visualized by PyMol tool [164].

Molecular Dynamic Simulation
The molecular dynamics (MD) simulation study was of the best selected vaccine construct was carried out using the online MD simulation tool, iMODS (http://imods.chaconlab.org/). It is a fast, online, user-friendly and effective molecular dynamics simulation server that predicts the deformability, B-factor (mobility profiles), eigenvalues, variance, co-variance map and elastic network of the protein complex. For a protein complex, the stability depends on the ability to deform at each of its amino acids. The eigenvalue represents the motion stiffness of the protein complex and the lower eigenvalue represents easy deformability of the complex. The server also determines and measures the protein flexibility [165]- [169].

Codon Adaptation and In Silico Cloning
The best predicted vaccine from the previous steps, was reverse transcribed to a possible DNA sequence which is supposed to express the vaccine protein in a target organism. The cellular machinery of that particular organism could use the codons of the newly adapted DNA sequence efficiently for producing the desired vaccine. Codon adaptation is a necessary step of vaccine design because this step provides the effective prediction of the DNA sequence of a vaccine construct. An amino acid can be encoded by different codons in different organisms, which is known as codon bias. Codon adaptation predicts the best codon for a specific amino acid that should work effectively and efficiently in a specific organism. The best predicted vaccine was used for codon adaptation by the Java Codon Adaptation Tool or JCat server (http://www.jcat.de/). The server ensures the maximal expression of protein in a target organism. Eukaryotic E. coli strain K12 was selected at the JCat server and rho-independent transcription terminators, prokaryotic ribosome binding sites and SgrA1 and SphI cleavage sites of restriction enzymes, were avoided.
The optimized DNA sequence was then taken and SgrA1 and SphI restriction sites were attached to the N-terminal and C-terminal sites, respectively. Finally, the SnapGene restriction cloning module was used to insert the newly adapted DNA sequence between the SgrA1 and SphI restriction sites of pET-19b vector [170]- [174].

Identification, Selection and Retrieval of Viral Protein Sequences
The SARS-CoV-2 (Wuhan seafood market pneumonia virus) was identified from the NCBI

Antigenicity Prediction and Physicochemical Property Analysis of the Protein Sequences
Two proteins: nucleocapsid phosphoprotein and surface glycoprotein, were identified as potent antigens and hence used in the next phases of the experiment (  (Table 03).
However, numerous in vitro and in vivo researches should be carried out on these predictions to determine the degree of their accuracy.

Determination
The MHC class-I and MHC class-II epitopes, determined for potential vaccine construction. The IEDB (https://www.iedb.org/) server generates a good number of epitopes. However, based on the antigenicity scores, ten epitopes were selected from the top twenty epitopes because the epitopes generated almost similar AS and percentile scores. Later, the epitopes with high antigenicity, nonallergenicity and non-toxicity were selected for vaccine construction. The B-cell epitopes were also selected based on their antigenicity, non-allergenicity and length (the sequences with more than 10 amino acids). Table 04 and Table 05 list the potential T-cell epitopes of nucleocapsid phosphoprotein and Table   06 and Table 07 list the potential T-cell epitopes of surface glycoprotein. Table 08 lists the predicted B-cell epitopes of the two proteins and Table 09 lists the epitopes that followed the mentioned criteria and were selected for further analysis and vaccine construction.      Table 09. List of the epitopes that followed the selection criteria (high antigenicity, nonallergenicity and non-toxicity) and selected for vaccine construction.

Cluster Analysis of the MHC Alleles
The

Generation of the 3D Structures of the Epitopes and Peptide-Protein Docking
After 3D structure prediction of the selected epitopes, the peptide-protein docking was conducted to find out, whether all the epitopes had the ability to bind with the MHC class-I as well as MHC

Vaccine Construction
After successful docking, three vaccines were constructed using the selected epitopes which is supposed to be directed to fight against the SARS-CoV-2. To construct the vaccines, three different adjuvants were used i.e., beta defensin, L7/L12 ribosomal protein and HABA protein and different linkers i.e., EAAAK, GGGS, GPGPG and KK linkers were used at their appropriate positions.
PADRE sequence is an important sequence which was used in vaccine construction. It has the capability to increase the potency of the vaccines with minimal toxicity. Moreover, PADRE sequence also improve the CTL response, thus ensuring potent immune response [175]. The newly constructed vaccines were designated as: CV-1, CV-2 and CV-3 (Table 11).

Constructs
The results of the antigenicity, allergenicity and physicochemical property analysis are listed in Table 12. All the three vaccine constructs were found to be antigenic as well as non-allergenic.

Secondary and Tertiary Structure Prediction of the Vaccine Constructs
From the secondary structure analysis, it was determined that, the CV-1 had the highest percentage of the amino acids (67.1%) in the coil formation as well as the highest percentage of amino acids (8%) in the beta-strand formation. However, CV-3 had the highest percentage of 37.8% of amino acids in the alpha-helix formation (Figure 05 and Table 13 generating the 3D structures of the query vaccine constructs [177]. The results of the 3D structure analysis are listed in Table 14 and illustrated in Figure 06.

3D Structure Refinement and Validation
The three vaccine constructs were refined and then validated in the 3D structure refinement and validation step. The PROCHECK server (https://servicesn.mbi.ucla.edu/PROCHECK/) divides the Ramachandran plot into four regions: the most favored region (represented by red color), the additional allowed region (represented by yellow color), the generously allowed region (represented by light yellow color) and the disallowed region (represented by white color).
According to the server, a valid protein (the best quality protein) should have over 90% of its amino acids in the most favored region. The additional allowed region and generously allowed region might also contain some percentage of the amino acids of the protein. Moreover, no amino acid should reside in the disallowed region [178]- [180].
The 3D protein structures generated in the previous step were refined for further analysis and

Vaccine Protein Disulfide Engineering
In protein disulfide engineering, disulfide bonds were generated within the 3D structures of the vaccine constructs. In the experiment, the amino acid pairs that had bond energy value less than 2.00 kcal/mol were selected. Since about 90% of the native disulfide bonds in proteins have energy value of less than 2.2 kcal/mol, the bond energy value of 2.00 kcal/mol was selected as the cut-off value for the experiment for better prediction [181]. The CV-1 generated 10 amino acid pairs that had the capability to form disulfide bonds. However, only one pair was selected because they had the bond energy, less than 2.00 kcal/mol: 276 Ser-311 Arg. However, CV-2 and CV-3 generated 04 and 05 pairs of amino acids, respectively, that might form disulfide bonds and no pair of amino acids showed bond energy less than 2.00 Kcal/mol. The selected amino acid pairs of CV-1 formed the mutant version of the original vaccines (Figure 08).

Protein-Protein Docking Study
The protein-protein docking study was carried out to find out the best constructed COVID-19 vaccine. The vaccine construct with the best result in the molecular docking, was considered as the best vaccine construct. According to docking results, it was found that CV-1 was the best constructed vaccine. CV-1 showed the best and lowest scores in the docking as well as in the MM-GBSA study. However, CV-2 showed the best binding affinity (ΔG scores) with DRB3*0202 (- showed the best results in the protein-protein docking study, it was considered as the best vaccine construct among the three constructed vaccines (Figure 09 & Table 15). Later, the molecular dynamics simulation study and in silico codon adaptation studies were conducted only on the CV-1 vaccine.

Molecular Dynamics Simulation
The results of molecular dynamics simulation of CV-1-TLR-8 docked complex is illustrated in Figure 10. Dynamic simulation of proteins gives easy determination of the stability and physical movements of their atoms and molecules [182]. So, the simulation is carried out to determine the relative stability of the vaccine protein. The deformability graph of the complex illustrates the peaks representing the regions of the protein with moderate degree of deformability (Figure 10b).
The B-factor graph of the complex gives easy visualization and comparison between the NMA and the PDB field of the docked complex (Figure 10c). The eigenvalue of the docked complex is depicted in Figure 10d. CV-1 and TLR8 docked complex generated quite good eigenvalue of 3.817339e-06. The variance graph illustrates the individual variance by red colored bars and cumulative variance by green colored bars (Figure 10e). Figure 10f depicts the co-variance map of the complex, where red color represents the correlated motion between a pair of residues, uncorrelated motion is indicated by white color as well as the anti-correlated motion is marked by blue color. The elastic map of the complex refers to the connection between the atoms and darker gray regions indicate stiffer regions (Figure 09g) [167]- [169].

Codon Adaptation and In Silico Cloning
Since the CV-1 protein had 596 amino acids, after reverse translation, the number nucleotides of the probable DNA sequence of CV-1 would be 1788. The codon adaptation index (CAI) value of 1.0 of CV-1 indicated that the DNA sequences contained higher proportion of the codons that should be used by the cellular machinery of the target organism E. coli strain K12 (codon bias).
For this reason, the production of the CV-1 vaccine should be carried out efficiently [183] [184].
The GC content of the improved sequence was 51.34% (Figure 11). The predicted DNA sequence of CV-1 was inserted into the pET-19b vector plasmid between the SgrAI and SphI restriction sites and since the DNA sequence did not have restriction sites for SgrAI and SphI restriction enzymes, SgrA1 and SphI restriction sites were conjugated at the N-terminal and C-terminal sites, respectively. The newly constructed vector is illustrated in Figure 12.

Discussion
The current study was designed to construct possible vaccines against the Wuhan Novel Coronavirus 2019 (SARS-CoV-2), which is the cause of the recent outbreak of the deadly viral disease, COVID-19 in China. The pneumonia has already caused the death of several thousands of people worldwide. For this reason, possible vaccines were predicted in this study to fight against this lethal virus. To carry out the vaccine construction, four candidate proteins of the virus were identified and selected from the NCBI database. Only highly antigenic sequences were selected for further analysis since the highly antigenic proteins can induce better immunogenic response [185,201]. Because the nucleocapsid phosphoprotein and surface glycoprotein were found to be antigenic, they were taken into consideration for vaccine construction.
The physicochemical property analysis was conducted for the two predicted antigenic proteins.
The extinction coefficient can be defined as the amount of light that is absorbed by a particular compound at a certain wavelength [186] [187]. Surface glycoprotein had the highest predicted extinction co-efficient of 148960 M -1 cm -1 . The aliphatic index of a protein corresponds to the relative volume occupied by the aliphatic amino acids in the side chains of the protein, for example: alanine, valine etc. [188] [189]. Surface glycoprotein also had the highest predicted aliphatic index among the two proteins (84.67). For this reason, surface glycoprotein had greater amount of aliphatic amino acids in its side chain than the nucleocapsid phosphoprotein. The grand average of hydropathicity value (GRAVY) for a protein is calculated as the sum of hydropathy values of all the amino acids of the protein, divided by the number of residues in its sequence [190]. Surface cell that cause the final destruction of the target antigen [191]- [195]. The possible T-cell and Bcell epitopes of the selected proteins were determined by the IEDB (https://www.iedb.org/) server.
The epitopes with high antigenicity, non-allergenicity and non-toxicity were selected to vaccine construction. The B-cell epitopes (predicted by the server) that were more than ten amino acids long were taken into consideration and the antigenic and non-allergenic epitopes were selected for vaccine construction. However, most of the epitopes were found to be residing within the cell membrane.
The cluster analysis of the MHC alleles which may interact with the selected epitopes during the immune response, showed quite good interaction with each other. Next the 3D structures of the selected epitopes were generated for peptide-protein docking study. The docking was carried out to find out whether all the epitopes had the capability to bind with their respective MHC class-I and MHC class-II alleles. Since all the epitopes generated quite good docking scores, it can be concluded that, all of them had the capability to bind with their respective targets and induce potential immune response. However, among the selected epitopes, QLESKMSGK, LIRQGTDYKHWP, GVLTESNKK and TSNFRVQPTESI generated the best docking scores.
After the successful docking study, the vaccine construction was performed. The linkers were used to connect the T-cell and B-cell epitopes among themselves and also with the adjuvant sequences as well as the PADRE sequence. The vaccines, with three different adjuvants, were constructed and designated as: CV-1, CV-2 and CV-3. Since all the three vaccines were found to be antigenic, they should be able to induce good immune response. Moreover, all of them were possibly nonallergenic, they should not be able to cause any allergenic reaction within the body as per in silico in the study could overcome such difficulties [196]- [200]. Finally, this study recommends CV-1 as the best vaccine to be an effective worldwide treatment based on the strategies employed in the study to be triggered against SARS-CoV-2 infection. However, further in vivo and in vitro experiments are suggested to strengthen the findings of this study.

Conclusion
The SARS-CoV-2 has caused one of the deadliest outbreaks in the recent times. Prevention of the newly emerging infection is very challenging as well as mandatory.

Conflict of Interest
Authors declare no conflict of interest regarding the publication of the manuscript.

Data Availability Statement
Authors made all the data generated during experiment and analysis available within the manuscript.

Funding Statement
Authors received no specific funding from any external sources.