Exploring different virulent proteins of human respiratory syncytial virus for designing a novel epitope-based polyvalent vaccine: Immunoinformatics and molecular dynamics approaches

Human Respiratory Syncytial Virus (RSV) is one of the most prominent causes of lower respiratory tract infections (LRTI), contributory to infecting people from all age groups - a majority of which comprises infants and children. The implicated severe RSV infections lead to numerous deaths of multitudes of the overall population, predominantly the children, every year. Consequently, despite several distinctive efforts to develop a vaccine against the RSV as a potential countermeasure, there is no approved or licensed vaccine available yet, to control the RSV infection effectively. Therefore, through the utilization of immunoinformatics tools, a computational approach was taken in this study, to design and construct a multi-epitope polyvalent vaccine against the RSV-A and RSV-B strains of the virus. Potential predictions of the T-cell and B-cell epitopes were followed by extensive tests of antigenicity, allergenicity, toxicity, conservancy, homology to human proteome, transmembrane topology, and cytokine-inducing ability. The most promising epitopes (i.e. 13 CTL epitopes, 9 HTL epitopes, and 10 LBL epitopes) exhibiting full conservancy were then selected for designing the peptide fusion with appropriate linkers, having hBD-3 as the adjuvant. The peptide vaccine was modeled, refined, and validated to further improve the structural attributes. Following this, molecular docking analysis with specific TLRs was carried out which revealed excellent interactions and global binding energies. Additionally, molecular dynamics (MD) simulation was conducted which ensured the stability of the interactions between vaccine and TLR. Furthermore, mechanistic approaches to imitate and predict the potential immune response generated by the administration of vaccines were determined through immune simulations. Owing to an overall evaluation, in silico cloning was carried out in efforts to generate recombinant pETite plasmid vectors for subsequent mass production of the vaccine peptide, incorporated within E.coli. However, more in vitro and in vivo experiments can further validate its efficacy against RSV infections.


Introduction
The Human Respiratory Syncytial Virus (hRSV), a member of the family of Paramyxoviridae, 95 is known to be the primary cause of lower respiratory tract infections (LRTI), including 9 192 the target pathogen genome [35,37]. In our research, a polyvalent epitope-based vaccine 193 blueprint was produced that could produce a significant immune response to both RSV-A and 194 RSV-B forms, targeting the phosphoprotein (P protein), nucleoprotein (N protein), fusion 195 glycoprotein (F protein), and major surface glycoprotein (mG protein) of these viruses. Since 196 RSV-A is more prevalent than RSV-B, as a model, the vaccine was developed using  [2]. For the T-cell and B-cell epitope prediction, the RSV-A P protein, N protein, F protein, 198 and mG protein were used and then the epitopes with 100 % conservancy in both species along 199 with some other selection criteria were selected for vaccine construction. The criteria for 200 selecting the epitopes include i.e., antigenicity (the parameter that measures whether the 201 epitopes stimulate a high antigenic response), non-allergenicity (to ensure that the epitopes do 202 not cause any unintended allergic reaction inside the body), non-toxicity, conservancy across 203 the selected organisms, as well as non-homologation of the human proteome. It is, therefore, 204 expected that the vaccine will be effective against both the subtypes -RSV-A and RSV-B. The 205 most common vaccine target for RSV is known to be the F protein [574 amino acids (aa) in 206 length], which is a highly conserved protein in both RSV forms. The F protein mediates the 207 fusion and attachment of the virus to its target cells along with the mG protein, thus facilitating 208 viral entry [2,38]. The F1 (aa 137-574) and F2 (aa 1-109) subunits form a homotrimer in the 209 mature F protein, and the F1 subunit is required for membrane fusion. The F protein has two 210 different conformations i.e., the pre-fusion and post-fusion conformations [39,40]. The protein 211 rearranges to a more stable post-fusion form during infection to allow viral entrance into the 212 host cell. Antibodies having neutralizing activity identify at least two antigenic sites on both 213 the pre-fusion and post-fusion forms of F (sites II and IV) [41][42][43]. In this study, the precursor 214 F0 protein was targeted to retrieve all of the potential antigenic epitopes. The possible 215 conformational change of the F protein, as well as the cleavage sites of the protein sequence, 216 were taken into account while generating the potential epitopes [39,40]. The viral genome of RSV is surrounded by N protein, and the P protein is a vital component 219 of the viral RNA-dependent RNA polymerase complex which is necessary for the proper 220 replication and transcription of RSV [44]. Therefore, in our study, these four proteins were 221 used as possible targets to design a vaccine to suppress these viral proteins, preventing viral 222 entry, and thus interfering with the life cycle of the virus.     253 The two major types of T-cells, cytotoxic T-cells, and Helper T-cells are both considered 254 essential for the successful design of the vaccine [47]. For specific antigen recognition of the 255 major histocompatibility complex class I (MHC-I) or CD8+ cytotoxic T-lymphocytic (CTL) 256 epitopes on the surface of the antigen-presenting cells (APCs), the cytotoxic T-cells are 257 important. Additionally, the helper T-cells are considered to be a crucial component of adaptive 258 immunity that interacts on the surface of APCs with major histocompatibility complex class II 259 (MHC-II) or CD4+ helper T-lymphocytic (HTL) epitopes. They function in activating the B-260 cell, macrophages, and even cytotoxic T-cells [48,49]. On the other hand, B-cells produce 261 antigen-specific immunoglobulins after their activation [50]. They can identify solvent-12 262 exposed antigens via membrane-bound immunoglobulins called B cell receptors (BCRs) [51]. 263 B-cell epitopes are important for defense against viral infections because they are the essential 264 immune system components that activate an adaptive immune response in response to a 265 specific viral infection. Therefore, the B-cell epitopes are used as one of the crucial building 266 blocks of the subunit vaccine. There are two types of B-cell epitopes: linear B-cell epitopes 267 (LBL) and conformational B-cell epitopes, also known as continuous and discontinuous B-cell 268 epitopes, respectively [52]. 269 The T-cell and B-cell epitope prediction was performed using the Immune Epitope Database 270 or IEDB (https://www.iedb.org/), which contains extensive experimental data on antibodies 271 and epitopes [53]. For  Henceforth, based on their ranking, the top-scored HTL and CTL epitopes that were found to 286 be common for all of the selected corresponding HLA alleles were considered for further 13 287 analyses. All the parameters were retained by opting for default during the T-cell epitope 288 prediction. Subsequently, B-cell epitopes of the proteins were predicted using the BepiPred 289 linear epitope prediction method 2.0, maintaining all the default parameters. Using a Random 290 Forest algorithm trained on epitope and non-epitope amino acids obtained from crystal 291 structures, the BepiPred-2.0 server predicted linear B-cell epitopes from a protein sequence.

292
Following this, a sequential prediction smoothing was conducted. Residues with scores greater 293 than the threshold (default value of 0.5) were thought to constitute epitopes [54]. Finally, the 294 top-scored LBL epitopes containing more than ten amino acids were primarily regarded as 295 potential candidates for further analysis.

296
Conformational or discontinuous B-cell epitopes are critical components to induce antibody-297 mediated humoral immunity within the body. While designing a vaccine, efficient 298 conformational B-cell epitopes should be included alongside the LBLs to elicit a better 299 immunogenic response against the pathogen. The conformational B-cell epitopes of the 300 modeled 3D structure of the vaccine were predicted using IEDB ElliPro, an online server 301 (http://tools.iedb.org/ellipro/) using the default parameters of a minimum score of 0.5 and a 302 maximum distance of 6 angstroms [55]. ElliPro uses three algorithms to predict the protein 303 shape as an ellipsoid, measure the residue PI, and estimate adjacent cluster residues based on 304 their protrusion index (PI) values [56]. ElliPro calculates a score for each output epitope based 305 on an average PI value over the residues of each epitope. Protein residues are contained in 90% 306 of ellipsoids with a PI value of 0.9, while 10% of residues are outside ellipsoids. The center of 307 residue mass residing outside the largest ellipsoid possible was used to calculate the PI value 308 for each epitope residue [57]. In this step, several methods for predicting their conservancy, antigenicity, allergenicity, and 312 toxicity were used to evaluate the initially predicted T-cell and B-cell epitopes. To assess the 313 conservancy of the chosen epitopes [58], the conservancy prediction method of the IEDB 314 server (https://www.iedb.org/conservancy/) was used. Additionally, the components of the 315 vaccine should be highly antigenic, non-allergenic at the same time, and also devoid of toxic 316 reactions. In this step, the antigenicity determination tool VaxiJen v2.0 (http://www.ddg-317 pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) was used again for the determination of 318 antigenicity [45]. Two different tools were then used, i.e. AllerTOP v2.0 (https://www.ddg-319 pharmfac.net/AllerTOP/) and AllergenFP v1.0 (http://ddg-pharmfac.net/AllergenFP/) to obtain 320 the highest precision for prediction of allergenicity. Both of the tools are based on auto cross-321 covariance (ACC) transformation of protein sequences into uniform equal-length vectors.
In addition, the ToxinPred 324 (http://crdd.osdd.net/raghava/toxinpred/) server was used to predict toxicity for all epitopes by 325 using the Support Vector Machine (SVM) prediction method to keep all the default parameters.

326
The SVM is a widely accepted machine learning technique for toxicity prediction since it can 327 differentiate the toxic and non-toxic epitopes quite efficiently [61]. Finally, using the TMHMM 328 v2.0 server (http://www.cbs.dtu.dk/services/TMHMM/), the transmembrane topology 329 prediction of all the epitopes was performed to predict whether the epitopes were exposed 330 inside or outside, keeping the parameters at their default values. TMHMM uses an algorithm 331 called N-best (or 1-best in this case) to predict the most probable location and orientation of 332 transmembrane helices in the sequence [62].  The conservancy analysis of the specified epitopes was conducted using the IEDB server's 351 epitope conservancy analysis module (https://www.iedb.org/conservancy/) [58]. The epitopes 352 that were found to be fully conserved among the selected strains were taken for the construction 353 of the vaccine since this will ensure and facilitate the broad-spectrum activity of the polyvalent 354 vaccine over the two selected RSV species or types. The homology of the human proteome epitopes were selected as non-homologous pathogen peptides that showed no hits below the e-359 value inclusion threshold [67]. The epitopes found to be highly antigenic, non-allergenic, non-360 toxic, fully conserved, and non-homologous to the human proteome were considered among 361 all the initially selected epitopes to be the best-selected epitopes or the most promising epitopes, 362 and only these selected epitopes were used in the construction of the vaccine.  against respiratory infections [69][70][71].

393
The epitopes were also appended to the pan HLA-DR epitope (PADRE) sequence. By 394 enhancing the ability of CTL vaccine epitopes, the PADRE sequence activates the immune 395 responses [34]. In the conjugation of the CTL, HTL, and LBL epitopes, the AAY, GPGPG, 396 and KK linkers were used, respectively. The EAAAK linkers have a viable partition of 397 bifunctional fusion protein domains [72], while the GPGPG linkers are ideal for preventing 398 junctional epitope production and optimizing the processing and presentation of the immune 399 system [73]. The AAY linker is also commonly used in the design trials of the in silico vaccine  To build a timely and successful immune response to the pathogenic attack, the constructed 405 vaccine should be strongly antigenic. and effective [80][81][82][83][84]. Moreover, determination of the tertiary or 3D structure of the vaccine 439 construct was carried out using the RaptorX online server (http://raptorx.uchicago.edu/). Using 440 an easy and powerful template-based method [85], the server predicts the tertiary or 3D 441 structure of a query protein. Furthermore, RaptorX uses a deep learning method to enable 442 distance-based protein folding. This server has also been rated first in contact prediction in both 443 CASP12 and CASP13, making it an ideal server for 3D structure determination [86]. The tertiary structure prediction of the proteins using computational methods also requires 446 extensive refinement, to turn predicted models with lower resolution into models that closely 447 match the native protein structure. Therefore, a GalaxyWEB server (http://galaxy.seoklab.org/) 448 using the GalaxyRefine module further refined the created tertiary structure of the proposed 449 vaccine model. The server uses dynamic simulation and the refinement approach is tested by 450 CASP10 to refine the tertiary protein structures [87,88] (https://prosa.services.came.sbg.ac.at/prosa.php) was also used. A z-score that expresses the 455 consistency of a query protein structure is created by the PROCHECK server. In the latest PDB 456 database, a z-score residing within the z-score spectrum of all experimentally defined protein 457 chains represents a higher consistency of the query protein [91]. further design the disulfide bonds within the vaccine proteins [92]. The tool was developed 465 using computational approaches to predict the protein structure [93,94], and the algorithm of PDB files may be generated for selected disulfides [95].

471
The χ3 angle was held at -87 ° or +97 ° ±10 during the experiment to cast off various putative 472 disulfides that were generated using the default angles of +97 ° ±30 ° and -87 ° ±30 °. to have an energy value of less than 2.2 Kcal/mol [92]. organisms, an amino acid can be encoded by more than one codon, which is known as codon 554 bias wherefore, the codon adaptation study is carried out to predict an appropriate codon that 555 effectively encodes a specific amino acid in a specific organism. Java Codon Adaptation Tool 556 or JCat server (http://www.jcat.de/) was used for codon optimization [112], and the optimized 557 codon sequence was further analyzed for expression parameters, codon adaptation index (CAI), 558 and GC-content %. The optimum CAI value is 1.0, while a score of > 0.8 is considered 559 acceptable, and the optimum GC content ranges from 30 to 70% [113]. For in silico cloning 560 simulation, the pETite vector plasmid was selected which contains a small ubiquitin-like 561 modifier (SUMO) tag as well as a 6x polyhistidine (6X-His) tag, which will facilitate the 562 solubilization and affinity purification of the recombinant vaccine construct [114]. Also, 6X-

563
His can facilitate the swift detection of the recombinant vaccine construct in 564 immunochromatographic assays [115]. List of the proteins with their accession numbers used in the vaccine designing study.  HTL epitopes as well as top B-cell epitopes with lengths over ten amino acids were taken into 616 consideration for further analysis. Following this, a few criteria were selected to filter the best 617 epitopes which included, high antigenicity, non-allergenicity, non-toxicity, conservancy, and 618 human proteome non-homology. Furthermore, the cytokine (i.e., IFN-γ, IL-4, and IL-10) 619 inducing ability of HTL epitopes was also considered to determine whether they can produce 620 at least one of these cytokines. Finally, the epitopes that met these criteria were listed as the 621 most promising epitopes in Table 02 and were later used for the construction of the vaccine.

689
Results of the secondary structure analysis of the vaccine construct. The 3D structure of the vaccine protein produced by the RaptorX server was refined to predict 702 a structure that closely resembles the native protein structure. The refined protein structure was 703 then validated by evaluating the PROCHECK server-generated Ramachandran plot and the 704 ProSA-web server-generated z-score. The Ramachandran plot study found that in the most    and IgG + IgM antibodies) (Fig 9A). It was also expected that the concentrations of active B 799 cells (Fig 9B and Fig 9C), plasma B cells (Fig 9D), helper T cells (Fig 9E and Fig 9F), and 800 cytotoxic T cells (Fig 9H and Fig 9I) could steadily increase, reflecting the vaccine's capacity 801 to create a very high secondary immune response and healthy immune memory. However, Fig   802   9G demonstrates that the concentration of regulatory T cells would gradually decrease 803 throughout the phases of the injections, which represents the decrease in suppression of 804 vaccine-induced immunity by regulatory T cells [124].

805
In comparison, the rise in macrophage and dendritic cell concentrations showed that these 806 APCs had a competent presentation of antigen (Fig 9J and Fig 9K). The simulation result also 36 807 predicted that the constructed vaccine could generate numerous forms of cytokines, including 808 IFN-γ, IL-23, IL-10, and IFN-β; some of the most critical cytokines for producing an immune 809 response to viral infections (Fig 9L). Therefore, the overall immune simulation analysis 810 showed that after administration, the proposed polyvalent multi-epitope vaccine would be able 811 to elicit a robust immunogenic response. promote the vaccine's purification during downstream processing [125]. The newly built 823 recombinant plasmid has been designated as "Cloned_ pETite" (Fig 10). Thereafter, the Mfold  health concerns [127][128][129]. As a result, bioinformatics and immunoinformatics techniques have 849 been developed and widely utilized to design novel subunit vaccines that are safe, effective, 850 efficient, and low-cost alternatives to current preventive measures [130,131]. conduct structure-function studies. Thus, antibodies can recognize any area of the antigen that 892 has been exposed to solvents. B-cell epitopes can be split into two categories: linear and 893 conformational; conformational B-cell epitopes are made up of patches of solvent-exposed 894 atoms from residues that are not always sequential, while LBL epitopes are made up of 895 sequential residues. Antibodies that identify LBL epitopes can recognize denatured antigens, 896 but denaturing the antigen causes conformational B-cell epitopes to lose their recognition 897 [144]. T-cell and B-cell epitopes have been predicted for the selected RSV proteins using the 898 IEDB server. The most conserved epitopes with high antigenicity, non-allergenicity, and non- and IFN-γ [161,162]. It also facilitates the chemotaxis of immature DCs and T cells through 938 its interaction with chemokine receptor 6 (CCR6), as well as the chemotaxis of monocytes 939 through its interaction with CCR2 [163]. This peptide also promotes and activates myeloid 940 DCs and natural killer (NK) cells [157,162]. the refined structure showed a very high Rama favored amino acid percentage. Following that, 964 the refined structure was used for disulfide engineering. Furthermore, disulfide engineering of 965 the vaccine construct has been conducted to increase its stability using the DbD2 v12.2 servers.

966
The server can determine the B-factor of areas involved in disulfide bonding as well as identify 967 potential disulfides that increase the protein's thermal stability [92]. All residue pairings in a TLRs, which are found on leukocytes and in tissues, play a major role in innate immunity 993 activation by identifying invading pathogens, including viruses like RSV, and sending out 994 signals that promote inflammation-related components [167]. TLRs, such as TLR2, TLR1, 995 TLR6, TLR3, and TLR4, are found on leukocytes and can interact with RSV to boost immune 996 responses [168]. Within the lungs, TLR2 interactions with RSV increase neutrophil migration 997 and dendritic cell activation. TLR2 exists as a heterodimer complex with either TLR1 or TLR6 998 on the surface of immune cells and tissues [169]. According to genetic analysis and vaccine 999 studies, TLR2 signaling appears to be critical in RSV recognition [170][171][172]. TLR2 and TLR1 1000 or TLR2 and TLR6 complexes can recognize RSV, and greatly enhance early innate 1001 inflammatory responses [173][174][175]. Previous research has also suggested that the signals generated by TLR2 and TLR6 activation are critical for viral replication control [168]. the production of proinflammatory cytokines and chemokines [183]. TLRs which are critical 1013 in the RSV pathogenesis were considered for the docking analysis. Thus, TLR-1, TLR-2, TLR-  Table. 1046 1047 The immune simulation analysis of the proposed vaccine demonstrated that the vaccine could RSV initially affects the upper respiratory tract, due to which the immune system may be 1064 stimulated against the virus predominantly at the mucosal surfaces [190]. Previous clinical 1065 studies also reported IgG and IgM to be significant role players against RSV infections [191].

1066
Furthermore, the simulation analysis revealed that the concentration of regulatory T cells would 1067 gradually decrease throughout the phases of the vaccine doses which indicates the potential 1068 decrease in suppression of vaccine-induced immunity by regulatory T cells [192].

1069
Hence, the proposed vaccine construct is predicted to produce an effective immunogenic The E.coli cell culture system is considered to be the majorly 1075 recommended system for the production of recombinant proteins at a mass level. In the codon 1076 adaptation analysis, the obtained results were significantly good with a CAI value of 0.98 and a GC content of 50.23 %, since any CAI value above 0.80 and a GC content of 30% to 70% 1078 are considered to be the most promising scores [34,186,193]. Following this, the optimized free energy is often considered better than the higher maximal free energy score which indicates 1087 the protein to be more stable. It can, therefore, be reported that the predicted vaccine could be 1088 very stable upon transcription [133]. Overall, this study suggests that the proposed vaccine that were 100% conserved; they could therefore be efficient against the two selected viruses.

1106
In addition, high antigenicity, non-allergenicity, and non-toxicity as well as non-homology (to 1107 the human proteome) were also considered to be the criteria for choosing the most promising