In-silico design and assessment of OprD based multi-epitope vaccine against Acinetobacter baumannii

Gram-negative, opportunist pathogen Acinetobacter baumannii is notorious for causing a plethora of nosocomial infections predominantly respiratory diseases and blood-stream infections. Due to resistance development towards last-resort antibiotics, its treatment is becoming increasingly difficult. Despite numerous therapeutic developments, no vaccine is available against this ubiquitous pathogen. It is therefore apropos to formulate a rational vaccine plan to get rid of the super-bug. Considering the importance of Outer Membrane Porin D (OprD) as a potential vaccine candidate, we methodically combined the most persistent epitopes present in the A. baumannii strains with the help of different immunoinformatic approaches to envisage a systematic multi-epitope vaccine. The proposed vaccine contains highly immunogenic stretches of linear B-cells, cytotoxic T lymphocyte epitopes, and helper T lymphocyte epitopes of outer membrane porin OprD. The finalized epitopes proved to be significant as they are conserved in A. baumannii strains. The final 3D structure of the construct was projected, refined, and verified by employing several in silico approaches. Apt binding of the protein and adjuvant with the TLR4 suggested significantly high immunogenic potential of our designed vaccine. MD simulations showed highly stable composition of the protein. Immune simulations disclosed a prominent increase in the levels of the immune response. The proposed vaccine model is proposed to be thermostable, immunogenic, water-soluble, and non-allergenic. However, this study is purely computational and needs to be validated by follow-up wet laboratory studies to confirm the safety and immunogenicity of our multi-epitope vaccine.

has characterized the multi-drug resistant (MDR) Acinetobacter spp. as an urgent threat hence emphasizing the need for extensive research on therapeutics development (Giammanco et al., 2017). Additionally, A.baumannii has been listed by World Health Organization(WHO) among top-priority dangerous pathogens that are responsible for posing the greatest threat to public health (Harding et al., 2018).
Burgeoning resistance rates have left us with very few treatment options rendering the existing treatment approaches as mostly incompetent. Existing approaches usually involve monotherapy and combinatorial therapy which make use of different antibiotics such as polymyxins, tigecycline, and sometimes aminoglycosides. However, their undesirable pharmacokinetic properties, ability to cause toxicity (nephrotoxicity, neurotoxicity), and resistance have led to clinical failure (Shrivastava et al., 2018). Moreover, some of the antibiotics are only useful if they are used in combination whereas others are involved in increased fatality rate. Therefore, new therapeutic options are the pressing need to treat multidrug-resistant A.baumannii infections (Isler et al., 2018).
Taking this into account, researchers have worked extensively to devise new cost-effective therapeutic strategies that predominantly involve chemo-immuno therapies and unraveling new epitopes for active or passive immunization against the MDR pathogen. Newly proposed vaccine candidates against this pathogen include outer membrane proteins and porins like Omp34 kDa (Jahangiri et al., 2018), OprC and OmpA in combination with pal (Lei et al., 2019), outer membrane protein nuclease NucAb  , FilF , BamA (Singh et al., 2017), phospholipase D (Zadeh Hosseingholi et al., 2014) , Pili subunit hemagglutinin (Homenta et al., 2014) and functional exposed amino acid BauA (Sefid et al., 2015).Despite of these advancements, there have been no FDA approved vaccines in markets due to underlying limitations such as high toxicity, low immunogenicity , insolubility when expressed, or complex compositions.
Many studies have validated the use of OMPs as successful vaccine candidates in GNBs such as Legionella pneumophila, Mannheimia haemolytica, Aeromonas salmonicida, Anaplasma marginale, Bartonella henselae, Campylobacter jejuni, and Avian Pathogenic E.coli (APEC) (Ayalew et al., 2010;Diao et al., 2020;Hove et al., 2020;Moumène et al., 2015). A copious amount of information has been generated through proteomic analysis on different types of families of porins such as OmpA and OprD family in Pseudomonas aeruginosa (Chevalier et al., 2017). OprD is primarily involved in carbapenem uptake. Recent studies have shown downregulation of OprD and upregulation of efflux systems in the development of mutation mediated resistance (Chevalier et al., 2017;Zeng et al., 2014). The active contribution of OprD in causing the resistance makes it an important tool for combating this pathogen.
According to studies, OprD of A.baumannii could act as a putative vaccine candidate (Kim et al., 2016). However, there is a lack of information on the immunogenic capability of OprD in A.baumannii . This evidence convinces us to carry out the present study aiming at the assessment of the capacity of OprD to elicit an immune response through bioinformatics analysis. In the current study, we selected immunogenic B and T cells of OprD of A.baumannii. This paper concisely elucidates and explores the in-silico strategies employed in vaccine designing. Region of OprD having the greatest potential of immunogenicity has been elected as a novel candidate for the vaccine that could potentially be utilized for designing therapeutic/ prophylactic peptide vaccines against A. baumannii.  Additional bioinformatic analysis was carried out, as described in the following section.

Signal peptide and localization prediction
To identify possible signal peptide in the sequence, online tools SignalP 5.0 (Almagro Armenteros et al., 2019) available at http://www.cbs.dtu.dk/services/SignalP-5.0/ and LipoP (Juncker et al., 2003) at http://www.cbs.dtu.dk/services/LipoP/ were used. SignalP 5.0 encompasses artificial neural networks to improve the prediction performance of signal peptide (Almagro Armenteros et al., 2019). LipoP 1.0 locates signal peptidase I and II cleavage sites within the protein sequence (Juncker et al., 2003). Predictions about protein localization were performed by employing a template free algorithm, DeepLoc which also achieves high accuracy by employing deep neural networks (Almagro Armenteros et al., 2017). Two different servers were employed for prediction of putative transmembrane domains (TMBs): PRED-TMBB and BOCTOPUS2. PRED-TMBB enables its users to pinpoint trans-membrane strands of GNB through the method of Hidden Markov Model (HMM) (Tsirigos et al., 2016).Meanwhile, other online tool, BOCTOPUS2 predicts topology by detecting the backbone hydrogen bonding restraints that could be used to form large size TMBs (Hayat and Elofsson, 2012).

Analysis of conserved region
A preliminary analysis of OprD protein sequence of A.baumannii as representative for all the possible strains of bacterium was conducted using BLASTp. BLAST search was performed with non-redundant protein sequences (nr) database of bacteria using blosum62 matrix. The retrieved sequences were aligned by multiple sequence alignment (MSA) using BLAST software, to obtain the conserved regions. To evaluate the reliability of this protein sequence, it was used as a query for performing BlastP analysis against non-redundant database restricted to Homo sapiens (taxid:9606). Non-human homologous proteins in other GNBs were also identified and interpreted to assess the inter-species and intra-species conservation of the selected protein.
B-lymphocytes in immune system detect and attach themselves to B-cell epitopes (BCEs) harbored within foreign molecules. Prediction of BCEs is important in designing the vaccine construct as well as diagnostic tests (Larsen et al., 2006). Several servers were used to discover potential B-Cell Epitopes. BepiPred-2.0 webserver was exploited to forecast the linear B-cells epitopes by employing random forest algorithm. As compared to other tools, this method is outstanding for sequence-based predictions (Jespersen et al., 2017). For precise estimation of the linear BCEs, Hidden Markov model, Thornton's method, and Support Vector Machine (SVM) methods were utilized by ABCpred, BCPREDS, and SVMTriP, respectively (Saha and Raghava, 2006;Yao et al., 2020). Using various tools to find BCEs generates good quality results (Faria et al., 2011).

Evaluation of Cytotoxic T lymphocytes (CTL)
Identifying specific peptide patterns that elicit strong MHC restricted cytotoxic T cell response is a vital step to formulate vaccine (Zhao et al., 2003) . To effectively forecast the CTLs, a webserver NetCTL 1.2 was harnessed in which information is generated by artificial neural networks (ANN) and matrix methods pertaining to TAP transport efficiency, MHC class I affinity and proteasomal cleavage (Larsen et al., 2007). A default parameter 0.75 was used as a threshold value.

Helper T-Lymphocyte (HTL) Epitope Mapping
Identification of HTLs was achieved by the webserver NetMHCII 2.3. This server calculates binding affinities of the peptides to MHC II molecules including 7 mouse H2 alleles, 20 HLA-DQ,9 HLA-DP, and 25 HLA-DR class II alleles. It has been reported to be a highly effective tool for accurately measuring binding affinities of peptides towards MHC class II binding molecules.
The results are shown in IC 50 nM units along with the percentage rank to a set of 1,000,000 random natural peptides. IC50 binding values of < 500 nM represent strong binding affinities so this criteria was chosen (Jensen et al., 2018).

Multi-Epitope Subunit Vaccine Designing
Correct positioning of the nominated epitopes plays a significant role in yielding maximum immunization. Improper joining of peptides without linkers and adjuvant may lead to the synthesis of a completely new protein with unknown properties (Farhadi et al., 2015). To curb such errors, selected B-cell, HTL, and CTL epitopes were progressively connected by means of suitable linkers viz. GPGPG to connect linear B cell epitopes with HTLs and AAY to connect CTLs. Apart from enhancing the immunogenicity of the construct, these linkers are also involved in better epitope presentation and averting the production of junctional epitopes (Tahir ul Qamar et al., 2020). Adding an appropriate adjuvant is fundamental in boosting immunogenic behavior of the vaccine (Sun et al., 2018).Capability of 548AA long GroEl HSP60 of Salmonella typhi as an immunogenic protein has been promising (Chitradevi et al., 2013). Therefore, its protein sequence (AN: NP_458769.1) was retrieved from NCBI and saved in the FASTA format. It was attached to amino terminus of the designed construct via EAAAK linker which is a rigid helical linker that does not permit interaction of construct with other areas of proteins, thus resulting in stability (Choi et al., 2019).

Assessment of Physicochemical Aspects and Peptide Solubility
The physicochemical aspects of designed vaccine construct was predicted by the ExPasy Protparam webserver. The parameters calculated were the total number of residues, molecular weight in kDa, in-vivo and in-vitro half-life, aliphatic index, theoretical pI, instability index, and grand average of hydropathicity (GRAVY) (Gasteiger et al., 2005). To gauge the solubility of the peptide, Protein-Sol webserver was employed. This webserver uses a fast protein sequencebased calculation of solubility. The prediction output is given in the format of 0-1 range with values greater than the average value of 0.45 being considered as soluble (Hebditch et al., 2017).

Immunogenicity Profiling
Allertop v 2.2 was employed to investigate the allergenicity of the vaccine. It uses a machine learning method that classifies protein using the k-nearest neighbors (kNN) algorithm and thus shows accuracy level of 85.3% at 5-fold cross-validation (Dimitrov et al., 2014). VaxiJen v 2.0 and ANTIGENpro tools to measure antigenic behavior of the peptide were utilized. VaxiJen depicts the immunogenic potential of construct with an accuracy ranging from 70% to 89% (Flower et al., 2017). According to several reports, ANTIGENpro can predict the protein antigenicity with an accuracy greater than 75.5% (Magnan et al., 2010).

Determination of the Secondary Structure
To determine the secondary structure of the polypeptide, a webserver PSIPRED was employed which performs processing of position specific scoring matrix using two feed forward neural network with an accuracy of 84.2% (McGuffin et al., 2000).

Derivation of 3-Dimensional Structure of Vaccine
To forecast the 3D shape of the peptide, webserver I-TASSER (Iterative Threading ASSEmbly Refinement) was utilized that performs the structure folding and remodeling through Monte Carlo simulations which are based on the improved knowledge-based force field. It consists of three key methodologies: (i) hydrogen-bonding networks (ii) basic statistical potentials, and (iii) threading-based restraints from LOMETS (Yang et al., 2014). It is reported to be a top-ranked server for carrying out structural predictions for the last five CASP experiments (Zhang, 2008).

Improvement of Derived Protein Structure
The initial models are further sent for improvement that includes accurate and quality transformation of initial models into structures that are comparable with experimental structures (Feig, 2017). ModRefiner server http://zhanglab.ccmb.med.umich.edu/ModRefiner was utilized for preliminary refinements to achieve an overall improvement in the physical structure (Xu and Zhang, 2011). Next, a webserver GalaxyRefine http://galaxy.seoklab.org/cgibin/submit.cgi?type=REFINE was employed to further enhance the structural quality. This method involves remaking and repacking of side chains as well as overall structure relaxation by molecular dynamics simulations. It has refinement (Heo et al., 2013).

Structure Validation
Evaluation and confirmation of the structural quality of the protein construct are ensured via several quality assessment tools such as Ramachandran plot analysis for viewing torsion angles distribution in the structure, clash scores for all-atom contact analysis, aberrations in structural conformity, and rotamers analysis. These indicators have proved to be highly valuable in determining the accuracy of local and global three-dimensional structures of protein (Pražnikar et al., 2019). In light of this, many online web servers were utilized to confirm the refined structures. These include ( Laskowski et al., 1983) was also employed to analyze the Ramachandran Plot. The Ramachandran plots help in analyzing the dispersal of (φ, Ψ ) torsion angles of the protein backbone to evaluate the protein structures (Sobolev et al., 2020).

Mapping of discontinuous B-cell epitopes (Conformational)
Distantly located residues on the primary structure of a protein that fall into proximity due to folding of a protein structure constitute >90% of discontinuous B-cell epitopes (Mukonyora, 2015). An online tool, ElliPro http://tools.iedb.org/ellipro/ was employed to infer conformational BCEs. Ellipro is considered the state of the art webserver as it gave an AUC value of 0.732 (Ponomarenko et al., 2008).

Molecular Dynamics (MD) Analysis
GROMACS 5.0 software was used to have a deep understanding of structural integrity of protein in a life-like simulated environment (Abraham et al., 2015). OPLS-AA force field was applied to the system and overall charge was noted when pdb file was used as an input. SPC/E water model was used to solvate the cubic system and genion tool was employed to add ions into the system to nullify the overall charge. It is pertinent to stabilize the temperature using NVT ensemble and pressure under NPT ensemble. Subsequently, MD simulations were performed for 10 nanoseconds (ns) and resulting trajectories of root-mean square deviation (RMSD) and rootmean square fluctuations (RMSF) were examined.

Molecular Docking Analysis
Only appropriate attachment of immunogenic molecules to the specific immune receptors can evoke a proper immune response. TLR4 is involved not only in generating suitable immune response and long-lasting immunity but also plays a specific role in A. baumannii infections in vitro and in vivo (Monem et al., 2020;Pulido et al., 2020;Shi et al., 2020

Prediction of linear B-cells
The selection of the most recurrent and immunogenic epitopes was done carefully to design the vaccine construct. BCPred, ABCpred, SVMTrip and BepiPred 2.0 predicted a total of 74 epitopes. and final recruitment of selected epitopes in the development of a new vaccine construct (Table 1).
Default settings were selected for the calculation of epitopes. Finally, 02 epitopes were carefully chosen based on their high-affinity scores (Table 1).

Prediction of Helper T Lymphocytes (HTLs)
The webserver NetMHCII 2.2 envisaged MHC-II epitopes that illustrate high binding scores towards the HLA alleles in the form of IC50 values (in nM). For final incorporation into the vaccine construct, a sum of two epitopes were opted (Table 1).  The physicochemical aspects were estimated using the FASTA format in the ExPASY ProtParam webserver to know about the basic characteristics of the vaccine. The molecular weight was calculated to be 69.9 kDa. The isoelectric point value (pI) was found to be 5.51, which illustrates that the protein is slightly acidic. ProtParam declared the protein as stable by identifying the instability index (II) value as 28.26.II values greater than 40 signify instability.
The showing the maximum c-score of -0.45 was chosen for additional refinements (Fig 4a). The TM score and RMSD value were documented to be 0.66 and 3.0 ± 4.6, respectively. A model demonstrates precise topology if the TM value is greater than 0.5. Values lesser than 0.17 indicate nonspecific similarity. Rectification of the primary vaccine model was attained by the ModRefiner server (Fig 4b) followed by subsequent refinements by GalaxyRefine (Fig. 4c). After several refinements, a final model was selected based on distinct parameters which include RMSD (0.286), GDT-HA (0.9873) and MolProbity (2.046). Poor rotamers value and clash score was observed to be 0.0 and 10.7, respectively. Ramachandran plot score was noted to be 94.5%. Therefore, further immunoinformatic analysis were performed on this carefully chosen model.

Structure Validation
According to the Ramachandran plot analysis, 94.46 % of protein residues were found to be present in the favored region (Fig. 4d). This value is consistent with the GalaxyRefine score of 94.5%. Moreover, 8.7% residues were observed in the allowed region and only 1% of the residues were detected in the disallowed region. These percentages symbolize the good quality of our predicted model. To determine the presence of any possible errors in the refined model or validate the global quality of the structures and, it is necessary to determine ERRAT and ProSAweb scores. Thus, the Z-score of -9.54 (Fig. 4e) and the quality factor of 87.6% further endorse the good quality of our model.

Predicted Discontinuous Epitopes
A sum of 7 discontinuous B-cell epitopes having 127 residues in total and the immunogenicity scores ranging from 0.641 to 0.891 was identified. Varying sizes of the conformation epitopes were documented extending from five to thirty-nine residues (Table 2, Figure 5).

Molecular dynamics analysis of multi-epitope vaccine
To assess the movement of atoms and stability of protein, molecular dynamics simulation was performed using the GROMACS server. The designed system underwent equilibrations (potential energy, pressure, and temperature) followed by energy minimization. As a result, graphs revealed that system reached a desirable temperature of 299.7 K (Fig. 6a) and an average pressure of 0.92 bar with a total drift value of 0.62 (Fig. 6b). To further analyze the overall stability of the structure and how much the conformation was changed between two time points, RMSD graph was analyzed. It showed that the protein remained highly stable. Initially the fluctuations started from 0.12 nm and go up to 1.10 nm in time of 8.8 ns. These slight oscillations show that model maintains stability over the period (Fig. 6c). To further analyze the conformational stability, RMSF plot was generated and analyzed in which the highest peak of fluctuation was observed to be at 1.37 nm, while the lowest peak of fluctuation was at 0.15 nm.
The overall graph showed very slight oscillations where higher peaks in graph depict regions of higher flexibility (Fig. 6d).  Moreover, E. coli pET30a (+) vector was used for integration and optimal expression of the designed construct. For successful integration and cloning of the sequence into the vector, two restriction sites Xho I and Nde I were added with the help of SnapGene software (Fig. 10).

Discussion
A. baumannii has emerged as an extremely cumbersome pathogen across the world becoming the leading cause of nosocomial and neonatal deaths (Peleg et al., 2008). It has successfully developed pan-drug resistance and thus appeared as one of the most difficult pathogens to treat or control (Styles et al., 2020). Almost 30 and 76% percent of the deaths are due to A.baumannii infections which can allegedly increase depending upon the severity of the patients (Ballouz et al., 2017). Different types of potential vaccine candidate have been discovered and many subunit vaccines consisting of protein or set of proteins have been studied against A. baumannii that include Outer Membrane Proteins (OMPs), Outer membrane Vesicles (OMVs), Inactivated Whole Cells Vaccines (IWCs), Ata, bap and several other subunit vaccines .
However, due to the complex configurations, enhanced toxicity, and evocation of non-specific immune response, none of these vaccines could make their way to market or even the clinical trials (Ni et al., 2017a). More apt and rational vaccine design to combat this pathogen is the need of the hour. Therefore, we designed a multi-epitope vaccine for the first time using innocuous and more immunogenic regions of the outer membrane porin OprD.
The goal of this study was to introduce a novel multi-epitope vaccine design by exploiting numerous in-silico approaches against A. baumannii. OprD is a vaccine candidate protein predicted by the in-silico analysis of complete genomes and several strains of MDR A.
baumannii. Complex proteome analysis of complete genomes and antibiotic-resistant strains of A. baumannii was performed by (Ni et al., 2017b). Outer membrane porins being profound in nature could instigate a considerable amount of immune response and induce immunization when encountered with A. baumannii. Eight porin proteins having potential roles as protective antigens and virulence factors were recognized, out of which one was OprD. Several studies support the prospective role of OprD as a successful vaccine candidate (Li et al., 2014). This porin participates in the transport of basic amino acids and carbapenem uptake. It is pertinent to use bioinformatics tools for carefully choosing the regions of OprD that confer immunogenicity.
OprD is present in a wide array of clinical strains of A.baumannii (Ni et al., 2017a;Zhu et al., 2019 The antibody-mediated or the humoral immune response is prompted by the recognition of distinct features of the pathogen considered as non-self by the host (Chaplin, 2010). These distinct features are antigenic determinants, also known as epitopes that are specifically detected by the antigen-binding site of the antibodies present in the host's immune system (Cruse and Lewis, 1999). There have been studies supporting the role of B cells and T cells immunity against A.baumannii infections (Chen, 2020;Morris et al., 2019). The role of the TLR4 pathway has also been reported in detail along with the potential roles of other TLRs in mediating the interaction between antigens and the host immune system (Kim et al., 2013). We first projected T cell and B cell epitopes of the protein and combined them with the aid of specific linkers to obtain a multi-epitope vaccine as a result. These spacers play a key role in improving vaccine design (Shey et al., 2019). Formerly reported cleavable linkers, AAY, GPGPG, and EAAK were fused into the final vaccine design with AAY and GPGPG connecting the forecasted peptides and EAAK linking the adjuvant to the N terminal of the B cell epitopes (Dong et al., 2020). A six-His tag was joined at the C-terminus to ensure purification at the later stages. Several computations based on immunoinformatic analysis revealed that the designed vaccine model contains high-affinity MHC class I and class II epitopes and linear B cell epitopes in bulk amounts. The absence of allergenic features in this model further endorses the safety of it as a vaccine candidate.
The molecular weight of our construct was observed to be 69.9 kDa with the capability of being soluble when expressed. Apart from defining the bioactivity of the protein, the solubility of protein also plays a crucial role in the biochemical and functional analysis (Ahmad et al., 2018).
The protein was detected to be slightly acidic with a theoretical pI of 5.51. Similarly, it was noted that the protein will be stable upon expression, hence reinforcing the competence of the designed construct. The protein was also observed to be hydrophobic due to the presence of aliphatic side chains as indicated by the aliphatic index value. All these features designate the protein as has ability to withstand high temperatures and thus well suited for utilization in the endemic parts.
Information regarding the secondary, as well as the tertiary structure, is vital in vaccine designing (Shey et al., 2019). Secondary structure analysis revealed that the protein is largely comprised of coils (43%), with only 7.16% of the residues disordered. The 3D structure of the vaccine showed notable refinement and thus resulted in the attainment of the desirable properties such as Ramachandran plot values. Rama-favored regions had a score of 94.5 % with rare residues in the outlier region thus suggesting the high quality of the model.
Energy minimization was performed to stabilize the overall conformation as well as minimize the potential energy of vaccine. During energy minimization, anomalous parts of the structure are repaired hence giving rise to a more stable protein structure that would behave more efficiently in the life-like cellular environments. The predicted RMSD of the vaccine-TLR-4 complex was predicted to be 0.5Å, which confirms the firmness of the complex.
Several studies have demonstrated the role of TLR4 in protecting against A.baumannii infections. Therefore, a data-driven docking analysis was carried out to assess the possible interactions between the designed construct and TLR4. Results suggest that the designed construct could effectively stimulate the immune response. Binding energies elucidate the strong binding of TLR4 with the construct. It is strongly recommended to further investigate the interactions between TLR4 and the construct in vitro as well as in vivo.
The results obtained by immune simulations complied with the actual immune responses. A continuous spike in immune responses was observed after giving repeated exposures to the antigen.IgG1, IgG3, IgE immune responses have been reported against A.baumannii in different reports (Ansari et al., 2019;Cha et al., 2018). We observed a spike in B and T cell populations, as well as a considerable amount of peak in Th cell populations, was also noted.
The first and foremost step for the validation of a potential vaccine candidate is to inspect it for immunoreactivity (Gori et al., 2009). For this purpose, it is necessary to express the target protein in an appropriate host. E. coli is a renowned expression system to attain maximum protein expression levels (Chen, 2012) .To achieve high-level expression in E.coli (strain K12), codon optimization was executed. The values of the CAI (0.99) and GC content (50.54%) were well-suited to obtain an optimum expression.

Conclusion
Novel and effective strategies are required to cope with difficult-to-treat A. baumannii infections.
This study makes use of several in-silico tools to design a rational vaccine that consists of multiple B and T cell (CTL and HTL) epitopes. Our designed model is more antigenic and immunogenic with decreased cytotoxic effects which could trigger antibodies associated protection.

Disclosure statement
No conflict of interest was reported by the authors.