In silico design of glyco-D,L-peptide antiviral molecules

Background most licensed antiviral drugs are nucleoside analogs. A recent research focuses on blocking a virus from entering the cells in the viral cell adsorption/entry stage. In this entry mechanism the glycans present on the viral surface play a fundamental role. Homochiral L-peptides acting this fusion mechanism have shown some inhibition of viral infection. Peptides with regularly alternating enantiomeric sequence (L,D-peptides) can assume structures, which are not accessible to the corresponding homochiral molecules. Further, L,D-peptides are less sensitive to the enzymatic digestion. Aim in silico design a L,D-peptide with a high affinity for the viral surface glycans, and consequently able to interfere with its fusion mechanism. Methods a 3α,6α-Mannopentaose (3-6MP) molecule was used to simulate a viral surface glycan. Molecular Dynamics (MD) simulations of 3-6MP and D,L-peptide in water are performed using the force field AMBER12-GLYCAM06i. The binding constant was evaluated from trajectories. The D,L-peptide molecule was modified over the sequence, the length, the terminals and finally glycosylated to attain a very high binding constant value for 3-6MP. In addition, the specific interaction between T lymphocyte CD4 glycoprotein and HIV envelope gp120 glycoprotein was studied through MD simulations between a D,L-peptide, bounded to a typical CD4 glycan, and a highly conserved HIV gp120 glycan. Results in the case of interaction with 3-6MP molecule, the very effective molecule obtained was H-D-Trp-L-Pro-D-Asn-L-Pro-D-Trp-L-Pro-D-Asn-L-Pro-OH where the Asn residues in position 3 and 7 are glycosylated with alpha-D-Mannopyranosyl-(1->3)-[alpha-D-mannopyranosyl-(1->6)]-alpha-D-mannopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-1-OH. In the case of interaction with HIV envelope, the very effective molecule obtained, able to antagonize the CD4 glycoprotein, was H-D-Trp-L-Pro-D-Asn-L-Pro-D-Trp-L-Pro-D-Asn-L-Pro-OH where the Asn residue in position 3 is glycosylated with alpha-D-galactopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-(1->2)-alpha-D-Mannopyranosyl-(1->3)-[alpha-D-galactopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-(1->2)-alpha-D-Mannopyranosyl-(1->6)]-beta-D-mannopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-(1->4)-N-acetyl-beta-D-glucopyranosyl-1-OH Conclusion these glycosylated D,L-peptide molecules are very promising representatives of a new class of antiviral agents.


Introduction
There is a continuous effort in the search for new antiviral agents, which aims to develop new and more effective drugs, capable of targeting an increasing number of diseases while, at the same time, limiting their side effects. The approved drugs have different mechanism of actions during the viral life cycle 1 . A few of them are entry inhibitors. One of these (Enfuvirtide) is a polypeptide with 36 units. Many polypeptides were tested as antiviral 2,3 . The viral fusion stage is often mediated by the glycoproteins present on its envelope 4 .
The oligo-L,D-peptides present structures which are not accessible to the corresponding homochiral peptides. In particular, by assuming local  conformations, they can realize helices of different diameter 5,6 . Another property of these molecules with the peptide bond connecting D,L and L,D dimers is the decreased susceptibility to digestion by proteolytic enzymes compared with the LL dimers 7 . This results in their increased average life time in the organism. The initial idea of our research was to find a short L,D-oligopeptide capable of binding the sugar moiety of the viral surface glycoproteins, so interfering with the fusion stage of the infection.  1). This is a common core structure for a large number of glycans present in glycoproteins 8 .   4 -OH in -helix conformation with 6.2 monomer per turn. Middle: the helix turns round the oligosaccharide. Right: same as the middle, but with different colors for the oligosaccharide, Phe residues, Lys residues.
The rationale for the octapeptide chosen sequence was its solubility, obtained through the introduction of the charged Lys residues, and the stabilizing van der Waals interactions between Phe residues and the oligosaccharide. The association of the two molecules was monitored performing MD simulations, through visual inspection of the trajectory by means of VMD 1.9.1 9 program and following the molecules' center of gravity distance as a function of time.
Our simulations, however, showed that this system ( fig.2) is unstable. The peptide molecule moves away from the oligosaccharide and evolves into a globular structure, where -turns are present ( fig.  3). This happens because the peptide sequences L,D and D,L can easily form-turn conformations 10 . Therefore, the resulting association constant has a low value. Given this result, we decided to modify the D,L-peptide molecule over the sequence, the length, the terminals and finally glycosylated in an attempt to attain a very high binding constant value. It is worth mentioning that glycopeptide antibiotic molecules have shown antiviral activity 11,12 . Detailed Methods and Results are given below.

Methods
Force field : oligosaccharides are present in the system with the peptide molecule, hence the choice for the force field is AMBER12sb 13 with GLYCAM06j 14,15 . Molecular Dynamics simulations : All-atom MD simulations were performed in water by using GROMACS 4.5.6 software package 16 . AMBER12sb and GLYCAM06j was obtained from the pertinent official sites 17,18 . After minimization and 150 ps equilibration steps on a NVT ensemble and subsequent 150 ps equilibration steps on NPT ensemble, a 200 ns MD simulation was performed in a periodic box, with 1 Na + and 1 Clto simulate the ionic strength.
The following settings were used: An integration step of 2 fs with periodic boundary conditions at constant volume and temperature (T=310 K). Leap-frog integration scheme of the equations of motion. The default LINCS algorithm to constrain bonds involving H atoms. The Verlet cutoff scheme for neighbor searching, since parallel computing on GPUs was performed. Particle Mesh Ewald (PME) algorithm for long range electrostatics. Three different simulations were performed for each peptide molecule, differing in box volume and/or relative position of peptide and 3-6MP molecules and/or initial peptide conformation. Carbohydrate coordinates, topology and charges : they were obtained by the Carbohydrated Builder tool present in the GLYCAM-web site glycam.org. The molecule can be submitted in condensed Glycam notation 19 or can be built specifying the monosaccharides and their linkages.
Files with the extensions rst7 for coordinates and parm7 for topology were obtained, in AMBER style. The program glycam2gmx.pl 20,21 can be used to obtain files in GROMOS stile. The command line is perl ./glycam2gmx.pl -prmtop *.parm7 -crd *.rst7 -outname *, which returns *.gro for coordinates and *.top for topology (* stands for the chosen names). These files must be merged with the corresponding peptide.gro and peptide.top files obtained by the program pdb2gmx for the peptide molecule. One coordinates file was obtained merging the carbohydrate coordinates in the peptide.gro file. They can be visualized by VMD program and translated through the program editconf (present in the GROMACS suite) if necessary. One topology file was obtained by renaming the carbohydrate file *.top to *.itp and inserting the line #include "*.itp" in the peptide.top file. Duplicated records, atom types definitions and water topology must be deleted. Glycopeptide coordinates, topology and charges : they were obtained by the Glycoprotein Builder tool present in the GLYCAM-web site glycam.org.
Step 1: the coordinates of the peptide under glycosylation are submitted in pdb 22 format, without hydrogens and records CONNECT.
Step 2: the ends are fixed. If the peptide is in zwitterionic form, CONTINUE must be chosen.
Step 3: the glycan derivative is chosen by the interactive carbohydrated builder tool; the saccharide chain is selected, ending with aglycon -OH; this end will be linked to the peptide; after aglycon, branches can be introduced; term with DONE command; add glycan to glycosylation sites: N-linking and Olinking biologically likely sites are shown; SHOW ALL gives all sites with the solvent accessible surface area; select and continue; download current structure.
Step 4: the structure is minimized.
Step 5: download of the rst7 and parm7 files. The program glycam2gmx.pl can be used to obtain files in GROMOS stile. In simulations of the glycopeptide with a carbohydrate molecule, its coordinates and topology will be merged in glycopeptide files by following the same procedure as before. Association constant estimation : the distance between the molecules' centers of gravity was calculated by the command g_dist for each MD simulation. When the intermolecular distance was lower than a certain value dependent upon the molecular weights (MW) of the molecules (in our case 1.5 nm) these molecules were considered associated. From the total time of the MD simulation (t tot ) is possible to extract a time when the molecules are considered associated (t ass ). Through the introduction of = ⁄ , the fraction of association time, the association constant can be roughly calculated as: where the molar fraction is considered numerically equal to the time fraction, applying the ergodic postulate. The trajectories are of limited extension, and hence the approximation in constant evaluation.
The association constants reported were averaged over all trajectories calculated for the same molecular system. The error was calculated as half the range of values obtained.

Results
The peptide or glycopeptide sequences studied are given below together with their (in mol -1 ) with 3-6MP molecule. In order to achieve the expected association constant, we introduced the following changes: 1) terminal ACE and NME were introduced to increase the number of intramolecular hydrogen bonds (HB) so as to stabilize the peptide helix conformation; 2) Pro or Hyp were introduced to impart conformational stiffness to the helix; 3) The D-aminoacid was replaced to try to introduce a range of hydrophobic and hydrogen bonding interactions. All the following peptides behave like the peptide H-(D-Phe-L-Lys) 4   In this case six different MD simulations were performed. The results are in all cases a stable bending between TrpProAsnPenta and 3-6MP after their initial matching into the simulation box. There can be different arrangements in the bending, but without dissociation. This demonstrates a very high (not measurable with the aforementioned method). The results of a MD simulation are reported as the inter-molecular distance vs time ( fig.7) and in ( fig.8) is reported the structure of the association complex obtained.  Encouraged by these positive findings, we developed a simulation aimed to a specific case. The idea is to use for glycosylation the same cellular glycan chain preferentially exploited by the virus for the adhesion in the first stage of fusion, thus creating a molecular mimicry against the virus. Considering the specific case of HIV virus, this means using a glycan chain of the CD4 glycoprotein, which is involved in the interaction with the viral surface glycoproteins in the early stage of recognition 4 . Numerous different glycan chains are reported for this glycoprotein 24 , with respectively: one, two or three branches. We chose the structure alpha-D-galactopyranosyl-  The glycoprotein CD4 interacts with the HIV glycoprotein gp120, which can relate through the exposure of different glycans 4 . The glycan at position Asn262 in HIV gp120 is chosen for the interaction with TrpProAsn9glyco, because it has the best resolved structure in a recent quaternary complex with CD4, and its presence is essential for the fusion mechanism 25   MD simulations were performed over TrpProAsn9glyco and N262glyco in water. The results are in all cases a stable bending between TrpProAsn9glyco and N262glyco after their initial matching into the simulation box. Similar to the previous case, here too, we found a very high (not measurable with the aforementioned method). The results of one MD simulation are reported ( fig.11) showing the structures of the association complex obtained: Fig.11. 73.8 ns time structure, which represents the average of the association during the simulation; two representations are shown, one with different colors for different chemical units, the other with different colors for the 2 molecules, to locate them better.
It is worth mentioning that the molecules' relative positions, in the 'association state', are consistent with a hypothetical interaction between CD4 and gp120 glycoproteins, as indicated by the opposite positions of the glycosides binding points, which are the peptide and the GlcNAc terminals, respectively. These findings are illustrated in Fig. 12.

Conclusions
Our results showed that oligo-D,L-peptides can achieve a stable association with a glycan, only when glycosylated. Further, the glycosylation is effective only if the glycan side chains are suitably long. Actually H-D-Asn-L-Pro-D-Asn(-GlcNAc)-L-Pro-D-Asn(-GlcNAc)-L-Pro-D-Asn(-GlcNAc)-L-Pro-OH which has three monomeric glycosylation sites, does not achieve a stable association with 3-6MP molecule. H-D-Trp-L-Pro-D-Asn(-Penta)-L-Pro-D-Trp-L-Pro-D-Asn(-Penta)-L-Pro-OH realizes in silico the goal of a high bending with the molecule 3-6MP, which is the common core structure of a large number of glycans present in glycoproteins. H-D-Trp-L-Pro-D-Asn(-9glyco)-L-Pro-D-Trp-L-Pro-D-Asn-L-Pro-OH realizes in silico the goal of a high bending with the molecule N262glyco, where 9glyco is a common glycosylation of CD4 and N262glyco a conserved glycan chain of gp120, so to realize a molecular mimicry against the HIV virus. If our in silico studies will be confirmed by in vivo tests, this array of compounds will represent an entire new class of anti-viral drugs, with promising therapeutic potential. A possible synthetic route for these molecules is presented in Appendix. The great variability, achievable through modifications of the peptide sequence, length, endings and through its glycosylation, allows the possibility of fine tuning logP, solubility and interaction specificity. Finally, it is possible to use for glycosylation the same cellular glycan chain preferentially exploited by the virus for the adhesion in the first stage of fusion, thus creating a molecular mimicry against the virus. In the case of HIV virus, we have used a glycosylic chain of CD4 glycoprotein 4 , involved in interaction with the surface glycoproteins of the virus in the early stage of recognition.