How a Hemicarcerand Incarcerates Guests at Room Temperature Decoded with Modular Simulations

Hemicarcerands are host molecules created to study constrictive binding with guest molecules for insights into the rules of molecular complexation. However, the molecular dynamics simulations that facilitate such studies have been limited because three-dimensional models of hemicarcerands are tedious to build and their atomic charges are complicated to derive. There have been no molecular dynamics simulations of the reported water-soluble hemicarcerand (Octacid4) that explain how it uniquely encapsulates its guests at 298 K and keeps them encapsulated at 298 K in NMR experiments. Herein we report a modular approach to hemicarcerand simulations that simplifies the model building and charge derivation in a manner reminiscent of the approach to protein simulations with truncated amino acids as building blocks. We also report that apo Octacid4 in water adopts two clusters of conformations, one of which has an equatorial portal open thus allowing guests to enter the cavity of Octacid4, in microsecond molecular dynamics simulations performed using the modular approach at 298 K. Under the same simulation conditions, the guest-bound Octacid4 adopts one cluster of conformations with all equatorial portals closed thus keeping the guests incarcerated. These results explain the unique constrictive binding of Octacid4 and suggest that the guest-induced host conformational change that impedes decomplexation is a previously unrecognized conformational characteristic that promotes strong molecular complexation.


Introduction
Hemicarcerands comprise two identical bowl-shaped resorcinarene fragments tethered with four linkers (Fig. 1). These molecules, containing four equatorial portals in the linker region and two axial portals in the resorcinarene region, are developed as hosts to encapsulate small-molecule guests in a typical way that the guests enter and exit the host cavity at high temperatures and remain in the cavity at low temperatures 1,2 . The binding characteristics and solubility of hemicarcerands are governed mostly by the linker structures. Octaacid 4 (Octacid4; tethered with a 4,6-dimethylisophthalic acid fragment as shown in Fig. 1) is reportedly a unique hemicarcerand that is water soluble and can form host•guest complexes (hemicarceplexes) with small molecules in water without the need to raise temperature 3 . Therefore, the Octacid4-like hemicarceplexes are useful model systems for the study of constrictive binding, which is a type of molecular complexation that is affected by the activation energy required for a guest to enter the host cavity through a size-restricting portal [4][5][6] , to obtain insights into the rules of molecular complexation.
To date, only a few molecular dynamics (MD) simulations of hemicarcerands or hemicarceplexes have been reported for the facilitation of constructive bindings studies [7][8][9][10] . This scarcity is partly because hemicarcerands have more than 200 atoms, making their three-dimensional models tedious to build. The scarcity is also due to the complexity of hemicarcerands, all of which have multiple conformations 8,11 . As shown in Fig. 2, variations of the torsions along the four linkers of Octacid4 can result in many distinct conformations, which complicates the derivation of the conformationdependent atomic charges of the hemicarcerand. These technical complexities, as detailed below, may explain why there have hitherto been no aqueous MD simulations of Octacid4 to understand how it uniquely encapsulates various small-molecule guests at 298 K and keeps them encapsulated at the same temperature such that the bound guests can be differentiated in their NMR spectra from those in the bulk phase 3 .
Herein we report a modular method that simplifies the model building and charge derivation for MD simulations of Octacid4 and its hemicarceplexes. We also report the characterization of the host -4 -cavity and motions of the encapsulated guests that explains the unique constrictive binding of Octacid4 and offer new mechanistic insight into molecular complexation.

Results
The modular approach. In addition to the general technical complexities in hemicarcerand simulations noted above, the charge derivation for water-soluble hemicarcerands such as Octacid4 is particularly complicated. Because of the need to balance the atomic charges of the water-soluble hemicarcerand with the charges of the aqueous solvent and the charges of the small-molecule guest when using an AMBER forcefield such as FF12MClm 12 , the hemicarcerand charges need to be derived from ab initio calculations using the HF/6-31G* basis set that uniformly overestimates the polarity of the molecule. This is because (1) aqueous solvent models (such as the widely used TIP3P empirical water model) include polarization effects due to the empirical calibration to reproduce the density and enthalpy of liquid vaporization 13 , and (2) the small-molecule guest bears the restrained electrostatic potential (RESP) charges that are derived from ab initio calculations using the HF/6-31G* basis set [14][15][16] . Further, to obtain the atomic charges of a water-soluble hemicarcerand without any bias toward one particular conformation of the molecule, the hemicarcerand charges need to be obtained from ab initio calculations at the HF/6-31G*//HF/6-31G* level with multiple conformations of the molecule followed by using the Lagrange multiplier to force identical charges on equivalent atoms in these conformations 16 . These ab initio calculations and the Lagrange multiplier are computation-demanding and labor-intensive, respectively. For example, the ab initio calculation of one Octacid4 conformation at the HF/6-31G*//HF/6-31G* level took ~168 CPU hours using the computers at the University of Illinois Urbana-Champaign National Center for Supercomputing Applications.
During the course of our Octacid4 simulations, we recognized that the technical complexities described above differ little from those of MD simulations of proteins with large numbers of atoms -5 -and conformations. One known approach to circumventing the complexities of protein MD simulations is to perform the simulations using the modular concept underlying the secondgeneration AMBER forcefield 17 . This forcefield builds a protein model with truncated amino acids as building blocks and derives the atomic charges of each building block from its multiple conformations using the Lagrange multiplier to force identical charges on equivalent atoms in these conformations 16 . Given the effectiveness of the modular concept as demonstrated by the reported autonomous miniprotein folding MD simulations that achieved agreements between simulated and experimental folding times within factors of 0.6-1.4 18 , we used to the modular concept to simplify the model building and charge derivation for hemicarcerands and devised a modular method for hemicarcerand simulations. We describe below two key attributes of our method for performing the MD simulations of Octacid4.
The building block of Octacid4. Because Octacid4, similar to most hemicarcerands, has C4 symmetry, we divided it into four identical building blocks (termed HC1; Fig. 3). In this building block, C1 and C20 are designated as the head and tail atoms of the residue, respectively, similar to the designation of head and tail atoms in a truncated amino acid residue for protein simulations.
First, we assembled the four residues into a linear molecule. This assembly followed the same scheme as that for assembling protein residues-namely, constructing a covalent bond between the tail atom of the preceding residue and the head atom of the following residue. Second, we converted the linear molecule to a cyclic molecule by specifying a cross-link (viz., constructing a covalent bond) between the head atom of the first HC1 and the tail atom of the last HC1. Third, we converted the cyclic molecule to Octacid4 by specifying (i) four cross-links between C4 of the preceding HC1 and C18a of the following HC1, (ii) four cross-links between O5 of the preceding HC1 and C17a of the following HC1, and (iii) four cross-links between O16a of the preceding HC1 and C6b of the following HC1.
The Octacid4 sequence and all cross-links can be specified all together, thus making the hemicarcerand model building conceptually simple, algorithmic, and generalizable. Notably the HC1 building block can, in the same manner, be further divided into three sub-building blocks for hemicarcerands without C4 symmetry.
Atomic charges of HC1. Given the HC1 residue defined above, we obtained the atomic charges of Octacid4 via the RESP charge derivation for HC1 with the following specific conditions, in addition to the general conditions that are the same as those for protein RESP charges (such as the two-stage fitting for the methyl and methylene groups and forcing identical charges on equivalent intramolecular atoms) 16,17 .
First, we converted HC1 to HCD1 by attaching blocking groups to the junction atoms of HC1 ( Fig. 3) as these groups are needed to mimic the polar and aromatic groups abutting the junction atoms in Octacid4. We then performed the ab initio calculation of HCD1 to derive the HCD1 charges by using the Lagrange multiplier to force (1) HCD1 to have a net charge of -2 and (2) the total charge of all blocking groups to be zero, similar to the protein charge derivation from ab initio calculations of the acetyl-and N-methyl-blocked amino acids whose blocking group are used to mimic adjacent residues of the central amino acid 16 . Second, to balance the hemicarcerand charges with those of the solvent and the guest, we obtained the HCD1 RESP charges from ab initio calculations of HCD1 at the HF/6-31G*//HF/6-31G* level. Third, to avoid any bias toward one particular conformation of Octacid4, we obtained the HCD1 RESP charges from ab initio calculations using (1) two HCD1 proteins that uses the alpha helical and beta strand side-chain conformations 16 . Last, we extracted the HC1 charges from the HCD1 charges.
In contrast to the computation-demanding and labor-intensive charge derivation for the intact Octacid4 molecule, the HC1 charge derivation can be readily performed with HCD1. For example, the ab initio calculation of each of the two HCD1 conformations at the HF/6-31G*//HF/6-31G* level took 31-40 CPU hours, considerably less than the ~168 CPU hours for Octacid4 noted above.
Characterization of the Octacid4 cavity. Using the modular method described above, we performed 16 sets of 20 316-ns distinct, independent, unrestricted, unbiased, and classical isobaric-isothermal MD simulations of Octacid4 and its hemicarceplexes with seven small-molecule guests at 298 K and  (Table 1) from the 14 reported guests that formed hemicarceplexes with Octacid4 3 .
As apparent in Fig. 4, the Octacid4 cavity is confined by a set of atoms shown with the sphere model. Therefore, we used the variation in the radius of gyration (Rg) of these atoms to estimate the change in cavity volume. Overall, the standard errors for the average Rgs of the cavity for the 16 sets of MD simulations were either 0.01 Å or 0.02 Å (Table 1) Table 1).
The average and standard error of the cavity Rg for the 20 simulations at 298 K were 5.44 Å and 0.02 Å, respectively (Table 1). For the simulations at 340 K, the Rg also remained at ~5.4 Å and spiked to >5.7 Å, but the frequency of the spikes was much higher at 340 K than that at 298 K ( Fig. 5B and Fig. S1), and the average and standard error of the Rg at 340 K were 5.45 Å and 0.02 Å, respectively (Table 1). For the simulations of the DEA, p-xylene, and naphthalene hemicarceplexes, we found a cavity Rg of ~5.6 Å with a few or no spikes of >5.7 Å at 298 K and 340 K ( Fig. 5 and Fig. S1); the respective average and standard error of the cavity Rg of the simulations at 298 K were 5.64 Å and 0.02 Å for DEA, 5.56 Å and 0.02 Å for p-xylene, and 5.65 Å and 0.02 Å for naphthalene. The corresponding -9 -values at 340 K were 5.65 Å and 0.02 Å for DEA, 5.59 Å and 0.02 Å for p-xylene, and 5.66 Å and 0.02 Å for naphthalene (Table 1). These time series of Rg and their average values indicate that (1) the hemicarceplexes involving relatively bulky guests in water all adopted one cluster of conformations with their cavities larger than that of the closed conformation of apo Octacid4 in vacuo and smaller than that of the open conformation of apo Octacid4 in vacuo, and (2) these hemicarceplexes mostly kept their cavities contracted with all equatorial portals closed.

Characterization of guest motion in Octacid4.
We also performed conformational cluster analyses on the simulations at 340 K for apo Octacid4 and the seven hemicarceplexes. All hemicarceplex simulations were performed using an initial conformation in which the host adopted the open conformation of apo Octacid4 in vacuo ( Fig. 2A) and the guest was manually docked into the host cavity with its maximal dimension perpendicular to the axial axis (viz., the axis passing the two axial portals). Interestingly, as detailed below, in the aqueous MD simulations, all seven guests adopted a new orientation with their maximal dimensions parallel to the axial axis (Fig. 6). Similar results were obtained (Fig. S2) when the analyses were performed using the simulations at 298 K. For the DMSO hemicarceplex (Fig. 6A), the most populated conformation had the guest oxygen atom pointing to the equatorial portals and the two guest methyl groups pointing to the axial portals; in the average conformation of the largest conformation cluster, the guest was shrunk to a ball with the oxygen atom on one side and the two overlapping carbon atoms on the other, indicating that the guest had rotated around multiple axes.
-10 -For the EtOAc hemicarceplex (Fig. 6B), the most populated guest conformation (population: 94.7%) was not fully extended (with C1-C2-O1-C3 and C2-O1-C3-C4 being -114° and 81°, respectively), and the population of the fully extended guest conformation with corresponding torsions of -180° and 180° was 0.1%; the most populated guest conformation had its two oxygen atoms pointing to the equatorial portals and its two methyl groups pointing to the axial portals; in the average conformation of the largest conformation cluster, the guest was shrunk to a short rod with the oxygen atoms in the middle and the two carbon atoms on both ends of the rod, indicating that the guest had rotated frequently around the axial axis and less frequently around the equatorial axis (the axis passing the two opposing equatorial portals).  portals. In the average conformation of each cluster, the guest was shrunk to a long rod whose length was the same as that of the guest, indicating that these guests all had rotated exclusively around the axial axis. reportedly kept its guests encapsulated at the same temperature 3 . Perhaps the guests were incarcerated by strong intrinsic binding (viz., the complexation governed by strong nonbonded intermolecular interactions). However, the involvement of strong intrinsic binding is debatable given the weak interaction of the cavity with DMSO or 1,4-dioxane as indicated by their free spins as described above (Table 1 and Figs. 6A and 6D). An alternative explanation is therefore needed.

Discussion
Unexpectedly, our characterization studies show that upon complexation, Octacid4 adopted only Octacid4 in water adopts one cluster of conformations with closed equatorial portals. These conformations satisfactorily explain how Octacid4 can incarcerate a range of guests at 298 K, which is the temperature at which the guests enter the cavity.
-13 -Implication. Collectively, our studies suggest that Octacid4 can open one of its equatorial portals without elevating temperature before complexation and close all of its equatorial portals without lowering temperature after complexation. These interesting and unexpected capabilities enable the unique constrictive binding of Octacid4 with a range of small-molecule guests and suggest further that the guest-induced host conformational change that impedes decomplexation is a previously unrecognized conformational characteristic that is conducive to strong molecular complexation. This characteristic could broaden the theoretical dissection of the experimentally observed complexation affinity and the design of new complex systems for materials technology, data storage and processing, catalysis, drug design and delivery, and medicine by accounting for not only intrinsic binding (which is limited because, confined by the cost of chemical synthesis, only a small number of functional groups can be introduced to improve nonbonded intermolecular interactions) but also constrictive binding (which requires one or a few functional groups to trigger the formation of a host conformation that hinders decomplexation as demonstrated by the Octacid4 hemicarceplexes).

Molecular dynamics simulations.
Hemicarcerand Octacid4 or its hemicarceplex neutralized with sodium ions was solvated with the TIP3P water 13 ("solvatebox molecule TIP3BOX 8.2") using tLEaP of the AmberTools 16 package (University of California, San Francisco) and then energy-minimized for 100 cycles of steepest-descent minimization followed by 900 cycles of conjugate-gradient minimization to remove close van der Waals contacts using SANDER of the AMBER 11 package (University of California, San Francisco), forcefield FF12MClm 12 , and a cutoff of 8.0 Å for nonbonded interactions. The resulting system was heated from 5 K to 298 or 340 K at a rate of 10 K/ps under constant temperature and constant volume, and then equilibrated for 10 6 timesteps under constant temperature of 298 K or 340 K and constant pressure of 1 atm employing isotropic molecule-based scaling. Finally, a set of 20 distinct, independent, unrestricted, unbiased, isobaric-isothermal, and 316--14 -ns MD simulations of the equilibrated system was performed for Octacid4 or its hemicarceplex using PMEMD of the AMBER 16 package (University of California, San Francisco), forcefield FF12MClm 12 , and a periodic boundary condition at 298 K or 340 K and 1 atm. The 20 unique seed numbers for initial velocities of the 20 simulations were taken from Ref. 22   Root mean square deviation and radius of gyration. Carbon root mean square deviations were calculated manually using ProFit V2.6 (http://www.bioinf.org.uk/software/profit/). Radius of gyration was calculated using CPPTRAJ.

Data availability
All data generated for this study are included in this Article and its Supplementary Information. K.G.M. performed the literature search; prepared the z-matrices of the HC1 and HCD1 residues; performed energy minimization of HC1, HCD1, and Octacid4; analyzed the MD simulation result of p-xylene•Octacid4; contributed to revisions of the manuscript. Y.-P.P. conceived the modular method; designed the HC1 and HCD1 residues and all protocols for MD simulation and analysis; performed the remaining computational study; analyzed the data; wrote the manuscript.

Competing interests
The authors declare no competing interests.        MoGIO: The motion of the guest incarcerated in Octacid4. Axial spin: rotation around the axial