Abstract
The prediction of protein mutations that affect function may be exploited for multiple uses. In the context of disease variants, the prediction of compensatory mutations that reestablish functional phenotypes could aid in the development of genetic therapies. In this work, we present an integrated approach that combines coevolutionary analysis and molecular dynamics (MD) simulations to discover functional compensatory mutations. This approach is employed to investigate possible rescue mutations of a poly(ADP-ribose) polymerase 1 (PARP1) variant, PARP1 V762A, associated with lung cancer and follicular lymphoma. MD simulations show PARP1 V762A exhibits noticeable changes in structural and dynamical behavior compared with wild type PARP1. Our integrated approach predicts A755E as a possible compensatory mutation based on coevolutionary information, and molecular simulations indicate that the PARP1 A755E/V762A double mutant exhibits similar structural and dynamical behavior to WT PARP1. Our methodology can be broadly applied to a large number of systems where single nucleotide polymorphisms (SNPs) have been identified as connected to disease and can shed light on the biophysical effects of such changes as well as provide a way to discover potential mutants that could restore wild type-like functionality. This can in turn be further utilized in the design of molecular therapeutics that aim to mimic such compensatory effect.
Significance Statement Discovering protein mutations with desired phenotypes can be challenging due to its combinatorial nature. Herein we employ a methodology combining gene SNP association to disease, direct coupling analysis and molecular dynamics simulations to systematically predict rescue mutations. Our workflow identifies A755E as a potential rescue for the PARP1 V762A mutation, which has been associated with cancer. This methodology is general and can be applied broadly.
Introduction
The identification of disease–associated mutations that result in missense protein variants can provide avenues for therapeutic development. For example, trans–splicing is a therapeutic approach that can be employed to repair mutations at the mRNA level (1). Therefore, understanding the impact of disease variants may be of value to determine if these mutations can or should be targeted for genetic therapies.
The identification and characterization of missense mutations has been a field of active research. Some of us have developed an approach termed Hypothesis Driven-SNP-Search (HyDn-SNP-S). This approach involves the search of single nucleotide polymorphisms (SNPs) resulting in missense mutations that are associated with a specific phenotype on a particular gene or genes, followed by atomistic simulations to characterize the impact of the mutation (2, 3). We have previously employed this approach to uncover and characterize various cancer– associated mutations (2, 4, 5), including the prediction and experimental confirmation of a rescue mutation for a lung–cancer associated mutation on APOBEC3H (6). Although there are successful examples for the prediction of rescue mutations, a systematic method to discover these variants would be beneficial.
Proteins evolve through a series of neutral or selectively-favored mutations (7, 8) that could coevolve with corresponding compensatory mutations to maintain constraints from folding structure or function (9–11). Such coevolutionary information between residue sites can be inferred by a statistical modeling of sequences in a protein family and has achieved significant performance in predicting physical contacts for protein folding and protein-protein interaction prediction (12–15). The coevolutionary model has also been used to estimate mutational effect in epistatis studies (16–19). The direct coupling analysis (DCA) method is a statistical model that estimates a global probability distribution of protein sequences by inferring parameters including covariation coupling between residues and site-wise conservation from multiple sequence alignments (MSAs) of homologous sequences (20). As a result, DCA is a useful tool that has been successfully applied in the prediction of protein structures (21, 22), conformational changes (23), protein interactions (24), function (25), Sequence Evolution with Epistatic Contributions (SEEC) (19), and recently in protein design (25, 26).
Given the features of these two methodologies, coevolutionary analysis and SNP search can be combined to further understand the relationship between cancer-related mutations and compensatory mutations which could rescue the SNP variant (Figure 2). Working from these two origins, molecular dynamics (MD) simulations of the identified mutations can be used to contextualize their impacts in reference to the wild type structure. In this contribution we present the development of a methodology that combines HyDn-SNP-S with coevolutionary analysis to uncover possible compensatory mutations for disease variants. We apply this approach to the regulatory domain of protein PARP1 and use MD simulations to understand the mutation’s effect on the overall PARP1 structure.
Poly(ADP-ribose) polymerase 1 (PARP1) performs base excision and repairs single-stranded breaks. It acts as an ADP-ribosylating enzyme, covalently attaching ADP-ribose to proteins. The successive transfer of ADP-ribose results a PAR chain, which acts as a signal for other DNA-repair enzymes (27). This process, known as PARsylation or PARylation, occurs on both single- and double-stranded DNA (28, 29). PARP1 is believed to perform over 90% of all cellular PARsylation activity (30). PARP1 is known to assist with the repair of single- and double-stranded breaks through several DNA repair pathways, including base excision repair (BER), nucleotide excision repair (NER), mismatch repair (MMR), homologous recombination repair (HRR), and non-homologous end joining (NHEJ) (28, 31). Despite its involvement in these various pathways, PARP1 is only essential for single-strand break repair, and it is considered non-essential for double-stranded repair (32). When it is in a position to repair strand breaks, PARP1 is believed to dimerize with the DNA-binding domain of another PARP1 (33). This dimerization is facilitated by the central automodification domain (34). PARP1 is of particular interest because of its inextricable link with the BRCA1 enzyme, known for breast cancer susceptibility (29, 35). Both PARP1 and BRCA1 are involved in homologous recombination repair (HRR), where a damaged area of DNA is resynthesized using a sister chromatid (36–38). BRCA1 is also known to perform PARsylation, which, together with RAP80, regulates HRR (39). This link has been utilized in the treatment of BRCA-mutated cancers, as evidenced by the FDA-approved use the PARP1 inhibitor Olaparib for advanced ovarian cancer (40). Inhibiting PARP1 leads to a stalling of the replication fork and the subsequent switch to repair via the NHEJ pathway in cancer cells, but a continuation via HRR in non-cancer cells (41).
PARP1’s N-terminal domain has three zinc fingers, one responsible for interactions between domains, and the other two involved in DNA binding (43). When DNA damage occurs, PARP1 localizes to the damaged area (44). The zinc fingers bind to the exposed nucleotides, instead of the 3’ and 5’ ends at the break sites, allowing for versatility in binding other secondary arrangements of DNA (45). The catalytic region then goes through three enzymatic reactions for PARsylation, composed of initiation, elongation, and branching. Central to this process is an “ADP-ribosyltransferase (ART) signature” (Figure 1) comprised of a conserved His-Tyr-Glu (H-Y-E) triad in its nicotinamide binding pocket (46). PARsylation requires the nicotinamide adenine dinucleotide (NAD+) as a coenzyme, because PARP1 polymerizes the ADP-ribose units (47). Unfolding of the helical subdomain (HD) (Figure 1) is crucial for the activation of PARP-1, and thus, changes in stability in this region can affect the enzyme’s catalytic output or how it binds NAD+ (48). This unfolding has been proposed to occur through a two-step mechanism, first through DNA binding and secondarily through substrate binding to destabilize the folded HD structure (49). Wild type PARP1 has been found to be upregulated in different cancers (50–56). In turn, overactivation of PARP1 can lead to mitochondrial distress and cell necrosis (57–59). One particular single nucleotide polymorphism (SNP), rs1136410, results in the V762A missense mutation in the HD region of PARP1. The resulting V762A mutation has been shown to reduce the enzymatic activity of PARP1 (60). This SNP has been linked to both lung cancer and follicular lymphoma through the HyDn-SNP-S method (2, 3). The rs1136410 SNP has also been shown to serve as a protective factor against breast cancer and coronary artery disease in the Han Chinese population, but it may lead to an increased overall risk of age-related cataracts and cancers (61–65).
The ADP-ribosyltransferase (ART; red) and the helical (HD; pale orange) subdomains of the catalytic domain of PARP1 are highlighted on the 4ZZZ wild-type structure (42). Residues A755 and V762 are shown in black.
In the remainder of this paper we describe the application of the combined HyDn-SNP-S and coevolutionary analysis methods to determine whether there are possible compensatory mutations for the PARP1 V762A variant. In the next section we provide details of the coevolutionary analysis and molecular dynamics simulations methods, followed by the results of these approaches for wild type (WT) and various mutants of PARP1. Finally, concluding remarks are provided on the applicability of this combined approach.
Materials and Methods
Coevolutionary Analysis for PARP Regulatory Domain
The V762A mutation is found within the PARP regulatory domain of PARP1. To investigate evolutionary footprints for this functional domain, multiple sequence alignments (MSAs) of homologous sequences for this specific domain are obtained from the Pfam database with an entry ID of PF02877 (66). The direct coupling analysis (DCA) method (20) is then applied to the MSA dataset to extract information about coevolutionary coupling between any pairwise residues and the preference of amino acid occurrence at each residue position. As described in (20), DCA utilizes maximum entropy modeling to estimate the joint probability distribution of amino acid sequences of a protein or domain sequence:
where Z is the partition function, the position of residues within the aligned domain or protein sequence are denoted as i and j and parameters ei,j and hi can be inferred by DCA. Parameters ei,j quantify coevolutionary coupling strength for residue i and j for all possible amino acid occurrence pairs. The amino acid biases for single residue positions is captured by the parameter hi. Being the exact inference of the parameters an intractable problem, there are multiple approximations to infer these parameters with different complexities and accuracies. In this work we use the mean field formulation (20), which optimizes the identification of highly coupled sites, however it is not generative as other approximations like bmDCA (67) or arDCA (68). Since the generative property is not useful in our context, mfDCA provides both accuracy and low computational complexity.
Calculation of a Sequence-based Energy Function for PARP1 Mutants
Using the collection of ei,j and hi parameters estimated by DCA, a sequence-based energy function can be calculated from Equation 1 for any given aligned sequence. This collection of parameters or Hamiltonian (H) for a protein sequence S is expressed as:
Calculating the energy function H(SWT) for the wild type sequence of PARP1’s regulatory domain provides a reference energy to compare against amino acid changes in the sequence. This sequence Hamiltoniann has been predictive of functional and non-functional effects in proteins and RNA (69–71). Any amino acid substitution in this domain would update the energy function to a mutant one H(SMut). Then the effect of any mutant could be estimated in terms of the differential of this sequence-based energy function:
In this context, a more positive ΔHmut score is predicted in general to have an unfavorable or neutral effect, while a more negative one represents a favorable change for fitness.
Original codes for coevolutionary parameter inference by DCA and H(S) score calculation were written in MATLAB and published before at https://github.com/morcoslab/coevolution-compatibility (72).
Molecular Dynamics Analysis of PARP1
Seven systems of the catalytic domain in PARP1 were considered including: wild type from crystal 4ZZZ (WT) (42, 73), a cancer mutant containing the V762A mutation (rs1136410) from crystal 5WS1 (V762A) (74), a mutant containing the V762A mutation from the WT (V762A-from-WT), single mutants containing either A755E only (A755E) or A755L only (A755L), and double mutants containing the hypothesized rescue mutations and the V762A cancer-related mutation, A755E/V762A and A755L/V762A (Table 1). None of the systems studied contained DNA. The V762A-from-WT was created by using LEaP in AMBER Tools to edit structure 4ZZZ (75, 76). Modeller was used to incorporate the missing residues into both crystal structures (77, 78). The single mutants were created by editing structure 4ZZZ using UCSF Chimera and replacing the amino acid using the Dunbrack rotamer libraries (79, 80). The double mutants were similarly created by modifying structure 5WS1. VMD and UCSF Chimera were used for visualization (79, 81).
Naming scheme for MD simulations of PARP1.
Using LEaP, chloride ions were added to neutralize the total charge of each system (75). WT, V762A, and V762A-from-WT were each solvated using TIP3P water extending at least 8 Å from the solute, and the A755E, A755L, A755E/V762A, and A755L/V762A systems were each solvated extending at least 12Å from the solute (75, 82). A simulation for a WT system with a 12 Å solvent buffer was also performed and no significant differences were observed compared with the smaller box results (Figures S4 and S7). Charged residues were assigned the default protonation state in LEaP, consistent with PROPKA (5 His had suggested protonation at N-delta by PROPKA, 3 of which were inconclusive by the electrostatic calculation and visual inspection) (83–85). The ff14SB force field was used for all protein residues (86).
AMBER molecular dynamics simulations were run using pmemd.cuda (76, 87, 88), with the NVT ensemble (number of atoms, volume, and temperature held constant) for the minimization and heating phases. The NPT ensemble (number of atoms, pressure, and temperature held constant) with the Langevin thermostat (temperature held at 300 K) was used for equilibration and production (89). The systems were run in triplicate with a 2 fs time step for the total simulation time shown in Table 1. Results for a representative trajectory of each system are shown below. All difference data between systems is presented as Variant - WT.
Cpptraj was used to analyze production dynamics (90). Normal modes were visualized using the Normal Mode Wizard in VMD (81, 91). Further data processing and graphing were performed with Gnuplot and the Matplotlib, NumPy, and statsmodels Python libraries (92–96). A FORTRAN90 program was used for the energy decomposition analysis (EDA) (97). EDA averaging was done using R (98), with the data.table, abind, and tidyverse libraries (99–101).
Results and Discussion
We developed a compensatory mutation discovery workflow comprised of two computational approaches: 1) molecular dynamics simulations to investigate the effect of mutations on the protein’s structure and dynamics and 2) sequence-based coevolutionary analysis that provides a global single mutation landscape to screen out potential rescuing mutations for the SNP variant of interest. The compensatory mutation discovery workflow that has been developed herein is depicted schematically in Figure 2. Briefly, the two computational approaches are performed in tandem to investigate the structural and dynamic properties of the protein and disease variants under study via MD; coupled with the sequence-based coevolutionary analysis to obtain a single mutation landscape to screen out possible rescue mutations for the SNP variant of interest.
Workflow of DCA-MD method for identifying compensatory mutations for SNP(s). The DCA method infers coevolutionary parameter from MSA(s) containing the mutated residue (See Methods). Then a mutational landscape of protein energy function scores for all possible single mutations is generated to evaluate the SNP and initially screen possible compensatory mutations. The MD method simulates and validates the effect of SNP and compensatory mutation candidate.
Guided by the results from HyDn-SNP-S on PARP1, we sought to understand how the rs1136410 SNP affected the overall structure and dynamics of PARP1. Thus, we performed molecular dynamics (MD) simulations of both wild type PARP1 (WT) and the V762A PARP1 (V762A) variant structures.
Each system’s root mean square deviations (RMSDs) were stable across all simulations (see Figures S2–S3, and S8A). One way to assess the mutation’s effect on the dynamics of the system is through the use of a by-residue correlation matrix. This analysis can reveal regions of motion and dynamical correlation, anti-correlation, and no correlation within the protein (see Figures S9–S14). Based on the differences in correlated movements in Figure 3A, about half of the residues in the HD subdomain (710–770) and a fifth of the residues in the ART domain (910–960) show enhanced correlated movement in V762A than in WT.
Comparison between V762A and WT. A. Differences in correlated movement between V762A (blue) and WT (red). B. Differences in RMSF between V762A and WT. A 1Å threshold is shown with a blue dashed line. C. Differences in EDA and hydrogen bonding between V762A and WT. Beige residues have undefined values. Differences in EDA above threshold of 0.5 Å are marked. Hydrogen bond donor in orange, acceptor in pink; bond indicated with bold dashed line.
An analysis of the root mean square fluctuation (RMSF) can be used to identify areas of higher or lower fluctuation between a system and its reference. Detailed RMSF data can be found in Table S1 and Figures S5–S7 and S8B,C. V762A and WT differ in RMSF by more than 1 Å at residues 724, 747, 782, and 825. Each of these residues is central to flexible loops throughout the subdomains, indicating a difference in dynamics between V762A and WT. V762A impacts the active site because of its proximity to the nicotinamide binding pocket. The active site residues (879 to 889; Figure 3B) show increased fluctuation in the V762A structure compared to WT, leading to decreased structural stability in the mutant. These residues are in a flexible loop opposite the NAD+-coordinating residues in the binding pocket, and several residues interact directly with V762.
An energy decomposition analysis (EDA), comprised of Coulomb and van der Waals (non-bonded) interactions, was used to study all of the intermolecular interactions between individual protein residues and residue 762 (see Figures S15–S25). Residues G888 and Y889, specifically, interact more favorably with residue 762 in the V762A mutant than in WT. The reverse behavior occurs with residue N759, which is located in the same helix as V762, but opposite the loop. N759 interacts more favorably with V762 in WT than in the V762A system (Figure 3C). Further, one of the hydrogen bonds between the HD and ART subdomains, GLN 717 – THR 887, is present for 27% less of production time in the V762A system than WT (Figure 3B). The V762A mutation appears to result in a reduction in stability in the active site because the loop and helix are not held as tightly together. This instability could mean that the NAD+ may bind with lower affinity in the PARP1 V762A holoenzyme.
We then utilized a DCA-based energy scoring function ΔHMut (see Materials and Methods) to explore the single mutation landscape for all residues in the regulatory domain of PARP1 (Figure 4A). The majority of single mutations had disruptive scores in for PAPR1 regulatory domain. V762A has a more positive score than WT, indicating its potential disruptive role in protein folding or stability from the perspective of coevolutionary analysis. Among all possible mutations, the top two most favorable mutations are observed in residue 755, specifically A755E and A755L. A double mutation profile generated with V762A also reports A755E and A755L as the best compensatory mutations occurring at positions other than 762 for the V762A mutant (Figure S1 and Figure 4B). This SNP-based profile directly estimates the effect of a second mutation on the original SNP variant, to uncover if there are second mutations that reverse the effect of SNP on the energy function score (Figure S1). Both A755E and A755L single mutations cause a comparable, but opposite, effect on protein coevolutionary score as V762A, while the double mutations, V762A/A755E and V762A/A755L, lead to scores near WT (Figure 4B,C). A755L generates a positive epistatic effect on V762A, suggesting that the double mutations has a better fitness energy score than the additive effect of two single mutations. A755E causes a negative epistatic effect on V762A. In summary, the coevolutionary analysis indicates that two mutations at residue 755 are promising for rescuing V762A.
Coevolutionary information based energy for PARP1 and mutant. A. Energy landscape of mutations on PARP regulatory domain for PARP1. B. Mutational effect of PARP1 SNP V762A and potential complementary mutants, A755E/L. C. Epistatic effect for V755E and V755L for PARP1 SNP V762A.
Working from the results of the coevolutionary analysis, we simulated the A775E and A755L single mutants to establish a baseline for those mutations. The hydrogen bond between GLN 717 and THR 887 is present 28% (OE1–OG1) and 23% (OE1–N) less of production time in the A755L system than in the WT, indicating that A755L leads to less stability in the active site (lower box of Figure 5A). The EDA revealed significant differences in the non-bonded interactions between A755E and WT; with a large number of residues in the catalytic domain showing changes larger than 1 kcal/mol (Figure 5B). This may be due to the change from a residue with no charge to one that is negatively charged. Additionally, several HD subdomain residues (717, 720, 758–759) and residue 887, which is located in the active site, all interact more favorably with residue 755 in A755L than in WT (Figure 5B). Four residues in flexible loops, two in the HD subdomain (744 and 746) and two in the ART subdomain (824 and 825), have a significantly lower RMSF in A755E and A755L than WT (Figure 5C and S8C), suggesting that both A755E and A755L stabilize the overall structure as a result of this decreased flexibility. Based on the differences in correlated movements, the portion of the helices of the HD subdomain near the variant (residues 710–770) show more correlated movement in both A755E and A755L than in WT with themselves (Figure 5D-E).
Comparison between A755E and WT and A755L and WT. A. Differences in EDA and hydrogen bonding between A755L and WT. Beige residues have undefined values. Differences in EDA above threshold of 0.5 kcal/mol are marked. Hydrogen bond donor in orange, acceptor in pink; bonds indicated with bold dashed line. B. Differences in EDA between A755E and WT. Beige residues have undefined values. Differences in EDA above threshold of 0.5 kcal/mol are marked. C. Differences in RMSF between A755E and A755L and WT. A 1 Å threshold is shown with a blue dashed line. D. Differences in correlated movement between A755E (blue) and WT (red). E. Differences in correlated movement between A755L (blue) and WT (red).
We then simulated the A755E/V762A and A755L/V762A double mutant systems to evaluate the role of the predicted residues as compensatory mutations on the dynamics of PARP1. The hydrogen bond between GLN 717 and THR 887 is present 28% (OE1–OG1) less of production time in A755L/V762A than in the WT, indicating that A755L/V762A leads to less stability in the active site (Figure 6A). Residues 717, 720, 758–759, which are near the site of mutation, and 887, which is located in the active site, all interact more favorably with residue 762 in A755L/V762A than in WT (Figure 6A).
Comparison between A755E/V762A and WT and A755L/V762A and WT. A. Differences in EDA and hydrogen bonding between A755L/V762A and WT. Beige residues have undefined values. Differences in EDA above threshold of 0.5kcal/mol are marked. Hydrogen bond donor in orange, acceptor in pink; bonds indicated with bold dashed line. B. Differences in correlated movement between A755E/V762A (blue) and WT (red). C. Differences in correlated movement between A755L/V762A (blue) and WT (red). D. Differences in EDA and hydrogen bonding between A755E/V762A and WT. Beige residues have undefined values. Differences in EDA above threshold of 0.5kcal/mol are marked. Hydrogen bond donor in orange, acceptor in pink; bonds indicated with bold dashed line. E. Differences in RMSF between A755E/V762A and A755L/V762A and WT. A 1Å threshold is shown with a blue dashed line.
There is minimal difference in correlated movements between A755E/V762A and WT, potentially indicating that A755E is a rescue mutation (Figure 6B). Based on the differences in correlated movements, residues 710-770 show more correlated movement in A755L/V762A than in WT with themselves and 885-985 with themselves (Figure 6C). This impact on the HD subdomain may point to a similar or increased catalytic output for structures with A755E/L rescue mutations.
In A755E/V762A, the hydrogen bond between GLN 717 and THR 887 is present 29% (OE1–OG1) and 24% (OE1–N) less of production time than in the WT, indicating that A755E/V762A also leads to less stability in the active site (lower box of Figure 6D). There are significant differences in the non-bonded interactions between A755E/V762A and WT; residues all over the catalytic domain are impacted (Figure 6D). Similar to A755E, the changes may be due to the additional charge at position 755.
At residues 724, 748, and 826, the RMSF was significantly lower in A755E/V762A than WT (Figure 6E). At residues 960, 961, and 968, the RMSF was significantly higher in A755E/V762A than WT (Figure 6E). These correspond to differences seen for A755E/V762A in the normal modes analysis (see Figures S26–S27), where the HD subdomain shows less motion than the ART subdomain. The A755L/V762A system, however, closely resembles the WT in its first normal mode. As these residues are indicated by RMSF in each mutant studied, their fluctuation may be important for the recognition of the cofactor, which is absent from these simulations. The added stability provided by A755L in the A755L/V762A gives strong support for its evolution as a compensatory mutation. Because HD region destabilization is necessary for PARP1 activation, this particular double-mutant may more tightly control activation.
Conclusion
In this study, we have demonstrated that integrating coevolutionary analysis and MD simulations can be useful to discover and validate compensatory mutations for SNPs using PARP1 rs1136410 as an example. A755E/L is first recognized by the DCA coevolutionary method as variants that are most favorable for PARP1 structures and the subsequent MD simulations validated that both variants stabilize the overall structure. Coevolutionary information can also be used to estimate double mutations that contain SNP to uncover rescue mutations. Both A755E/L lead to favorable “fitness” conditions in the context of the V762A variant from an evolutionary perspective. Additionally, the effects of A755E/L and V762A on PARP1 protein are not purely additive, with A755E being negatively synergetic and A755L being positively synergetic (Figure 4B). MD simulations show that the cancer mutation affects the structure and dynamics of V762A PARP1 compared with WT. These results indicate that the A775E mutation, in conjunction with V762A, can resolve some of the structural and dynamical impacts, mimicking wild type. Our work can help understand the effects of SNPs in their association with disease, like cancer in this case, as well as identifying changes that could ameliorate those changes. The discovery of important compensatory mutations can be used to study how particular SNPs are not always associated with disease and provide a roadmap for molecular therapeutic approaches aiming at reducing the negative effects of mutations. This methodology is generic in the sense that can be applied to a large number of systems where structural and sequence data is available. The case of PARP1, presented here, is only one of many that could be studied with our integrated approach. Subsequent computational and new experimental investigation of the potential of the two proposed rescue mutations would provide further insights. We expect future work could uncover important insights on the effect of mutations for many more genes and their associated diseases.
Author Contributions
KR, XJ, FM, and GAC designed the project. KR and EML carried out MD simulations, analyzed and interpreted data, and co-wrote the manuscript. XJ performed coevolutionary analysis, analyzed data and co-wrote the manuscript. All authors contributed to discussion and manuscript editing.
Acknowledgments
This work was supported by NIH R01GM108583 (GAC), R35GM133631 (XJ and FM), and NSF CAREER grant MCB-1943442 (FM). Computational time from CASCaM’s CRUNTCH3 cluster, partially supported by NSF CHE-1531468 (GAC), and from XSEDE Project No. TG-CHE160044 is thankfully acknowledged.
Footnotes
↵* faruckm{at}utdallas.edu, andres{at}utdallas.edu
Figure 1 added, figure quality/captions improved, additional MD details, and supplemental files updated.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.
- 11.↵
- 12.↵
- 13.
- 14.
- 15.↵
- 16.↵
- 17.
- 18.
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.↵
- 57.↵
- 58.
- 59.↵
- 60.↵
- 61.↵
- 62.
- 63.
- 64.
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.
- 94.
- 95.
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.
- 101.↵