Molecular Dynamics simulations of Alzheimer’s variants, R47H and R62H, in TREM2 provide evidence for structural alterations behind functional changes

There is strong evidence supporting the association between Alzheimer’s disease (AD) and protein-coding variants, R47H and R62H in TREM2. The TREM2 protein is an immune receptor found in brain microglia. A structural alteration could therefore have a large effect on the protein. Crystallised structures were used as a base for both WT and mutated proteins. These subjected to 300ns of molecular dynamic simulation (MD). Results suggest structural alterations in both mutated forms of TREM2. A large change was noted in the R47H simulation in the complementarity-determining region two (CDR2) binding loop, a proposed binding sites for ligands such as APOE, a smaller change was observed in the R62H model. These differing levels of structural impact could explain the in vitro observed differences in TREM2-ligand binding. Author Summary A number of mutations have been found in the TREM2 protein in populations of people with Alzheimer’s and other dementias. Two of these mutations are similar in that the both cause the same coding change in the same domain of the protein. However, they both cause a very different result in terms of risk and in vitro observed changes. Why these two similar mutations are so different is largely unknown. Here we have used a in silico, simulation, approach to understanding the structural changes which occur in both of the mutations. Our results suggest that the mutation which carries a higher risk, but it less commonly observed, has a much larger impact on the protein structure than the mutation which is thought to be less damaging. This structural change is observed at a part of the protein which is thought to code for a binding loop and a change here could have a big impact on the proteins function. Further studies to investigate this binding loop could help not only a better understanding of TREM2’s role in the onset of dementia but also possibly provide a target for therapeutics.

0.0 NHD 85 86 In order to investigate the structural impact of the R47H and R62H mutations and 87 predict possible loss-of-function we carried out an in silico study of the binding 88 domain of the protein containing the mutations. Here we describe the results of this 89 study and in particular the similarities and differences between the two models. 90 Results suggest a greater effect on the binding loops by the R47H mutation, fitting 91 with previous studies [10]. . These all form important roles in 100 the protein but as they are not suggested to be used in the binding process, the 101 process which is affected by the AD risk mutations, and they are not crystallised, 102 they have been excluded from this study. The domain under investigation is a V-type 103 Ig domain which contains nine B-strands and two short a-helices, all of which are 104 characteristic of an Ig protein domain. Both mutations, R47H and R62H, can be 105 found on the protein surface, which is suggested to be how they affect TREM2's 106 binding abilities, in particular its ability to bind to APOE and APOJ [12,13]. The Have 107 your Protein Explained server (HoPE) was used to investigate the possible 108 mutational effects prior to any simulations being run [18]. Results from the server 109 suggest that the wildtype amino acid (arginine) at position 47 forms a hydrogen bond 110 with amino acids at positions 66 (threonine) and 67 (histidine) which would not be 111 possible with the histidine mutation in this position. These bonds may be important 112 for protein structural integrity. The wildtype residue is conserved at position 47, 113 though histidine is observed here in some species. Residue 62 on the other hand is 114 less well conserved, but histidine is not observed here. There is an obvious loss of 115 charge and size with mutated R62H, shown schematically in figure 1. The SIFT 116 online tool was used to predict the tolerance of the two mutations in the protein, this 117 does not predict the effect of binding, or function, but whether the mutation will be 118 tolerated in the protein structure. Results from this show the R47H mutation to be 119 tolerated with a score of 0.06 and the R62H mutation to be tolerated with a score of 120 0.10, this was based on 13 sequences. A score of <0.05 would result in a damaging 121 prediction [19]. The I-mutant server results showed a decrease in stability for both 122 the R47H and R62H mutations [20]. 129 MD simulations were run, in triplicate, for the WT and mutated proteins, the stability 130 of the simulations were checked, and volume, pressure and root mean square 131 deviation (RMSD) remained stable throughout thus giving confidence in the model 132 systems. Figure 2 depicts the structure of TREM2 which has been modelled and run 133 through the MD simulations. The complementarity-determining region (CDR) loops, 134 which are suggested to be key for the ligand binding process [21,22], are coloured 135 as follows; red for the CDR1, green for CDR2 and purple for the CDR3 loop. The two 136 mutated sites are shown in dark blue, both are close to the CDR1 and CDR2 loops, 137 with R47H actually being found in the CDR1 loop. 138 139 140 141 Figure 2 -Wild type TREM2, the structure is depicted as a cartoon style with secondary structure colouring. CDR loops are 142 coloured as follows; CDR1 = red, CDR2 = green, CDR3 = purple, the position of the two mutated sites are coloured in dark 143 blue and shown in full.

144
145 Mutations could be impacting the local or the global structure of TREM2. Local 146 structural changes were first investigated in the three molecular models. The region 147 surrounding both mutations, amino acids 43-65 were viewed, figure 3. The R62H 148 mutation alters this local structure with a shift in the beta sheet and a large 149 movement of the random coil. The R47H mutation does not appear to alter this local 150 structure in any way. As well as altering the local structure the flexibility of the 151 individual residue, i.e. the amount of movement it has, was also altered for the R62H 152 mutation. Results show the WT and R47H have a flexibility of 0.23 +/-0.02 and 0.01 153 respectively at amino acid 62, the R62H mutation on the other hand has a reduced 154 flexibility of 0.17 +/-0.01. There is also, to a lesser extent, a reduction of flexibility 155 across neighbouring amino acids which surround the R62H mutation. 156 157 Arginine, which is present in the wildtype protein at both positions, is a long and 158 stretching amino acid with a chain of carbons and nitrogens. Histidine, which is the 159 mutated form of both variants, is a ring strucutre, with less avaliblity for hydrogen 160 bonding. MD simulation results show a change in positioning of the wildtype to 161 mutated amino acid, the wildtype pretuding from the molecule in both cases and the 162 mutated amino acid being visually far more buried within the structure, figure 3 (d-g). 163 164 165 Figure 3 -Graphs of the flexibility changes for the wildtype and mutated proteins at both sites as well as the point specific 166 SASA are shown in a and b. c depicts the local level structural alteration with the type in secondary structure colours, R47H 167 in red and R62H in blue. d-g show the wildtype and mutated acids positioning for the R47H wildtype, mutation, and the 168 R62H wildtype and mutation respectively.

169
170 Solvent accessible surface area (SASA) for the whole protein, and the individual 171 mutated residues were measured. Overall SASA was reduced from 71 to 70, this 172 small change is not significant and may not have any effect on the protein function. 173 Amino acid specific SASA was measured for the WT and mutated proteins, at the 47 174 and 62 sites. Here a SASA change can be seen at the mutated site in each protein, 175 with a reduction of SASA, figure 3b. 176 177 A final measurement of the distance between the two mutation sites (taken to show 178 structural shrinkage in the protein) was measured. Again, a reduction was seen here 179 in the R47H and R62H mutations when compared to the WT. 180 181 Significant structural alteration can be seen in the CDR2 loop, figure 4 shows the 182 R47H mutation to cause a loss of beta sheet and a changing of alpha helix position 233 binding domain, here they could perform key functions in binding, it has been 234 suggested that the positive amino acids such as these play an important role [25]. 235 The mutated residues are neutral in charge, provide less opportunity for hydrogen 236 bonding and are buried within the binding domain, this alone causes an impact on 237 TREM2's ability to bind to ligands such as APOE. Further to this local change, both 238 mutations are found in the vicinity of the binding loops of CDR1 and CDR2, R47H 239 lies on CDR1 and R62H between the two loops. These, and other putative AD 240 mutations, are found on the surface of the protein where they may affect TREM2's 241 ability to bind and function. 242 243 Solvent-accessible surface area, SASA, is important when considering rates of 244 reactions which require a protein-protein or protein-ligand interaction and so a 245 change in the SASA of either of these two amino acids which could be key in the 246 binding process should be considered a detrimental effect and results showed a 247 reduction in SASA at the mutated residue for both models [26]. A further result of 248 note is the reduction the distance between the amino acids for the R62H model, this 249 measurement suggests a reduction in overall protein size and a loss of shape, two 250 things which are again key for function. Sudom et al recently published a paper 251 which showed mutated R47H protein to contain a remodelled helix in the CDR2 loop, 252 though their crystal structure is missing residues 76-81 [17]. This study supports an 253 altered helix structure in the CDR2 loop, we also see a loss of the beta sheet 254 structure which is replaced by a random coil. A random coil is far more variable and 255 could explain why they were unable to resolve this region of the protein and the 256 crystal structure is missing this region. This TREM2 domain also contains three 257 possible N-glycosylation sites, one of which is at position 79, the alteration in 258 structure here could be effecting the ability of TREM2 to undergo translational 259 modification and could explain the altered glycosylation seen in vitro in the R47H 260 mutated form [27]. 261 262 Park et al recently showed that the R47H mutation in TREM2 resulted in a 263 decreased protein stability, based on our models this may due to the large alteration 264 in the CDR2 loop structure [27]. Another study by Atagi et al presented strong 265 evidence for the binding of TREM2 to APOE, and more interestingly a lack of binding 266 when the R47H mutation was present [14], this is further supported by Yeh et al who 267 measured a decrease in TREM2's ability to bind CLU/APOJ and APOE when the 268 R47H and R62H mutations were present. Their results support our difference in 269 binding loop loss between the two mutations as they observed less of a decrease in 270 binding with the R62H mutation [12]. This binding loop degradation we observed 271 may be the key to understanding the functional effect these mutations are having on 272 the protein.

274
The evidence shown here correlates with previous studies which indicate a binding 275 change when the R47H mutation is present. We present novel findings which show 276 the R62H mutation to have a structural effect on the same region of the protein albeit 277 to a lesser extent. This provides insight and support to the studies which show less 278 of a decrease in binding ability with the R62H mutated protein compared to the R47H 279 mutated form. Understanding the structural and functional changes which occur in 280 this AD associated protein increase our knowledge of the mechanisms behind the 281 processes which cause AD and as a result provide more novel drug and therapeutic 282 targets.

284 285 Materials and Methods:
286 The immunoglobulin domain for the TREM2 protein has previously been crystallised 287 [10], both mutations were added to the structure using the modify protein function in 288 the Accelrys software, Discovery studio. The wildtype protein (WT) and the two 289 mutated structures were subjected to over 300ns of molecular dynamics (MD) 290 simulations. MD was carried out using the GROMACS [28] software suit using the 291 Amber03 [29] in built force field parameters. All protein structures were placed in a 292 cubic box, solvated using TIP3P water molecules and neutralised using Clions. The 293 particle mesh ewald (PME) method was used to treat long-range electrostatic 294 interactions and a 1.4 nm cut-off was applied to Lennard-Jones interactions. All of 295 the simulations were carried out in the NPT ensemble, with periodic boundary 296 conditions and at a temperature of 310K. There were three-steps to each simulation. 297 1; Energy minimisation, using the steepest decent method and a tolerance of 298 1000KJ -1 nm -1 . 2; Warm up stage of 25 000 steps at 0.002ps steps, during this stage 299 atoms were restrained to allow the model to settle. 3; Finally, a MD stage run for a 300 total of 300ns. Root mean square deviation (RMSD) was monitored along with the 301 total energy, pressure and volume of the simulation to check for stability. 302 303 Resulting structures were analysed for flexibility using the gmx rmsf and hydrogen 304 bonding using gmx hbond (both available within the GROMACS suite) all proteins 305 were visualised for structural differences using VMD. Further to this prediction of the 306 functional effect and stability analysis was carried out using three online servers, 307 HoPE, SITF and I-mutatnt [18-20,30]. HoPE analyses the impact of a mutation, 308 taking into account structural impact, and contact such as possible hydrogen 309 bonding and ionic interactions. The SIFT software predicts tolerated and deleterious 310 SNPs and identifies any impact of amino acid substitution on protein function and 311 lastly, I-Mutant is a neural-network based prediction of protein stability changes. 312 313 Statistical normality in distributions such a rmsd, energy, pressure, volume etc, were 314 tested for using the Anderson-Darling test. All were not normally distributed and so 315 all statistical differences between the wildtype and mutated simulations were 316 calculated using the Mann-Whitney U test. 317 318 Acknowledgements 319 Part-funded by the European Regional Development Fund through the Welsh 320 Government 321