Dynamics-function relationship of N-terminal acetyltransferases: the β6β7 loop modulates substrate accessibility to the catalytic site

N-terminal acetyltransferases (NATs) are enzymes catalysing the transfer of the acetyl from Ac-CoA to the N-terminus of proteins, one of the most common protein modifications. Unlike NATs, lysine acetyltransferases (KATs) transfer an acetyl onto the amine group of internal lysines. To date, not much is known on the exclusive substrate specificity of NATs towards protein N-termini. All the NATs and some KATs share a common fold called GNAT. The main difference between NATs and KATs is an extra hairpin loop found only in NATs called β6β7 loop. It covers the active site as a lid. The hypothesized role of the loop is that of a barrier restricting the access to the catalytic site and preventing acetylation of internal lysines. We investigated the dynamics-function relationships of all available structures of NATs covering the three domains of life. Using elastic network models and normal mode analysis, we found a common dynamics pattern conserved through the GNAT fold; a rigid V-shaped groove, formed by the β4 and β5 strands and three relatively more dynamic loops α1α2, β3β4 and β6β7. We identified two independent dynamical domains in the GNAT fold, which is split at the β5 strand. We characterized the β6β7 hairpin loop slow dynamics and show that its movements are able to significantly widen the mouth of the ligand binding site thereby influencing its size and shape. Taken together our results show that NATs may have access to a broader ligand specificity range than anticipated. Author summary N-terminal acetylation concerns 80% of eukaryotic proteins and is achieved by enzymes called the N-terminal acetyltransferases (NATs). They belong to the large family of acetyltransferases and adopt the GNAT fold. Interestingly most lysine acetyltransferases (KATs), which acetylate specifically internal lysines, share the same fold. Rationale for the ligand recognition by the GNAT enzymes remains unclear. Proteins are dynamic entities that utilize their structural flexibility to carry out functions in living cells. By studying the dynamics throughout the entire NATs family, we found that the slow dynamics of the fold is strongly conserved. We also revealed the mobility of the active site lid, namely the β-hairpin loop β6β7, which is one of the main structural differences between the NATs and the KATs. The size and shape of the ligand binding site depend on movements of that β-hairpin loop. We suggest that in attempts of mapping NATs specificity or ligand design the fold flexibility should be taken into consideration.


Introduction
Dynamics-function relationship of NATs    [32,33,[57][58][59]. Furthermore, β4 and β5 are not assembling into a sheet along their whole 265 length despite their proximity. Instead they form a "V shape" splitting the seven-stranded beta-sheet and 266 creating a crevice where Ac-CoA and peptide substrate meet (Fig. 1).

267
In the region of the helices α1 and α2, we notice a similar pattern of flexibility between all the 268 structures where the loop α1α2 and the helix α2 fluctuate more. Molecular dynamics simulations of the 269 human Naa50 and Naa10 have shown that the flexibility of helix α2 is decreased in the presence of a 270 substrate [19]. This region is also involved in the complex formation with the subunit Naa15 303 Furthermore Naa40 has an extra N-terminal helix α0, the movements of which are correlated with strands 304 β3 and β4 as well as with loop α3β5 (Fig. 6). On the contrary Naa80 has a shorter α1-α2 loop as well as a 305 different orientation of its β6β7 loop.

323
Further, we calculated how the surface area changes when the structure is modified following the 324 individual lowest energy normal modes (see Experimental procedures). A reconstitution of the protein side 325 chains from our C-alpha model was necessary to calculate the surface area of the mouth of the tunnel using 326 CAVER Analyst [65] (see Experimental procedures). To this goal we first defined a static clipping plane 327 using three residues (Tyr31, 73 and 138) lining the entrance (see Experimental procedures). We generated a 328 trajectory along the first six lowest-frequency normal modes and held this static clipping plane for each 329 frame of the trajectory. Note that the orientation of the clipping plane was extracted from the conformation 330 of the X-ray structure and the same orientation was used in each frame. We then calculated the value of the 331 surface area throughout the trajectory. We also calculated the surface area at a clipping plane 2Å away from 332 the mouth, where the second and third residues of the substrate sit, in order to verify that no amino acid 333 closes the access upwards of the active site.

334
The lowest energy modes show concerted motions of the three loops β3β4, β6β7 and α1α2. They 335 surround the active site and their movements modify the shape of its mouth and its surface area (Fig. 7).  Table). The 394 loop β3β4 is 10 to 15 residues longer in Naa60 than in the other NATs and mutations on key residues 395 disrupt interactions with the β5, β6, and β7 strands, leading to altered catalytic efficiency and protein 396 stability [70]. Finally the β6β7 loop, one of the most flexible loops, contains two tyrosines conserved 397 through most of the NATs, which make hydrogen bonds with the backbone of the first and second amino 398 acids of the substrate [19].

399
The highly mobile α1α2 and β6β7 loops flank the catalytic core regulating the size of the binding site.
400 Moreover the regions from the N-terminal to α2 and β6-β7 have structural differences among the NATs.   (Table 1). From these we excluded eleven structures for 457 which the X-ray structure had unresolved segments within the GNAT fold (S1 Table).
458 We formed 10 functional groups: Naa10, Naa20, Naa40, Naa50, Naa60, Naa80, archaeal NATs, RimI, where is the distance vector between two Cα atoms and in the configuration of the protein, is 516 We selected the 6 first non-trivial modes from the set of normal modes of the protein of interest. For each 517 mode, we generated a trajectory consisting of nine snapshots; we displaced the initial Cα positions 518 following the mode and using arbitrary amplitudes in either direction around the X-ray structure (the mode 519 vectors were multiplied by -12, -9, -6, -3, and 3, 6, 9, and 12). 520 We used the Molecular Modelling ToolKit (MMTK,[86]) to reconstruct the side chains in order to obtain 521 trajectories at all-atom resolution. We calculated the 3D transformation necessary to superimpose the initial 522 all-atom structure onto each of the snapshots of the Cα trajectories. This was done by minimizing the RMS 523 difference between the initial all-atom structure and the Cα trace snapshots. The 3D transformations were 524 not computed on the overall structure but locally using an iterative process. We used sliding windows that 525 were three amino acids long to compute the transformation, which is then applied only to the central amino 526 acid for which the side chain is reconstructed. The process is then iterated by sliding by one residue along 527 the protein sequence.

528
529 Visualisation and calculation of the cavity 530 The analysis was performed using CAVER Analyst [65]. We first selected three Cα atoms from amino acids 531 (Tyr31, 73, and 138) that are lining the mouth of the cavity in the X-ray structure. In 3D space, these three 532 atoms define a parallel plane to the tunnel mouth. Then a set of intersecting spheres, with a radius of 1Å, is 533 placed on a line perpendicular to this plane to fill up the length of 5Å into the cavity. Using this geometrical 534 structure as a base, we computed a cavity surface in each frame using the algorithm described in [65]. We