The 4.4 Å structure of the giant Melbournevirus virion belonging to the Marseilleviridae family

Members of Marseilleviridae, one family of icosahedral giant viruses classified in 2012 have been identified worldwide in all types of environments. The virion shows a characteristic internal membrane extrusion at the five-fold vertices of the capsid, but its structural details need to be elucidated. We now report the 4.4 Å cryo-electron microscopy structure of the Melbournevirus capsid. An atomic model of the major capsid protein (MCP) shows a unique cup structure on the trimer that accommodates additional proteins. A polyalanine model of the penton base protein shows internally extended N- and C-terminals, which indirectly connect to the internal membrane extrusion. The Marseilleviruses share the same orientational organisation of the MCPs as PBCV-1 and CroV, but do not appear to possess a protein akin to the “tape measure” of these viruses. Minor capsid proteins named PC-β, zipper, and scaffold are proposed to control the dimensions of the capsid during assembly.

regions of density which could be segmented and followed without significant ambiguity. 156

157
The major capsid protein 158 The amino acid sequence of the Melbournevirus MCP was aligned against those of tokyovirus, 159 PBCV-1 and other NCLDVs, where it scores highly against tokyovirus, and within acceptable 160 parameters against Iridovirus and PBCV-1 (Fig. S6). The MCP was predicted to possess the 161 same "double jelly roll" motif as other NCLDV MCPs (Fig. S7). This is confirmed by the 162 striated density on the MCP trimer extracted from the centre of the threefold axis (Fig. 3A), with interactions between the FG1 loops which extends inside the trimer with the two short α-168 helices (dotted red circle in Fig. 3D), and stabilized with the N-terminal domain (NTD) extended 169 from the adjacent monomer which functions like an anchor (arrow in Fig. 3D). Furthermore, in 170 the Melbournevirus MCP trimer, the elongated HI1 loop interacts with the adjacent MCP 171 monomer (arrow heads in Fig. 3D-E, Fig. S7A), and forms a unique cup structure with the 172 relatively long loops of DE1 and FG1 on the top of the MCP trimer (Fig. 3E). The cup structure 173 serves to accommodate a similarly uniform and symmetric "cap" region to tokyovirus (Chihara 174 et al., 2021) which cannot be attributed to the MCP polypeptide (Fig. 3B, C). Previously, 175 Okamoto et al. reported that an uncharacterised protein, MEL_236, appeared to have the same 176 abundance as the MCP protein (Okamoto et al., 2018). At 16kDa, MEL_236 is approximately 9 the correct size (given a margin of error for SDS-PAGE size determination) to correspond with 178 the PAS stain-sensitive 14 kDa protein of tokyovirus previously reported (Chihara et al., 2021). 179 Genomic studies of Noumeavirus reported the orthologue of this, NMV_189, as the most 180 abundant protein (Fabre et al., 2017). Unfortunately, in these maps the cap region is not clear 181 enough to build a model from this sequence (arrowheads in Fig. S3D and S4D). the internal is the interactions between the N-terminals (black asterisks in Fig. 4D) and between 195 the N-terminal and C-terminal (red asterisks in Fig. 4D). The entire structure of the penton base 196 protein is similar to that of PBCV-1, which does not have the large insertion domain found in 197 CroV-dependent mavirus (Fig. S8). In the cryo-EM map, some inner capsid densities also exist 198 under the penton base protein models (arrow in Fig. 4). Another component may exist in this 199 region and connect the penton to the inner membrane extrusion. of the other (Fig. 2, Fig. S9). A polyalanine model was built de novo and fitted into the density. 214 However, it is less clear whether this is a single protein or two independent ones, as the 215 extremity of this region is relatively unsupported and as a result is ~6Å (Fig. S3). Our current 216 hypothesis is that they are two independent bundles, where the polypeptide chain continues into 217 the capsid framework and are located surrounding the inner membrane extrusion (Fig. S8, Fig.  218 1B). 219 220

Other minor capsid proteins 221
Like tokyovirus, the mCPs of Melbournevirus form an intricate lattice network (Fig. 2). The two 222 viruses, extremely close in genetic terms, should demonstrate high structural similarity. Thus, 223 Melbournevirus shows the same lattice array with the trapezoidal lattice consisting of the lattice 224 protein (salmon in Fig. 2), which are linked by cement proteins (pale blue in Fig. 2) and with a 225 glue/zipper protein arrangement (pink and orange, respectively, in Fig. 2) along the trisymmetron 226 interface. These are internally supported by other proteins; one is the scaffold protein, and the 227 other is the support protein (Fig. 2). The scaffold proteins (yellow in Fig. 2) extend from the 228 "horseshoe" shaped terminus (bracket in Fig. 2) of one scaffold array near the pentasymmetron 229 to the scaffold array along the trisymmetron interface. The support proteins (dark blue in Fig. 2) 230 showed a large density and three additional smaller densities, running parallel to the 231 trisymmetron interface and the cement protein array. The three additional small densities were 232 not observed in tokyovirus capsid at 7.7 Å resolution (Chihara et al. 2021). As the scaffold and 233 support proteins (yellow and dark blue, respectively, in Fig. 2) are underneath the lattice layer of 234 the capsid, they are more flexible and as such resolution suffers when the map is filtered by local 235 resolution, however without local filtering they are exceedingly difficult to see (Fig. 2). The glue 236 proteins are located on the edge of the trisymmetron and serve to connect the adjacent 237 trisymmetron. The bisymmetric structure looks like a dimer of the protein, but it is unclear at this 238 resolution. The zipper protein consists of ten components (orange in Fig. 2), eight of which are 239 located along the trisymmetron interface and bind to each of the two-fold symmetric glue 240 proteins (pink in Fig. 2). The remaining terminal two zipper proteins are located between the 241 glue proteins and the pentasymmetron components (arrowhead in Fig. 2

), and their orientation is 242
significantly different from the other zipper proteins (arrow in Fig. 2).  higher whole virus resolution, which can be considered consistent with the increased particle 288 count. It is also ~25% smaller than Melbournevirus, at 190 nm, so the intra-particle defocus 289 gradient should not be as large. Computationally, improvements in image processing software, 290 specifically, magnification anisotropy estimation and Ewald sphere correction played a 291 significant role in improving our Melbournevirus reconstruction (see Methods). As these 292 computational methods advance further, we look forward to seeing further improvements for the 293 study of giant viruses.  Fig. 3B, Fig. S7) and form a cup structure on the top of the MCP trimer to accommodate 307 additional cap densities (orange in Fig. 3C). PBCV-1 MCP has a highly glycosylated "cap" 308 region, while ASFV and faustovirus have a large additional β-sheet rich domain (Fig. S7B-D). 309 Melbournevirus, however, has an ordered region, which is still lower resolution than the main 310 body of the MCP trimer (Figs. S3D and S4D), which does not correspond to any part of the MCP 311 model (Fig. 3B,C). This ordered structural motif would imply the presence of protein, rather than 312 sugars, which tend to be highly mobile. However, the reduced clarity of the density compared to 313 that of the MCP may be caused by several factors. Firstly, the cap may be in multiple 314 conformations across the capsid surface, resulting in loss of clarity due to icosahedral averaging. 315 Secondly, the cap may present partial occupancy of each MCP trimer across the capsid surface, 316 also resulting in loss of clarity due to icosahedral averaging. To conclude this, further 317 investigation is necessary to identify the cap protein and determine the stoichiometry with the of the asymmetric unit the "golf club motif". Furthermore, the "golf club motif" was 360 hypothesized to be caused by specific localization of the tape measure proteins (Xian et al. 361 2020). In our observations, Melbournevirus showed a "golf club motif" of the MCP trimer array 362 in the same orientation as PBCV-1 and CroV (Fig. 5), but Melbournevirus lacks a tape measure 363 protein. However, the rotation of the single MCP trimer by 60° (P1d, P2d, P3d in Fig. 5) orients it 364 to match that of the adjacent trisymmetron and penton asymmetric unit. If we picture all the 365 MCP trimers in a pentasymmetron asymmetric unit in the same orientation (Fig. S11), we can 366 better visualise why the rotation of a single MCP trimer may occur. The single MCP trimer 367 would cause a mismatch in the MCP orientational alignment of the trisymmetron interface 368 (arrows in Fig. S11). The trisymmetron interface demonstrates a greater angle across it than the 369 trisymmetron, which is of a gentler curvature. As such, it is likely that this orientational 370 arrangement is a result of improving the flexibility of the capsid across the trisymmetron 371 interface. 372 373 Without a tape measure protein, we propose that PC-β, which is present underneath the interface 374 between the 60°-rotated Pd MCP trimer and the Pb and Pe MCP trimers, may play a role in 375 determining trimer orientation. In a similar manner, PC-α may support the interface between the 376 penton and two asymmetric units, while PC-γ supports the interface between the 377 pentasymmetron and the adjacent trisymmetron. However, these components do not directly 378 contact with the 60°-rotated Pd MCP trimer. While we have previously proposed that the scaffold 379 protein array acts as an equivalent to the tape measure protein, the horseshoe terminus of the 380 scaffold does not appear to be able to directly interact with any of the PC proteins, as the 381 glue/zipper protein array is present in between the horseshoe terminus and either PC-β or PC-γ 382 (Fig. 2). PC-β may interact with the terminal two zipper proteins, which have a rotated 383 orientation when compared to the remaining eight zipper proteins running down the edges of the 384 glue proteins along the trisymmetron interface, while PC-γ interacts with the glue proteins 385 themselves along two trisymmetron interfaces. As the scaffold proteins run parallel to the 386 glue/zipper array, some interaction may also occur there, with the three acting in concert in a 387 similar manner to the tape measure protein in controlling capsid construction. Therefore, the 388 combination of PC-β, zipper proteins, and scaffold proteins (dotted curve in Fig. 2) is most likely 389 to function as the tape measure protein in Melbournevirus. 390

391
The lattice proteins show strong similarity to the motifs displayed in the lower resolution 392 tokyovirus reconstruction. By using local resolution filtering in RELION 3.1 (Zivanov et al., 393 2020), we were able to avoid losing clarity in the more rigid lattice proteins, while 394 simultaneously visualising the more flexible scaffold array (Fig. 2). Interestingly, this permitted 395 visualisation of three additional weak densities extending from the previously reported support 396 protein (dark blue in Fig. 2), which run parallel to the zipper/glue proteins and cement proteins 397 on the trapezoidal lattice, but which themselves were not previously observed in tokyovirus. 398 Missing them in tokyovirus was likely due to resolution limitations and caused by decreased 399 signal to noise from fewer particles and lack of dose weighting for HVEM data (Chihara et al., which direction of two apparently divergent paths the density should follow. As the primary 407 sequence is unknown, and large or interacting sidechains can create a density bridge which is 408 difficult to discriminate from backbone density, we erred on the side of caution for the time 409 being. One of the mCPs, a repeating segment of the lattice protein (salmon in Fig. 2), has a 410 section exposed on the internal face of the capsid (following the chain into the mCP layer created 411 too much ambiguity in direction of the polypeptide) which fits a short section consisting of three 412 linked α-helices (Fig. S10). The PDB model of the PBCV-1 penton protein was rigid-body fit to the Melbournevirus penton 513 density. It was deemed a poor fit, so the map blocks were segmented using SEGGER (Pintilie 514 and Chiu, 2012) and extracted segments were processed with DeepTracer (Pfab et al., 2021) 515 using a poly-alanine sequence for chain tracing. The best of these poly-alanine chain models was 516 rigid-body fit to fill the penton with five identical chains. ISOLDE (Croll, 2018) was used to 517 better fit some sections of the chain. Real-space refinement and validation were carried out in 518 PHENIX (Adams et al., 2010). Other parts of the mCPs (PC-α and Lattice protein) were also 519 process with the same way and the polyalanine models were built.