RNA-induced allosteric coupling drives viral capsid assembly in bacteriophage MS2

Understanding the mechanisms by which single-stranded RNA viruses regulate capsid assembly around their RNA genomes has become increasingly important for the development of both antiviral treatments and drug delivery systems. Here, we investigate the effects of RNA-induced allostery in a single-stranded RNA virus — Levivirus bacteriophage MS2 — using the computational methods of the Dynamic Flexibility Index (DFI) and the Dynamic Coupling Index (DCI). We show that asymmetric binding of RNA to a symmetric MS2 coat protein dimer increases the flexibility of the distant FG-loop and induces a conformational change to an asymmetric dimer that is essential for proper capsid formation. We also show that a point mutation W82R in the FG-loop creates an assembly-deficient dimer in which RNA-binding has no significant effect on FG-loop flexibility. Lastly, we show that the highly flexible disordered FG-loop of the RNA bound asymmetric dimer not only becomes the controller of the rigid FG-loop but enhances its dynamic coupling with all the distal positions in the dimer. This strong dynamic coupling allows highly regulated communication and unidirectional signal transduction that drives the formation of the experimentally observed capsid intermediates. Author summary The final stage of an RNA virus’ life cycle is the assembly of a protein shell encapsulating the viral genome prior to release from the host organism. Despite rapid advancements in both experimental and theoretical biology since the mid-20th century, little is still known about the underlying mechanisms of viral capsid assembly. However, understanding the biophysical principles of viral capsid assembly would bring us one step closer to developing new biotechnologies such as antivirals that inhibit this critical stage of the life cycle or artificial capsids for targeted drug/vaccine delivery. Although we limit the present study to one simple RNA virus that infects bacteria, we propose that the physical implications can extend to other RNA viruses including the human coronavirus SARS-CoV-2. We also propose that the allosteric regulation by specific protein-RNA interactions might be a general mechanism exploited by many other ribonucleoprotein complexes, such as CRISPR-Cas9, spliceosome or ribosome.

Single-stranded RNA (ssRNA) viruses package their genetic material into a protein 2 capsid as part of their replication cycle [1]. Understanding the mechanism of the 3 packaging has importance for development of antiviral therapeutics, as well as for 4 repurposing for delivery applications in bionanotechnology. The main experimental 5 source of information about viral packaging are the atomistic structures resolved 6 through cryoEM or X-ray crystallography, which however only provide static picture of respective protein residues by quantifying their displacement under an applied force 23 using linear pertubation theory. DFI has been previously shown to identify residues 24 important to protein function, as mutation introduced at rigid positions result in 25 dyfunctional proteins [5,6]. The DCI measures coupling between pairs of residues upon 26 perturbation of one of them. Prior studies showd that high DCI socre between distant 27 residues indicated long-range allosteric communication [5,6]. The details about 28 calculating DCI and DFI and their extension to protein-RNA complexes is provided in 29 Methods. 30 Our results of application of DCI and DFI to MS2 bacteriophage indicate a 31 mechanism of packaging signals-induces sequential assembly of viral capsids, and, more 32 generally, can also be applied to other RNA-protein complexes for identification of 33 possible RNA-mediated allosteric regulation of proteins. 34 Researchers have extensively studied the Levivirus bacteriophage MS2, a 35 positive-sense ssRNA virus that encodes only four proteins: maturation, lysis, replicase, 36 and coat [7][8][9]. X-ray crystallography (PDB ID: 1ZDH [10]) has revealed that the MS2 37 genome is encapsulated by 180 coat proteins that form three quasi-equivalent 38 conformations arranged as 60 asymmetric A/B dimers and 30 symmetric C/C dimers. 39 These dimers create a T=3 icosahedral shell with spherically symmetric three-fold 40 (pseudo-six-fold) and five-fold axes. The FG-loops (residues 66-82) of the symmetric 41 and asymmetric dimers, which make crucial inter-dimer contacts at the three-fold axis is consistent with a prior all-atom normal-mode analysis of the same system [29,30].  Although detecting protein complex intermediates can be challenging due to their short 150 lifespan, two significant intermediates, namely the three-fold and five-fold rings, have 151 been identified on the MS2 capsid assembly pathway via mass spectrometry [31][32][33].

152
While crescent and horseshoe structures have not been experimentally detected, 153 theoretical evidence suggests that they are two probable structures on the pathway to 154 the five-fold ring (see [31,33]). Higher order capsid intermediates are formed by the alternating 1:1 association of asymmetric-symmetric dimers. Three-fold rings and five-fold rings form independently, with an RNA-bound asymmetric dimer acting as the nucleation site of capsid assembly. Crescent and horseshoe conformations are two possible intermediates that arise during the formation of the five-fold ring structure. In the full T=3 capsid, five three-fold rings converge at one five-fold ring.
The successful self-assembly of bio-molecules requires long-range cooperativity, which 156 we hypothesize will be reflected in the intermediate structures along the pathway to 157 capsid assembly. Specifically, we predict that these intermediates will exhibit high  regulatory activity in capsid assembly [35,36]. Moreover, the recent discovery of 190 previously unknown variability in MS2 capsids [37] raises questions about whether the 191 protein-RNA interactions discussed here are specific enough to support the formation of 192 T=3 symmetry over unconventional T=4 or hybrid T=3/T=4 symmetries. Finally, the 193 application of DFI and DCI to other viral proteins and even larger RNA-protein 194 systems, such as ribosomes, spliceosomes, CRISPR-Cas9, and full capsids, has the 195 potential to reveal a general allosteric mechanism among nucleoprotein complexes that 196 can be guiding their function as well as mediate a particular directed assembly pathway, 197 and they will be subject of our future investigations. depending on the specific application [38]. In recent years, the ENM has been extended 205 and applied to RNA structures [39]. However, due to the higher flexibility of RNA 206 compared to proteins, RNA molecules are best modeled as all-atom networks [40,41].
where H is the 3N × 3N Hessian matrix composed of the second derivatives of the is most consistent with a naturally occurring system [5,6].

225
From the response vectors, we construct a perturbation matrix A of the form: where |∆R j | i is the magnitude of the fluctuation of residue i when residue j is 227 perturbed. Subsequently, the DFI value of residue i is given by the net fluctuation of 228 residue i relative to the net displacement of the entire protein: DFI is a metric that quantifies the relative flexibility of a residue, where higher dysfunctional proteins, suggesting that these residues must be conserved across 235 structural homologs with comparable function [5,6]. Thus, rigid residues act as hinges 236 (like joints/pivots) in the global motion of the protein and are sequence-specific, while 237 flexible (high DFI) residues provide the mobility necessary for dynamical processes such 238 as catalysis, signal transduction, and conformational changes [5,6].

239
Similarly, the DCI value of residue i is given by the fluctuation of residue i in 240 response to perturbations at a single residue j relative to the average response of residue 241 i due to perturbations at all other residues: DCI is a measure of the coupling between residue i and residue j upon perturbation 243 at residue j. If the distance between residues i and j is significant and binding of a