Abstract
The multi-subunit chromatin remodeling complex SWI/SNF1–3 is highly conserved from yeast to humans and plays critical roles in various cellular processes including transcription and DNA damage repair4, 5. It uses the energy from ATP hydrolysis to remodel chromatin structure by sliding and evicting the histone octamer6–10, creating DNA regions that become accessible to other essential protein complexes. However, our mechanistic understanding of the chromatin remodeling activity is largely hindered by the lack of a high-resolution structure of any complex from this family. Here we report the first structure of SWI/SNF from the yeast S. cerevisiae bound to a nucleosome at near atomic resolution determined by cryo-electron microscopy (cryo-EM). In the structure, the Arp module is sandwiched between the ATPase and the Body module of the complex, with the Snf2 HSA domain connecting all modules. The HSA domain also extends into the Body and anchors at the opposite side of the complex. The Body contains an assembly scaffold composed of conserved subunits Snf12 (SMARCD/BAF60), Snf5 (SMARCB1/BAF47/ INI1) and an asymmetric dimer of Swi3 (SMARCC/BAF155/170). Another conserved subunit Swi1 (ARID1/BAF250) folds into an Armadillo (ARM) repeat domain that resides in the core of the SWI/SNF Body, acting as a molecular hub. In addition to the interaction between Snf2 and the nucleosome, we also observed interactions between the conserved Snf5 subunit and the histones at the acidic patch, which could serve as an anchor point during active DNA translocation. Our structure allows us to map and rationalize a subset of cancer-related mutations in the human SWI/SNF complex and propose a model of how SWI/SNF recognizes and remodels the +1 nucleosome to generate nucleosome-depleted regions during gene activation11–13.
Main
To gain insight into the molecular mechanisms of how SWI/SNF remodels chromatin, we purified endogenous SWI/SNF from S. cerevisiae, assembled the SWI/SNF-nucleosome complex in vitro (Extended Data Fig. 1) and determined its structure using single particle cryo-EM. The complex was assembled in the presence of the non-hydrolysable ATP analog ADP-BeFx and was determined to sub-nanometer resolution (Extended Data Fig. 2). We observed that the nucleosome is clamped between two regions of the SWI/SNF complex (Fig. 1, Supplementary Video 1). To improve the resolution, we also assembled the complex in the presence of ATPγS and determined its structure using cryo-EM (Extended Data Fig. 3). Since this structure shows features similar to the ADP-BeFx bound complex, we combined the two data sets and performed further processing (Extended Data Fig. 4). After careful 3D classification, we obtained a reconstruction of the body of SWI/SNF to an average resolution of 4.7 Å (Fig. 1a; Extended Data Fig. 4), which we refer to as the Body module of SWI/SNF. This resolution allowed the de novo model building of the SWI/SNF complex (Fig. 1b).
In addition to the bound nucleosome, the SWI/SNF complex is composed of three major modules: the Body, the Arp, and the ATPase (Fig. 2a). The Snf2 ATPase domain binds the nucleosome at super helical location (SHL) 2, the same location shown in the stand-alone structures of the Snf2 ATPase-nucleosome complexes14, 15 as well as in SWR116, Chd117, 18 and SNF2h19, but quite different from INO8020, 21 (Extended Data Fig. 5). The Arp module is composed of Arp7, Arp9, Rtt102 and the HSA domain of Snf2 and is sandwiched between the Body and the ATPase modules (Fig.1a, b). This architecture has never been observed before and is quite different from other multi-subunit remodeling complexes, including INO80 and SWR116, 20, 21 (Extended Data Fig. 5). The HSA of Snf2 plays an essential role in connecting the ATPase and Arp modules to the Body, extending into the Body and anchoring at the opposite side of the complex (Fig. 1a, b). We therefore named this region of Snf2 adjacent to the HSA the Anchor domain (Fig. 1c). This connection of the Arp module to the Body through a single α helix could explain the observed flexibility of the Arp and ATPase modules in the reconstruction as evidenced by lower estimated local resolution (Extended Data Figs. 2, 3). The functional relevance of this flexibility requires further investigation.
The 4.7 Å resolution map of the Body shows the helical nature of the SWI/SNF (Fig. 1a) and enabled us to build a structural model with the help of prior knowledge of this important complex (Figs. 1b, 2; Methods). We then mapped the crosslinking data for apo SWI/SNF22 onto our model of the Body module as a validation procedure (Extended Data Table 1). Out of the 35 inter-linking pairs that were mapped onto the Body model, 27 (77%) pairs have a Cα-Cα distance within 30 Å, the maximum distance that is allowed by using the crosslinker BS323. We also mapped 60 pairs of intra-links, of which 55 (92%) show a Cα-Cα distance within 30 Å. These comparisons demonstrate the accuracy of our model, and also indicates that the structure of the SWI/SNF Body module does not change drastically upon engaging a nucleosome.
The conserved subunits Swi1/ARID1/BAF250, Swi3/SMARCC/BAF150/177, Snf12/ SMARCD/BAF60 and Snf5/ SMARCB1/BAF47/INI1 assemble into the body of the SWI/SNF complex (Fig. 2), consistent with these proteins forming a core module in the human SWI/SNF complexes24. Based on the positioning of different domains and their functions, we further defined four sub-modules of the scaffold — the Spine, the Hinge, the Arm and the Core (Fig. 2a).
The Spine is composed of Snf12 and the C-terminal regions of Swi3 (Fig. 2b). We identified two copies (named A and B) of Swi3 in our structure, consistent with previous crosslinking data showing multiple same-residue crosslinks within Swi322. The most striking feature of the Spine is the four-helix bundle formed by the two long helices (LH1/2) of Snf12 and the Coiled-coil domains from two Swi3 (Fig. 2b), consistent with previous finding that the RSC homologs of Snf12 (RSC6) and Swi3 (RSC8) directly interact25. The Coiled-coil domain of Swi3 has clear leucine-zipper properties, containing hydrophobic amino acids separated by 7 residues in a helical region26. Interestingly, the crystal structure of the human dominant-negative OmoMYC homodimer27, a leucine-zipper containing complex, can be unambiguously fitted into the two helices belonging to Swi3 by rigid body docking (Extended Data Fig. 6a). Surprisingly, the two Coiled-coil domains of Swi3 have different lengths (Fig. 2b), showing an asymmetric folding (Extended Data Fig. 6b). We speculate that this might be due to the different interactions that the two Coiled-coils are involved in during complex assembly. BAF155/170 (SMARCC1/2), the human homologues of Swi3, have been indicated to form a dimer at the very first step of SWI/SNF complex assembly24. We therefore hypothesize that the two copies of Swi3 are indistinguishable at the early steps of SWI/SNF assembly, and that after engaging with other subunits, especially Snf12/SMARCD/BAF60, the symmetry is broken. Snf12 has been shown to play important roles in SWI/SNF function28, and our structure suggests that it may do so by interacting with Swi3 and contributing to the assembly of the complex. The unassigned density at the tip of the Spine shows clear β-sheet features and is directly connected to the SWIB domain of Snf12 (Extended Data Fig. 6c). This, together with the secondary structure prediction of Snf12, allowed us to assign this density to Snf12.
The Hinge is composed of the two SANT domains of Swi3 and the C-terminal helices of Snf12 (Fig. 2c). SANTB contacts the C-terminal helices of Snf12 and is in close proximity to the Core sub-module (Fig. 2c), whereas SANTA is located at the top and interacts with a C-terminal segment of Swi3A (Fig. 2c). Both SANT domains contact and sandwich the Snf2 Anchor domain (Fig. 2c), playing a key role in stabilizing the ATPase within the complex.
The Arm is composed of Snf5, the N-terminal SWIRM domains of Swi3 and C-terminus of Swp82 (Fig. 2d). The Snf5 Core repeat (RPT) domains each engage one copy of the Swi3 SWIRM domain in a similar manner as in the human BAF47/BAF155 crystal structure29 (Fig. 2d, Extended Data Fig. 7a, b). Subtle differences in the two RPT-SWIRM interfaces (Extended Data Fig. 7c) are likely due to the α helix N-terminal to RPT1, H-N, wedging between RPT1 and SWIRMA while the C-terminal region of RPT2 is packed against the opposite side of SWIRMB. The RPT1/SWIRMA connects the Arm module to the Core module by tightly associating with Swi1 (Fig. 2d). Swp82 contains an α helix that runs along Snf5/Swi3 (Fig. 2d), likely further stabilizing the Arm module. The environments that the two molecules of Swi3 experience in both the Hinge and the Arm further establish the asymmetric architecture of this homodimer (Extended Data Fig. 6b).
Swi1/ARID1/BAF250 resides in the core region of the Body, acting as a hub to integrate all other modules (Fig. 2a, e). Therefore, we name it the Core module. It clearly folds into an Armadillo (ARM) repeat structure30 (Fig. 2e, Extended Data Fig. 8a). Interestingly, BAF250a, the human homolog of Swi1, was predicted to contain an ARM domain31, consistent with the highly conserved nature of this subunit. Compared to the β-catenin structure32, the Swi1 ARM repeat domain contains extra insertion sequences (Extended Data Fig. 8a), such as the one between helices H3 and H6. In addition to the neighboring repeats, this long insertion makes extensive contacts with the Snf5 and Swi3 subunits of the Arm as well as both the Spine and the Hinge. It contacts the Arm by wrapping on top of the Swi3 SWIRMA domain and traveling back along the Snf5 H-N (Extended Data Fig. 8b). Interestingly, this insertion also forms an α helix H4 that contacts a surface on SWIRMB, whose corresponding region on SWIRMA engages with Swi1 H1 and Snf5 H-N (Extended Data Fig. 7d), emphasizing the role of Swi1 in associating with the Arm module. In addition to this long insertion associating with the Arm, H1 of Swi1 contacts the SWIRMA domain of Swi3, whereas H3 and H8 interact with Snf5 RPT1 (Extended Data Fig. 8b), thus connecting the Arm to the Core. The Swi1 ARM repeat domain also interacts extensively with the Spine sub-module. The entire top surface of the Swi1 ARM makes contacts with the helix bundle from the Spine, with the C-terminal helices H19 and H20 engaging the SWIB domain of Snf12 (Extended Data Fig. 8c).
The Core is also the major docking point of the Snf2 Anchor domain (Fig. 3a). H11 of Swi1 ARM interacts with an extended region of the Snf2 HSA domain that is absent from the crystal structure33, while H2, H6 and H9 contact the Anchor linker (Extended Data Fig. 9a). These interactions, together with the Hinge region sandwiching the Anchor helices of Snf2 (Extended Data Fig. 9b), further lock the ATPase in the complex. This observation is consistent with ARID1A being the branching subunit connecting the ATPase module with the rest of the SWI/SNF complex in humans24.
The modular architecture of the SWI/SNF complex revealed by our structure agrees well with the modules revealed by previous biochemical and proteomic studies34, 35. The conserved SWI/SNF subunits form the structural scaffold within the complex, whereas yeast-specific subunits only occupy peripheral regions. For example, Snf6 was identified to situate at the back of the complex, spanning the Core and wrapping on top of the four-helix bundle of the Spine (Extended Data Fig. 10a). Swp82 is another yeast-specific subunit, and it is also located peripherally, making limited contacts with the rest of the complex (Extended Data Fig. 10b). Based on our sequence conservation analysis (Supplementary Figures 1-5), we have also mapped a subset of invariant residues from the human cancer mutation database36 onto our SWI/SNF model. Although the majority of the mutations likely compromise structure and folding, many also map to protein-protein interfaces, contributing to different types of the human disease (Fig. 2b-e, 3a).
Our structure has also enabled us to map the interactions between the SWI/SNF complex and the nucleosome. The ATPase domain of Snf2 binds the nucleosome at SHL2 in the context of the entire complex, as reported previously for the stand-alone ATPase14, 15, 37, 38. A series of cancer patient mutations map to the Snf2 HSA-DNA interface near SHL-6, likely diminishing the remodeling efficiency by disrupting protein-DNA interactions (Fig. 3a). The yeast-specific subunit Swp82 also contacts the nucleosomal DNA near SHL-2 (Fig. 3b), likely contributing to the remodeling activity of SWI/SNF. Although the nucleosomal DNA is not deformed as was observed for Chd117, 18, there are multiple interactions between the SWI/SNF complex and the extranucleosomal DNA in our structure. First, Snf6 contacts the extranucleosomal DNA proximal to the nucleosome (Fig. 3b), in good agreement with previous site-directed DNA crosslinking experiments39. Second, at a lower threshold, we observed additional density for extranucleosomal DNA contacting the Body module (Extended Data Fig. 11), suggesting flexibility of this region of the DNA. However, when we prepared the SWI/SNF-nucleosome complex using a nucleosome with no overhanging DNA sequence (data not shown), we failed to observe stable complex formation, suggesting the importance of the extranucleosomal DNA in nucleosome binding to SWI/SNF. Interestingly, this extranucleosomal DNA also coincides with the possible trajectories of the N-terminal regions of both Swi1 and Snf5 (Extended Data Fig. 11), which have been shown to interact with acidic transcription activators40–42. This could explain how SWI/SNF is recruited by transcription activators to its target loci for chromatin remodeling, leading to an activated gene transcription.
A connecting density is observed between the histones and Snf5 C-terminus (Fig. 3c, d), consistent with the histone crosslinking experiments22, 39. This density likely corresponds to the highly conserved Snf5 arginine anchor motif that interacts with the acidic patch of the histone octamer (Fig. 3c, d) where a number of nucleosome regulators bind43, suggesting a conserved mechanism of octamer recognition. Deletion of the RPT domains in Snf5 uncouples ATP hydrolysis by Snf2 with the chromatin remodeling activity22. Our structure suggests an anchoring role of the Arm sub-module during active remodeling, in which Snf5 locks the histones in place as the nucleosomal DNA is being translocated, thus coupling ATP hydrolysis with chromatin remodeling (Fig. 4). In contrast, this anchoring role is primarily carried out by the Arp module in other large remodeling complexes, including INO80 and SWR116, 20, 21 (Extended Data Fig. 5). It has been well documented that the natural substrate for SWI/SNF is the +1 nucleosome situated near the promoter11–13. Therefore, the extranucleosomal DNA at the exit side of the nucleosome in our structure corresponds to upstream promoter DNA, consistent with SWI/SNF’s function in generating the nucleosome-depleted regions during gene activation.
Methods
SWI/SNF purification
SWI/SNF complex was purified from a yeast strain containing a TAP tag at the C-terminus of Snf244 (obtained from the High Throughput Analysis Laboratory at Northwestern University). Tandem affinity purification was performed as following. The tagged yeast strain was grown to an optical density at 600nm (OD600) of 4-5 in 12 liters of YPD (3% glucose). Next, cells were harvested by centrifugation and washed with 200 ml of cold TAP Extraction Buffer (40 mM Tris pH 8, 250 mM ammonium sulfate, 1mM EDTA, 10% glycerol, 0.1% Tween 20, 5 mM dithiothreitol [DTT], 2 mM phenylmethylsulfonyl fluoride [PMSF], 0.31 mg/ml benzamidine, 0.3 μg/ml leupeptin, 1.4 μg/ml pepstatin, 2 μg/ml chymostatin). Cells were resuspended in 150 ml cold TAP Extraction Buffer and lysed in a BeadBeater (Biospec Products). Cell debris was removed by centrifugation at 14,000 ×g at 4°C for 1 hr. For the first affinity step, 2 ml IgG Sepharose beads (GE Healthcare) were incubated with the lysate at 4°C overnight. The beads were next washed and resuspended in 4 ml cold TEV (tobacco etch virus) Cleavage Buffer (10 mM Tris pH8, 150 mM NaCl, 0.1% NP-40, 0.5 mM EDTA, 10% glycerol). TEV cleavage using 25 µg of TEV protease was performed at room temperature for 1 hr with gentle shaking. The TEV protease-cleaved products were collected, and the IgG beads were washed with 3 column volumes (∼6 ml total) cold Calmodulin Binding Buffer (15 mM HEPES pH7.6, 1 mM magnesium acetate, 1 mM imidazole, 2 mM CaCl2, 0.1% NP-40, 10% glycerol, 200 mM ammonium sulfate, 5 mM DTT, 2 mM PMSF, 0.31 mg/ml benzamidine, 0.3 μg/ml leupeptin, 1.4 μg/ml pepstatin, 2 μg/ml chymostatin). CaCl2 was added to the combined eluate at a final concentration of 2 mM and incubated with 0.8 ml Calmodulin Affinity Resin (Agilent Technologies) at 4°C for 2 hours. After incubation, the beads were washed with cold Calmodulin Binding Buffer and cold Calmodulin Wash Buffer (same as Calmodulin Binding Buffer, but containing 0.01% NP-40), and bound proteins were eluted with Calmodulin Elution Buffer (15 mM HEPES pH 7.6, 1 mM magnesium acetate, 1 mM imidazole, 2 mM EGTA, 10% glycerol, 0.01% NP-40, 200 mM ammonium sulfate) at room temperature. Fractions containing the SWI/SNF complex were combined and concentrated to a final concentration of ∼4 mg/ml (280 nm absorption) using a concentrator (Amicon Ultra-4 Ultracel 30K, Millipore). Concentrated protein was aliquoted, flash frozen in liquid nitrogen and stored at −80℃.
Nucleosome reconstitution
Mono-nucleosome was reconstituted with Xenopus histones and the 601 DNA45 using the Mini Prep Cell (Bio-rad) as described previously46. The Xenopus histones were obtained from the Histone Source – the Protein Expression and Purification (PEP) Facility at Colorado State University. DNA oligonucleotides containing the 601 sequence were purchased from IDT (Integrated DNA Technology): top strand, 5’- ACCTCCCACTATTTTATGCGCCGGTATTGAACCACGCTTATGCCCAGCATCGTTAATCGATGTATATATCTGACACGTGCCTGGAGACTAGGGAGTAATCCCCTTGGCGGTTAAAACGCGGGGGACAGCGCGTACGTGCGTTTAAGCGGTGCTAGAGCTGTCTACGACCAATTGAGCGGCCTCGGCACCGGGATTCTGAT-3’; bottom strand, 5’-ATCAGAATCCCGGTGCCGAGGCCGCTCAATTGGTCGTAGACAGCTCTAGCACCGCTTAAACGCACGTACGCGCTGTCCCCCGCGTTTTAACCGCCAAGGGGATTACTCCCTAGTCTCCAGGCACGTGTCAGATATATACATCGATTAACGATGCTGGGCATAAGCGTGGTTCAATACCGGCGCAT-3’. The 601 sequence is underlined. The lyophilized DNA oligos were resuspended in water to a final concentration of ∼100 µM and mixed at 1:1 molar ratio. Annealing of the DNA was performed by incubating in boiling water for 5 min followed by gradually cooling to room temperature in 2 hours. The reconstituted nucleosome core particle (NCP) was concentrated to ∼6µM and annealed with a biotinylated RNA molecule (IDT, 5’-UAGUGGGAGGU-3’-biotin) to the top DNA strand at 1:1.5 (DNA to RNA) molar ratio at 45℃ for 5 min followed by gradually cooling to room temperature in 30-40 min. This resulted in a final concentration of the nucleosome core particle (NCP) at 5.52 µM. The annealed NCP was stored at 4℃.
SWI/SNF-NCP assembly
To assemble the SWI/SNF-NCP complex, we modified our approach of reconstituting Pol I/II/III pre-initiation complexes (PIC)47–49 and used the NCP to replace the nucleic acid scaffold. Specifically, 0.4 µl of the biotin-RNA-annealed NCP (0.552 µM, 1/10 of the storage concentration) was first mixed with 1µl of the assembly buffer (12 mM HEPES pH 7.9, 0.12 mM EDTA, 12% glycerol, 8.25 mM MgCl2, 1 mM DTT, 2 mM ADP, 32mM KF, 4mM BeCl2 and 0.05% NP-40 [Roche]). Next, 1 µl of the concentrated SWI/SNF complex was added to this mixture and incubated at room temperature for 2 hours. Assembled complex was immobilized onto the magnetic streptavidin T1 beads (Invitrogen) which had been equilibrated with the assembly buffer plus 60 mM KCl and minus ADP-BeFx. Following washing of the beads two times using a wash buffer (10 mM HEPES, 10 mM Tris, pH 7.9, 5% glycerol, 5 mM MgCl2, 50 mM KCl, 1 mM DTT, 0.05% NP-40, 1 mM ADP, 16mM KF, 2mM BeCl2), the complex was eluted by incubating the beads at room temperature for 30 min with 3µl digestion buffer containing 10 mM HEPES, pH 7.9, 10 mM MgCl2, 50 mM KCl, 1 mM DTT, 5% glycerol, 0.05% NP-40, 1 mM ADP, 16mM KF, 2mM BeCl2 and 0.05 unit/µl RNase H (New England Biolabs). The SWI/SNF-NCP complex assembled in the presence of ATPγS was performed essentially as described above with 1mM (2mM in the first assembly buffer) ATPγS replacing ADP-BeFx in the buffers.
Electron microscopy
The assembled SWI/SNF-NCP complex was first crosslinked using 0.05% glutaraldehyde under very low illumination conditions on ice for 5 min before applied onto EM grids. Negative staining sample preparation and data collection were performed as previously described48. For cryo sample preparation, crosslinked complex (∼3.3 µl) was applied onto a 400 mesh Quantifoil grid containing 3.5 µm holes and 1 µm spacing (Quantifoil 3.5/1, Electron Microscopy Sciences). A thin carbon film was floated onto the grid before it was plasma cleaned for 10s at 5 W power using a Solarus plasma cleaner (Gatan) equipped with air immediately before sample deposition. The sample was allowed to absorb to the grid for 10 min at 4℃ and 100% humidity in a Vitrobot (FEI) under low illumination conditions, before blotted for 4 s at 10 force and plunge-frozen in liquid ethane. The frozen grids were stored in liquid nitrogen until imaging.
Cryo-EM data collection was performed using a JEOL 3200FS transmission electron microscope (JEOL) equipped with a K2 Summit direct electron detector (Gatan) operating at 200kV (Extended Data Table 2). Data were collected using the K2 camera in counting mode at a nominal magnification of 30,000 × (1.12 Å per pixel). Movie series with defocus values ranging from −1.5 to −4.5 µm were collected using Leginon50. 40-frame exposures were taken at 0.3 s per frame (12 s total), using a dose rate of 8 e- per pixel per second, corresponding to a total dose of 76.5 e- Å-2 per movie series. Four datasets with a total number of 7,769 movies on the ADP-BeFx sample and four other datasets with a total number of 6,903 movies on the ATPγS sample were collected.
Image processing and three-dimensional reconstruction
Negative stain data pre-processing was performed using the Appion processing environment51. Particles were automatically selected from the micrographs using a difference of Gaussians (DoG) particle picker52. The contract transfer function (CTF) of each micrograph was estimated using CTFFind453, the phases were flipped using CTFFind4, and particle stacks were extracted using a box size of 128 × 128 pixels. Two-dimensional classification was conducted using iterative multivariate statistical analysis and multi-reference alignment analysis (MSA-MRA) within the IMAGIC software54. Three-dimensional (3D) reconstruction of negative stained data was performed using an iterative multi-reference projection-matching approach containing libraries from the EMAN2 software package55. The initial 3D model was generated using cryoSPARC56.
Cryo-EM data was pre-processed as follows. Movie frames were aligned using MotionCor257 to correct for specimen motion. Particles were automatically selected from the aligned and dose-weighted micrographs using Gautomatch (developed by Zhang K, MRC Laboratory of Molecular Biology, Cambridge, UK) with 2-fold binnning (corresponding to 2.24Å/pixel). The CTF of each micrograph and of each particle was estimated using Gctf58. All three-dimensional (3D) classification and refinement steps together with postprocess and local resolution estimation were performed within RELION 3.059.
For the ADP-BeFx dataset, 891,573 particles were automatically picked and were subjected to an initial round of 3D classification with alignment using the density obtained from negative staining as the initial reference (Extended Data Fig. 2). The “Angular sampling interval”, “Offset search range (pix)” and “Offset search step (pix)” were set to 15 degrees, 10 and 2, respectively, for the first 50 iterations. Next, these values were set back to default (7.5 degrees, 5, 1) and the 3D classification was continued until convergence. This resulted in class 3 with 198,543 particles showing sharp structural features of SWI/SNF and nucleosome. This class was subsequently refined and further classified without alignment into 5 classes with a mask around the Arp module and the nucleosome (Extended Data Fig. 2b). Class 1 with 35,214 particles from this second round of classification showed best features of the nucleosome and was chosen to proceed with 3D auto-refinement, which yielded a structure of SWI/SNF-NCP at an overall resolution of 8.96Å (Extended Data Fig. 2c). All resolutions reported herein correspond to the gold-standard Fourier shell correlation (FSC) using the 0.143 criterion60. The ATPγS dataset with 820,117 particles was processed in a similar manner (Extended Data Fig. 3), resulting in a structure with an overall resolution of 10Å.
To focus on the Body of SWI/SNF, we combined the particles from both samples after the first round of 3D classification with a total number of 390,573 particles (Extended Data Fig. 4). Next, signal subtraction on the nucleosome and the lower half of the Arp module was performed as previously described61, leaving the SWI/SNF Body module and the top half of the Arp module intact. Subsequently, a 3D classification was performed with only local alignment turned on. This resulted in class 5 with 61,518 particles showing the best structural features of the Body module (Extended Data Fig. 4b). Next, we unbinned and refined the original particle stack of this class, and generated masks around the Body module, the Arp module plus the ATPase density of Snf2, and the nucleosome (Extended Data Fig. 4b). 3D multi-body refinement62 was then performed on this class, which drastically improved the resolution of the Body module to 4.7Å (Extended Data Fig. 4c). The core region of the Body module has a resolution close to 4.3Å (Extended Data Fig. 4b), showing densities of bulky sidechains, which enabled us to partially build the structural model of the Body module (Fig. 1b). This body map replaced its corresponding region in the ADP-BeFx map to result in the composite map shown in Fig. 1a.
Model building
To aid in model building, we performed secondary structure prediction of the SWI/SNF subunits using the Genesilico Metaserver63. Sequence alignment of the conserved SWI/SNF subunits were performed using CLC Sequence Viewer 7 (Supplementary Figures 1-5). To build the structural model of the SWI/SNF Body module, we first performed rigid body docking of known structures into our 4.7Å Body map. The rigid-body docking was performed in UCSF Chimera64, 65, which yielded good fit of the following structures: the SNF5 Repeat domains (RPTs) and Swi3 SWIRM domains from the human BAF47/BAF155 complex (PDB ID 5GJK)29, the SANT domain of the yeast Swi3 (PDB ID 2YUS), and the SWIB domain of mouse BAF60a (PDB ID 1UHR). The BAF47/BAF155 heterodimer and the Swi3 SANT domain can be docked in the density map at two distinct locations, indicating two copies of these domains. Indeed, chemical crosslinking combined with mass spectrometry has shown at least two copies of Swi3 in the yeast SWI/SNF complex22, confirming our docking experiment. Yeast Snf5 has been annotated with two SNF5 RPT domains in Pfam (http://pfam.xfam.org/protein/P18480). We observed clear density in our map that connects these two RPT domains. Next, we built homology models of these structures using Modeller66 and replaced the docked PDBs in the Body density map. Regions with missing or extra connecting density were then manually deleted or built in Coot67 based also on secondary structure predictions of these proteins.
The Snf2 Anchor domain was built manually in Coot. First, the Arp7/Arp9/Rtt102/HSA structure (PDB ID 4I6M)33 was rigid body docked into the full map, which helped in registering the HSA helix in the Body map. The HSA helix was then manually extended in Coot, with Y586 matching a sidechain density further confirming the register of this helix. Next, the Anchor domain was manually extended from the end of the HSA by following the connected density of the map. Again, secondary structure prediction was also used as a guide when extending the model in Coot. Bulky sidechain density at Y497, Y533 and W554 further confirmed the model.
The ARM repeat domain of Swi1 locates in the core region of the Body map with the highest local resolution, therefore enabling de novo model building. First, the helix density corresponding to residues 942-955 of Swi1 was chosen to model because it has the highest local resolution and that it contains a few bulky sidechain densities. Next, two α helices with poly-alanine sequence were generated in Coot, which allowed us to create a bulky residue (lysine, arginine, histidine, methionine, phenylalanine, tyrosine, and tryptophan) pattern along both directions. Subsequently, these patterns were used to search against the sequences of SWI/SNF subunits on the Sequence pattern search server (http://www-archbac.u-psud.fr/genomics/patternSearch.html), and Swi1 942-955 was one of the best hit. Further extension of this helix into connected density also matched the secondary structures of Swi1. Then, the remaining regions of the Swi1 ARM repeat domain were manually built into the density in Coot based on secondary structure prediction as well as bulky sidechain densities wherever possible. The overall architecture of the ARM repeat domain of Swi1 also matches that of an Armadillo repeat containing protein β-catenin32 (Extended Data Fig. 8a), confirming our model of Swi1.
The positioning of the SWIB domain of Snf12 aided us in building the remaining of this protein into the density. First, at the Spine tip, where the SWIB was docked, there is β-sheet like density (Extended Data Fig. 6c). This agrees with the secondary structure prediction of Snf12, which shows β-strands right N-terminus of the SWIB domain.
Although the resolution of this region is low, we are confident about its identity. Four long helices belonging to the Spine module directly connect to this region, two of which extending into the Snf12 densities. Therefore, we assigned these two helices to Snf12. This agreed well with the secondary structure prediction of Snf12, which shows that Snf12 contains two long helices. Next, we performed protein sequence pattern search based on the bulky sidechain densities. Based on the search results, we manually built the two helices of Snf12 in Coot.
Based on secondary structure prediction, we reasoned that the other two long helices belong to Swi3 C-terminus. This is backed up by the finding that the C-terminus of Swi3 contains a coiled-coil leucine zipper motif26 and there are two copies of Swi3 in SWI/SNF. To facilitate the registering of the sequence in these long helices, we fitted the crystal structure of human OmoMYC homodimer (PDB ID 5I4Z)27 into the density and obtained a good fit (Extended Data Fig. 6a). Based on this fitting, we mapped the hydrophobic residues from Swi3 as indicated before26 and manually built the two helices in Coot. The rest of Swi3 density cannot be confidently modeled due to lower resolution and missing density, therefore are modeled with poly-alanine.
Snf6 was also manually built in Coot based on secondary structure prediction, bulky sidechain density and prior knowledge based on chemical crosslinking and mass spectrometry data22 and site-directed DNA crosslinking experiments39. We cannot confidently model in Swp82, however we were able to assign densities to this yeast specific subunit based on crosslinking experiments22 and mapping by deletions and EM68. The N-terminal region of Swp82 forms a RSC7 homology domain, therefore we speculate that it occupies the globular density near the Hinge; C-terminal region crosslinks to both Snf5 and Swi3, therefore it was assigned to the density by Snf5 and Swi3. There are also several unassigned densities on the solvent exposed surface of the complex. We did not identify Taf14 and Snf11 in the map (Extended Data Table 3).
The molecular model of SWI/SNF Body module was then refined using Namdinator69 (Extended Data Table 2). To obtain the model for the full complex, we rigid-body fitted the Body, the Arp module (PDB ID 4I6M)33 and the ATPase-nucleosome bound with ADP-BeFx (PDB ID 5Z3V)15 into the map of the full complexes in Coot. Then, the HSA helix was connected manually, and the DNA sequence was modified to match our sequence. The extra DNA was manually extended by 10bp using B form DNA in Coot. The figures were prepared using UCSF Chimera and ChimeraX70. Cα-Cα distances from crosslinked lysine pairs22 were measured in UCSF Chimera. For crosslinks involving the two molecules of Swi3, we picked the combination that gave the shortest distance as the measurement (Extended Data Table 1).
Data availability
Cryo-EM density maps have been deposited in the Electron Microscopy Data Bank (EMDB) under accession numbers EMD-XXXX (ADP-BeFx), EMD-XXXX (ATPγS), EMD-XXXX (body). Model coordinates have been deposited in the Protein Data Bank (PDB) under accession numbers XXXX (ADP-BeFx), XXXX (body).
Author contributions
Y Han and Y He conceived the project. Y Han performed most of the experiments and collected and analyzed cryo-EM data with Y He. AA Reyes and S Malik contributed to protein purification. Y Han built the models with help from Y He. Y Han and Y He wrote the manuscript, with input from all other authors.
Acknowledgement
We thank Dr. Jonathan Remis for assistance with microscope operation and data collection and Jason Pattie for computer support. We are grateful to Amy Rosenzweig, Ishwar Radhakrishnan and Susan Fishbain for helpful discussion and comments on the manuscript. We also thank the staff at the Structural Biology Facility (SBF) of Northwestern University for technical support. This work was supported by a Cornew Innovation Award from the Chemistry of Life Processes Institute at Northwestern University (to Y He), a Catalyst Award by the Chicago Biomedical Consortium with support from the Searle Funds at The Chicago Community Trust (to Y He), an Institutional Research Grant from the American Cancer Society (IRG-15-173-21 to Y He), an H Foundation Core Facility Pilot Project Award (to Y He). Y He is supported by P01 CA092584 and U54CA193419 from NIH/NCI. Y Han is a recipient of the Chicago Biomedical Consortium Postdoctoral Research Grant.