Twist, Snail, and Sox9 form an allosterically regulated complex, the EMTosome, on a bipartite E-box site

Epithelial-Mesenchymal transition (EMT) of primary tumor cells is a critical trans-differentiation event that contributes to dissemination and metastasis. The process of EMT is controlled by specific DNA-binding transcription factors (TFs) that reprogram the tumor transcriptome. In particular, the canonical EMT-TFs Twist and Snail can induce an EMT program when overexpressed in cancer cells, and both are found upregulated in metastatic cancers. Twist and Snail bind DNA directly, by recognition to variants of the E-Box sequence CANNTG. However, it is unclear how this binding is regulated. We have used a biochemical approach to dissect DNA binding and protein-protein interactions that occur amongst these proteins. We find that Twist preferentially recognizes a dyad repeat of E-boxes that are not directly bound by Snail. Our data suggest that Twist use its WR domain to recruit Snail into a binding complex through the Snail zinc-finger motifs. We analyzed Twist-Snail complexes in the breast carcinoma cell line SUM1315 and found evidence that it contains an additional protein partner, Sox9. Notably, we report that a native Twist complex can be displaced from its dyad binding site by consensus DNA binding sites for Snail and Sox9 even though these proteins do not contact the Twist dyad site. Taken together, our findings suggest that Snail and Sox9 interact with Twist to regulate its DNA binding ability via protein-protein interactions, thereby allosterically regulating Twist DNA binding. We designate this ternary complex EMTosome. These results may inform efforts to therapeutically target the EMT program in order to target cancer metastasis.

healing. Cancer pathobiology often includes reactivation of this program for the purpose of metastasis, 48 where it provides the motility necessary to leave the primary tumor, and enter the circulation for distant 49 spread, as well as therapeutic resistance (2). EMT is regulated on a transcriptional level by a series of 50 proteins, considered master regulators of EMT. The major proteins involved are Twist, Snail, Slug, and 51 Zeb (3). Further understanding of EMT is critical to our understanding of cancer and metastasis, as these 52 proteins have been shown to be indicators of poor outcomes in cancer treatment (4).

53
Twist is a transcriptional activator and a type II basic helix-loop-helix (bHLH) protein that requires 54 an obligate binding partner in order to regulate transcription (5). This binding partner is typically E12 or 55 E47, both encoded by E2A/TCF3, with which it forms a dimer of heterodimers, or a tetramer (6). These 56 binding partners are type I bHLH proteins, and all are part of the HLH protein superfamily (5). We have 57 previously discovered that the optimal Twist binding site occupied in cells to is a dyad E-box 58 (CANNTGNNNNNCANNTG) binding motif spaced by 5 nucleotides, such as the one in the Alpha-2-59 macroglobulin (A2M) gene, in order to regulate transcription (6). The WR domain of Twist dimerizes Twist, 60 thereby forming a tetramer of Twist-E47 molecules on the dyad site. The WR domain also serves as a 61 platform for protein-protein interactions (PPIs) which occur because of the tetramer binding DNA. Snail 62 is a transcriptional repressor and a zinc finger protein, also a master regulator of EMT, that also binds to 63 subsets of E-box sites which appear to be different than those for Twist: the specificity to differentiate 64 between E-boxes is determined by the central NN nucleotides (CANNTG) (6) and partially by flanking 65 sequences. A classic, canonical DNA binding site for Snail is located in the E-cadherin gene (E-cad) (6).

66
The role of Snail in metastasis, tumor progression, and endowing a stem-like phenotype as part of its role made by harvesting and lysing cells in a 10cm dish (80-90% confluency) in 100 μl of Tween 20 Lysis Buffer  To begin to assess the protein-DNA complexes associated with Twist, we performed electrophoretic 251 mobility shift assays using a canonical Twist dyad binding site from the A2M promoter, using nuclear 252 extracts from the breast carcinoma cell line Sum1315, which is well-characterized for studying EMT (16).

253
Additionally, we supplemented these analyses using purified proteins for Twist, Snail, E47 and Sox9 and 254 derivatives, as well as their consensus DNA binding sites, as depicted in Figure 1A-C. Whole cell lysates 255 and nuclear extracts from Sum1315 cells were confirmed to contain Twist, E47, Snail, and Sox9, as well as 256 the other EMT markers, Slug and Vimentin (Sup Fig 1B). We utilized EMSA with 32 P radiolabeled A2M DNA 257 probe, and antibodies (Sup Fig 1C) to verify the existence of a complex and verify the presence of these proteins in that complex (Fig 1D). Nuclear extracts from Sum1315 cells produced a consistent and 259 reproducible complex with the A2M DNA probe (Lane 2). This primary complex shows altered mobility 260 when Twist antibody is added (Fig 1D, Lane 3), verifying the presence of Twist protein within this complex 261 as expected. Surprisingly, the integrity of this primary complex is greatly reduced by addition of Snail or 262 Sox9 antibodies suggesting that both Snail and Sox9 are additional binding partners in this multiprotein 263 complex. (Fig 1D, lanes 4 and 5). For clarity we denote this protein-DNA complex the "EMTosome".

264
We next supplemented these analyses using purified proteins from E. coli such that their native (Twist/bHLH+WR) and 9kDa (E47/bHLH), but they elute from the column at ~44 kDa and larger, suggesting 276 either extended, non-globular secondary structure, or tetramer formation. Using these highly pure 277 proteins we tested the ability of Snail to bind to Twist and Sox9. Utilizing a GST-Snail-ZF1-4 fusion protein 278 supplemented with nuclear extracts from Sum1315 cells, we detected associations between Snail-ZF1-4 279 and both Twist and Sox9, reproducibly and under harsh washing conditions (Fig 2A). The GST-Twist-WR 280 domain shows association with Twist and Snail from lysates, suggesting that the WR domain of Twist is 281 necessary for interactions with endogenous Twist and Snail (Fig 2A right panel). Notably, a GST-Sox9/DIM+HMG fusion protein was also able to associate with Snail/ZF1-4 recombinant protein, providing 283 evidence that the interaction between Snail and Sox9 may be direct (Fig 2B). we found that the dyad E-box is required, as mutating either of the E-box binding sites abolished the 305 interaction between Twist, Snail and Sox9 in this complex (Fig 2C, left panel). Similar data were obtained 14 306 in Cos-1 cells transfected with all four proteins (Fig 2C, right panel). We also found that the ability of the 307 WR domain of Twist to interact with flag-purified Snail was facilitated by the addition of the DNA binding 308 site for Snail from the E-cadherin promoter (E-cad, Fig. 2D To further characterize the DNA binding properties of the EMTosome, we analyzed the effects of 314 competitor DNA (Fig 3A) on the endogenous complex in Sum1315 nuclear extracts using EMSA under both 315 equilibrium and non-equilibrium conditions. By mixing radiolabeled A2M DNA binding site and an 316 unlabeled probe (A2M, E-cad or Sox9) at a 20x concentration. Surprisingly, we observed that binding of 317 the SUM1315 EMTosome is eliminated by any of these three sites (Fig 3A, Lane 3) when both hot and cold 318 probes are mixed prior to protein addition. Alternatively, when the EMTosome is first bound to 319 radiolabeled A2M DNA, even the homologous probe cold A2M was no longer capable of eliminating 320 binding (Fig 3A, Lane 6), suggesting a low rate of dissociation of the complex once bound to DNA. Cold 321 probes for E-cad and Sox9 binding sites, when added after EMTosome formation, also had modest ability 322 to diminish complex formation (Fig 3A lanes 7 and 8).

323
To confirm and extend these findings, we performed chromatin immunoprecipitation in cells 324 expressing the core EMTosome components Twist, Snail, E47 and Sox9. We found that antibodies to 325 Twist, Snail and Sox9 were all capable of immunoprecipitating a Twist canonical binding target, the dyad 326 E-box site on the promoter region of the A2M gene, and a canonical Snail target, the single E-box site in 327 the promoter region of the E-cadherin gene (Fig. 3B). These data support the ability of these three 328 proteins to occupy the appropriate site in mammilian cells.
329 Twist and E47) showed strong specificity for their own canonical site but did not bind to sites recognized 343 by the other components. (Fig. 3C)  Sox9, we revisited this for specificity sparked by the data in (Fig 2D) that Snail purification from 374 transfected Cos-1 cells is greatly aided by the addition of the Snail E-cad binding site (Fig 2D). We found 375 that this effect is specific for the Snail E-cad DNA binding site, as it was not observed using DNA oligos for 376 the A2M dyad site or the SZF zinc finger protein consensus binding site (a site unrelated to the E-box) (Fig   377  4C). A similar effect of DNA binding is also evident upon Sox9 purification (Fig 4D) and Sox9 (Fig 2C). With a triple transfection of Flag-Twist, HA-Snail, and Myc-E47, we observe that both 387 Twist and E47 copurify easily in a DNA affinity procedure using biotinylated A2M site (Fig. 5A). We also 388 observe that Snail is capable of copurifying with Twist when Flag-Twist, HA-Snail, and Myc-E47 are triple 389 transfected into cells, but that this copurification is eliminated on addition of the A2M DNA oligo, 390 suggesting that complex formation may require both Sox9 and DNA binding for stability. (Fig 5A). When

391
Cos-1 cell extracts containing HA-Twist, Myc-E47, Sox9-C-Flag, and Snail-C-Flag proteins, we observe that 392 Twist and E47 are capable or copurify with Sox9 on a Myc-tag antibody resin (E47). Curiously, under the 393 same conditions HA-Twist is very poorly recovered with HA resin, and no recovery of Snail or Sox9 is seen 394 (Fig 5B). These data suggest that, while we can recover this complex utilizing a DNA binding site of one 395 element of the complex (Fig 2C), the complex cannot be fully recovered in the absence of DNA binding, 396 suggesting that the DNA binding process is not fully separable from complex formation.

397
To further characterize the EMTosome reconstituted in cells, EMSAs with antibody supershifts 398 were done with nuclear extract proteins from quadruple transfected cells (Fig 5C). By EMSA, the major 399 complex co-migrates with the endogenous complex observed many times derived from SUM1315 cells.

400
A marked supershift of this complex is evident when Twist and E47 antibodies are added, but surprisingly 401 there is no effect of Sox9 antibodies. There is a modest but reproducible increase in the abundance of the major complex with the addition of Sox9 to the transfections (Fig 5C, Lane 2 compared to Lane 6), 403 while the mobility of the complex remains overall similar to that of the SUM1315 (Fig 5C, Lane 10).

404
Addition of cold competitor A2M and E-cad DNA (cold and hot DNAs mixed prior to protein addition) 405 showed that they robustly abrogate binding of the complex to the A2M DNA binding site. However, 406 addition of unlabeled Sox9 DNA binding site has a greatly reduced impact on the binding of the complex 407 to the A2M DNA binding site (Fig 5D, Lanes 3-5). Altogether, these data suggest that we only partially