Finding reliable phenotypes and detecting artefacts among in vivo and in vitro assays to characterize the refractory transcriptional activator Sxy (TfoX) in Escherichia coli

The Sxy (TfoX) protein is required for expression of a distinct subset of the genes regulated by the cAMP receptor protein (CRP) in the model organisms Escherichia coli, Haemophilus influenzae, and Vibrio cholerae. Genetic studies have established that CRP and Sxy co-activate transcription at gene promoters containing DNA binding sites called CRP-S sites. In contrast, CRP acts without Sxy at gene promoters containing canonical CRP-N sites, suggesting that Sxy makes physical contacts with CRP and/or DNA to assist in transcriptional activation at CRP-S promoters. Despite growing interest in Sxy’s activity as a transcription factor, Sxy remains poorly characterized due to a lack of reliable phenotypes in E. coli. Experiments are further hampered by growth inhibition and formation of inclusion bodies when Sxy is overexpressed. In this study we applied diverse phenotypic and molecular assays to test for postulated Sxy functions and interactions. Mutations in conserved regions of Sxy and truncations in the Sxy C-terminus abolish transcriptional activation of a CRP-S promoter, and a 37 amino acid truncation of the C-terminus relieves the growth inhibition normally caused by Sxy overexpression. Sxy was unable to augment weakened CRP interactions to restore carbon metabolism phenotypes. Bandshift analysis and chromatin pull-down assays of Sxy-CRP-DNA interactions yielded intriguing evidence of CRP-Sxy and Sxy-DNA physical interactions. However, despite the careful application of standard protein purification protocols and quality control steps for nickel affinity column purification, protein mass spectrometry revealed the enrichment of additional DNA-binding proteins in nickel column eluates, presenting a probable source of artefactual protein-protein and protein-DNA interaction results. These findings highlight the importance of extensive controls and phenotypic assays for the study of poorly characterized and recalcitrant proteins like Sxy.

growth inhibition and formation of inclusion bodies when Sxy is overexpressed. In this study we 23 applied diverse phenotypic and molecular assays to test for postulated Sxy functions and 24 interactions. Mutations in conserved regions of Sxy and truncations in the Sxy C-terminus abolish 25 transcriptional activation of a CRP-S promoter, and a 37 amino acid truncation of the C-terminus 26 relieves the growth inhibition normally caused by Sxy overexpression. Sxy was unable to augment 27 weakened CRP interactions to restore carbon metabolism phenotypes. Bandshift analysis and 28 chromatin pull-down assays of Sxy-CRP-DNA interactions yielded intriguing evidence of  Introduction [1,5,7,9,10,18,22,[24][25][26][27]], yet Sxy's mode of action remains unknown. Unfortunately, in vitro 66 characterization of Sxy has been hampered by the toxicity of overexpressed sxy and the 67 requirement for strong denaturants to solubilise Sxy inclusion bodies [7,15,25]. We originally 68 proposed that Sxy's role might be to bind to A+T sequences upstream of H. influenzae CRP-S sites 69 [22], but similar sequences were not detected upstream of E. coli CRP-S sites [7]. Previous 70 studies from our laboratories indicated that CRP and Sxy have conserved functions in E. coli and 71 H. influenzae. First, we found that EcCRP can activate competence gene expression in a H. 72 influenzae Δcrp mutant, and that this complementation absolutely requires Sxy [9,28]. Second,73 reciprocal experiments demonstrated that CRP-S promoter activity is higher when cognate pairs 74 of EcCRP/EcSxy and HiCRP/HiSxy were co-expressed compared to creating pairs between 75 proteins from different species [7]. These findings suggest that CRP and Sxy work best with their 76 co-evolved protein partner, supporting a model in which these proteins physically interact in vivo.

Diverse mutations in Sxy prevent CRP-S promoter activity and relieve growth inhibition 118
To identify amino acids and domains required for Sxy's conserved function as an activator of 119 competence gene expression in the Enterobacteriaceae, Pasteurellaceae, and Vibrionaceaea, we 120 aligned the Sxy protein sequence from a representative member of each genus. This identified 121 conserved amino acids potentially important for Sxy function (Fig. 1A). All three Sxy proteins 122 have similar lengths (209, 215, and 199 amino acids, respectively), but amino acid sequence 123 identity (21-26 %) was low and distributed throughout their lengths. Each was also predicted to 124 contain both the TfoX-N and TfoX-C domains (domains are illustrated in Fig. 1A). To test whether 125 either predicted domain is sufficient for transcriptional activation of a CRP-S regulated gene, we 126 deleted the N-terminal half of EcSxy to amino acid 102 (EcSxy HIS -∆Nter) and the C-terminal half 7 from position 108 (EcSxy HIS -∆Cter) (Fig. 1A). We also tested the requirement for the largest block 128 of conserved amino acids in all three genera by deleting the eight amino acids between positions 129 118 and 125, creating mutant EcSxy HIS -mut1. Alanine (Ala) at position 174 is conserved in 98% 130 of the 1,306 full-length Sxy orthologs annotated at EMBL InterPro; we converted alanine 174 to 131 threonine, creating EcSxy HIS -mut2 (Fig. 1A). 132 To assess how each mutation in EcSxy impacted transcription activation, we measured 133 transcriptional activity at the pilA (ppdD) CRP-S promoter and the mglB CRP-N promoter. 134 Expression of pilA was only detected when wildtype EcSxy HIS was over-expressed, whereas all 135 Sxy mutant variants were incapable of inducing pilA (ppdD) expression ( Fig. 1 B). Conversely, 136 expression of mglB was unaffected by overexpression of wildtype or Sxy variants ( Fig. 1 B). 137 Altogether, the TfoX-N and TfoX-C domains, amino acids 118-125, and Ala174 were all critical 138 for transcriptional activation of pilA. These various mutations in Sxy may prevent pilA expression 139 either because transcription activation requires Sxy-CRP interactions that are lost in the mutant 140 proteins, and/or because the mutations abolish an independent function for Sxy at the pilA 141 promoter. 142 Another phenotype associated with Sxy arises when sxy is overexpressed in E. coli. 143 Overexpression of cloned sxy causes growth inhibition and induction of the RpoH stress response, 144 and this growth inhibition phenotype manifests even at low concentrations of the inducer, IPTG 145 [5,7]. We systematically truncated the C-terminus of histidine-tagged Sxy (EcSxy HIS ) and tested 146 for the growth inhibition phenotype. Sxy inhibition of growth was alleviated by all three C-147 terminus truncates, EcSxy HIS -R1, EcSxy HIS -R2, and EcSxy HIS -R3 (truncates length indicated in 148 Fig. 1A, and growth is illustrated in Fig. 1C). IPTG-induced expression of the truncated proteins 149 was confirmed by mass spectrometry (described below), indicating that the full-length C-terminus 8 is required for growth inhibition. When not induced by IPTG, cells were unaffected by cloned 151 wildtype and mutant sxy genes (S1 Fig. B was required for metabolism of maltose, mannitol, xylose and glycerol, but not fructose or 160 galactose (S1 Table, rows 1 and 2), and exogenous expression of EcCRP HIS from plasmid pEccrp 161 fully restored wildtype metabolism to a ∆crp mutant (S1 Table 1, row 3). In contrast, 162 complementation by HiCRP HIS was partial, restoring only xylose metabolism (S1 Table 1, row 4). 163 Suggesting that HiCRP HIS 's lower affinity for DNA prevents it from activating all E. coli CRP-N 164 promoters. We tested this by examining whether HiSxy HIS could restore maltose, glycerol, and 165 mannitol fermentation by stabilizing HiCRP HIS -DNA interactions. Although co-expression of 166 HiCRP HIS and HiSxy HIS restores CRP-S promoter function in E. coli [7], the same co-expression 167 did not restore fermentation of maltose, glycerol, and mannitol in our phenotypic assays (S1 Table  168 1, rows 5 and 6). Further, EcCRP activity at CRP-N promoters was unaffected by expression of 169 either EcSxy HIS or HiSxy HIS (S1 Table 1, rows 7 and 8), suggesting that Sxy cannot significantly 170 enhance the binding of CRP to weak CRP-N sites sufficiently to enhance transcription. 171

Protein-DNA crosslinking in vivo detects non-specific DNA binding 173
We next sought to test whether Sxy-DNA interactions occur in vivo using a chromatin 174 affinity precipitation assay. In principle, formaldehyde can crosslink EcSxy HIS with DNA and 175 proteins to which it is bound in vivo. Isolation of EcSxy HIS on a nickel affinity column will then 176 co-purify any DNA bound and crosslinked to EcSxy HIS . To test whether Sxy physically interacts 177 with DNA in vivo, the cloned Ecsxy HIS gene was overexpressed in E. coli, formaldehyde was added 178 to cell cultures to crosslink interacting DNA and protein molecules, and cells were lysed by 179 sonication and fractionated by centrifugation. The cytoplasmic (soluble) fraction was then 180 incubated with nickel-agarose resin (Ni-NTA) to bind EcSxy HIS and any crosslinked DNA. After 181 extensive washing, EcSxy HIS was eluted with imidazole, formaldehyde crosslinking was reversed, 182 and eluted DNA was quantified by PCR. 183 We hypothesized that Sxy bound specifically to CRP-S sites in living cells, which we could 184 detect as an enrichment of CRP-S containing loci compared to non-CRP-S containing DNA in 185 elution fractions. We quantified the levels of three unlinked chromosomal genes: two CRP-S 186 regulated genes, pilA (ppdD) and comM, and a negative control non-CRP regulated gene hns. In 187 each fraction, all three genes were eluted in equal quantities (Fig. 2

240
The different affinities of EcCRP HIS and HiCRP HIS for DNA allowed us to test whether E. 242 coli or H. influenzae Sxy could enhance binding of their cognate CRP to pilA or pilA-N DNA. 243 Enhanced binding was predicted to manifest as an increase in the amount of pilA DNA shifted by 244 EcCRP, and to create a detectable shift of pilA DNA when bound by HiCRP. Additionally, we 245 predicted that simultaneous binding of CRP and Sxy to DNA would create a super-shift; in other 246 words, a CRP-Sxy-DNA complex is expected to migrate slower during electrophoresis than the 247 smaller CRP-DNA complex. However, a range of Sxy concentrations had no detectable effect on 248 either EcCRP HIS or HiCRP HIS binding to pilA or pilA-N DNA (Fig. 3 B, C, E, F, lanes 3-7). 249 We next tested whether EcSxy HIS or HiSxy HIS alone can bind DNA. Surprisingly, at 4 and 250 40 nM of EcSxy HIS , a small amount of pilA-N DNA was shifted, producing a new band at the same 251 position as CRP-DNA binding (Fig. 3 C, lanes 11 and 12). The shift was dependent on CRP's 252 allosteric effector cAMP, confirming the presence of CRP in the binding reactions (Fig. 3 E,  253 compare lanes 2 and 4). No binding was detected when the experiment was repeated using 254 EcSxy HIS purified from a ∆crp background, confirming that the shift arose from contaminating 255 CRP in the ExSxy HIS preparation (Fig. 3 D, lane 6). Comparison with lane 11 indicates that CRP 256 contamination at <100 pM concentration would be sufficient to account for the observed shift ( Fig.  257 3 C, lanes 2 and 12). With HiSxy HIS and pilA-N bait DNA, increasing protein concentrations 258 correlated with an increase in the amount of DNA retained in the gel wells (Fig. 3 F, lanes 8-12). 259 A specific interaction between Sxy and a DNA binding site was expected to yield approximately 260 the same sized bandshift as CRP-DNA binding, but no such band was detected, suggesting 261 HiSxy HIS was interacting non-specifically with pilA-N DNA to retain it in the gel wells. 262 To test whether Sxy-DNA binding may in fact occur but is too weak to persist under 263 electrophoretic conditions, we used formaldehyde to cross-link proteins bound to DNA in vitro, before electrophoresis. A 1:1 mix of EcCRP HIS :EcSxy HIS resulted in a shift corresponding in size 265 to a CRP-DNA complex (Fig. 3 B and C, lanes 13). In contrast, with the H. influenzae proteins, 266 most bait DNA was retained in the wells of the gels, indicating the formation of large protein-267 DNA complexes (Fig. 3 E and F, lanes 13 and 14). This super-shift occurred also in lane 14, which 268 contained only Sxy without CRP, supporting the hypothesis that DNA-Sxy interactions exist, and 269 the super-shift is evidence that Sxy-DNA interactions could be stabilized with formaldehyde. 270 We also tested the hypothesis that CRP and Sxy may need to form protein-protein contacts 271 prior to binding DNA. However, pre-mixing the proteins from each species before adding bait 272 DNA had no effect on DNA binding (S3 Fig., lanes 4 and 10). Similarly, combining either CRP 273 or Sxy alone with bait DNA before addition of the second protein also had no effect (S3 Fig.,  shift pilA DNA (the highest Sxy concentration used in the above-mentioned experiment (Fig. 3) 280 was 2nM). Purified EcSxy HIS proteins (10 mM) shifted 17 % of DNA (Fig. 4 A, lane 3), and higher 281 EcSxy HIS concentrations resulted in stronger DNA shifts (40%, 64%, and 74 %) (Fig. 4 A, lane 4  282 -6). The discrete bandshift supported that this was specific protein-DNA interactions. Further, an 283 excess of non-specific competitor DNA (poly dI-dC) only caused a minor reduction in the 284 observed protein-DNA interaction (Fig. 4 A, lane 2). Titrating salmon sperm DNA as a non-285 specific competitor revealed that once competitor DNA surpassed the concentration of bait DNA 286 although a discrete bandshift was formed by adding EcSxy HIS extract to bait DNA, protein-DNA 288 binding was low affinity and/or low specificity. Adding high concentrations of EcCRP HIS to 289 EcSxy HIS reactions produced an additive effect resulting in protein-DNA supershifts (S6 Fig., lanes  290   3, 4, 8, and 9). The retention of bait DNA in wells is consistent with CRP proteins binding non-291 specifically to bait DNA. Non-specific protein-DNA interactions were confirmed as competitor 292 DNA effectively liberated DNA from wells (S6 Fig., lanes 5 and 6).   The results above indicate that EcSxy HIS and HiSxy HIS extracts contain non-specific DNA 301 binding activity that can be detected by bandshift assays either by stabilizing protein-DNA 302 interactions with formaldehyde ( Fig. 3) or by addition of high concentrations of protein extract 303 ( Fig. 4 and S6 Fig.). 304 305

Identification of other DNA binding proteins in Sxy extracts. 306
Previous studies of protein purification by immobilized metal affinity chromatography 307 identified SlyD and CRP from native E. coli extracts to have a high affinity for nickel affinity 308 columns, and that these contaminating proteins elute at high imidazole concentrations along with 309 the desired histidine-tagged protein [31,32]. The SlyD protein chaperone contains a metal binding domain with high affinity for Zn 2+ and Ni 2+ ions, making it an expected contaminant of metal 311 affinity chromatography. We used mass spectrometry to examine the degree to which EcSxy HIS 312 was enriched by nickel affinity column purification, also allowed us to identify and quantify 313 cytoplasmic proteins co-purifying with EcSxy HIS . 314 We first established that EcSxy HIS was not detected in the cytoplasm of E. coli ∆crp cells 315 before IPTG induction, then became the 4 th most abundant protein in cytoplasm after induction 316 (S7 Fig.). After nickel column purification, EcSxy HIS represented almost half of the eluted protein 317 (40%), but was surpassed by SlyD (43%) (Fig. 5A). In the eluate from wildtype cultures, EcSxy HIS 318 was only the 4 th most abundant protein (6% of eluted protein) (Fig. 5B). EcSxy HIS -R3 only 319 represented 3% of eluted protein when enriched from wildtype cells (Fig. 5C). The DNA-binding 320 proteins Fur and CRP were the 2 nd and 3 rd most abundant proteins in eluates from wildtype cells 321

334
The present work used protein alignments, directed mutations, and phenotypic assays to 335 confirm that Sxy's predicted domains and conserved amino acids are essential for transcriptional 336 activation of a CRP-S promoter in E. coli. Growth assays confirmed that growth inhibition by Sxy 337 requires a full-length C-terminus, which is predicted to encode a helix-hairpin-helix DNA binding 338 domain. All experiments described in this work (both in vivo and in vitro) used histidine-tagged 339 Sxy proteins, and multiple lines of evidence suggest that the histidine tag does not negatively 340 impact Sxy function. For example, overexpression of either wildtype or histidine-tagged Sxy 341 inhibits E. coli growth, and both histidine-tagged Sxy and CRP are strong activators of gene 342 expression from CRP-S promoters ( Fig. 1B and in [7,22]). Having confirmed these phenotypes 343 resulting from overexpressing histidine-tagged sxy genes, we tested the hypothesis that Sxy 344 enhances weak CRP-DNA interactions using HiCRP's inability to fully restore sugar metabolism 345 in an E. coli ∆crp mutant (S1 Table). However, the observed negative results are inconclusive 346 because they may represent an insurmountable inability of HiCRP to act at all E. coli CRP-N 347 promoters. Nevertheless, the negative results in the sugar metabolism assys could also reflect that 348 CRP-N promoters lack a specific, but as yet undetected, feature required for Sxy activity.  Table). 394

Quantitative PCR 395
Quantitative PCR was performed as described in [1], where RNA was extracted using 396 templates were synthesized using the iScript cDNA synthesis kit (BioRad). qPCR was carried out 398 using SYBR Green Supermix (BioRad) with the primers listed in S2 Table. 399 Sxy growth inhibition assays 400 E. coli W3110 containing either plasmid pEcsxy (WT), pEcsxy-R1, pEcsxy-R2, or pEcsxy-401 R3 was cultured in 40 ml LB with 35 µg/ml chloramphenicol in a shaking incubator at 37 ºC. At 402 OD 600 of 0.2, the culture was separated into two flasks: one induced with 1mM IPTG and one 403 control culture "non-induced". Growth was monitored for 4 hours after IPTG induction by 404 measuring light absorbance (OD 600nm ). 405

Protein purification 406
The histidine-tagged CRP proteins (25 KDa) are described in Cameron and Redfield [22]. 407 The histidine-tagged Sxy proteins (26 KDa) were purified under the same conditions after 2.5 408 conc.) was added to 225 µl of each elution fraction and held at 65 °C for 6 hours. Then, 9 µl salmon sperm DNA (5 mg/ml) was added to each sample just before adding 500 µl 465 Phenol:Chloroform:Isoamyl alcohol (25:24:1). The samples were vortexed and centrifuged at 466 13,000 × g for 5 minutes at room temperature. The aqueous (top) layer was collected in a fresh 1.5 467 ml microfuge tubes and 500 µl of chloroform was added to each sample. The samples were 468 vortexed and centrifuged at 13,000 × g for 5 minutes at room temperature. The aqueous layer was 469 transferred to a fresh 2.0 ml microfuge tubes. 5 µg of GlycoBlue (5 mg/ml), 1 µl of salmon sperm 470 DNA (5 mg/ml, Invitrogen) and 50 µl of 3M NaAc (pH 5.2) was added to each sample and mixed 471 well. The DNA was precipitated with 1375 µl of 100% ethanol and incubated at -70 ºC for 30 472 minutes (or -20 °C overnight). The samples were centrifuged at 13000 × g for 20 minutes at 4 ºC. 473 The DNA pellets were washed with 500 µl of ice-cold 70% ethanol and air-dried for 10-15 minutes. 474 The DNA pellets were re-suspended in 50 µl of sterile filtered HPLC water. qPCR analysis was 475 performed for each DNA fraction (four elution fractions). 476

Protein mass spectrometric analysis 477
Protein samples were desalted using 10kDa MWCO filters (EMD Millipore) before tryptic 478 digest. Subsequently, the resulting peptides were separated on a Waters Nanoaqcuity nano-LC and 479 analyzed with a Waters Synapt G2 HDMS (Waters Corporation). For the LC, an Acquity UPLC 480 T3HSS column (75 mm x 200 mm) was used and a gradient was run from 3% acetonitrile/0.1% 481 formic acid to 45% acetonitrile in 2 hours. Mass spectrometric acquisition was conducted using 482 data-independent acquisition (MSE) in resolution mode and using leucine-enkephaline as 483 lockspray for mass correction. Resulting spectra were analyzed with the ProteinLynx Global 484      No protein