Strategies to improve scFvs as crystallization chaperones suggested by analysis of a complex with the human PHD-bromodomain SP140

Antibody fragments have great potential as crystallization chaperones for structural biology due to their ability to either stabilise targets, trap certain conformations and/or promote crystal packing. Here we present an example of using a single-chain variable fragment (scFv) to determine the previously unsolved structure of the multidomain protein SP140. This nuclear leukocyte-specific protein contains domains related to chromatin-mediated gene expression and has been implicated in various disease states. The structure of two of the domains (PHD-bromodomain) was solved by crystallizing them as a complex with a scFv generated by phage display technology. SP140 maintains a similar overall fold to previous PHD-bromodomains and the scFv CDR loops predominately interact with the PHD, while the framework regions of the scFv makes numerous interactions with the bromodomain. Analysis of our and other complex structures suggest various protein engineering strategies that might be employed to improve the usefulness of scFvs as crystallization chaperones.


16
Despite extensive efforts in construct design (1, 2) and screening of chemical space (3), many 17 targets routinely fail to yield the protein crystals necessary for structural studies via x-ray 18 diffraction. This failure may be due to poor protein stability (4), multiple conformational states 19 leading to sample heterogeneity or it even being the wrong time of year (5)! Antibody 20 fragments have proven useful in overcoming some of these issues by acting as crystallization 21 chaperones (6). They can stabilise and/or trap specific conformations of proteins as well as 22 directly mediate crystal contacts. Mimetic binders based on non-immunoglobulin scaffolds 23 have also been engineered and proven useful in this application (7). 1 Examples of target:binder complexes in the PDB are numerous and steadily growing, with 2 several different binder types having been used, such as Fabs, scFvs, nanobodies and Darpins 3 (see Table 1). Currently there is no clear consensus which binder format is best and most likely 4 this will vary from target to target. For structural studies binders should have a stable and rigid 5 scaffold and regions (such as loops) that tolerate high sequence variability that generates 6 specificity for a target. Ideally binders should also be produced in a way that gives rise to a 7 homogenous preparation (i.e. without glycosylations and other modifications). generated by phage display, we solved the structure of the PHD-Bromodomain of SP140 10 (6G8R). As well as generating a novel structure, we analysed this complex and those reported 11 by others to investigate some of the features of scFvs, which may aid crystallization and points 12 the way to possible improvements to this scaffold as a crystallization chaperone. Although scFvs are useful binder scaffolds, we have found that production in Escherichia coli 7 routinely gives low yields of less than 1 mg/L of cells for some if not most constructs. As 8 crystallography can require several milligrams of protein, we evaluated different strategies to 9 improve yield and purity. The most successful strategy was to produce the G-SP140-8 scFv as

21
Curiously the isolated scFv was always found to run lower than its predicted MW ( Figure 1B 22 ii). This may indicate some dissociation of the two domains during the run (V H and V L ), i.e.

23
the protein overall behaves more like a long linear molecule rather than a single larger globular 24 protein. After isolation by SEC, the complex of scFv:SP140 was put directly into crystallization 1 trials and crystals with a bipyramid morphology were found in many conditions after 1-2 days 2 at room temperature. Unfortunately, most of these crystals were found to diffract to a low 3 resolution (3-5 Å). Only crystals grown in a mixture of malonate/jeffamine ( Figure 1C.) were 4 found to diffract to a resolution of 2.7 Å and allow the generation of a structure with reasonable 5 refinement statistics, Table 2.   Figure S7). None of the CDR loops occupy the peptide-binding site of the 21 PHD domain ( Figure 2D) suggesting that the scFv would not inhibit protein activity.

22
More unexpected is the importance of the interactions of residues on the side of the scFv 23 opposite to the CDR-loops, which participate in crystal contacts with neighbouring copies of 1 SP140 ( Figure 2B). Crystal packing is also observed between neighbouring scFv molecules 2 ( Figure 2C).

3
The interaction of the scFv with both the bromodomain and PHD and its probable stabilising 4 effect may be important given the above noted fact that the PHD in isolation appears to be non-5 functional (18). This might explain our failure to crystallize the isolated protein, despite success 6 with many similar classes of protein (20).  Figure 1D with peptide bound 12 SP100C structure (5FB1) with bound peptide in stick representation (hot pink) and the PHD and 13 bromodomains coloured hot pink and pale pink respectively.
14 15 scFvs as crystallization chaperones 16 Of the binder:complex structures reported in the PDB (Table 1) sixty-one of these contain an 17 scFv or Fv in complex with another protein, the remainder being small molecules, peptides, 18 apo or anomalies (see Table S1 for details). As can be seen from to the gp120 subunit of the HIV-1 envelope trimer (29). We reasoned that an analysis of our 3 and others protein:scFv/Fv complexes would yield suggestions on how to generate scFvs with 4 improved crystallization properties.
5 Table S3 shows a summary of information on the scFv/Fv and their amino acid sequences in 6 these complexes. It can be seen from domains were sufficiently stable as a complex to not require a linker (Fv rather than scFv).

10
A simple analysis of the complexes (Table S3) shows that many regions of the scFv constructs 11 are not visible in the final crystal structure; these include long C-terminal purification tags (e.g.

17
Another observation is that the V H -V L linker is rarely visible in structures (Table S3). If this 18 linker could be redesigned to be more rigid for the above reasons of entropic cost, this may 19 increase crystallization success rates. One approach may be to simply increase the serine 20 content, which has been reported to increase stiffness (31). Alternatively a computational 21 approach to the design could be taken (32). Care has to be taken as linker modifications can 22 have unintended adverse effects such as reducing affinity, stability and oligomerization (33, 23 34), any of which which may be disadvantageous for crystallization and target binding. In Table S4 an analysis of the regions commonly found in Fv crystal contacts (not involving 2 the primary Fv:target complex) shows that many crystal packing arrangements of Fv molecules 3 are possible. These multiple possible packing arrangements may partly explain the usefulness 4 of these binders as crystallization chaperones. analysed is relatively small (61 in total) and that the Fv domains are from a number of species 10 (human, mouse, rhesus, rabbit & rat), meaning a high degree of sequence variability. Even with 11 these limitations it is perhaps surprisingly how frequently the same regions are found in crystal 12 packing points with other Fv (Tables S5 and S6). Targeting of these regions with mutagenesis 13 is likely a good strategy for improving scFv as crystallization chaperones.
14 Analysis of the alignments in Tables S5 and S6 also shows that the Fv:Fv packing frequently 15 involve a glutamine, lysine or glutamate. These residues are associated with high entropic cost 16 and mutating these to alanine or a residue with lower entropy (arginine, aspartate or asparagine) 17 has been shown in other proteins to be a successful strategy to improve crystallization success 18 (36). Such a surface entropy reduction (SER) strategy has been done to improve the 19 crystallization properties of an EE epitope (EYMPME) binding scFv (28), although without a 20 target protein bound. Similarly a SER approach has been used on a Fab scaffold to improve 21 crystallization when in complex with RNA (37). tag and a C-terminal Avi tag, creating the SP140A-c055 construct, Supplementary sequence 1 S1. Expression and purification were performed essentially as described (42), see 2 supplementary methods S1 for a detailed protocol. The phage selection procedure was carried out basically as described (43)

19
Expression and purification was done using a similar method to that described previously (1,  To isolate the scFv:SP140 complex, 3 mg of SP140 were mixed with 1 mg of scFv. The mixture 2 was then injected onto a 104 mL Yarra SEC-2000 column (Phenomenex) connected to a NGC 3 system (Bio-Rad) using 10 mM HEPES, 500 mM sodium chloride, 5 % glycerol, 0.5 mM 4 TCEP, pH 7.5 as the mobile phase, ( Figure 1B and supplementary figures S2, S3 and S4).

5
Fractions corresponding to the complex were pooled and concentrated to 7.7 mg /mL. Xia2 auto-processed images (44) were used to perform molecular replacement using Phaser 16 (45). Further refinement of the structure was done using the Phenix suite of programs (46) in 17 combination with Coot (47) and Molprobity (48), see