Purification of recombinant SARS-CoV-2 spike, its receptor binding domain, and CR3022 mAb for serological assay

Serology testing for COVID-19 is highly attractive because of the relatively short diagnosis time and the ability to test for an active immune response against the SARS-CoV-2. In many types of serology tests, the sensitivity and the specificity are directly influenced by the quality of the antigens manufactured. Protein purification of these recombinantly expressed viral antigens [e.g., spike and its receptor binding domain (RBD)] is an important step in the manufacturing process. Simple and high-capacity protein purification schemes for spike, RBD, and CR3022 mAb, recombinantly expressed in CHO and HEK293 cells, are reported in this article. The schemes consist of an affinity chromatography step and a desalting step. Purified proteins were validated in ELISA-based serological tests. Interestingly, extracellular matrix proteins [most notably heparan sulfate proteoglycan (HSPG)] were co-purified from spike-expressing CHO culture with a long cultivation time. HSPG-spike interaction could play a functional role in the pathology and the pathogenesis of SARS-CoV-2 and other coronaviruses.


INTRODUCTION
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), identified in December 2019, is the causative agent for the global pandemic designated COVID-19 [1]. Serological surveys represent an invaluable tool to study immune response, to assess the extent of the pandemic given the existence of asymptomatic cases and to guide control measures. There is clear research evidence that SARS-CoV-2 spike (S) and nucleocapsid (N) proteins are the primary viral antigens, against which antibodies are raised [2]. Hence, ELISA-based serological tests rely on the purified SARS-CoV-2 components that are recombinantly expressed in insect or mammalian cells. 10 mM imidazole, pH 8.0). Spent medium containing either spike or RBD was filtered using a 0.22 µm stericup and loaded onto the equilibrated column using a sample pump. Typically, a volume of 50 -500 mL was loaded. After sample loading, the column was washed in three steps using 5 CVs of buffer A, 5 CVs of 4.5% (v/v) buffer B, and 10 CVs of 9% (v/v) buffer B. Protein was eluted using 100% (v/v) buffer B, and collected in fractions (1 mL/fraction) using a fraction collector. The column equilibration, sample loading, and column washing steps were conducted at 5 mL/min, while the elution step at 1 mL/min. To further reduce non-specific binding, spent medium was adjusted to 20 mM imidazole using buffer B, prior to filtration and sample loading.

SDS-PAGE
Protein fractions were analysed on a NuPAGE TM 4-12%, Bis-Tris, 1 mm, 12-well mini protein gel (Thermo Fisher Scientific) to check for size, purity, and integrity. The gel was run using NuPAGE TM MES SDS running buffer (Thermo Fisher Scientific) at a constant voltage of 200 V for 40 min. For visualization, the gel was stained using InstantBlue TM (Expedeon).

Buffer exchange and storage of SARS-CoV-2 spike and RBD
Protein fractions were pooled and buffer exchanged into storage buffer [20 mM Tris, 200 mM NaCl, 10% (v/v) glycerol, pH 8.0] using a PD-10 desalting column (Cytiva). Protein sample was aliquoted, snap frozen in liquid nitrogen, and stored at -80°C.

Protein purification of CR3022 mAb
A 5-mL HiTrap TM Protein G HP column (Cytiva) was equilibrated with 10 CVs of Protein G IgG Binding Buffer (pH 5.0; Thermo Fisher Scientific). Spent medium containing CR3022 mAb was diluted with binding buffer at a 1:1 volume ratio and filtered using a 0.22 µm stericup, prior to sample loading using a sample pump. The column was washed with 10 CVs of binding buffer. mAb was eluted using IgG Elution Buffer (pH 2.8; Thermo Fisher Scientific). Protein fractions (1 mL/fraction) were collected using a fraction collector and collection tubes containing 100 µL/tube of neutralization buffer (1 M Tris, pH 9.0). The column equilibration, sample loading, and column washing steps were conducted at 5 mL/min, while the elution step at 1 mL/min.

Protein quantification
Spike, RBD, and CR3022 mAb samples were diluted with their corresponding storage buffers, and quantified using Pierce TM Coomassie Plus (Bradford) Assay Kit (Thermo Fisher Scientific). Bovine serum albumin (2.5 µg/mL to 25 µg/mL) was used as a protein calibration standard. Briefly, 100 µL of Bradford reagent was added to 100 µL of diluted protein sample in a 96-well microplate. After a 30-sec shaking, the plate was incubated at room temperature for 10 min. Absorbance was then measured at 595 nm using a Multiskan TM FC microplate photometer (Thermo Fisher Scientific).

Western blot
Proteins were resolved on a NuPAGE TM 4-12%, Bis-Tris, 1 mm, 12-well mini protein gel (Thermo Fisher Scientific), and transferred onto a PDVF membrane (iBlot TM 2 Transfer Stacks; Thermo Fisher Scientific). His-tagged protein was detected using a mouse anti-6´His IgG conjugated to horseradish peroxidase (MCA1396P; Bio-Rad), and the blot was developed using Pierce TM ECL Western Blotting Substrate (Thermo Fisher Scientific).

Peptide analysis by mass spectrometry
The identification of proteins analysed by SDS-PAGE was performed by first excising the bands from the gel and slicing them into 1 mm pieces. The gel pieces were then de-stained with 2 x 200 µL of 50% (v/v) acetonitrile, 50 mM ammonium bicarbonate (ACN/ABC). The proteins were reduced and S-alkylated with 10 mM TCEP and 55 mM MMTS, respectively, with washing in ACN/ABC after each reaction. The gel pieces were dried in a vacuum centrifuge then rehydrated in 50 mM ABC containing 12.5 ng/μL trypsin (Promega), 5 mM calcium chloride. Proteolytic digestion was by incubation at 37°C for 16 h. Peptides were extracted from the gel pieces with 15 μL of 25 mM ABC then 20 μL of neat ACN. The extracts were combined, dried in a vacuum centrifuge and the peptides redissolved in 0.5% (v/v) TFA, 3% (v/v) ACN for analysis by liquid chromatography coupled to mass spectrometry (LC-MS/MS) using an RSLCnano-Q Exactive HF system (Thermo Fisher Scientific). Mass spectra were searched against Chinese hamster or human proteome databases (with inserted Spike and RBD sequences) using Mascot software (Matrix Science).
For identification of proteins in solution, samples were concentrated to approximately 1 g/L by centrifugal ultrafiltration and 5 μg diluted to 7 μL with water. 1 μL each of 500 mM ABC, 0.2% (w/v) ProteaseMax surfactant (Promega) in 50 mM ABC and 0.2 g/L trypsin was then added. The proteins were incubated at 37°C for 16 h then 1 μL of 5% (v/v) TFA was added to degrade the surfactant. After incubation at RT for 5 min, the peptides were desalted using a C18 spin column (Thermo Fisher Scientific) then dried and redissolved for LC-MS/MS analysis as above.

Size exclusion chromatography
A HiLoad™ 26/600 Superdex™ 200 pg column (Cytiva) was washed with 2 CVs of water, and equilibrated with 1.5 CVs of buffer (20 mM Tris, 200 mM NaCl, pH 8.0). A 2-mL protein sample was injected via a 5-mL sample loop. Separation was performed at a flow rate of 2 mL/min. Eluate was collected in fractions (4 mL/fraction) using a fraction collector.

Co-immunoprecipitation
Co-immunoprecipitation was conducted using a NAb TM Spin Kit (Protein G, 0.2 mL; Thermo Fisher Scientific). Briefly, the columns were equilibrated with 400 µL of binding buffer (100 mM phosphate, 150 mM NaCl, pH 7.2), and incubated with 500 µL of protein sample for 10 min at room temperature. The columns were subsequently washed 3 times with 400 µL of binding buffer, followed by 3 elution steps with 400 µL of IgG Elution Buffer. Eluates were collected in tubes containing 40 µL/tube of neutralization buffer (1M Tris-HCl, pH 8.5).

Protein purification of spike
All the purification schemes in this study were developed for use with ÄKTA protein purification systems and the likes. The schemes, consisting of an affinity chromatography step and a desalting step, were designed for routine larger-scale preparative purification. The number of purification steps was intentionally minimised to strike a balance between protein purity that suffices a serological test and protein yield, and to reduce processing time.
SDS-PAGE analysis showed that the full-length spikes purified from both HEK and CHO cultures were relatively pure and homogenous ( Fig. 2A), giving a Mw of >180 kDa that is significantly larger than the calculated Mw of 139.2 kDa. SARS-CoV-2 S gene encodes 22 N-linked glycan sequons (N-X-S/T, where X ¹ P) per protomer [18]. The size difference of ~40 kDa indicated that the recombinant spike is a heavily glycosylated protein. The band corresponding to the spike protein was subsequently analysed using in-gel trypsin digestion in conjunction with mass spectrometry analysis. Mass spectrometric data confirmed the identity of the spike (Fig. 2B). Contrary to the work reported by Krammer and co-workers, in which the full-length spike appeared as a double band (~180 kDa and ~130 kDa) on a reducing SDS-PAGE which was postulated due to protein degradation resulting in a smaller protein [11], our purified spike ran as a clear single band on both non-reducing ( Fig. 2A) and reducing (data not shown) SDS-PAGE.

Protein purification of RBD
The same purification scheme, developed for the full-length spike, was applied for the purification of RBD. Non-reducing SDS-PAGE analysis of the purified RBD showed two distinct bands (Fig. 3), a dominant band with a Mw of ~30 kDa and a side band with a Mw of ~60 kDa. As our RBD sequence (a.a. 319 -541) contains, in total, 9 cysteine residues (C336, C361, C379, C391, C432, C480, C488, C525 and C538), we suspected an RBD dimer formation. When we re-ran the purified RBD on a reducing SDS-PAGE, a clear single band was obtained (Fig. 3), confirming our hypothesis. The bands corresponding to the RBD dimer and RBD monomer were analysed using in-gel trypsin digestion in conjunction with mass spectrometry analysis and confirmed the identity of RBD in both cases (Fig. 4). Based on the crystallographic data of RBD (6M0J, [19]), there are four disulphide bonds in RBD (C336-C361, C379-C432, C391-C525 and C480-C488; Fig. 5). The free C538 is likely the cysteine residue involved in inter-molecular dimerization.
Like the recombinant full-length spike, recombinant RBD also appeared larger in size. There are two N-glycosylation sites within the RBD sequence (N331 and N343) [18]. This could explain the size difference of ~4 kDa between the Mw deduced from the SDS-PAGE (~30 kDa) and the calculated Mw (26.1 kDa).
Protein purification of CR3022 mAb CR3022 mAb was purified using a pre-packed protein G column, with a straightforward purification scheme. This IgG interacts strongly with protein G, evidenced by its broad elution peak spanning over 16 fractions (Fig. 6). We chose a Protein G IgG Binding Buffer with a pH of 5.0 in our purification scheme for a stronger IgG interaction with protein G. The binding between protein G and IgG was shown to be pH-dependent between 2.8 and 10, strongest at pH 4 and 5, and weakest at pH 10 [20]. The downside of using an acidic binding buffer is protein precipitation, which occurred when the spent medium was diluted with the binding buffer at a 1:1 volume ratio. The yield of CR3022 mAb (57 mg/L spent medium) was similar to that estimated from HPLC using MAbPac TM Protein A column (40 -100 mg/L spent medium). The precipitated protein was likely host cell proteins. It is therefore necessary to filter the diluted spent medium with a 0.22 µm stericup, prior to sample loading onto the equilibrated protein G column. Non-reducing (

HSPG was co-purified with full-length SARS-CoV-2 spike
During our optimization of protein expression in CHO cells, a longer cell cultivation time was found to have a profound and positive effect on the full-length spike yield (data not shown). However, the spike purified from the spent medium of a longer culture did not provide the same purity as seen in Fig. 2A, despite applying the same purification scheme (Fig. 8A). Notably, we observed protein species of much higher Mw (>250 kDa) and of lower Mw (predominantly at ~70 kDa). Western blot analysis with an anti-His antibody showed that these co-purified proteins were not His-tagged (Fig. 8B), and the possibility of spike degradation due to a longer cell cultivation time was ruled out.
To identify these impurities, two impure spike fractions (Fig. 8A, lanes D and F) were subjected to in-solution trypsin digestion and mass spectrometry analysis. The two fractions gave almost an identical protein profile (Table 1). Based on the mass spectrometry data, heparan sulfate proteoglycan (HSPG, Mw 334 kDa), nidogen-1 (79 kDa) and suprabasin (64 kDa) constituted bulk of the co-purified impurities, consistent with the impurity bands (high Mw of >250 kDa and low Mw of ~70 kDa) observed in the SDS-PAGE analysis (Fig. 8A).

Peaks corresponding to HSPG and full-length spike were incompletely resolved using a gradient elution in affinity chromatography or a size exclusion chromatography
To further optimise the affinity chromatography step, two strategies were attempted: (a) a gradient elution, and (b) a more stringent binding by adding 20 mM imidazole into the spent medium prior to sample loading.
In Fig. 9, a gradient elution [0 -100% (v/v) buffer B over 5 CVs] was applied to an impure spike fraction (Fig. 8, lane D). HSPG interacts strongly with Ni sepharose, as the peak corresponding to HSPG was eluted at 47% (v/v) buffer B. Although it was possible to partially resolve the peaks corresponding to HSPG (fractions 13 -16) and spike (fractions 15 -24), as shown in Fig. 10A, this elution scheme was not further pursued for two reasons. First, the spike protein was eluted in a broad peak spanning over 10 fractions. This lowers the concentration of the purified spike and would mean a larger volume to process in the subsequent desalting step. The lower spike concentration is not ideal as protein is typically more stable when stored at a higher concentration. Second, some spike protein was lost through co-elution with HSPG ( Fig.  10, fractions 15 and 16). Adding 20 mM imidazole into the spent medium, however, provided a clear benefit of higher purity without compromising the protein yield (Fig.  10B).
We also attempted size exclusion chromatography (SEC) using an impure spike fraction, with the aim of simultaneously achieving (a) separation between HSPG and spike, and (b) buffer exchange. However, we obtained a broad peak (Fig. 11, line 2) with a slight shoulder (line 3). SDS-PAGE analysis confirmed that the peak corresponding to HSPG was eluted at ~116 mL, while the one corresponding to spike at ~121 mL (Fig. 12). Although SEC was not the right approach to separate HSPG from spike, we could deduce the Mw of the full-length spike from the calibration curve for a HiLoad™ 26/600 Superdex™ 200 pg column (Fig. 13A). Using an elution volume of ~121 mL, the recombinant spike has a Mw of 619 kDa (Fig. 13B), confirming the trimeric nature of this glycoprotein. This Mw is in good agreement with the one reported by Watanabe et al. using a similar full-length spike construct (670 kDa, deduced from SEC using a Superose 6 10/300 column) [18].
As shown in Fig. 14, all the impurities including HSPG were not captured by the Protein G spin column. They were present in the flowthrough and absent from the eluate. Spike, on the other hand, was captured by the Protein G spin column via CR3022. Although we did not observe a co-complex formed between spike and HSPG, we could not rule out their potential interaction for two reasons. First, co-immunoprecipitation relies on strong affinities between the bait (CR3022) and the primary target (spike) as well as the latter with the secondary target (HSPG). Low affinity or transient interaction between proteins may not be detected. The spike-HSPG interaction could potentially be a weak one. Second, CR3022 could be a competitor of HSPG for spike interaction. Although the co-immunoprecipitation experiment was inconclusive with respect to spike-HSPG interaction, it confirmed nonetheless (a) the functionality of the purified spike and CR3022, and (b) the incomplete resolution of the peaks corresponding to HSPG and spike in a gradient elution. Therefore, if a gradient elution was adopted during purification, the spike protein loss would be higher than the gain in purity.

Co-purification of HSPG
HSPGs are composed of unbranched, negatively charged heparan sulfate polysaccharides attached to a variety of cell surface or extracellular matrix proteins [21]. Owing to the heavily sulfated glycosaminoglycan chains, they present a global negative charge that interacts electrostatically with the basic residues of viral surface glycoproteins or viral capsid proteins. Viruses exploit these weak interactions with HSPG to increase their local concentration at the cell surface, augmenting their chances of binding a more specific receptor [21]. HSPGs are expressed in lung cells [22]. They also play a pivotal role in cellular internalization of viruses, basic peptides and polycation-nucleic-acid complexes [23].
Both heparan sulfate [24] and heparin [25] were recently shown to interact with SARS-CoV-2 spike and its RBD, further strengthening our postulation that CR3022 and HSPG could compete with one another for spike binding. All these evidences in the literature, along with the experimental observation presented in this study, suggest that HSPG co-purification is not a coincidental event. We hypothesize that HSPG copurification is contributed by two synergistic mechanisms, as illustrated in Fig. 15A. First, HSPG interacts with spike via electrostatic interaction. As spike is being captured by the HisTrap TM column, the high local concentration of spike within the column would favour protein-protein interaction. Second, the high histidine content of HSPG (135 residues, 4.3%) further augment its binding with Ni sepharose or spike. The pKa value of the histidine side chain is 6.0. Some of these histidine residues would be deprotonated in our purification buffers of pH 8.0. Important to note, three arginine residues and one lysine residue were removed from the spike variant used in this study (R682, R683, R985 and K986). These modifications would have compromised the affinity of HSPG to spike. Recently, the D614G mutation in spike was shown to increase the infectivity of the virus [26]. Removal of a negatively charged residue from spike could benefit its interaction with HSPG.
Based on our SEC data (Fig. 13b), HSPG has a Mw of 727 kDa, in line with a previous study (400 -600 kDa, purified from mouse cells) [27]. There are a number of GXXXG motifs in the protein sequence of HSPG (Fig. 15B), suggesting that this protein could form a dimer [28]. Throughout our spike purifications, we have consistently noticed that the spike purified from a HEK culture was purer than that from a CHO culture. This is in agreement with a previous report that CHO cells had a higher amount of cell-surface and intracellular HSPGs compared to HEK293 cells [33].

CONCLUSION
Recombinant spike, its RBD and CR3022, purified using the purification schemes presented here, are adequate for routine serological assays, both in terms of protein purity and yield. Although CHO is an excellent manufacturing host for spike, impurities such as HSPG are co-purified from CHO culture with a long cultivation time when a HisTrap column is used for spike purification. One potential mitigation strategy is the use of HSPG-deficient CHO cells or those defective in the biosynthesis of glycosaminoglycans [34]. HEK293, with a lower HSPG expression, is a good alternative. The potential HSPG-spike interaction certainly warrants further investigation, as it may contribute to the pathology and the pathogenesis of SARS-CoV-2 and other coronaviruses.

14.
Braun, E. and D.   (Fig. 8, lanes D and F), co-purified with the full-length spike, as determined by mass spectrometric analysis. Proteins are ranked from high to low abundance. Impure spike fraction 1 (Fig. 8, lane D) Impure spike fraction 2 (Fig 8, lane F          column volumes], when an impure spike fraction (Fig. 8, lane D) was loaded onto a 5-mL HisTrap TM HP column. The x axis, left y axis, and right y axis show the elution volume in mL, UV absorbance at 280 nm, and % (v/v) of buffer B, respectively. The blue, green and brown curves represent UV absorbance at 280 nm, % (v/v) of buffer B and conductivity, respectively. HSPG interacted strongly with Ni sepharose. The peak corresponding to HSPG was eluted at 47% (v/v) buffer B.  Fig. 9], when an impure spike fraction (Fig.8, lane D) was loaded onto a 5-mL HisTrap TM HP column. HSPG (fractions 13 -16) and spike (fractions 15 -24) were partially resolved. (B) A more stringent binding by adding 20 mM imidazole into the spent medium, prior to sample loading onto a 5-mL HisTrap TM HP column, improved the purity of spike when the protein was purified from CHO culture with a long cultivation time.    Fig. 11. Spike (line 3 in Fig. 11) has a Mw of 619 kDa, confirming its trimeric nature.