Sequence signatures of two IGHV3-53/3-66 public clonotypes to SARS-CoV-2 receptor binding domain

Since the COVID-19 pandemic onset, the antibody response to SARS-CoV-2 has been extensively characterized. Antibodies to the receptor binding domain (RBD) on the spike protein are frequently encoded by IGHV3-53/3-66 with a short CDR H3. Germline-encoded sequence motifs in CDRs H1 and H2 play a major role, but whether any common motifs are present in CDR H3, which is often critical for binding specificity, have not been elucidated. Here, we identify two public clonotypes of IGHV3-53/3-66 RBD antibodies with a 9-residue CDR H3 that pair with different light chains. Distinct sequence motifs on CDR H3 are present in the two public clonotypes that appear to be related to differential light chain pairing. Additionally, we show that Y58F is a common somatic hypermutation that results in increased binding affinity of IGHV3-53/3-66 RBD antibodies with a short CDR H3. Overall, our results advance fundamental understanding of the antibody response to SARS-CoV-2.


Introduction 38
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is the etiological agent of 39 coronavirus disease 2019 (COVID-19) 1,2 , which primarily results in respiratory distress, cardiac 40 failure, and renal injury in the most severe cases 3,4 . The virion is decorated with the spike (S) 41 glycoprotein, which contains a receptor-binding domain (RBD) that mediates virus entry by 42 binding to angiotensin-converting enzyme-2 (ACE-2) receptor on the surface of host cells 1,5-7 . To 43 mitigate the devastating social and economic consequences of the pandemic, vaccines and post-44 exposure prophylaxes including antibody cocktails that exploit reactivity to the S protein are being 45 developed at an unprecedented rate. Several vaccines are currently in various stages of clinical 46 trials 8,9 . Most notable are the mRNA vaccines from Pfizer-BioNTech and Moderna, which have 47 In this study, we define clonotypic IGHV3-53/3-66 RBD antibodies as antibodies that share the 89 same IGL(K)V genes and with identical CDR H3 length. Literature mining of 214 published 90 IGHV3-53/3-66 RBD antibodies obtained from convalescent patients (Supplementary Table 1) 91 revealed that the two most common clonotypes have a CDR H3 length of 9 amino acids and are 92 paired with light chains IGKV1-9 (clonotype 1) and IGKV3-20 (clonotype 2), respectively (Figure 93  We further investigated sequence signatures of CDR H3s in clonotypes 1 and 2 ( Figure 1b). In 110 particular, we focused on amino acid residues 96, 98 and 100 in CDR H3 since these residues 111 show clear patterns of differential amino-acid preference between clonotype 1 and clonotype 2 112 antibodies. Subsequently, analysis was performed on structures of BD-604 (PDB 7CH4) and 113 CC12.1 (PDB 6XC2), which are two clonotype 1 antibodies, as well as BD-629 (PDB 7CH5) and 114 CC12.3 (PDB 6XC4), which are two clonotype 2 antibodies. 115 116 Residue 96 is usually Leu in clonotype 1 antibodies, while an aromatic residue, usually Tyr, 117 occupies residue 96 in clonotype 2 antibodies. While VH L96 interacts with Y489 of the RBD in 118 clonotype 1 antibodies via van der Waals interactions, VH F/Y96 is located at the center of a p-p 119 stacking network that involves F456, Y489 and VH Y100 (Figure 2a  clonotype 1 has a positive Φ angle, which is typically less favorable for non-Gly amino acids. In 139 contrast, residue 100 is a highly conserved Tyr in CDR H3 of clonotype 2 antibodies (Figure 1b). 140 Structural analysis shows that VH Y100 contributes to the p-p stacking network that is formed via 141 the aromatic ring at VH residue 96 (see above) and an aromatic residue at VL residue 49 ( Figure  142 2b, Supplementary Figure 2b which may be disfavored in clonotype 1 antibodies due to the limited space where VH residue 100 154 is located (Supplementary Figure 5). Overall, our structural analyses provide a structural basis for 155 the differential signature sequence motifs in CDR H3 between clonotype 1 and clonotype 2 156 antibodies. 157 158

Incompatibility of CDR H3 between clonotype 1 and clonotype 2 antibodies 159
To understand the influence of light-chain usage in CDR H3 sequences, we performed a structural 160 alignment of RBD-bound CDR H3 from two clonotype 1 antibodies, namely BD-604 and CC12.1, 161 and two clonotype 2 antibodies, namely BD-629 and CC12.3 ( Supplementary Figures 2c-2f). 162 While the CDR H3 conformations are similar within each clonotype (RMSD ranges from 0.27 to 163 0.41 Å), they are quite different between clonotypes (RMSD ranges from 0.77 Å to 1.5 Å). 164 Although our sample size is small, this analysis suggests that antibodies from clonotypes 1 and 165 2 have different preferences for their CDR H3 conformations. Such differential preference of CDR 166 H3 conformations may be partly influenced by light-chain usage, as indicated by the structural 167 analyses above on VH residues 96, 98, and 100 ( Figure 2, Supplementary Figures 2 and 5). 168

169
To experimentally examine the compatibility between CDR H3 and the light chains from clonotype 170 1 and clonotype 2 antibodies, we focused on antibodies COV107-23 (clonotype 1) and COVD21-171 C8 (clonotype 2). The heavy-chain sequences of these two antibodies only differ by four amino 172 acids in CDR H3, namely VH residues 96, 98, 99, and 100 (Supplementary Figure 6a). Of note, 173 COV107-23 uses IGHJ4, which is seldom observed among clonotype 1 antibodies but highly 174 preferred in clonotype 2 antibodies (Figure 1c Table 2). The conformations of CDR H3 indeed differ when paired with different 181 light chains, as exemplified by the 3.3 Å displacement of VH G97 near the tip of CDR H3 and 182 different side-chain orientations of VH T98 (Figure 3b). In addition, a type I' β-turn is observed at 183 the tip of CDR H3 in COV107-23 when paired with its native light chain but not with the light chain 184 from COVD21-C8 ( Figure 3c). These observations demonstrate that the conformation of CDR H3 185 changes substantially when IGKV1-9 in COV10-23 is swapped to IGKV3-20, which abolishes the 186 binding to RBD (Figure 3a). The CDR H3 conformation is therefore a determinant for compatibility 187 between the CDR H3 sequence and the light chain in IGHV3-53/3-66 RBD antibodies. 188 189

Compatibility of different CDR H3 variants with IGHV1-9 for binding to RBD 190
Besides antibodies from clonotypes 1 and 2, other IGHV3-53/3-66 RBD antibodies with a range 191 of CDR H3 lengths pair with different light chains ( Figure 1a). We further aimed to expand our 192 analysis on CDR H3 compatibility to include CDR H3 from IGHV3-53/3-66 RBD antibodies other 193 than clonotypes 1 and 2. In particular, we focused on identifying CDR H3 sequences that are 194 compatible with IGKV1-9, which is used by clonotype 1 antibodies for binding to RBD. We first 195 compiled a list of 143 CDR H3 variants that were observed in IGHV3-53/3-66 RBD antibodies 196 Table 3 We noticed that some CDR H3 sequences that come from IGKV1-9 RBD antibodies do not enrich 216 in binding. One possibility is that they are still able to bind to RBD, but with a lower affinity than 217 B38, which has a KD of 70 nM to the RBD 26 . However, as shown by our yeast display screen, 218 CDR H3 sequences from IGKV1-9 antibodies in general have a significantly stronger binding to 219 RBD than those from non-IGKV1-9 antibodies (p-value = 0.002, Figure 4d), whereas their 220 expression level is only marginally higher than that from non-IGKV1-9 antibodies (p-value = 0.06, 221 were then analyzed (Figure 5a). This analysis included 214 IGHV3-53/3-66 RBD antibodies that 228 have sequence information available. One clear observation is that Y58F is highly common 229 among IGHV3-53/3-66 RBD antibodies with a CDR H3 length of less than 15 amino acids, but 230 completely absent when the CDR H3 length is 15 amino acids or above, suggesting that Y58F 231 improves the binding of affinity IGHV3-53/3-66 antibodies to RBD only when they have a short 232 CDR H3 loop (CDR H3 < 15 amino acids). To understand the effect of Y58F on the binding affinity 233 of IGHV3-53/3-66 antibodies to the RBD, we compared the binding affinity of the same antibodies 234 that carry either Y58 or F58 to the RBD. In particular, we focused on three IGHV3-53/3-66 RBD 235 antibodies that have a CDR H3 length of 9 amino acids -one in clonotype 1 (COV107-23), and 236 two in clonotype 2 (COVD21-C8 and CC12.3). Our BLI experiments showed that the Y58F 237 mutation dramatically improved the affinity of the three antibodies (COV107-23, COVD21-C8 and 238 Although this study revealed that Y58F is a common SHM that improves the affinity of IGHV3-286 53/3-66 antibodies with a short CDR H3 to RBD, other common SHMs have also shown up in our 287 sequence analysis (Figure 5a), albeit with a lower frequency. Most noticeably, a cluster of 288 common SHMs is found in VH framework region 1 from residues 26 to 28. This cluster of SHMs 289 is also likely to be important for affinity maturation to RBD. A recent study has indeed shown that 290 SHMs VH F27V and T28I together increase affinity by 100-fold of an IGHV3-53/3-66 antibody to 291 the SARS-CoV-2 RBD 38 . Additional common SHMs among IGHV3-53/3-66 RBD antibodies with 292 a short CDR H3 include S31R in CDR H1 and V50L in CDR H2 (Figure 5a) shown to also alter antigenicity of the spike protein 66,69-71 . Consistently, we found that K417N 305 dramatically decreased the binding of COV107-23 (clonotype 1) and COVD21-C8 (clonotype 2) 306 to RBD (Supplementary Figures 11a-11b). In fact, K417 forms an electrostatic interaction with the 307 signature residue VH D/E98 of CDR H3 in clonotype 2 antibodies (Figure 2b) and can also interact 308 with CDR H3 of clonotype 1 antibodies (Supplementary Figure 11c), providing a structural 309 explanation for its change in antigenicity. Constant antigenic drift of SARS-CoV-2 is unavoidable 310 if it keeps circulating among humans. Thus, sustained efforts in characterizing the antibody 311 response to SARS-CoV-2 as it evolves will not only benefit vaccine development and assessment, 312 but also improve our fundamental understanding of the ability of the antibody repertoire to rapidly 313 respond to viral infections. 314

Literature mining for antibodies to SARS-CoV-2 RBD 316
Sequences of anti-SARS-CoV-2 RBD from convalescent patients infected with SARS-CoV-2 317 were obtained from published articles 20-40 (Supplementary Table 1 100 µl of WT B38 yeast antibody display library glycerol stock was recovered in 50 ml SD-CAA 389 medium (2% w/v D-glucose, 0.67% w/v yeast nitrogen base with ammonium sulfate, 0.5% w/v 390 casamino acids, 0.54% w/v Na2HPO4, 0.86% w/v NaH2PO4•H2O, all dissolved in deionized water) 391 by incubating at 27°C with shaking at 250 rpm until OD600 reached between 1.5 and 2.0. At this 392 time, 15 ml of the yeast culture was harvested, and the yeast pellet was obtained via centrifugation 393 at 4,000 × g at 4°C for 5 min. The supernatant was discarded, and SGR-CAA (2% w/v galactose, 394 2% w/v raffinose, 0.1% w/v D-glucose, 0.67% w/v yeast nitrogen base with ammonium sulfate, 395 0.5% w/v casamino acids, 0.54% w/v Na2HPO4, 0.86% w/v NaH2PO4•H2O, all dissolved in 396 deionized water) was added to make up the volume to 50 ml. The yeast culture was then 397 transferred to a baffled flask and incubated at 18°C with shaking at 250 rpm. Once OD600 has 398 reached between 1.3 and 1.6, 1 ml of yeast culture was harvested, and the yeast pellet was 399 obtained via centrifugation at 4,000 × g at 4°C for 5 min. The pellet was subsequently washed 400 with 1 ml of 1x PBS twice. After the final wash, cells were resuspended in 1 ml of 1x PBS.