A general model of multivalent binding with ligands of heterotypic subunits and multiple surface receptors

Zhixin Cyrillus Tan; Aaron S. Meyer

doi:10.1101/2021.03.10.434776

Abstract

Multivalent cell surface receptor binding is a ubiquitous biological phenomenon with functional and therapeutic significance. Predicting the amount of ligand binding for a cell remains an important question in computational biology as it can provide great insight into cell-to-cell communication and rational drug design toward specific targets. In this study, we extend a mechanistic, two-step multivalent binding model to account for multiple ligands and receptors, optionally allowing heterogeneous complexes. We derive the macroscopic pre-dictions for both specifically arranged and randomly assorted complexes, and demonstrate how this model enables large-scale predictions on mixture binding and the binding space of a ligand. This model provides an elegant and computationally efficient framework for analyzing multivalent binding.

1 Introduction

Binding to extracellular ligands is among the most fundamental and universal activities of a cell. Many important biological activities, and cell-to-cell communication in particular, are based on recognizing extracellular molecules via specific surface receptors. For example, multivalent ligands are common extracellular factors in the immune system [8], and many computational models have been applied to study IgE-FcεRI [5], MHC-T cell receptor [7], and IgG-FcγR interaction [16].

In this study, we extend a simple two-step, multivalent binding model to cases involving multiple receptors and ligand subunits [1, 2, 3, 12, 7]. By harnessing the power of combinatorics via applying the multinomial theorem and focusing on macrostates, we can predict the amount of binding for each ligand and receptor at the equilibrium state. Our model provides both generality and computational efficiency, allowing large-scale predictions such as characterizing synergism of using a mixture of ligands and depicting the binding space of a compound. The compactness and elegance of the formulae enable both analytical and numerical analyses. We expect this binding model will be widely applicable to many biological contexts.

2 Preliminaries

2.1 Vector and matrix notation

In this work, we denote a vector in boldface letter and its entry in the same letter but with subscript and not in boldface, e.g. C = [C₁, C₂,…, C_n]. The sum of elements for a vector is denoted as .

For any matrix (A_ij) of size m × n, we denote the vector formed by its i-th row as A_i• = [A_i1, A_i2, …, A_in], and the vector formed by its j-th column as A_•j= [A_1j, A_2j, …, A_mj]. The row sums of matrix (A_ij), therefore, can be written as |A_1•|, |A_2•|, …, |A_m•|, and column sums |A_•1|, |A_•2|, …, |A_•n|.

In this work, multinomial coefficients such as n choose k₁, k₂, …, k_n will be written as

The implicit assumption here is that |k| = n, and each k_i ∈ ℕ.

2.2 Some useful theorems in combinatorics

From the binomial theorem, we know that

Differentiating both sides by Φ, we get

We can derive similar property from the multinomial theorem. Assume the elements of a nonnegative integer vector q add up to f, or |q| = f. Given another nonnegative vector φ with sum of elements |φ|, we have

Differentiate both sides by φ_m where φ_m can be any entry of φ, and rearrange, we have

We can multiply two different multinomial theorem equations together, too. Let u and v are two nonnegative integer vectors, a and b are two nonnegative vectors, and |u| = m, |v| = n, we have

Throughout this paper, we consolidate multiple summation symbols into one. In this case, we use Σ_|u|=m,|v|=n as a shorthand for Σ_|u|=m Σ_|v|=n. From Eq.(2), we can derive the sum of a linear combination of two exponents from each multinomial term as where k₁ and k₂ are constants.

We can extend this to the product of N multinomial equations. Let q₁, …, q_N be N nonnegative integer vectors, each with |q_i | = θ_i, and Ψ₁,… , Ψ_N be N nonnegative vectors. Then, the sum of any linear combination of exponent terms , where k_r’s are constants and each is the t_r-th element of , can be calculated as

3 Model setup

3.1 Parameters and notations

In this study, we investigate the binding between multivalent ligand complexes and a cell expressing various surface receptors. As shown in Figure 1, we consider N_L types of distinct monomer ligands, namely , and N_R types of distinct receptors expressed on a cell, namely . The monovalent binding association constant between L_i and R_j is defined as K_a,ij. A ligand complex consists of one or several monomer ligands, and each of them can bind to a receptor independently. Its construction can be described by a vector , where each entry θ_i represents how many L_i this complex contains. The sum of elements of vector θ, |θ |, is f, the valency of this complex.

Figure 1:

General setup of the model. In this study, we investigate the binding behavior of complexes formed by monomer ligands in either specific arrangement or random assortment. We propose that the binding configuration between a complex and several receptors on a cell can be described as a matrix (q_ij). The construction of a complex can be written as a vector θ. The figure shows the dimensions of the model’s parameters: C_i, the monomer compositions, are in a vector of N_L; R_tot,j and R_eq,j, the receptor expression and equilibrium level are in vectors of N_R; the binding affinities, K_a,ij, are in a matrix of N_L × N_R; φ_ij and Ψ_ij are in the matrices of N_L × (N_R + 1). Θ is a set of all possible θ’s, with C_θ as their compositions. Each θ is a vector of N_L, and C_θ should be in a vector of the same size as Θ.

The binding configuration at equilibrium between an individual complex and a cell expressing various receptors can be described as a matrix (q_ij) with N_L rows and (N_R + 1) columns. For example, the complex bound as shown on the top left corner in Figure 1 can be described as the matrix below it. Entry q_ij represents the number of L_i to R_j binding, and q_i0, the entry on the 0-th column, is the number of unbound L_i on that complex in this configuration. This matrix can be unrolled into a vector form of length N_L(N_R + 1). Note that this binding configuration matrix (q_ij) only records how many L_i-to-R_j pairs are formed, regardless of which exact ligand on the complex binds. For example, in Figure 1, swapping the two L₂’s binding to R₂’s will give us the same configuration matrix. Therefore, we will need to account for this combinatorial factor when applying the law of mass action.

We know from the conservation of mass that for this complex, must hold for all i. Mathematically, vector θ is the row sums of matrix (q_ij). The corresponding θ of a binding configuration q, θ(q) which is written in the format of a function, can be determined by this relationship. Also, the sum of elements in q, |q| = f, the valency.

The concentration of complexes in the solution is L₀ (not to be confused with L_i, the name of ligands, when i = 1, 2, …, N_L). The number of ligand complexes in the solution is usually much greater than that of the receptors and so it is a common practice to assume binding does not deplete the ligand concentration.

On the receptor side, R_tot,i is the total number of R_i expressed on the cell surface. This usually can be measured experimentally. R_eq,i is the number of unbound R_i on a cell at the equilibrium state during the ligand complex-receptor interaction, and it needs to be calculated from R_tot,i as we will explain later.

The binding of a ligand complex, a large molecule, is complicated. To simplify the matter, we will need to make some key thermodynamic assumptions. In this model, we make two assumptions on the binding dynamics:

The initial binding between a free (unbound) complex and a surface receptor R_j has the same affinity (association constant, K_a,ij) as the monomer ligand L_i;
In order for the detailed balance to hold, the affinity constant of any sub-sequent binding event on the surface of a cell after the initial interaction must be proportional to their corresponding monovalent affinity. We assume the subsequent binding affinity in multivalent interactions between L_i and R_j to be .

is a term coined as the crosslinking constant. It captures the difference between free and multivalent ligand-receptor binding, including but not limited to steric effects and local receptor clustering [4]. In practice this term is often fit to apply this model to a specific biological context.

We create two more variables that will help to simplify our equations through-out this work. For all i in {1, 2, …, N_L}, we define and where j = {1, 2, …, N_R}, and we define Ψ_i0 = 1, φ_i0 = C_i. Therefore, φ_ij = Ψ_ijC_i holds for all i and j. Then we define the sum of this new matrix (φ_ij) as , and . The rationale of these definitions will become clear in future sections.

3.2 The amount of a specific binding configuration

Now we will derive the amount of complexes bound with the configuration described as q on a cell at equilibrium, v_q.

Within the definitions of our model, we know that the composition of any complex can be described by a vector θ of length N_L, where each entry θ_i represents the number of monomer L_i this complex consists of. We can enumerate all possible binding configuration of θ complex by filling the matrix (q_ij) with any nonnegative integer values so long as its row sums equal θ. Conversely, we can imply the complex composition given any binding configuration q by finding its row sums, θ(q). For a certain configuration q, its θ(q) is determined and has concentration L₀C_θ(q). If the corresponding complex θ(q) does not exist in the solution, C_θ(q) = 0. Since we assumed that binding will not deplete the ambient concentration of any θ(q), it will remain L₀C_θ(q) at equilibrium.

Initial binding

We start with the initial binding reaction of a complex, L_i-to-R_j. As shown in Figure 2, the reactants of this reaction are the free complexes and the free receptors R_j (in this case R₂), and the product are L_i-to-R_j (in this case L₂-R₂) monovalently bound complexes q₍₁₎. We denote the concentration of this new complex as . The concentration of free complexes is . By the assumption of the model, the equilibrium constant for the reaction is K_a,ij. Therefore, we have

Figure 2:

A scheme of cell-complex binding step by step. We assume the initial binding event has the same affinity as monomer binding, K_a,ij, while subsequent binding has an association constant scaled by , the crosslinking constant. Each binding configuration scheme above can be described by the q right below, if we ignore the statistical factors. θ(q) is the structure of the complex and can be derived from q.

While the binding configuration of q₍₁₎ can be described by q_a, the total amount of complexes that bind as described as q_a may not be the same as v_q(1), since q_a does not consider the number of ways this binding L_i can be chosen. An equivalent explanation is that, q₍₁₎ is only one possible microstate to achieve the q_a configuration, and we need to count how many microstates are possible for q_a. Accounting for this statistical factor, we have since θ(q₍₁₎) = θ(q_a). q_{a,i •} is a vector formed by the i-th row of q_a. For example, in Figure 2, q_a,_{2 •} = [2, 0, 1, 0]. Conceptually, can be understood as the number of ways to split θ_i L_i’s into q_i0 of unbound units, q_i1 of R₁-bound, q_i2 of R₂-bound, …, and of -bound. Here, only q_i0 and q_ij will be nonzero, with q = θ − 1 and q = 1, so it is effectively the same as . However, the multinomial coefficient expression can be generalized into more complicated cases.

Subsequent binding

For a subsequent binding between L_i and R_j (i and j are not necessarily the same as in initial binding), we have the reactants as a bound complex, q₍₁₎, and a free receptor R_j (in the case shown by Figure 2, R₂), while the product is another bound complex, q₍₂₎. The equilibrium constant is , then

To account for the statistical factors for , we have . For example, in Figure 2, q_b,2• = [1, 0, 2, 0]. Putting these together, we have

By recursion, we can solve v_q for any q from these equations. It is if we define for j = 1, 2, …, N_R and Ψ_i0 = 1 for all i. is a shorthand for . In the next section, we will use this formula repeatedly.

Notice that this equation is not suitable for calculating the concentration of unbound q, when every nonzero values are on its 0-th column. The concentration of unbound ligands should always be L₀C_θ(q). However, for algebraic convenience, we allow such definition and will name it v_0,eq which equals .

4 Macroscopic equilibrium predictions

From here we will investigate the macroscopic properties of binding, such as the total amount of ligand bound and receptor bound on a cell surface at equilibrium. We consider two different ways complexes in the solution to be formed. First, complexes can be formed in a specific arrangement. In this case, the structure and exact concentration for each kind of complex are designed and known. Alternatively, we can set a fixed valency f for all complexes given the known proportion of each ligand monomer. Through random assortment, any combination of f monomer ligands can form a complex, and their concentration will follow a multinomial distribution. We will explore these two cases separately.

4.1 Complexes formed in a specific arrangement

When complexes are specifically arranged, the structure and proportion of each kind are well-defined. To formulate this mathematically, we assume that we have various kinds of complexes, and each of them can be described by a vector θ of length N_L, with each entry θ_i as the number of L_i in this complex. The valency of each complex may be different, and for complex θ its valency is |θ|. The proportion of θ among all complexes is defined as C_θ, and the concentration of each θ complex will be L₀C_θ. For example, if we create a mixture of 20% of bivalent L₁ and 80% of bispecific L₁ − L₂, then θ₁ = [2, 0], θ₂ = [1, 1], , and . If the mixture solution has a total concentration of 10 nM, then the concentration of θ₁ is 2 nM, and the concentration of θ₂ is 8 nM.

We further conceptualize that Θ is a set of all existing θ’s. By this setting, we should have Σ _θ∈Θ C_θ = 1. These complexes will bind in various configurations which can all be described as a q. We define Q as a set of all possible q’s, and we borrow the notation q ⊆ θ to indicate any binding configuration q that can be achieved by complex θ. This is equivalent to |q_{i •} | = θ_i for all i, or θ is the row sum of (q_ij).

Solve the amount of free receptors

A remaining problem in the model setup is that in practice we can only experimentally measure the total amount of receptor of each kind expressed by a cell, R_tot,j, while the amount of free receptors at equilibrium, R_eq,j, though being used extensively in the model derivation, is unknown. To find R_eq,j, we first need to derive the amount of bound receptors of each kind, R_bound,j, then use conservation of mass to solve R_eq,j numerically.

To calculate the amount of bound ligand R_bound,n, we can simply add up all entries at the n-th column for every q’s: where , and .

By the conservation of mass, we have

In this equation, R_tot,n are known, and any Ψ_i• is a function of every R_eq,j, j = 1, 2, …, N_R, so all R_eq,j need to be solved together. This system of equations usually does not have a closed form and must be solved numerically. When implementing, we suggest taking the logarithm of both sides of these equations so the exponents can be eliminated and the range is restricted to positive numbers.

As a side note, the total amount of bound receptors regardless of which kind is

The amount of bound ligand complexes

Our model makes many macro-scopic predictions readily accessible. For example, the amount of ligand bound at equilibrium is a useful quantity when measuring the overall quantity of tagged ligand. To compute this number, we can add up all v_q except the q’s that only have nonzero values on the 0-th column, v_0,eq. Consequently, the model prediction of bound ligand at equilibrium is when , and the predicted amount of bound complex θ (complex of each kind) is

The amount of fully bound ligands

In multivalent complexes like bispecific antibodies, drug activity may require that all subunits be bound to their respective targets [13]. The predicted amount of ligand bound f -valently can be calculated as with , the q_i• vector without q_i0. In this equation, the multinomial coefficient describes the number of ways one can allocate θ_i receptors to any position in the i-th row of the (q_ij) matrix except the 0-th row which stands for unbound.

In fact, the predicted amount of any specific-valently bound ligands can be derived in such manner. For example, the amount of ligands that bind monovalently can be calculated as

This can be used for estimating the amount of multimerized ligands, L_multi = L_bound − v_1,eq, and multimerized receptors, R_multi = R_bound − v_1,eq.

4.2 Complexes formed through random assortment

Another common mode of forming multivalent complexes in biology, such as in the formation of antibody-antigen complexes [16], is engagement of monomer units to a common scaffold. Instead of resulting in a specific arrangement, we provide binding compounds of a fixed valency f and a litany of monomer ligands, and complexes can form through random assortment. The concentration of these complexes, therefore, will follow a multinomial distribution.

To formulate this mathematically, we denote the proportion of L_i as C_i, and . For example, we have 40% L and 60% L in the solution to form dimers (f = 2), then C₁ = 40%, C₂ = 60%. Assume complex formation follows a binomial distribution, there will be 16% bivalent L₁, 36% bivalent L₂, and 48% L₁ L₂ complex. When a complex is randomly assembled from the monomer ligands, the probability of such complex formed as described by θ is

Since , we know that

Plugging this relationship between C_θ and C_i into the equation for the amount of a specific binding configuration derived in the previous section, we have where and φ_i0 = C_i.

Solve the amount of free receptors

Like in the specific arrangement case, we still need to solve R_eq,n numerically from R_tot,n. We first derive the amount of bound receptors of each kind at equilibrium as

Then by the conservation of mass, we have the equation to numerically solve for R_eq,n:

Again, since Φ is a function of every R_eq,n, all R_eq,n need to be solved together.

The amount of k-valently bound complexes

For randomly assorted complexes, we first derive the amount of ligands that bind k-valently. As we will show, it has a nice expression that can used to calculate many other quantities conveniently. First, let’s break q into two separate vectors, q = (q_•0, q_•x). We define the vector formed by the 0-th column of q which stand for unbound as q_•0, and the one formed by the other elements as q_•x. By the model setup, we know |q| = f, |q_•x| = k, and |q_•0| = f − k. We then have

The amount of total bound ligands and receptors

Many macroscopic properties can be derived from v_k,eq. For example, the amount of total bound ligands is simply the sum of ligands bound monovalently to fully, and can be simplified to

Similarly, the total bound receptors should be

As we show here, these quantities all have elegant closed form solutions, and they are only dependent on Φ, a single value that incorporate all information about receptor amounts, monomer ligand compositions, and binding affinities.

The number of cross-linked receptors

In some biological contexts such as T cell receptor-MHC [7] or antibody-Fc receptor [16] interactions, signal transduction is driven by receptor cross-linking due to multivalent binding. The amount of total cross-linked receptors can be derived from v_k,eq as

To find the number of crosslinked receptors of a specific kind, R_n, requires extra consideration. Similar to how v_k,eq was found, we break break q into three separate vectors, q = (q_•0, q_•n, q_•x). q_•0 is the vector formed by the 0-th column of q, q_•n is the vector formed by the n-th column of q, and q_•x contains all others. If we assume that a complex is s-valently bound, then |q_•0| = f−s. We further assume that |q_•n| = t, then |q_•x| = s−t. By this setup, we have

This formula can useful when investigating the role of each receptor in a pathway that requires multimerized binding.

Of course, the macroscopic predictions provided in this section cannot exhaust many biological quantities one may wish to study, but with the ideas we have demonstrated here, the readers can derive their own formulae as needed.

5 Application examples

In previous sections, we have shown how all macroscopic predictions made in this work can be written in closed form formulae. Therefore, many computational methods such as auto-differentiation and sensitivity analysis can be easily applied. These analyses will bring great insights into the complex behavior of multivalent binding. Here, we provide two examples to demonstrate the advantage of large-scale predictions made possible by this model.

5.1 Mixture binding prediction

Leveraging the synergistic effect among two or more drugs is of great interest in pharmaceutical development. A challenge in investigating synergy is to identify its underlying source. Most biological pathways follow a similar pattern: when the drug binds to certain surface receptors of a cell, a downstream pathway in the cell is initiated, leading to some actions. Therefore in general, synergism can come from either the initial binding events themselves or downstream processes. Binding-level synergy means that merely using a combination of ligands boosts the amount of binding to the important receptors and thus intensifies the overall effect. Downstream effect synergy indicates that the benefit of using mixtures arises from other cellular regulatory mechanisms two ligands can bring about. The binding model we introduced can help to investigate this issue by offering accurate predictions for the binding of multivalent complex mixtures.

In Figure 3, we provide an example of mixture binding predictions. We investigate a mixture of two types of ligand complexes, bivalent L₁ (θ₁ = [2, 0]) and bispecific L₁−L₂ (θ₂ = [1, 1]). The crosslinking constant is set to be , similar to previous results [16]. We predict the amount of binding of this mixture to a cell expressing three types of receptors, with R_tot = [2.5 × 10⁴, 3 × 10⁴, 2 × 10³] cell⁻¹. The affinity constants of L₁ to these three receptors are K_a,1• = [1 × 10⁸, 1 × 10⁵, 6 × 10⁵] M⁻¹, and of L₂, K_a,2• = [3 × 10⁵, 1 × 10⁷, 1×10⁶] M⁻¹. Figure 3 shows the predicted ligand bound (left panel) and R₃ bound (right panel) for only θ₁ or θ₂ with L₀ from 0 to 1 nM, and their mixtures in every possible composition with total concentration L₀ = 1 nM (from and to and ).

Figure 3:

Prediction on mixture binding of θ₁ = [2, 0] and θ₂ = [1, 1]. The left panel shows the predicted total ligand binding, while the right shows the amount of bound R₃ at equilibrium. Shaded areas are simulated confidence interval by varying the receptor levels up and down by 10%. The red dots on the left panel are simulated experimental results. In case a (red circles), since most data points are inside the confidence interval, we can assume the measurement error can explain these variations. In case b (red squares), however, the synergism of these complexes are beyond the binding level.

Mixture binding prediction can help us identify the source of synergy. To connect model predictions to experimental measurements, ligand binding might be measured by fluorescently-tagged ligands, while the number of bound receptors of a specific type might associate with an indirect measurement such as cellular response. After making a series of measurements for different compositions of mixtures, we can fit the 100% of one complex cases (numbers on the two ends on the plot) first and then compare the mixture measurements to the predictions. Determining whether the downstream effect contributes to the observed synergy (or antagonism) can be framed as a hypothesis testing problem:

H₀: The synergism of the mixture can be explained solely by binding.

The uncertainty of mixture binding prediction comes from measurement errors of receptor abundance and binding affinities. Usually, the receptor expression of a cell population has an empirical distribution which can be measured. The confidence interval in Figure 3 is drawn with the assumption that receptor expression fluctuates up and down for 10%, similar to the confidence interval of a log-normal distribution. Also, due to the measurement technique, the binding affinities may be over- or underestimated [14]. The confidence interval of mixture prediction can be determined by the model with all these considered, and a p-value can be even derived.

If most mixture measurements fall within the confidence interval of the predictions (such as case a annotated by the red circles in Figure 3, left panel), the synergy will very likely come from binding only. However, if the measurements are obviously beyond the confidence interval (case b, the red squares), it is reasonable to suspect a synergistic (or antagonistic) effect beyond binding alone. Because of the binding model’s flexibility, this method can also be extended to a mixture of more than two compounds.

5.2 Binding space of a ligand

When a dose of ligands (drug, hormone, cytokine, etc.) is released into the circulation system of an individual due to either physiological responses or exogenous administration, the compounds will spread and bind to many cell populations to varying extents. An essential question in pharmacology is how much a compound will bind to their intended target populations compared to off-target ones. This question is important for understanding basic biology as well as developing new therapeutics. For example, hormones and cytokines are important signaling molecules, and having a quantitative prediction of on- and off-target binding can help us understand their mechanism greatly. For drug development, binding prediction can guide optimization to improve specificity toward the intended targets [18]. A cell population can be defined by the protein they express, especially their surface receptors. Therefore, given the parameters of the dose and the receptor profile of a cell population, our model can make all the predictions discussed previously.

From the perspective of this binding model, there is nothing special about one specific cell population. If the local concentration is constant everywhere, our model can map any cell with a certain receptor expression to the amount of binding induced by this dose. If the biological activity of this compound on a cell is related to the quantity of binding to a certain ligand or receptor, the effect of this dose can be written as a function f, with where R_tot is a vector of nonnegative entries that describes the cell’s expression of N_R receptors, and f (R_tot) is the amount of binding. Here, we define the binding behavior of this dose (or any compound) as its binding space.

In Figure 4, we plot the binding space of a bivalent L₁ ligand θ = [2, 0] with concentration 1 nM. The binding affinities are the same as described in the last subsection. In this binding space, we consider three receptors, R₁, R₂, and R₃. We plot how the amount of binding relates to the cell expression profile, R_tot. Here, the amount of R₁ and R₂ varies with the two axes, while R₃ is held constant at 2.0 × 10³ cell⁻¹. Then we use colors and contour lines to show the amount of binding. From these two plots, we can see that although both ligand binding and R₂ binding increase with more receptors, ligand binding is more sensitive to R₁ amounts, and R₂ binding R₂ amounts. To consider any specific cell population, we only need to determine where its expression profile falls on the plot and read the predictions from the contour line. For example, on the left panel, the red cell population will have about e^5.2 = 181 bound ligands per cell. The number of contour lines a population ride on can also show intrapopulation variation. In this case, we expect the variation in ligand binding to fall between e^4.3 = 74 and e^6.0 = 403.

Figure 4:

The binding space of 1 nM θ = [2, 0]. The left panel shows the amount of total ligand bound, while the right panel shows receptor R₂ bound predictions. The x- and y-axis show the expression of R₁ and R₂, while the expression of R₃ is a constant, 2.0 × 10³ cell⁻¹, and not shown. Any cell population can be drawn on the binding space. For example, the red ellipse on the left panel represents a cell population with receptor expression at about R_tot = [1.0, 10.0, 2.0] × 10³ cell⁻¹. We can alternatively project points of experimental single cell expression data onto a binding space, as shown on the right panel.

The binding space can provide ample information about the compound. It is an intrinsic property of a ligand given its concentration and other ligand it mixes with, independent of any specific cell. The biological process of drug diffusion to a certain cell is analogous to sampling a point from this binding space. Its gradient indicates in which direction the binding level increases the fastest, as well as to which receptor it is more sensitive. An inactive antagonist that introduces binding competition with the ligand can distort its binding space, and we can visualize it by the change of shape in the contour lines. This plot can also intuitively demonstrate intrapopulation binding variance and interpopulation cell specificity of the compound. With the development of high-throughput single-cell methods such as flow cytometry, the expression profiles of a collection of cells can be identified en masse, and we can overlap their results onto a binding space plot (as in Figure 4, right panel). This shows the promise of applying our model to single-cell data. Although we can only visualize two receptors in a plot, binding space applies to any N_R types of receptors. Theoretically, the concept of the binding space of a ligand is only complete when all relevant surface receptors are considered.

6 Discussion

In this work, we propose a mechanistic multivalent binding model that accounts for the interaction among multiple receptors and a mixture of ligand complexes formed by binding monomers. We first derive the amount of lig- and of a specific binding configuration at equilibrium through the law of mass action. Using this formula, we make macroscopic predictions by applying the multinomial theorem strategically. Our predictions cover cases where complexes are formed by specific arrangement or random assortment. Finally, we provide two practical examples of how this model can help with biological research.

Compared with many previous approaches, this model has several clear advantages. First of all, it is extremely efficient, and it is capable of handling a large number of receptors, ligands, and complexes types. This allows the model to make large-scale predictions easily, enabling mixture synergy analysis and binding space calculations. The mathematical elegance of the model welcomes analytical studies and incorporating it into more complicated frameworks.

The assumptions made in this model may compromise its accuracy in some cases. For example, the steric effects of a multivalent ligand can be more complicated and context-dependent. Our setup has a single crosslinking constant, , to reflect the multivalency effect. In practice, this model works well in predicting experimental binding results [18, 15]. Some other computational approaches investigate the steric effect more meticulously, but inevitably introduce considerable added complexity [4]. When the actual situation is not known, our model can serve as an adequate starting point.

Although this model is very general purpose, it mainly focuses on the binding dynamics on a cell surface, similar to the previous work on which it is based [1, 2, 3]. For intracellular ligands discordant with the multivalent velcro shape shown in Figure 2, this model may be less suitable. For example, some previous works focus scaffold proteins in the cell signaling system for quantitative analysis [9], and various computational models different from ours have been developed [11, 6, 10].

Surface receptor binding is a universal event in biology. A prevalent question calls for a general enough solution. The model we present in this work can be successfully applied to many contexts, including predicting Fc-FcγR interaction [16] and fitting epithelial cell adhesion molecule binding data [18, 15]. With the arise of multispecific drugs in the recent decade [17], we expect this model to apply even more widely, exhibit its full competence and facilitate both basic scientific research and new therapy development.

Declaration of interest

This work was supported by NIH U01-AI-148119 to A.S.M. The authors declare no competing financial interests.

Author contributions

Z.C.T.: Methodology, Writing – original draft; A.S.M.: Funding acquisition, Writing – review & editing.

References

[1].↵
Alan S Perelson and Charles DeLisi. “Receptor clustering on a cell surface. I. Theory of receptor cross-linking by ligands bearing two chemically identical functional groups”. In: Mathematical Biosciences 48.1-2 (1980), pp. 71–110.
OpenUrl
[2].↵
Alan S. Perelson. “Receptor clustering on a cell surface. II. theory of receptor cross-linking by ligands bearing two chemically distinct functional groups”. In: Mathematical Biosciences 49.1 (1980), pp. 87–110.
OpenUrl
[3].↵
Alan S Perelson. “Receptor clustering on a cell surface. III. Theory of receptor cross-linking by multivalent ligands: description by ligand states”. In: Mathematical Biosciences 53.1-2 (1981), pp. 1–39.
OpenUrl
[4].↵
William S Hlavacek, Richard G Posner, and Alan S Perelson. “Steric effects on multivalent ligand-receptor binding: exclusion of ligand sites by bound cell surface receptors”. In: Biophysical journal 76.6 (1999), pp. 3031–3043.
OpenUrl
[5].↵
William S Hlavacek et al. “Quantifying aggregation of IgE-FcεRI by multivalent antigen”. In: Biophysical journal 76.5 (1999), pp. 2421–2431.
OpenUrl
[6].↵
Andre Levchenko, Jehoshua Bruck, and Paul W Sternberg. “Scaffold proteins may biphasically affect the levels of mitogen-activated protein kinase signaling and reduce its threshold properties”. In: Proceedings of the National Academy of Sciences 97.11 (2000), pp. 5818–5823.
OpenUrl
[7].↵
Jennifer D Stone, Jennifer R Cochran, and Lawrence J Stern. “T-cell activation by soluble MHC oligomers can be described by a two-parameter binding model”. In: Biophysical Journal 81.5 (2001), pp. 2547–2557.
OpenUrl
[8].↵
Jodi M Paar et al. “Bivalent ligands with rigid double-stranded DNA spacers reveal structural constraints on signaling by FcεRI”. In: The Journal of Immunology 169.2 (2002), pp. 856–864.
OpenUrl
[9].↵
Stephen A Chapman and Anand R Asthagiri. “Quantitative effect of scaffold abundance on signal propagation”. In: Molecular systems biology 5.1 (2009), p. 313.
OpenUrl
[10].↵
Yinghao Wu et al. “Transforming binding affinities from three dimensions to two with application to cadherin clustering”. In: Nature 475.7357 (2011), pp. 510–513.
OpenUrl
[11].↵
Jin Yang and William S Hlavacek. “Scaffold-mediated nucleation of protein signaling complexes: elementary principles”. In: Mathematical biosciences 232.2 (2011), pp. 164–173.
OpenUrl
[12].↵
Catherine A Macken and Alan S Perelson. Branching processes applied to cell surface aggregation phenomena. Vol. 58. Springer Science & Business Media, 2013.
[13].↵
Emily C Piccione et al. “A bispecific antibody targeting CD47 and CD20 selectively binds and eliminates dual antigen expressing lymphoma cells”. In: MAbs. Vol. 7. 5. Taylor & Francis. 2015, pp. 946–956.
OpenUrl CrossRef PubMed
[14].↵
SA Hunter and JR Cochran. “Cell-binding assays for determining the affinity of protein–protein interactions: technologies and considerations”. In: Methods in enzymology 580 (2016), pp. 21–44.
OpenUrl CrossRef PubMed
[15].↵
Clifford M Csizmar et al. “Multivalent ligand binding to cell membrane antigens: defining the interplay of affinity, valency, and expression density”. In: Journal of the American Chemical Society 141.1 (2018), pp. 251–261.
OpenUrl
[16].↵
Ryan A Robinett et al. “Dissecting FcγR regulation through a multivalent binding model”. In: Cell systems 7.1 (2018), pp. 41–48.
OpenUrl
[17].↵
Raymond J Deshaies. “Multispecific drugs herald a new era of biopharmaceutical innovation”. In: Nature 580.7803 (2020), pp. 329–338.
OpenUrl
[18].↵
Zhixin Cyrillus Tan, Brian Orcutt-Jahns, and Aaron S Meyer. “A quantitative view of strategies to engineer cell-selective ligand binding”. In: bioRxiv (2020).