The role of cell geometry and cell-cell communication in gradient sensing

Cells can measure shallow gradients of external signals to initiate and accomplish a migration or a morphogenetic process. Recently, starting from mathematical models like the local-excitation global-inhibition (LEGI) model and with the support of empirical evidence, it has been proposed that cellular communication improves the measurement of an external gradient. However, the mathematical models that have been used have over-simplified geometries (e.g., they are uni-dimensional) or assumptions about cellular communication, which limit the possibility to analyze the gradient sensing ability of more complex cellular systems. Here, we generalize the existing models to study the effects on gradient sensing of cell number, geometry and of long- versus short-range cellular communication in 2D systems representing epithelial tissues. We find that increasing the cell number can be detrimental for gradient sensing when the communication is weak and limited to nearest neighbour cells, while it is beneficial when there is long-range communication. We also find that, with long-range communication, the gradient sensing ability improves for tissues with more disordered geometries; on the other hand, an ordered structure with mostly hexagonal cells is advantageous with nearest neighbour communication. Our results considerably extend the current models of gradient sensing by epithelial tissues, making a step further toward predicting the mechanism of communication and its putative mediator in many biological processes.


Introduction
1 Collective migration of epithelial cells is central in plenty of biological processes, ranging 2 from development, to wound healing and cancer 3 [1], [2], [3], [4], [5], [6] [7], [8] [9], [10], [11]. The coherent movement of cells is often 4 accompanied by extensive changes in cell shape and geometry, with the formation of 5 higher-order vertices, as in multicellular rosettes [6], [12], [13], [14], [15]. In many cases, 6 cells follow the gradient of a guidance cue (e.g., chemoattractants), which induces the 7 initiation of a migration process [4], [16]. The gradients of the chemoattractant signals 8 are typically shallow [17], which raises the question of how cells can detect small 9 changes in chemical concentrations. 10 Based on experimental evidence, it has been proposed that cellular communication 11 can enhance the ability of groups of cells to measure shallow gradients [18]. The 12 local-excitation global-inhibition (LEGI) model [17], [18], [19] is one of the simplest 13 models of collective gradient sensing through intercellular communication. The LEGI 14 model posits that a local and a global reporter molecules are produced in response to 15 the external signal. The local reporter remains confined in the cell and represents a 16 local measurement of the signal; the global reporter is exchanged between neighbour 17 cells, allowing the system to perform a spatial average of the signal concentration. Each 18 cell can then estimate local deviations of the signal from the overall spatial average by 19 combining the information encoded in the levels of the two reporters [18]. This model 20 has been successfully used to describe branching morphogenesis of the epithelial tissue 21 in mammary glands [18], and its physical limits have been derived [19]. 22 The ability to measure gradients can improve even further when the local reporter is 23 also exchanged between neighbour cells, as shown with the regional-excitation 24 global-inhibition (REGI) model [19], or when the communication can involve 25 non-nearest neighbours cells [20]. 26 While the above mentioned studies provided experimental and theoretical 27 foundations for the role of cell communication in gradient sensing, they only considered 28 1-dimensional chains of cells or very simplified 2-dimensional geometries (e.g., made by 29 multiple 1-dimensional chains, coupled in the direction transverse to the gradient [21]). 30 However, complex and dynamically changing geometries are typically found in 31 2-dimensional epithelial sheets during tissue morphogenesis and organ formation. For 32 example, during the establishment of the anterior/posterior axis in mouse embryos, 33 changes in cell packing occur concurrently with the migration of the anterior visceral 34 endoderm, a subset of cells that specifies the anterior side of the embryo. This leads to 35 the formation of multi-cellular rosettes, i.e., groups of five or more cells that meet at a target receptors through the extracellular space. Other examples include signalling 47 through soluble ligands that bind the epidermal growth factor receptor (EGFR) in 48 many tissues and organs [30] and the long-range communication mediated by 49 extracellular vesicles in retinal pigment epithelial cells [31]. 50 Here, we investigate how the size and the geometry of a 2D system of epithelial cells 51 affect the gradient sensing ability. To this aim, we generalize the LEGI model and, via 52 numerical simulations and analytical calculations, we estimate how the signal-to-noise 53 ratio is affected by changes in the mean polygon number, which determines the presence 54 of higher order vertices. In addition to the nearest neighbour communication included 55 in the standard LEGI model, we also consider cellular communication mediated by a 56 global reporter that diffuses in the intercellular space and can thus mediate a long-range 57 communication. 58 Our work integrates in a single modelling framework the complex arrangement of  We start from the LEGI model without temporal integration [18], [21] and we consider 66 a molecular signal in a 2-dimensional space with a concentration c that varies with the 67 position r: where c 0 is a background concentration and the vector g indicates the direction and 69 strength of the gradient.

70
A group of N epithelial cells respond to this molecular signal by producing a global 71 and a local reporter molecules, in an amount that is proportional to the signal 72 concentration. The global reporter is exchanged between cells, while the local reporter 73 is confined within a cell. Hence, the geometric configuration of the N cells can only 74 influence the dynamics of the global reporter (see Section 2.2).

75
The stochastic dynamic equations in the linear response regime for the molecule 76 numbers of the local (u) and the global (v) reporters are [18]: where k = 1, . . . , N is the cell index; c is the concentration of the signal; r k is the 78 position of the centroid of cell k; a is the linear size of a cell; β and µ are the production 79 and degradation rates of the reporters, respectively; η k and ξ k are Langevin noise 80 terms [18].

81
The matrix M = {M nn } includes terms related to the degradation and the 82 exchange of the global reporter between neighbour cells. Its elements are: where γ nn is the exchange rate of the global reporter between cells n and n . In the 84 next sections, we will indicate the matrix of the exchange rates as Γ = {γ nn }.

85
In the original unidimensional formulation of the model, the exchange rate is the same 86 for every pair of nearest neighbour cells and the matrix M is tridiagonal, since it 87 includes the exchange rate of the global reporter between nearest neighbour cells in 88 1-dimension and its degradation rate. In Section 2.3 we will extend its definition to 89 nearest neighbour communication in more complex 2-dimensional systems and to 90 communication through a reporter diffusing in the intercellular space.

91
In the LEGI model, the cellular response to the signal is mediated by a molecule 92 that is activated by the local reporter and inhibited by the global reporter. Thus, in the 93 limit of a shallow gradient, the readout in the cell n is the difference between the 94 numbers of the local and global reporter molecules, ∆ n = u n − v n [18]. Given the 95 properties of the reporters, this quantity is equivalent to the difference between the 96 estimation of the local concentration of the signal (provided by the local reporter) and 97 the average concentration (provided by the global reporter) over a spatial region whose 98 extension depends on the range of cell communication [18].

99
To take into account the noise in the production, degradation and exchange of 100 molecules, the average∆ n and the variance (δ∆ n ) 2 of ∆ n over time are computed from 101 the steady state equations for u k and v k [18]: The quantity of interest is the square root of the Signal to Noise Ratio (SNR) computed 103 in cell n which quantifies the precision of gradient sensing achieved by cell n. To compute it, we 105 use the formula obtained in [18].

106
The values of the parameters were chosen as in previous studies [18], [19], [32], and 107 are reported in Table 1. We assume the limit of shallow gradient so we take c 0 a| g|. 108 Table 1. Parameter values. c 0 is the background concentration, a is the typical linear size of a cell, β and µ are the production and degradation rates of the reporters, | g| is the slope of the gradient [18], [19], [32]. Given the signal gradient g, we estimate the SNR in the cell that is exposed to the 110 highest concentration of the signal (as measured in its centroid). This is in line with 111 what has been done in previous studies [18], [19], [21] and presupposes that the cell 112 measuring the highest signal concentration is the first responding (e.g., by starting a 113 migration or branching process).

114
To account for the different orientations of the gradient, we parametrize it with the 115 angle θ that the gradient forms with the x-axis (see Fig. 1) and sampled 500 equally 116 spaced values of θ between 0 and 2π. Then, for each value of θ, we compute the SNR as 117 explained above. By doing so, each group of cells has a set of 500 SNR values that 118 represent how the SNR varies over all possible orientations of the gradient.   First, we generated a 2-dimensional configuration with a given number of hexagonal 124 cells and a Gaussian noise on the position of the cell centroids (with zero mean and 125 standard deviation 0.5; "noise" optional parameter in the function "planar sheet 2d" 126 from the "tyssue" package). By definition, such a configuration has mean polygon 127 number 6. Each cell configuration is defined by the coordinates of the cell centroids, the 128 set of edges forming the boundaries of the cells and the set of vertices in which two or 129 more edges meet. Next, we randomly collapse edges until the target mean polygon 130 number is reached. Using this procedure, we sample N C configurations with 131 approximately equally spaced values of the mean polygon number in a specified range. 132 As an example, we report in Fig. 2.A a set of configurations with number of cells 133 between 7 and 127 and mean polygon number between 5 and 6. All the configurations 134 used in the present study are available from the GitHub repo  In this Section, we introduce the two models of intercellular communication we   The first model of communication is the Nearest Neighbour Exchange (NNE), in which 141 the global reporter is exchanged between nearest neighbour cells. This is the same 142 model considered in [18], [19], [21], [32], although they included simpler system  Fig. 2.B). Diffusion in the lateral intercellular space is known to 157 occur in epithelia and has been studied, for example, in cultured renal cells 158 (Madin-Darby canine kidney) [34], [35]). In this model of cell communication that we 159 call Intercellular Space Diffusion (ISD), the exchange rates γ ij between any pair of cells 160 i and j can be non-zero and can be written as: where α is the secretion rate of the reporter and P ij is the probability that a reporter 162 secreted by cell i is internalized by cell j.

163
The probabilities P ij depend on the internalization rate λ, on how fast the reporter 164 can diffuse in the intercellular space and on the geometry of the epithelial tissue (see   For each cell in the configuration, we generate N T = 5000 trajectories using the 188 above algorithm, fixing λ = 1.0 s −1 and taking D = 10 µm 2 /s or D = 1000 µm 2 /s to 189 model slow or fast diffusion (see below and [34], [35]).

190
From the above simulations, we computed the probabilities P ij as where N  p-values to variations in the number of sampled configurations [36].

203
For two sets of observations A and B, the CLES is defined as We consider the outcome of the Wilcoxon rank sum test significant if the p-value is  While we used computer simulations to evaluate the SNR (see above), here we provide 212 an analytical estimation of the exchange rates of the global reporter in the ISD model, 213 to understand how the parameters affect cellular communication in this model.

214
Once a molecule is released on a cell boundary (i.e., an edge) in x = x 0 , under the 215 assumption that the internalization is a Poisson process with rate λ, we can write the 216 probability that the molecule is internalized at a time t < τ as: From this, using the First Passage Time Distribution for a free Brownian motion 218 (i.e., a Levy distribution), F P T (τ, d), we can obtain the probability that the molecule is 219 internalized by a cell before travelling a distance d: Based on P int (d), we can obtain an approximation for P ij , i.e., the probability that 221 a global reporter secreted by cell i is internalized by cell j: where i and j are neighbours of order m (i.e., m = 1 corresponds to nearest neighbours), 223 l E is the average edge length and N m is the number of neighbours of cell i of order m. 224 In this approximate calculation, we considered that a molecule typically has to travel at 225 least ∼ 2 edges to move from a neighbour cell of order m to one of order m + 1.     [6].

278
In this section, we compute how gradient sensing is affected by cell shape looking at 279 the SNR as a function of the mean polygon number in a system with a fixed size 280 (N = 127 cells).

281
In the strong-local and strong-global communication regimes (Figure 6.C-D and 282 Supp. Fig. S2.B), the mean polygon number has little effect on the SNR in both the 283 ISD and the NNE models, due to the very efficient exchange of the global reporter 284 between neighbour cells. On the other hand, in the case of weak-local communication, 285 the SNR tends to slowly increase with the mean polygon number, especially in the NNE 286 model ( Figure 6.A-B). This is because, with larger mean polygon numbers, the number 287 of nearest neighbours increases, which is advantageous when the communication is local 288 and not very efficient.

289
Interestingly, in the weak-global regime of the ISD model, we found the opposite 290 behaviour: the SNR is higher in configurations having lower mean polygon numbers 291 ( Figure 6.E). This implies that, in this regime, configurations with higher-order vertices, 292 characterized by the presence of heterogeneous cell shapes and of, e.g., multicellular 293 rosettes [6], confer a better ability to sense shallow gradients with respect to more 294 ordered configurations. Indeed, with more disordered configurations and higher-order 295 vertices, the global reporter can reach more efficiently cells that are located at larger 296 distances. To show this, we estimated from our simulations the exchange rates γ as 297 function of the distance d between cells in the weak-global regime of the ISD model, and 298 we found that γ(d) has a slower decay with a smaller mean polygon number ( Figure   299 6.F), which suggests a more efficient long-range communication.

300
The results in systems with fewer cells are consistent with those shown above, 301 although the trends are noisier and less clear (Supp. Fig. S4). 302

Comparison between the NNE and ISD models 303
Finally, we compared the gradient sensing performance of the NNE and ISD models on 304 the same cell configuration.

305
To make this comparison, for each of the four regimes of the ISD model (with the 306 parameters selected above; see Fig. 5), we computed the average exchange rate between 307 nearest neighbours γ nn ISD and in the NNE we fixed the exchange rate γ N N E = γ nn ISD .

308
The range of values obtained for γ N N E in each communication regime of the ISD model 309 is shown in Table 2. By doing this, in each comparison we imposed that the global 310 reporter is exchanged at roughly the same rate between nearest neighbours in the two 311 models.

312
The results with N = 127 cells and different values of the mean polygon number are 313 shown in Fig. 7, where on the x-axis we plotted the √ SN R for ISD and on the y-axis 314 the √ SN R obtained with NNE.

315
In general, the SNR tends to be larger in the ISD compared to the NNE case, due to 316 the possibility of a long-range communication in the ISD model. More specifically, when 317 the reporter is mostly only locally and very efficiently exchanged (strong-local regime, 318 Fig. 7.B), the two models are almost equivalent. ISD yields a greater SNR if a longer 319 range communication is enabled (strong-global regime, Fig. 7.D), or when the 320 communication is less efficient (weak-local and weak-global, Fig. 7.A,C). In particular, 321 the largest difference between the two models is observed in the weak-global regime  Fig. 7

332
In this paper, we have analyzed the ability to measure shallow gradients of signals by 333 2D epithelial tissues via cellular communication. We started from a previously proposed 334 model called LEGI, and extended it to study complex cell geometries in 2D and by 335 adding an alternative mechanism enabling long-range communication, based on the 336 diffusion of the global reporter in the intercellular space. Then, we computed the SNR 337 of different cell configurations varying the model parameters and the type of 338 communication. 339 We found that, with a number of cells 7 < N < 127, different behaviours of the SNR 340 are possible depending on the type of communication. Specifically, smaller systems tend 341 to have higher SNR when the communication is"weak" and "local", whereas larger 342 systems are better when the communication is "global" (with the ISD communication) 343 and/or "strong" (see Figure 5).

344
With our model, we generated different cell geometries that are characterized by 345 higher order vertices associated with, e.g., multi-cellular rosettes. This allowed us to 346 characterize how the geometry influences the SNR with different types of 347 communication. Interestingly, we observed that, in the ISD model, when the reporter is 348 exchanged with low efficiency but can diffuse fast in the intercellular space, more 349 irregular geometries have a higher SNR and, thus, are more sensitive to external 350 gradients. This result might explain the observed extensive cell re-arrangements and the 351 formation of multi-cellular rosettes occurring in the anterior visceral endoderm (AVE) of 352 the early mouse embryo concomitantly with its migration, which is essential for the 353 definition of the anterior/posterior axis [6]. Indeed, the potential role of cellular 354 communication in AVE migration has been recently hypothesized [38]. proposed [32], while a recent work focused on the role of cell-cell adhesion in the 364 migration of 2D groups of cells [40]. Our generalization of the LEGI model with 365 complex 2D geometries would improve the above mentioned cell motility models.

366
Several other models of collective cell migration including gradient sensing, not relying 367 on the LEGI paradigm, have been developed. For instance, Roy et al [41] have 368 analyzed the influence of connectedness and size of 2D cell clusters on the velocity of 369 migration, and Camley et al [42] studied the role of cell-cell variation in responding to 370 an external signal in collective cell migration. All these models study specific properties 371 of groups of cells that would be worth considering in an extension of our model to 372 include a cell motility mechanism.

373
In conclusion, here we provide a modelling framework that clarifies the role of the 374 size and the geometry of epithelial cells in gradient sensing, and can be easily extended 375 to include further molecular and geometrical details. We believe it will be particularly 376 interesting to combine this model with the large-scale sequencing and imaging data that 377 are becoming available, for example on AVE migration in mouse embryos [43], [44], [45], 378 with the goal of identifying the underlying molecular mechanisms. Many algorithms, in 379 fact, have been developed to extract from single-cell and spatial transcriptomic datasets 380 candidate molecules that could be mediating cellular communication 381 (e.g., [46], [47], [48]). Our model can complement these algorithms by providing a 382 quantitative framework that can be used to further refine the lists of candidate 383 molecules based on their properties, the size of the system and the cell geometry. To 384 facilitate the use of our modelling framework by the community, we made all the code 385 freely available on GitHub (https://github.com/ScialdoneLab/2DLEGI).