**Abstract**

Collective cell responses to exogenous cues depend on cell-cell interactions. In principle, these can result in enhanced sensitivity to weak and noisy stimuli. However, this has not yet been shown experimentally, and, little is known about how multicellular signal processing modulates single cell sensitivity to extracellular signaling inputs, including those guiding complex changes in the tissue form and function. Here we explored if cell-cell communication can enhance the ability of cell ensembles to sense and respond to weak gradients of chemotactic cues. Using a combination of experiments with mammary epithelial cells and mathematical modeling, we find that multicellular sensing enables detection of and response to shallow Epidermal Growth Factor (EGF) gradients that are undetectable by single cells. However, the advantage of this type of gradient sensing is limited by the noisiness of the signaling relay, necessary to integrate spatially distributed ligand concentration information. We calculate the fundamental sensory limits imposed by this communication noise and combine them with the experimental data to estimate the effective size of multicellular sensory groups involved in gradient sensing. Functional experiments strongly implicated intercellular communication through gap junctions and calcium release from intracellular stores as mediators of collective gradient sensing. The resulting integrative analysis provides a framework for understanding the advantages and limitations of sensory information processing by relays of chemically coupled cells.

## Significance Statement

What new properties may result from collective cell behavior, and how these emerging capabilities may influence shaping and function of tissues, in health and disease? Here, we explored these questions in the context of epithelial branching morphogenesis. We show experimentally that, while individual mammary epithelial cells are incapable of sensing extremely weak gradients of a growth factor, cellular collectives in organotypic cultures exhibit reliable, gradient driven, directional growth. This underscores a critical importance of collective cell-cell communication and computation in gradient sensing. We develop and verify a biophysical theory of such communication, and identify the mechanisms by which it is implemented in the mammary epithelium, quantitatively analyzing both advantages and limitations of biochemical cellular communication in collective decision making.

## Introduction

Responses of isogenic cells to identical cues can display considerable variability. For instance, a population of cells will typically exhibit substantial variation in gradient sensitivity and migration trajectories within the same gradient of a diffusible guidance signal (1). The variation in response could arise from the inherent diversity of cell responsiveness (2-5), but it can be further exacerbated if the gradients of extracellular signals are shallow and noisy (6-11). In fact, sensing shallow gradients can approach fundamental physical limits that define whether diffusive graded cues can bias cell migration (12, 13). However, the spatially biased response can improve and its uncertainty can be substantially reduced if individual cells are coupled while responding to molecular gradients (5, 14-21). Strong cell-cell coupling might reduce the response noise by averaging individual responses of multiple cells (22-27). It can also alleviate sensory noise by extending the spatial range of the sensing, thus increasing the potential for more precise detection of weak and noisy spatially graded inputs. Importantly, however, cell-cell communication involved in such collective sensing may be itself subject to noise, reducing the precision of the communicated signals and therefore the advantage gained from an augmented size of the sensory and the response units. The interplay between the increasing signal and accumulating communication noise associated with the multicellular sensing, and thus the limits of this multicellular sensing strategy, remain incompletely understood.

An example of collective cellular response is branching morphogenesis of the epithelial tissue in mammary glands (28-30). The dynamic processes, whose coordinate regulation leads to formation, growth, and overall organization of branched epithelial structures, are still actively investigated (29). Conveniently, the morphogenesis of mammary glands is recapitulated in organotypic mammary culture (organoids) (31-33), extensively used to model and explore various features of self-organization and development of epithelial tissues (34). Epidermal Growth Factor (EGF) is an essential regulator of branching morphogenesis in mammary glands (35, 36). It has also been identified as a critical chemo-attractant guiding the migration of breast epithelial cells in invasive cancer growth (37). This property of EGF raises the possibility that it can serve as an endogenous chemo-attractant guiding formation and extension of mammary epithelial branches, a possibility that has not yet been experimentally addressed.

Our data reveal that the capacity of mammary organoids embedded in collagen I to respond to shallow EGF gradients requires collective gradient sensing, mediated by intercellular chemical coupling though gap junctions. Surprisingly, the advantage of multicellular sensing is limited and is substantially lower than the theoretical predictions stemming from gradient sensing models that do not account for communication noise (6). We build a theory of the multicellular sensing process, equivalent to the information-theoretic relay channel, which correctly predicts the accuracy of sensing as a function of the gradient magnitude, organoid size, and the background ligand concentration. The theory and the corresponding stochastic computational model trace the reduced sensing improvement to the unavoidable noise in the information relay used by cells to transmit their local sensory measurements to each other. This analysis allowed us to determine the approximate size of a collective, multicellular sensing unit enabling chemotropic branch formation and growth.

## Results

To study the response of multicellular mammary organoids to defined growth factor gradients, we developed and used mesoscopic fluidic devices. These devices permitted generation of highly controlled gradients of EGF, that were stable for a few days, within small slabs of collagen gels housing expanding organoids (see Fig. 1A, Methods, and Supplementary Information). We found that organoids of diverse sizes, ranging from 80 to 300 µm (or about 200 to 500 cells), developed normally within the device, forming multiple branches in the presence of spatially uniform 2.5 nM of EGF. When monitored over 3 days, the branch formation in such uniformly distributed EGF displayed no directional bias (Fig. 1C; Supplementary Fig. S2A). However, if EGF was added as a very shallow gradient of 0.5×10^{−3} nM/µm (equivalent to about 0.5×10^{−2} nM or as little as 0.2% concentration difference across a 10 µm cell), branch formation displayed a significant directional bias (Fig. 1D). The bias in formation of new branches remained the same when measured on each of the three consecutive days (Supp. Fig. S4), suggesting that EGF gradient sensing is not a transient response, and that its angular precision neither improves nor decreases with time. The bias was robust to the choice of the bias measure, as six different measures all yielded values at least four standard errors above their respective null values (Fig. 2; see also *SI* and Supp. Fig.s1)

In spite of the very shallow EGF gradient, it was still possible that the spatial bias in branching was a consequence of the gradient sensing by individual cells within the tips of the branches. To examine the sensitivity of single cells to these shallow gradients in the 3D geometry of collagen gels, we analyzed organoids derived from P-Cadherin knock out mice (38). Consistent with our previous findings (39), the luminal epithelial core of the organoids derived from P-cadherin null mice remain intact within collagen I gels, but individual and small groups of myoepithelial cells disseminate into the surrounding gel, since P-cadherin is a specific mediator of myoepithelial cell-cell adhesion. These individual dissociated cells displayed extensive migration through the collagen matrix. Although in these experiments the organoids continued to display EGF gradient-guided directional branching responses similar to those of WT organoids, the dissociated cells migrated in a completely unbiased manner (Fig. 1E, F; Supp. Figs. S2B, S3; see also *SI*). Cell motility and the distance traveled by single cells within the gels generally were the same as those observed in similar experiments performed in spatially homogenous 2.5 nM EGF distributions (data not shown). These results were corroborated by experiments in which dissociated single mammary epithelial cells isolated from WT mice or MTLn3-B1 cells were embedded in the same devices and subjected to the same experimental inputs (Supp. Fig. S4). The results of these experiments suggested that, in spite of considerable motility, there was no evidence of chemotaxis by these cells, in response to EGF gradients that were capable of triggering biased chemotropic response in organoids. Overall, our results reveal that cell-cell coupling within organoids permits sensing of EGF gradients not detectable by single cells.

Can enhanced collective gradient sensing by multiple cells be explained by a quantitative theory, permitting experimental validation? The classic Berg-Purcell (BP) theory of concentration (40) and gradient (6, 12) detection can explain why a larger detector (in this case, an organoid) has a better sensitivity than a smaller one (a cell). Briefly, the mean number of ligand molecules in the volume of a detector of a linear size *A*is , where *c* is the concentration being determined, and Overbar represents averaging. This number is Poisson distributed, so that the relative error in counting is .This bound can be modified to include temporal integration of the ligand diffusing in and out of the receptor vicinity (41). However, the organoids show steady branching and no improved directional sensitivity over the three days of experiments (Supp. Fig. S5), so the sensing can be assumed to occur on time scales much shorter than the overall branching response, without long-time integration. Estimation of spatial gradients by a cell or a multicellular ensemble involves inference of the difference between (or comparison of) concentrations measured by different compartments of the detector (6, 8, 12) (branches grow too slowly for a temporal comparison strategy to be useful (42)). For a detector consisting of two such compartments, each of size a ≪ A, the mean concentration in each compartment is, where is the concentration at the center of the detector, and *g* is the concentration gradient. For each of the compartments, the BP bound gives . Subtracting the two independently measured concentrations estimates the gradient, , which results in the signal-to-noise ratio (SNR, or inverse of the error):

Thus the sensing precision should improve without bound with the span of the gradient being measured (*A*), with the gradient strength (*g*), and with the volume over which molecules are counted (*a*^{3}). However, the precision should decrease with the background concentration () because it is hard to measure small changes in a signaling molecule against a large background concentration of this molecule.^{1} Note that Eq. [1] seems to predict an infinitely precise measurement when , and there are no ligand molecules. This paradox is resolved by the simple observation that the background concentration of the signaling molecule and the organoid size are not independent: in a linear gradient, is limited from below by , and, generally, small is only possible for a small organoid if the gradient is nonzero. In this *low concentration* limit of the BP theory, which is often the subject of analysis (6, 12), . Then Eq. [1] transforms to and the SNR *increases* with . Overall, this interplay between the size and the concentration depends on, which may take different forms depending on where organoids of different sizes are in the experimental device. Typically, the SNR has an inverted U-shape: it first grows 1 with because the span of the organoid increases, and then it drops because small differences of large concentrations must be estimated by a cell or a cell ensemble (see *SI* and Supp. Fig. S6). Interestingly, this decrease in gradient sensitivity does not require receptor saturation, as is commonly assumed (44). Calculations that account for true receptor geometries of the sensor give results similar to Eq. [1] (6). A critical prediction of this theory is that precision of gradient sensing (expressed as SNR) always increases with the organoid size *A.* We next contrasted this prediction with experimental data.To examine whether the precision of gradient sensing increases with the organoid size, we examined the bias of response of differently sized organoids naturally formed in our assays. (Fig. 3). To enable the comparison, we computed the fraction of organoids with L_{U} > L_{D}, where L_{U} (L_{D}) is the sum of branch lengths (projected in the gradient direction) pointing up (down) the gradient (measure B in Fig. 2). The corresponding theoretical prediction can be inferred from the analysis of a one-dimensional array of *N* coupled cells subjected to a ligand gradient. In particular the experimentally determined difference between ‘up’ and ‘down’ pointing branch numbers can be compared with the theoretically predicted probability that the measured number of ligand molecules in the *N*’th cell is larger than in the first cell in the array, ν_{N} > ν_{1}. We take ν_{n} as Gaussian-distributed with mean and variance , where the first term accounts for the Poisson nature of the molecular counts, and η^{2} represents the additional noise downstream of sensing, which can dominate the sensory noise, but is assumed to be unbiased (multiplicative noise was also considered, with similar effects, see *SI* and Supp. Fig. S7). We set the value of η^{2}by equating the experimental and theoretical bias probabilities averaged over all organoid sizes and background concentrations observed in the experiments. Figure 3A demonstrates that bias increases roughly linearly with the gradient strength in both the experiments and the BP model. However, Fig. 3B shows that the experimental bias saturates with organoid size, while the BP theory would predict an increase without bounds. Further, Fig. 3C shows that the experimental bias is generally weaker than that predicted by the BP theory. These disagreements with experimental results suggest that a new theory of multicellular gradient detection is required.

To develop the new theory, we note that, by assuming that information, collected by different parts of a spatially extended detector, can be integrated in an essentially error free fashion, the BP approach neglects a major complication: the communication noise. Indeed, to contrast spatially distributed inputs, e.g., the local EGF concentration, the information collected in different parts of a coupled multi-cellular ensemble must be communicated over large distances by means of noisy, molecular diffusion and transport processes. The unavoidable communication errors set new, unknown limits on the highest accuracy of sensing. From this perspective, the BP analysis accounts for the *extrinsic* noise of the ligand concentration, but not for the *intrinsic* noise (3, 45) of multi-cellular communication. To study the communication noise effects, we again approximated an organoid by a one-dimensional chain of *N* cells, each of size *a*, for a total length of *A*=*Na* parallel to the gradient direction. The observed independence of the response bias of the background EGF concentration (Fig. 3C) supports an adaptive model of sensing. We chose a minimal adaptive model allowing for chemical diffusive communication, based on the principle of local excitation and global inhibition (LEGI) (8, 46, 47). In the *n*th cell, both a local and a global molecular messenger species are assumed to be produced in proportion to the local external EGF concentration *c*_{n} at a rate *β*, and are degraded at a rate *μ*. Whereas the local messenger species is confined to each cell, the global messenger species is exchanged between neighboring cells at a rate *γ*, which provides an intrinsically noisy communication. The local messenger then excites a downstream species, while the global messenger inhibits it. In the limit of shallow gradients, the excitation level reports the difference Δ_{n} between local and global species concentrations (see *SI*). The difference Δ_{N,1} in the edge cells provide the sensory readout: positive/negative Δ shows that the local concentration at the edge is above/below the average, and hence the cell is up/down the gradient. Note that an individual cell within this multi-cellular version of the LEGI model cannot detect a gradient, as the readout will always be zero within statistical fluctuations.

In our analysis, we again note the absence of temporal integration of EGF gradients (Supp. Fig. S5; see also the extension of our analysis to the temporal integration case in Ref. (41)). Further, since there is no evidence for receptor saturation at high concentration (Fig. 3C), we confine ourselves to the linear response regime for theoretical studies. These assumptions allow us to calculate the limit of the sensory precision of the gradient detection, as a function of organoid size *N* and the background concentration (see *SI* and Supp. Fig. S6). We find that precision initially grows with *N*, then saturates at a maximal value (Supp Fig. S6C). This is in contrast to the BP estimate, Eq. [1], which predicts that precision grows indefinitely with *N*. In our model we expect precision to be the highest in the limits of a large organoid (N ≫1), fast cell-to-cell communication (ﻻ,/µ ≫1), and large local and global messenger species concentrations (β/µ ≫1). In these limits the saturating value of the sensory error takes the simplified form (Eq. [46] in *SI*):
where ;. Comparing Eq. [2] to the BP estimate, Eq. [1], we see that even when communication noise is accounted for, the organoid can achieve the noise-free bound, but with an *effective size* of *A = n _{0}a* instead of the actual size

*A*=

*N*

_{a}. Thus

*n*

_{0}, which grows with the communication rate

*γ*, sets the length scale of the effective sensory unit within the organoid: it is the number of neighbors with which a cell can reliably communicate before the information becomes degraded by the noise. Beyond N ∼

*n*

_{0}, a larger organoid is predicted to achieve no further benefit to its sensory precision. Additionally, because of this finite communication length scale, the sensory precision is predicted to depend on the concentration at the edge cell(s), rather than in the middle of the organoid. Thus the interplay between the concentration and the organoid size is also very different compared to the predictions of the standard BP theory.

We first tested the new theory that accounts for communication by simulating the multi-cellular, LEGI-based sensing with a spatially extended Gillespie algorithm (see *SI* for details). This analysis allowed us to explore the non-linear (Michaelis-Menten type) biochemical reaction regime. We verified that our theoretical predictions were fully consistent with this stochastic model in the linear regime, and were still qualitatively valid when the dependence of the local and the global signaling reactions on the input was allowed to gradually saturate (Fig. 4). In particular, under all assumptions, the advantage of increasing detector rapidly reached a maximum value. This maximum SNR value, however, gradually decreased with increasing saturation, suggesting predominant effects of decreasing sensitivity of saturating chemical reactions to the differences in the input values.

We then compared the predictions of our new theory of multicellular gradient sensing to the experimental measurements. To do that, we calculated the probability that the gradient indeed biases the branching response, i.e., that ∆* _{N}* > ∆

_{1}, where ∆

_{n}was assumed in the theory to be a Gaussian-distributed variable with the mean and variance (δ∆

_{n})

^{2}+ η

^{2}(the case of the multiplicative noise is treated in the SI and Supp. Fig. S7). The first term in the variance is calculated in the

*SI*, andthe second reflects the added noise downstream of gradient sensing, set by the average organoid bias, identical to the one found in the BP theory above. Figure 3A-C demonstrates the excellent agreement between experiment and theory that accounts for the communication noise, suggesting that the new theory is a much better explanation of the data than the BP analysis.

The experimental data in Fig. 3B place constraints on the possible range of values of the size of the effective multi-cellular sensing unit, *n _{0}*. The requirement that η

^{2}≥ 0 (downstream processes only increase noise, they do not decrease it) places the lower bound

*n*

_{0}≥ 2.9 (Fig. 3D). Roughlyspeaking, the edge cell must communicate with at least three neighbors if the inability of the observed bias to reach 1 was due

*entirely*to sensory noise, with no additional noise downstream. Further, the requirement that the model agree with the data within error bars in Fig. 3B also places the upper bound

*n*≤4 2 (similar limits come from the multiplicative model, Supp. Fig. S7). That is, a functional sensing unit of four cells or less is required to explain why all organoids, which range in width from approximately 8 to 30 cells, display roughly the same bias, independent of their size. Thus Fig. 3 demonstrates that cells receive reliable information from only a few nearby cells, and this number is tightly bounded. The tightness of the bound implies that the noise downstream of the sensing process is relatively small. Crucially, in our theory, a cell not communicating with the neighbors cannot detect a gradient, and a nonzero value of

_{0}*n*is

_{0}*qualitatively*different from

*n*

_{0}= 0. We thus tested if gradient sensing would be altered if cell-cell communication was prevented in the organoids.

A central prediction of the theoretical analysis is that preventing cell-cell communication can lead to a complete loss of sensing of shallow gradients. One simple way cell-cell communication can occur in epithelial layers is by means of gap junctions. We therefore explored the effect of disrupting the gap junction communication using four distinct inhibitors: 50 nM Endothelin-1, 50 μM flufenamic acid, 0.5 mM octanol and μM carbenoxolone (48). Although the mode of inhibition was different for these distinct compounds, application of each one of them resulted in a complete loss of directional bias in response, while the branching itself was present, and was similar to that without gap junction perturbation in spatially uniform EGF concentrations (Fig. 5A, Supp. Fig. S8). Crucially, this result also confirms that communication over the effective sensory unit is due to intracellular chemical diffusion, rather than through the extracellular medium or due to a mechanical coupling. The likely candidates for gap junction mediated cell-cell coupling are calcium or inositol trisphosphate (IP3), both of which are second messengers that can control intracellular Ca release. EGF is known to stimulate Ca signaling (49) at least in part through stimulation of IP3 synthesis, thus providing a source of these intracellular messengers.

To examine Ca signaling more directly, we used organoids obtained from transgenic mice, expressing genetically encoded Ca reporter GCaMP4, under the control of the CAG promoter (50). We confirmed that addition of 2.5 nM EGF to the medium indeed triggered a pulse of calcium signaling in a typical organoid (Fig. 5C). Furthermore, the Ca activity throughout the branching processes was coordinated, releasing calcium nearly simultaneously in cells at the tips of growing branches, suggesting cell-cell communication leading to Ca releases (see *SI*, Supp. Fig. S9, and Supp. Movie 1). To deplete intracellular Ca stores and thus potentially disrupt the effect of chemical cell-cell coupling, we treated the organoids with sarco/endoplasmic reticulum Ca2+-ATPase (SERCA) inhibitor thapsigargin. This treatment indeed was sufficient to disrupt EGF gradient sensing in the treated organoids (Fig. 5B and Supp. Fig. S2D). Surprisingly, SERCA inhibition also enhanced the branching elongation: the average length of a branch increased from 74 ± 1µm for WT organoids to 201 ± 3µm with SERCA blocking; the organoids appear to be almost entirely composed of branches after 3 days under these conditions. This result suggested that gap junction-mediated exchange of a molecular regulator that can trigger intracellular calcium release may have a negative effect on the local branching response, consistent with the assumed negative role of the diffusive messenger postulated in the LEGI model. We finally note that small molecules exchanged though gap junctions (e.g., IP3 or calcium ions) would be a natural choice for the cell-cell coupling intercellular messenger, since their smaller size and larger diffusion coefficient (compared to peptides) allow for a larger γ, which, in turn, increases the size of the effective sensory unit *n*_{0} and improves the sensing accuracy.

## Discussion

Morphogenesis and growth of complex tissues is orchestrated by diverse chemical and mechanical cues. These cues not only specify patterning of developing tissues but also direct tissue growth and expansion. However, we still lack details of how these collective, multi-cellular processes are controlled by spatial gradients of extracellular ligand molecules. Here we used mathematical modeling, computational simulations, and experimentation in a novel gradient generating device to study the directional guidance of branch formation and extension in a model of mammary tissue morphogenesis. Our data revealed that multicellular constructs undergo directionally biased migration in shallow gradients of EGF that are undetectable to single cells. Further, our analysis suggests that cell-cell communication through gap junctions underlies the increased gradient sensitivity, allowing the cell ensembles to expand the range of EGF concentrations they can sense within the gradient, and thus enhance the overall guidance signal. Increasing evidence suggests that collective sensing of environmental signals, particularly if accompanied by secretion of a common signal that enables averaging of variable and noisy signaling in individual cells, can help improve reliability of signaling, cell fate choices, and behavioral actions. Examples are abundant in coordinated pathogen actions or immune responses [8, 37-43]. Similarly, individual sensing and collective decision-making in morphogenesis and animal group behaviors have been shown to amplify weak signals observed by individual agents and to develop coherent, long-range patterns (24, 25, 51, 52). In contrast to ‘all-to-all’ signaling or response communication cases, here we focused on the case of sequential communication of a signal between the sensing units, in a relay fashion, which can enhance the sensing precision by enhancing the effective input itself. Critically, this communication mechanism, mediated by diffusive coupling through gap junctions, can be seen as an information-theoretic relay channel (53, 54), see Fig. 6. The theoretical analysis we present here is thus one of the first departures from the simple point-to-point information-processing paradigm in systems biology. In fact, our calculations of reliability of multicellular signaling, presented in this paper and in (41), are equivalent to calculating channel capacities of various Gaussian relay channels.

The key consequence of the relay communication mechanism is that it is subject to a gradual buildup of communication noise, mitigating the gain from the signal increase, and providing a fundamental limit on effectiveness of such collective sensing responses. This result runs counter to the prevailing intuition that sensing accuracy should increase without bound with the system size (40), for multicellular systems in development (27) and also for other multi-agent sensory systems. These intuitive expectations are flawed precisely because they fail to take into account the importance of communication uncertainty, which provides fundamental limits on the gains resulting from multicellular sensing. Our integrated analysis reveals that this multicellular sensing strategy in growing mammary branches is indeed limited by the noisy cell-cell communication. Importantly, we were able to combine theory and experiments to estimate these limits for EGF gradient response of mammary branching and found them to be much tighter than those that assume that all of the spatially distributed information is immediately actionable: growth of the branch beyond the size of the maximum effective multicellular sensing unit does not improve the sensing accuracy. We estimate that the sensing unit is approximately 3-4 cell lengths, a size that is consistent with the number of cell layers in small end buds of a growing mammary duct (55) (see also Supp. Movie 1). Some large end buds in vivo contain significantly more cell layers and our analysis suggests that these additional cells may be primarily involved in other functions, such as proliferation and differentiation, and not gradient sensing. The narrow bounds on the number of interacting cells also suggest that the “ actuation ” noise downstream of sensing is minimal, paralleling related findings in the nervous system (56). Interestingly, the theoretical analysis predicted that the sensory unit size is specified by a simple formula describing the typical distance traveled by a diffusing messenger molecule before it degrades or is inactivated, consistent with simpler estimates of the molecular communication reach (57). Our analysis also provided a new way to interpret the dependence between the background ligand concentration and gradient sensing - saturation of receptors is not needed to explain the often-observed decrease in the sensory precision at high concentration (12, 13, 44). Rather, the loss of precision is ascribed to increasing noise to signal ratio, stemming from the need to compare large, noisy concentrations. Similar limits might exist in any biological systems with spatially distributed sensing of spatially graded signals, including single cells or multi-nuclear syncytia.

Our results suggest that the intercellular communication underlying multicellular sensing in growing mammary tissue is mediated by calcium signaling events, as depletion of internal stores by a SERCA inhibitor both enhanced the branch formation and inhibited gradient detection. Thus release of calcium from internal stores is consistent with a negative or limiting effect on the local branch formation or extension. The release can be controlled by either IP3 or calcium itself, both of which can diffuse through gap junctions. Therefore, the inhibitory diffusive signal postulated by the LEGI models of gradient sensing may rely on the ultimate release of calcium from internal stories, as also suggested by our imaging of calcium with the genetically encoded probe. This role of calcium is consistent with its enhancement of retraction of the leading front in migrating cells (58). Consistent with the LEGI model, gradient sensing was persistent in time and exhibited very low sensitivity to the local background EGF concentration. The use of the LEGI model in our analysis, both in mathematical modeling and in spatially distributed Gillespie simulations, also showed results quantitatively consistent with the experiments, suggesting that this model was appropriate for describing the diffusively coupled collective EGF sensing.

Overall, we conclude that collective gradient sensing suggested for many natural developmental processes (59), as well as for pathological invasive tissue expansion (37), is an effective strategy, which, though subject to important limitations, can help explain the observed differences in the single cell and multicellular chemotactic responses. Importantly, the experimentally validated theory proposed in our analysis provides a way to assess the potential role of inter-cellular communication in other settings, including invasive tumor growth, pointing to the specific parameters that can be altered to disrupt this process or make it less efficient.

## Materials and Methods

*Experimental device.* Custom PDMS devices were developed using stereolithography, yielding culture area approximately 5mm wide, 10mm long and 1mm tall (see Fig. 1B). The sides of the device are open wells that allow the use of standard pipettes to change media and six replicates of the entire device is contained within a standard six-well plate. Before use, the center cell culture area is filled. This action is assisted by the hexagonal pillars, which are used to trap the liquid 3D ECM and organoid mixture within the cell culture area before the 3D ECM can harden (60). Once the 3D ECM matrix of choice has hardened, the open wells can be filled as previously mentioned. Both *in silico* and *in vivo* (see *SI* and Supp. Fig. S10) tests demonstrate a stable linear EGF gradient across the cell culture area for approximately three days, after which the media can be replenished as needed. Various compounds were added to collagen gel at final concentrations, as indicated, along with the organoids.

*Stereolithography & PDMS Casting.* Using the 3D rendering software SolidWorks (Dassault Systems), we drew the final mold for the PDMS devices. The design was electronically transmitted to FineLine Prototyping (Raleigh, NC) where it was rendered using high-resolution ProtoTherm 12120 as the material with a natural finish. Proprietary settings were used to accurately render the pillars. In two to three days the mold was shipped and after its arrival we mixed PDMS monomer to curing agent in a 10:1 ratio (Momentive RTV615). After mixing, the liquid PDMS was poured into the mold and a homemade press was used to keep the top surface flat. This press from bottom to top consisted of a steel plate, paper towel, piece of a clear transparency film, the mold with PDMS, another piece of a transparency, paper towel, piece of rubber, and an another steel plate. The entire assembly was placed in the oven at 80oC and baked overnight. The devices were then washed, cut, and placed on top of 22x22mm coverslips (72204-01, Electron Microscopy Sciences). Six devices where then placed inside an autoclave bag and sterilized. When needed, the bag was opened in a sterile environment and the devices were filled and placed inside a 6-well plate.

*Device preparation for time-lapse imaging.* In order to allow for real-time imaging, the devices were fabricated as described above, and were then cut from the PDMS using a 16mm sharp leather punch to create a circular device. The device was sterilized with ethanol and then plasma treated before being bonded directly to the bottom of a glass-bottomed 6-well plate with a 20mm hole (LiveAssay).

*Collagen preparation.* Rat-tail Collagen (354236, BD Biosciences) was pH balanced using 1M NaOH (S2770, Sigma) mixed to a final concentration of 3mg/mL with 10x DMEM (D2429, Sigma). This mixture sat in an ice block in aliquots of no more than 1.5mL until fibers formed, typically approximately 75 min (described in detail in Ref. (33)). Cells were then mixed in at 2.5 organoids/mL and a 100 μ L pipet tip was used to draw 75 μ L of the suspension. The pipette was inserted into the pre-punched hole and the suspension was gently injected into the device. The device was placed on a heat block for no more than 10 min before the side wells were filled with solution. The lid was replaced on the 6-well plate and the whole assembly was plated inside the incubator at 37°C with 5% CO2.

*Confocal Microscopy.* Confocal imaging was performed as previously reported (32, 61). Briefly, imaging was done with a Solamere Technology Group spinning disk confocal microscope, using a 40x C-Apochromat objective lens (Zeiss Microimaging). Both fixed and time-lapse images were acquired using a customized combination of µManager (https://www.micro-manager.org) and Piper (Stanford Photonics). Thereafter image stacking and adjustments were done with Imaris (Bitplane) in order to maximize clarity, but these adjustments were always done on the entire image.

*DIC Microscopy.* Phase contrast images were taken with an Axio Observer DIC inverted microscope (Carl Zeiss, Inc) using AxioVision Software (Carl Zeiss, Inc). All image processing was either done with Adobe Photoshop CS 6.0 (Adobe) or Fiji (GPL v.2) for clarity, but always done on the entire image.

*Image Quantification.* A custom Fiji program was written to measure the angle and length of the resulting branches. Additionally this program allows the user to draw a freehand outline around the body and/or the branch and body of the whole organoid. From these outlines area, a fit ellipse, and a Feret diameter were computed along with related statistics. After all measurements were made, a custom MatLab (MatWorks, Inc) program was written to create the graphs.

*Primary mammary organoid isolation.* Cultures are prepared as previously described (33). Mammary glands are minced and tissue is shaken for 30 min at 37°C in a 50 ml collagenase/trypsin solution in DMEM/F12 (GIBCO-BRL), 0.1 g trypsin (GIBCO-BRL), 0.1 g collagenase (Sigma C5138), 5 ml fetal calf serum, 250 µl of 1 µg/ml insulin, and 50 µl of 50 µg/ml gentamicin (all UCSF Cell Culture Facility). The collagenase solution is centrifuged at 1500 rpm for 10 min, dispersed through 10 ml DMEM/F12, centrifuged at 1500 rpm for 10 min, and then resuspended in 4 ml DMEM/F12 + 40 µl DNase (2U/µl) (Sigma). The DNase solution is shaken by hand for 2-5 min, then centrifuged at 1500 rpm for 10 min. Organoids are separated from single cells through four differential centrifugations (pulse to 1500 rpm in 10 ml DMEM/F12). The final pellet is resuspended in the desired amount of Growth Factor Reduced collagen.

*Multicellular gradient sensing model.* Theoretical results are derived using a stochastic dynamical model of multicellular sensing and communication. The model includes Langevin-type noise terms corresponding to ligand number fluctuations, stochastic production and degradation of internal messenger molecules, and exchange of messenger molecules between neighboring cells in a one-dimensional chain. The model is linearized around the steady state. The mean and instantaneous variance of the readout variable Δ *N* are obtained by Fourier transforming and integrating the power spectra over all frequencies. This leads to an expression in terms of the matrix of exchange reactions, whose inverse (the “ communication kernel ”) we solve for analytically and approximate in the appropriate limits to obtain Eq. [2]. See *SI* for more information.

*Statistical Analysis.* Angular histograms (e.g., Fig. 1C-F) plot the distribution of branching directions over all organoids. For each organoid, the branching direction is defined as the angle of the vector sum of its branches. A branch vector extends from the organoid body (defined by the fitted ellipse) to the tip of the branch. For single cell movement (Fig. 1E), the definitions are the same, except that branch vector is replaced by the displacement vector, from where the cell broke away from the organoid, to where the cell is observed in the image. The breakaway point is taken to be the nearest branch tip. Data contained in the angular histograms are reduced to a single bias measure in one of six ways, as described in Fig. 2. Measure B is also shown in Fig. 1, 3 and 4. See *SI* for comparison of the bias measures.

## 1. Measuring bias in organoid branching

To ensure that our determination of response bias is robust to our analysis technique, we measured bias in several different ways (Figure S1), using data for wild-type organoids in the presence of an EGF gradient (Fig. 1D in the main text). Figure S1A shows a histogram of the angles of all branches, irrespective of which organoid the branch comes from. Figure S1B shows a histogram of all organoid angles, where organoid angle is defined as the angle of the vector sum of all branches coming from a given organoid. Thus Fig. S1A is a branch-based histogram, whereas Fig. S1B is an organoid-based histogram. Figures S1A and B demonstrate that both a branch-based and an organoid-based analysis indicate that the response of wild-type organoids is significantly biased in the gradient direction. Figure S1C shows the six different bias measures defined in Fig. 2 of the main text, applied to both the branch-based and the organoid-based data. In all cases, the response is significantly biased with respect to the null value. This demonstrates that the determination of bias is robust to the choice of bias measure.

In general, we find that it does not matter whether we use a branch- or organoid-based measure to determine bias. Therefore, we focus on organoid-based measures for most of the study, since this metric retains the information about the organoids producing the branches, rather than considering branches as completely independent entities. Moreover, in general we also find that the determination of bias is robust to the choice of bias measure (see Figs. S2 and S3 below). Therefore we focus on measure B for most of the study, since it is easy to interpret and to compare with the theory: it is the probability that the vector sum of an organoid’s branches points up the gradient, not down the gradient

Figure S2 shows the six bias measures for each of the other experimental conditions considered in the main text. We see in all cases that the presence or absence of bias is robust to the choice of measure.

## 2. Measuring bias in single-cell movement

To ensure that our determination of bias in single-cell movement is also robust to the analysis technique, we subject the single-cell data to a similar multitude of bias measures. For single cells, the analog of a “ branch ” is the distance the cell migrates over time. Therefore, if more cells migrate to the right than to the left, then the cells exhibit a biased response. Fig. S3 shows the same bias measures computed for organoid branching, but now for single cell migration distances, for the experiment in which the P-cadherin mutation promotes shedding of single cells from the organoid. We compute the bias measures both (i) averaged over all cells, irrespective of the organoid from which cells are shed (Fig. S3A, analogous to the “ branch-based ” measures in Fig. S1C) and (ii) averaged per organoid, by accounting for the organoid from which the cells are shed (Fig. S3B, analogous to the “ organoid-based ” measures in Fig. 1C). In both cases, we see that the single-cell movement is not significantly biased, and that the absence of bias is robust to the choice of measure.

## 3. Mechanistic model of communicating cells

Here we present the stochastic model of gradient sensing by communicating cells. We consider a one-dimensional chain of N cells parallel to the gradient direction. As in the experiments, the mean EGF signal concentration varies linearly along the direction of the chain as

Where is the local concentration near the *n* th cell, is the cell *a* diameter, g is the concentration gradient, and is the maximal concentration at the *N* th cell. The observed independence of bias on background concentration (Fig. 3C of the main text) supports an adaptive model of sensing. We therefore choose a minimal adaptive model based on the principle of local excitation and global inhibition (LEGI) [5]. In the *n* th cell, both a local and a global molecular species are degraded at a rate µ and produced at a rate μ in proportion to the number of signal molecules in the vicinity, which is roughly . Whereas the local species is confined to each cell, the global species is exchanged between neighboring cells at a rate, γ, which provides the communication. Because there is no experimental evidence for receptor saturation (Fig. 3C of the main text), we confine ourselves to the linear response regime, in which the dynamics of the local and global species satisfy the stochastic equations
where
is the tridiagonal matrix governing degradation and exchange. Here *x*_{n} and *y*_{n} are the molecule numbers of the local and global species, respectively, and the terms *η*_{n} and *ξ*_{n} are the intrinsic Langevin noise terms with zero mean and covariances

Equation 5 and the first line of Eq. 6 contain the Poisson noise corresponding to each reaction, while the second line of Eq. 6 contains the anti-correlations between neighboring cells introduced by the exchange. Equations 4 and 6 are modified at the edges *n = {1,N}* to include exchange with just one neighboring cell.

In the LEGI framework, the local species excites a downstream species, while the global species inhibits it. In the limit of shallow gradients, the relative noise in the excitation level of this downstream species is equivalent to that in the difference ∆_{n} = *x*_{n} - *y*_{n} between local and global species’ molecule numbers. To see this, we recall from Ref. [5] that, in the LEGI model, the excitation level depends on the ratio of activator to inhibitor y as

Where is a constant. At equal activation and inhibition, *x* = *y*, the excitation level is *r*_{0} = 1/(1 + *z*). Defining *s*≡*r*-*r*_{0} as the deviation from this level, Eq. 7 can be written in terms of ∆= *x* - *y* and *y* as
where, for shallow gradients, we have assumed that the quantity ∆/*y* is small. Small fluctuations among *s*,∆ and *y* are therefore related as
or equivalently,
where the last step once again assumes is small. Thus we see that relative fluctuations in are equivalent to those in ∆. We therefore take ∆. as our readout variable, focusing in particular on ∆_{N}, the molecule number difference in the cell furthest up the gradient, since this cell initiates the morphological branching observed in the experiment.

## 4. Absence of directional sensitivity in single cells

While Fig. 1E showed the absence of directional sensitivity in individual cells, it remains possible that this insensitivity is a result of the P-cadherin knockout. To alleviate this possibility, we deposited individual cells from the MTLn3 mammary epithelial cell line [6], as well as individual cells from dispersed WT organoids into the experimental device for 3 days. Over this time, the cells can move over distances comparable to those determined organoid experiments. Directionally biased motility would result in enrichment of cells in different device zones (source of EGF, middle of the device, and sink of EGF). As seen in Supp. Fig. S5, no enrichment is observed, indicating that the absence of directional sensitivity in individual cells is not a byproduct of the P-cadherin knockout.

## 5. Instantaneous vs. temporally-integrated gradient sensing

Since the foundational publication of Berg and Purcell [1], most work on molecular sensing has considered the setup where a sensor integrates the signal over a certain time t, much larger than the typical turnover time of the ligand molecules, which is controlled by diffusion. As the diffusion brings new molecules to the vicinity of the sensor, fluctuations are averaged out, resulting in a typical decrease of the sensory error. Analyses of gradient sensing not considering [2] and considering communication [7] among the neighboring cells have also revealed similar time dependence due to temporal integration. In contrast, Figure S4 shows the organoids do not exhibit an increase in sensory precision with time between 1 and 3 days of the experiment duration. This suggests that the integration (or memory) time in this system is smaller than the typical diffusive turnover time. As a consistency check, we point out that the diffusion coefficient of EGF in extracellular space is about 50 um^{2}/s [8]. Thus a typical diffusion time across a 300 um organoid would be (300 um)^{2}/(50 um^{2}/s) = 30 min, so many biochemical signaling reactions – and integration scales defined by them – are faster (see [7] for a more careful analysis of time scales relevant for collective gradient sensing). Therefore, in what follows, we consider that ∆, an instantaneous steady-state, rather than time-averaged, difference of the local and the global messenger species, is the readout of our model most relevant for the experiments. At the same time, we refer the reader to the companion article, Ref. [7], where a full analysis with temporal integration is presented. The integration does not change the qualitative picture developed here (existence of a finite gradient sensing unit), but provides somewhat different values for the dependence of the sensory limits on the system parameters.

## 6. Mean and variance of the readout variable

The mean and variance of the readout variable are
where is the convariance.These expressions in turn depend on the mean and variance of *x*_{N} and *y _{N}*, which we now calculate from Eqs. 2 and 3 in steady state. The mean of

*x*follows straightforwardly from Eq. 2, where the term G ≡β/µ describes the factor by which the number of local species molecules is amplified beyond the number of detected signal molecules. Similarly, the mean of

_{N}*y*follows from Eq. 3, where . We see that, due to the communication, the global species number in the edge cell is a weighted sum of the signal measurements made by all the other cells.The weighting is determined by

_{N}*K*

_{n}, which we call the communication kernel and discuss in detail in the next section.

The variance of *x*_{N} is easiest to derive in Fourier space. We first consider the fluctuations and in terms of which Eq. 2 reads

Fourier transforming and rearranging obtains

Since we are interested in the instantaneous readouts only, the variance is then the integral over all frequencies of the power spectrum, where the cross terms vanish because signal fluctuations are not cross-correlated with local species fluctuations. The noise spectrum follows from Eq. 5, upon, which the second term in Eq. 17 integrates to x. The first term in Eq. 17 depends on the power spectrum of signal fluctuations, which for a Poisson process with timescale τ reads . We are considering instantaneous readouts, which is equivalent to the diffusion of EGF being slow, i.e., τ →∞ and This is the same as assuming that the number of signal molecules is Poisson-distributed but fixed in time. Thus Eq. 17 becomes

The first term is the extrinsic noise. It arises from fluctuations in the signal molecule number. Since these fluctuations are Poissonian, the variance of the signal molecule number equals its mean .Then, as these fluctuations are propagated to the local species, they are amplified by the gain G^{2}. The second term is the intrinsic noise. The intrinsic noise arises from fluctuations in the local species number itself. These fluctuations are also Poissonian, and thus the variance equals the mean .

We follow the same procedure to find the variance of y_{N}. The result is

The extrinsic noise (first term) once again scales with the gain G^{2}. It depends on the same kernel *K*_{n} that determines the mean, which reflects the fact that, as seen in Eq. 15, upstream fluctuations propagate through linear systems in the same way as the signals themselves [9]. The intrinsic noise (second term) is once again equal to the mean , which is a necessary consequence of the fact that Eq. 3 is an open system whose reaction rates are linear in the species numbers [10].

Finally, we apply the same technique to find the covariance, which is the integral over all frequencies of the cross-spectrum .The result is

This expression has a straightforward interpretation: it is the product of two extrinsic standard deviations. The first is the square root of the extrinsic noise in the local species,.The second is the square root of the extrinsic noise in the global species, but only the component affecting the *N* th cell . The Reason that only extrinsic noise enters is because *x _{N}* and

*y*only co-vary due to fluctuations in the extrinsic signal. The reason that only the N the component of the global noise contributes is because the local species is not communicated, and thus any effect on

_{N}*y*due to other cells cannot co-vary with

_{N}*x*

_{N}. Finally, the reason that the covariance takes the form of a product of standard deviations is because

*x*

_{N}and

*y*

_{N}depend identically on the signal (Eqs. 2 and 3), and therefore the correlation coefficient corresponding to extrinsic fluctuations cov

_{extrinsic}(

*x*

_{N},

*y*

_{N})/((σ

*x*

_{N}(σ

*y*

_{N}) is equal to one.

From the mean, variance, and covariance of *x*_{N} and *y*_{N}, the mean and variance of the readout variable follow via Eqs. 11 and 12. The only thing that remains is to solve for the communication kernel *K*_{n}, which we describe next.

## 7. Communication kernel

The communication kernel is found by inverting the tridiagonal matrix M*nn* First we derive the inverse, and then we present an approximation of K_{n} in the limit of strong communication and many cells.

Defining ρ≡µ/ﻻ, the diagonal (u), superdiagonal (v), and subdiagonal (w) terms of *ρM*_{nn}′ (Eq. 4) are

The inverse of any tridiagonal matrix can be calculated by recursion [11, 12],
where θ_{n} and ∮*n* satisfy

Since both *v*_{n} and *w*_{n} are constant and equal to 1, Eq. 22 simplifies to

From Eq. 24 we can also deduce that the inverse is symmetric. We write the first few terms of Ө*n* and notice the pattern,

The last term Ө_{N} does not conform to the pattern because u*N* is different from its previous terms, so we calculate Ө_{N} explicitly from Ө_{N-1} and Ө_{N-2} and simplify,

Then, since v*n* and w*n* are constants and u*n* = u*N-n*+1, we notice from Eq. 23 that

Inserting Eqs. 25-27 into Eq. 24 and simplifying, and recalling that the inverse is symmetric, we arrive at the expression for the inverse. The communication kernel is a particular case,

The communication kernal is normalized which is consistent with its interpretation as a weighting function.

Now we show that in the limit of strong communication and a large number of cells, the communication kernel can be approximated by an exponential distribution. Since all dependence on occurs in the numerator of Eq. 29, we approximate the numerator only, and then we set the denominator using the fact that K*n* is normalized. The approximation of the numerator follows two steps. First, the factorials in the choose function are written using the Stirling approximation. Second, the sum is simplified using the saddle point approximation.

We expect *K*_{n} to have the strongest support at the edge cell and nearby cells, i.e. for small values of *n*. Therefore, applying the Stirling approximation to the numerator of Eq. 29 is valid in the limit
where j*** is the value at which the summand peaks. We will see below that this condition is satisfied in the limit of strong communication and many cells.

Ignoring the denominator, we write the exchange kernel as, where

Applying the Stirling approximation log (x!) = (x + 1/2) log x - x + (1/2) log(21) yields

We now apply the saddle point approximation, which means we approximate j as continuous and expand g*j* to second order around its minimum value, permitting the evaluation of a Gaussian integral,

Here *j** is the value at which the minimum g*** occurs and at which the second derivative is evaluated. It is found by setting to zero the first derivative of Eq. 32, *>*

Ignoring the last three terms because their denominators are precisely the three quantities we have assumed are large, we solve to find
where Eq. 35 shows that j*** ˜ ϋ *N*, which means Eq. 30 can be written

The left condition in Eq. 36 requires that is small. This is satisfied in the strong communication limit ﻻ >>µ, since then .The right condition in Eq. 36 requires that N is large (there are many cells), such that the kernel falls to nearly zero still within the organoid. Inserting Eq. 35 value into Eq. 32 yields

Then differentiating eq.34 once again ignoring the last three terms, and inserting Eq. 35 yields

Now we evaluate the saddle point result (Eq. 33),

Where in the second step we drop all *n*-independent prefactors and define*n*_{0} ≡ 1/log[(1 + ψ)/(1 − ψ)]. We recover the proper prefactor by enforcing normalization,
and we see that the kernel falls off exponentially with the number of cells from the edge cell.The kernel length scale *n _{0}* can be simplified in the strong communication limit, in which is small,

We see that the length scale is the square root of the ratio of a diffusion term (γ) to a degradation term (µ). This is the same form as the length scale of morphogen profiles that are set up by diffusion and degradation, which, like the communication kernel, are exponential in shape [13].

## 8. Fundamental limit to the precision of instantaneous gradient sensing with communication

We now complete our calculation of the relative noise in the readout variable Δ_{N}. In the strong communication and many cells limit, the sums in Eqs. 14 and 19 can be approximated as integrals over all positive that are then easily evaluated using the exponential form of the kernel is the covariance. These expressions in turn depend (Eq. 41) due to the linearity of in (Eq. 1). We insert the results, along with Eqs. 13, 18, and 20, into Eqs. 11 and 12 to obtain

From Eqs. 43 and 44 we obtain the relative noise

Eq. 45 gives the relative uncertainty in the system’s estimate of the gradient via its readout *∆*_{N}, in the limit of many cells. In the brackets, the first term in parentheses arises due to the extrinsic noise. The second term in parentheses arises due to the intrinsic noise. The extrinsic and intrinsic terms have a similar structure, and in general as a function of *N* they will have a similar shape, because they both arise from the same kernel (Eq. 29). The intrinsic term reflects the counting noise from the finite number of internal communicating molecules. The extrinsic noise reflects the imperfect averaging performed by the global molecular species, since it has a finite communication length scale.

In principle, the intrinsic noise can be made arbitrarily small by producing more local and global species molecules, which is equivalent to increasing the gain G. Moreover, we observe that in the extrinsic noise, the second and third terms are smaller than the first term by a factor of *n*_{0}. This is because these terms, which involve the global species, benefit from measurements of the external signal across roughly *n*_{0} cells due to the communication. These terms are therefore small relative to the first in the strong communication limit. We are then left with

This is the central result of this section. Eq. 46 is the fundamental limit to the precision of instantaneous (not temporally averaged) gradient sensing via a LEGI-style adaptive, communicating system. Unlike for a system with temporal integration [7], Eq. 46 does not depend on the measurement time and depends on the spatial averaging scale as .

Figure S6 shows the values of , from Eqs. 11, 12 with the limiting values, or the fundamental limits, given by Eq. 45. In particular, Fig. S6D and E are the analogs of Fig. 3B and C in the main text, except that Fig. 3 plots the estimate of the organoid bias,*P* (∆_{N} > ∆_{1}), which is easily obtained from SNR*n*. Note that in Fig. S6D, SNR_{N} decreases at large N because large organoids push the N th cell to higher concentrations, where gradient sensing is less precise. In contrast, in Fig. 3B in the main text, the bias *P* (∆_{N} > ∆_{1}) saturates,for two reasons: (i) bias derives from the both SNR_{1} and SNR_{N}, which are pushed to opposite concentration regimes for large organoids, and (ii) Fig. 3 also includes additive downstream noise, which is independent of both size and concentration, and thus tends to flatten out dependencies.

## 9. Spatially resolved Gillespie stochastic simulations to explore modification of fundamental limits to the precision of instantaneous gradient sensing under violation of linearity assumptions

Our theory above made two linearity assumptions. First, we assumed that receptors are not saturated at high ligand concentrations, allowing us to treat the production rate of messenger molecules as a linear function of the position. Second, we assumed that the readout is the difference of the local and the diffusive messenger. In more conventional analysis of LEGI models, the readout is the concentration a response molecule *R*, positively modified by the activator *A* and negatively modified by the inhibitor, *I* [5, 14, 15]. To verify how our findings for the fundamental limits of collective gradient sensing are affected by these assumptions, we set up numerical stochastic and spatially-extended simulations of the system. Organoids were simulated using the HSim rule-based modeling program [16], version released 4/27/2015. For parameter exploration, a Python script generated model files with appropriate parameters and called HSIM with random seeds. Simulations were run on IBM NeXtScale nodes with Intel Xeon E5-2660 V2 and V3 processors.

Simulations were run for model organoids represented as coupled linear chains with the following numbers of cells: 3, 6, 10, 12, 15, 20, 25, and 50. For each simulated cell *n*, a set of molecules (*S*_{n}, *A*_{n}, *I*_{n}, *R*_{n}) was initiated which interacted only with each other (Table S1). In the LEGI model, *S*_{n} (the signal molecule) activates *A*_{n} and *I*_{n}. The activated *A*_{n} was allowed to activate *R*_{n}, and the activated *In* was allowed to deactivate it. *I*_{n} was also allowed to diffuse to become *I*_{n±1}. Each interaction was modeled as a Michaelis-Menten reaction. *A*_{n} and *I*_{n} were both allowed to deactivate with equal rates. Spherical cells with diameter 10 micron were initialized with *A*_{n} = 1000, *I*_{n} = 1000, *R*_{n} = 500, and molecules. *S*_{N} was initialized to 1000 in each simulation’s final cell, with the gradient of 5 molecules per cell. All kinetic parameters present in both the theory and the simulations were selected to match (see Fig. 4 of the main text and Table S1). To investigate the effects of saturation, deactivation rates of *A*_{n} and *I*_{n} were scaled by 1/4 and 1/10 for partial and full saturation, respectively. High saturation of *A*_{n} and *I*_{n} was confirmed by removing reactions with *R*_{n} and observing nonlinear response to varying *S*_{n}. The diffusion rate of *I*_{n} was scaled accordingly to maintain a communication strength cells. Supplementary Table S1 shows the values of all kinetic rates used in the low saturation simulations.

For each scenario (low, medium, and high saturation) and each number of cells, 16,384 simulations were run for a total of 393,216 runs. Simulations were ran sufficiently long (10,000 sec) so that the SNR had reached the steady state. SNR is reported as the squared mean over the variance of in the final cell at the end of simulations. Error bars are determined by bootstrap sampling, reporting variance of 100 re-samples of size 16,384 taken from the original data with replacement.

## 10. Limits on the size of the multicellular sensory unit with multiplicative downstream noise

In the main text, Fig. 3, we compared theoretical predictions of the BP model, as well as the model accounting for the communication noise, with the experimental data under the assumption that the noise in initiation of the phenotypic response, downstream of the gradient sensing, is additive. Here we consider a multiplicative noise model. For the BP theory, we again calculate the probability that the measured number of ligand molecules in the *N*’th cell is larger than in the first, *v*_{N} > *v*_{1}. However, now we take *v*_{n} as Gaussian-distributed with mean and variance where *f*^{2} ≥ 1 represents the multiplicative increase due to downstream noise. Similarly, for our theory with diffusive communication, we calculate the probability that ∆_{N} > ∆_{1}, where ∆ _{n} is Gaussian-distributed with mean , and variance (δ Δ_{n})^{2} *f*^{2} where both and (δ Δ_{n})^{2} calculated earlier in this Supplementary Information. Supplementary Figure S7 is the multiplicative noise analog of Fig. 3 in the main text. Importantly, Fig. S7 demonstrates that our results depend only weakly on the assumed properties of the downstream noise. In particular, with either additive or multiplicative noise, the data support our theory with communication over BP theory (Fig. 3B and C of the main text, and Fig. S7B and C here), and we obtain similar estimates of the multicellular sensory unit given by *n*_{0} (Fig. 3D of the main text and Fig. S7D here).

## 11. Treatments with gap junction-blocking drugs remove organoid response to EGF gradients

In addition to Endothelin-1, Fig. S8 confirms that other other gap-junction blocking drugs also remove the directional response of the organoids.

## 12. Calcium signaling is coordinated in nearby cells

To test the hypothesis that the global, diffusive inhibitory messenger in the organoids is related to calcium signaling (such as IP3 or calcium itself) we manually tracked 5 cells in the area at the front of a growing branch (see Supp. Movie 1) in an organoid derived from a transgenic mouse expressing genetically encoded Ca reporter GCaMP4, under the control of the CAG promoter [17], see Fig. S9. Calcium spikes in these cells are highly synchronized, indicating communication by calcium spikes inducing messengers. Note also that the size of the tip is consistent with our estimate of the gradient sensing unit (about 4 cells).

## 13. Gradient establishment in the device

Numerical simulations show that a linear gradient of EGF, a 6.4 kDa protein, is established in our device in less than 24 hrs. We verify this by flowing an easily observable 10 kDa fluorescent protein (Dextran, Cascade Blue, Life Technologies) through the system and imaging it a day after the initition of the experiment. Supplementary Fig. S10, indeed, shows a nearly linear gradient. EGF is smaller, has a higher diffusion coefficient, and will establish a stable gradient even faster.

## Acknowledgements

We thank Peng Huang for useful discussions. This work was supported in part by James S. McDonnel Foundation grant No. 220020321 (AM and IN), by NSF grants No. 1410978 (IN), 1410593 (AE), and 1410545 (AL), and NIH grant GM072024 (AL).

## Footnotes

↵

^{1}This is similar to the observation that a small difference of two large numbers always has a larger relative error than either of the two numbers, and so one is frequently cautioned against making such subtractions in scientific computing (43).