Stochastic Voronoi Tessellations as Models for Cellular Neighborhoods in Simple Multicellular Organisms

. Recent work on distinct multicellular organisms has revealed a hitherto unknown type of biological noise; rather than a regular arrangement, cellular neighborhood volumes, obtained by Voronoi tessellations of the cell locations, are broadly distributed and consistent with gamma distributions. We propose an explanation for those observations in the case of the alga Volvox , whose somatic cells are embedded in an extracellular matrix (ECM) they export. Both a solvable one-dimensional model of ECM growth derived from bursty transcriptional activity and a two-dimensional “Voronoi liquid” model are shown to provide one-parameter families that smoothly interpolate between the empirically-observed near-maximum-entropy gamma distributions and the crystalline limit of Gaussian distributions governed by the central limit theorem. These results highlight a universal consequence of intrinsic biological noise on the architecture of certain tissues.

Some of the simplest multicellular organisms consist of tens, hundreds, or thousands of cells arranged in an extracellular matrix (ECM), a network of proteins and biopolymers secreted by the cells.They often have a simple geometry: linear chains and rosettes of choanoflagellates [1,2], sheets and spheres of algae [3], and cylinders of sponges [4].While at first glance the arrangement of cells in the ECM appears regular, recent work [5] revealed a hitherto undocumented disorder found by assigning neighborhoods to cells through a Voronoi tessellation based on cell centers.Strikingly, both the labevolved "snowflake yeast" [6] (a ramified form found after rounds of selection for sedimentation speed) and the alga Volvox carteri have broad distributions of Voronoi volumes v accurately fit by k-gamma distributions where v is the mean volume and v c is the cell size.Particularly for Volvox, these observations are central to a general question in developmental biology: How do cells produce structures external to themselves in an accurate and robust manner?Volvox is one of the simplest multicellular systems with which to study statistical fluctuations in ECM generation.The adult (Fig. 1(a)) consists of ∼ 10 3 somatic cells embedded at the surface of a transparent ECM, the uppermost layer of which is a thin elastic shell ∼ 500 µm in diameter and ∼ 30 µm thick, with a more gelatinous interior below; the organism is > 98% ECM.Daughter colonies develop from germ cells below the outer layer through rounds of binary division that yield a raft of cells held together by cytoplasmic bridges remaining after incomplete cytokinesis.Following "embryonic inversion" that turns the raft inside-out [7], daughters enlarge by export of ECM proteins, expanding the colony to its final size over the course of a day, during which the widely-distributed neighborhood volumes appear.Fig-ure 1(a) shows a a section of the Voronoi tessellation obtained by light-sheet imaging [5].The area distribution of Voronoi partitions across 6 organisms is shown in Fig. 1(b) along with a fit of the gamma distribution (1) that yields k ≈ 2.35 ± 0.04 (95% CI).
The general issue above becomes the question of how cells generate ECM so that the spheroidal form is maintained during the dramatic colony growth despite the strong right-skew of the neighborhood volume distribution (1).A biological answer might invoke cell signaling in response to mechanical forces as a mechanism to coordinate growth and would ascribe the distribution (1) to imperfections in that process.Surprisingly, the novel problem of neighborhood distributions is so littlestudied that we do not even understand quantitatively the feedback-free case, surely a benchmark for any analysis of correlations.Work in granular physics [8] has shown that (1) arises from maximizing entropy of partitions subdividing a volume subject to a fixed mean partition size.But this begs the question of why biological systems should follow a maximum-entropy principle.
Here we study perhaps the simplest models for cellular positioning within a thin ECM, where noisy matrix production by statistically identical cells causes them to space apart randomly during growth.We formulate the resulting stochastic cellular configuration as a point process [9] whose Voronoi tessellation is a well-studied topic in stochastic geometry [10] and serves as a model for cellular neighborhoods.A class of analytically solvable 1D models is used to illustrate how gamma distributions (1) may arise from feedback-free growth processes, and a oneparameter family of 2D stochastic Voronoi tessellations is introduced as a prototype of systems with interactions between polygons.The following is a non-technical summary; details are in Supplementary Material [11].In the following, capital letters X i;θ1,... denote random variables (r.v.s) i with parameters θ 1 , . .., and W, X, Y, Z denote Gaussian, exponential, gamma, and beta r.v.s.In an act of extreme reductionism, consider a linear Volvox (termed model I.0 [11]).The gamma distribution has the property of being self-divisible, that is, by the convolution rule for sums of random variables [11].In a 1D ECM (Fig. 1c), if L i are the cellular spacings then V i = (L i + L i+1 )/2 are its Voronoi segments, which suggests the decomposition of V i into spacings L i which are themselves independent and identically distributed (i.i.d.) gamma r.v.s.Similarly, L i is itself interpretable as arising from i.i.d.gamma-distributed mass increments during growth.This leads to two classes (i, ii) of configurations which one may expect to observe experimentally.The first (i) is that where L i is formed by a large number k of small random mass increments, where k-gamma converges to a Gaussian by the central limit theorem, and at fixed mean to a delta distribution [11].
The second (ii) is the case in which L i is formed by a small number of large mass increments, suggesting some intermediate piece comprising the ECM is produced at low copy number.A plausible precedent for fluctuations possessing this particular distribution is the bursting protein transcriptional activity observed in simple unicellular organisms such as E. coli [29].There, it is known that mRNA transcription occurs at some rate when a gene is turned on, individual mRNA molecules are transcribed at some rate into proteins before degrading (e.g. by RNases) with an exponentially-distributed lifetime, and individual protein bursts exported into the extracellular environment correspond 1-1 with individual mRNA transcripts within the cell.Translating this phenomenology to Volvox, we hypothesize that L i ∝ P i where P i is the steady-state extracellular concentration of a protein P governing ECM assembly.Then the time-dependent concentration P i (t) is a pure-jump process with some total number b of exponentially-distributed bursts, resulting in the b-gamma-distributed spacings L i = Y i;b,λ .(Fractional values of b, representing cross-cell averages, yield the same stationary b-gamma distribution, as shown from the master equation for P i (t) with exponential kernel [30].We take b ≥ 1 as every cellular neighborhood grows in Volvox without cell division.)Then, V i is 2bgamma distributed, which, apart from the offset v c , is (1) with k = 2b.That k ≈ 2.4 in Volvox, approaching the 1D lower bound of k = 2 and apparently falling in class (ii), is consistent with observations that ISG, a glycoprotein critical to the ECM organization, is transcribed over a period of 10 minutes, quite short compared to its accumulation in the extracellular matrix over timescales comparable to the 48h life cycle [31].
In the low-copy number limit b → 1, cellular positions R i = j≤i L j occur as a Poisson process.This is the maximum-entropy configuration, in which cell positions are uncorrelated in the sense that for any fixed number of cells N occurring within a fixed segment of size L, {R i } N i=1 are i.i.d.uniform random variables [11].Of course, the gamma distribution is supported on [0, ∞) and one must consider finite-size effects.It can be shown [11] that on a circular ECM of fixed circumference C with fixed or variable cell count N (termed models I.1-I.3), the marginal distribution of Voronoi lengths given the above converges in the large-C, N limits at fixed cell number density to the same 2b-gamma distribution-analogous to the convergence of ensembles in statistical physics in the thermodynamic limit.
To complete the 1D analysis, we show that the offset v c in (1) from finite cell sizes.Suppose cells with centers of mass at {R i } n i=1 on a circle of circumference c have uniform size v c with nv c ≤ c, so that L i ≥ v c for all i.As we are in 1D, this is expressible as L i = v c + Li , where Li are the random spacings of a smaller circle of circumference c − nv c .This reduces to the fixed-N, C case of model I.2, hence we have the marginal Beta distribution for Voronoi lengths Defining the cell number density within the remainder as ρ = (n − 2)/(c − nv c ) and taking the thermodynamic limit n, c → ∞ with ρ and v c fixed, we have by (S46) Thus, the lengths V i with 2ρ := λ in (4) have precisely the distribution (1) under the substitution λ = k/(v−v c ). Unlike in 1D where any sequence of partitions forms the real line, cells in 2D are nearly always interacting since their neighborhoods are mutually constrained to be a subdivision of the ECM.Their positions are derived from the neighborhood configurations, as ECM is secreted during growth, and we should expect that within a Voronoi description the cell locations will depend on those partitions.These geometric constraints co-exist with the possibility of maximum-entropy (Poisson) and minimum-entropy (crystalline) configurations.The family of point processes we introduce below models cellular interactions based on their Voronoi tessellations, interpolates between these phases, and can be interpreted as arising from a strain energy in each neighborhood.Our focus on geometry is complementary to recent work on topological properties of tessellations [32].
Let the ECM now be a bounded domain Ω ⊂ R 2 , with area |Ω| > 0 and a fixed number n of cells.We assume that cells are scattered at positions Each summand of ( 5) is the trace of D i 's second area moment about x i , which for polygonal D i is the small-strain limit of the bulk energy of a deformation from a regular n-gon centered at x i [33].Alternatively, minimizing E has the interpretation of an optimal cell-placement problem with a cost for transporting resources produced by cells at x i to other points x in the neighborhoods D i .For fixed cell positions X , the set D minimizing E is precisely the Voronoi tessellation of X .To see this, note that for Voronoi D, any other D ′ , and a point y falling in D i ∈ D and also in Rescaling the coordinates x → x √ ρ to achieve unit number density ρ = n/ |Ω| → 1, this motivates the study of the positional energy where Vor(X ) is the Voronoi tessellation.For fixed neighborhoods D, calculating ∂E/∂x i = 0 shows that the minimizing positions X are the D-centroids Di xdx; minimizers of V are centroidal Voronoi tessellations (CVTs), ubiquitous in meshing problems, clustering, and models of animal behavior [34].
Define a family of Gibbs point processes [10] whose joint positional distributions conditional on fixed N are indexed by a temperature-like quantity β −1 .Following others who have investigated phase transitions of this system [35], we refer to it as the Voronoi liquid, which differs from classical pair-potential fluids due to manybody interactions between Voronoi-incident particles.
The maximum-entropy case (ii) is realized in the infinite-temperature limit β → 0 with equiprobable configurations.This defines the "Poisson-Voronoi tessellation" (PVT) [10], which reduces to the exponentiallydistributed spacings discussed in the 1D models above.The areas |D i | of 2D PVTs, a realization of which is shown in Fig. 2(d), have been shown in numerical studies to conform to k-gamma distributions [36][37][38].A minimum-entropy configuration arises in the zerotemperature limit β → ∞, where (7) becomes degenerate and the configuration freezes to a hexagonal lattice as in Fig. 2(d), which is the globally optimal CVT and densest sphere-packing in 2D [39].Prior approaches using a structure factor analysis [40] found that, by contrast, Lloyd iterations (corresponding to a "fast quench" at zero temperature [41]) suppress crystalline configurations and adopt amorphous "hyperuniform" states.We investigate now the finite-temperature range β ∈ (0, ∞), and show that areas are accurately described by k-gamma distributions with k an "order parameter" following a monotone relationship with β, analogous to the burst-count-driven spacing distributions of the 1D case.
As a generalization of our previous comment on nonuniqueness, in 2D the entropies of the Voronoi size distribution and of the positional distribution do not necessarily follow a monotone relation.Volvox itself (Fig. 1b) provides an example; its scaled area distribution (1) has k ≈ 2.3, while Poisson-Voronoi tessellations of the flat torus and sphere have k ≈ 3.5 [11], yet their positional distribution is the maximum-entropy one.Hence, "entropy" could refer to the differential entropy of its Voronoi size distribution or to that of its joint distribution over positions at fixed N .
Since V is C 2 [34], (7) is the stationary solution of a Langevin equation with W i (t) i.i.d.Brownian motions, and time has been rescaled to β −1 t to allow integration in the limit β → ∞.
Since ∂V /∂x i ∝ |D i | (x i − µ i ) [11], (8) may be interpreted as a neighborhood-centroid-seeking model of cellular dynamics during noisy growth or a Markov Chain Monte Carlo (MCMC) method to sample the stationary distribution (7).An Euler-Maruyama discretization of (8) does not satisfy detailed balance, but this can be β > 1 (g) Voronoi tessellation of V. carteri [11].(h): Minimum spacing r enforced by the Matérn thinning rule [11], with r equal to the minimum spacing in the frozen limit β → ∞.Voronoi polygons are colored from black to white based on relative area in each panel.
Using this method, we investigated the statistical properties of the Voronoi liquid by numerically solving (8) for n = 10 3 [11], the somatic cell count of Volvox, in the simplified geometry of a unit square with periodic boundary conditions to remove curvature and topology effects.
Figures 2(a-d) show samples of the Voronoi liquid at varying temperatures with evident differences and similarities to Volvox.As seen in Fig. 2(e,f), area distributions sampled at 13 logarithmically spaced values from β = 10 −3 − 1 and 21 linearly spaced values from 1 − 40, are well-described by k-gamma distributions with k increasing monotonically with β.This is consistent with the transition of p(|D i | / D i ) to a parabola on the logscale in Fig. 2e, the limit in which k-gamma approaches a Gaussian.It is in this sense that the control parameter β is analogous to the protein burst count b in 1D.A similar monotone relationship between the "granular temperature" β −1 gr of a packing and k, in which partition size instead played the role of energy, has been noted previously in granular physics [8,43].
The importance of intermediate-entropy configurations is perhaps more readily seen in 2D.Studies of confluent tissue [44] found that k-gamma distributions also arise in the aspect ratios (defined from the eigenvalues of the second area moment).Poisson-Voronoi tessellations, notably, do not possess gamma-distributed aspect ratios.They instead follow an approximate beta-prime distribution, perhaps as a consequence of the gamma-distributed principal stretches [11].This is seen in Fig. 2(a), where high-aspect ratio "shards" occur at β = 0, yet disappear at low temperature.This raises questions of the underlying physics responsible for aspects of stochastic geometry than size.As a simple extension, the Voronoi liquid with hard-sphere thinning [11] (one way to produce the offset v c (1) in 2D), a realization of which is shown in Fig. 2(h), does not exhibit these artifacts and more closely resembles the regular arrangement observed in Volvox, with both gamma-distributed areas and aspect ratios [11].
A biological interpretation of the Voronoi assumption is that the polygonal boundaries of each cellular neighborhood are the colliding fronts of isotropically produced ECM material exported from cells.Inverting the typical modelling procedure by assuming that the Voronoi rule holds at some temperature β −1 , one can infer the distributional parameters using standard maximum-likelihood estimators for Gibbs point processes [45].From the estimated temperature, for example, one can invert the k-β relationship by monotonicity to deduce the copy number of bursty rate-limiting steps in growth.Such estima-tors are critical for elastic models of tissues, where noise in individual cellular configurations co-exists with stable geometric properties of the population.The stochastic Voronoi models we have presented here reproduce aspects of configurational noise, such as the empirically observed gamma distributions, and simultaneously provide a formal framework-an ensemble-within which to infer features of random finite configurations of cells such as interaction strength and preferential geometry.
As a purely mathematical construct, a Voronoi tessellation makes no reference to microstructure around cells, and it thus plays a role for tissues analogous to the random walk model of polymers and the hard sphere model of fluids.Yet, each Volvox somatic cell sits within a polygonal "compartment" whose boundaries are composed of denser material within the larger ECM [46].Dimly visible in brightfield microscopy, these compartments have recently been labelled fluorescently [47], enabling the simultaneous motion tracking of cells and growth of compartments during development.The strong correlation observed [47] between the location of these compartment boundaries and the the associated Voronoi partitions will enable tests of the connection hypothesized here between properties of stochastic ECM generation at the single cell level and population-level statistics.
We are grateful to T. Day, P. Yunker, and W. Ratcliff for numerous discussions.This work was supported in part by the Cambridge Trust (AS), The John Templeton Foundation and Wellcome Trust Investigator Grant 207510/Z/17/Z (SSMHH, REG).

Supplemental Material
This file discusses analytical and computational details pertaining to the main text.

BACKGROUND ON RANDOM VARIABLES
Many of the following are standard facts, recalled for a self-contained reference.

Transforms of random variables and convergence in distribution
Definition S1 (Pushforward measure).Let g : R n → R n be a diffeomorphism and Y = g(X).The pushforward probability measure µ Y is, for all measurable Transforms of random variables.When µ X has a Radon-Nikodym derivative f X with respect to the Lebesgue measure -i.e. is expressible by the probability density µ X (U ) = U f X (x)dx-then by the rule for integration under a diffeomorphic change of coordinates y = g(x), with ∂g −1 /∂y denoting the Jacobian of the inverse.As this is true for all U we conclude When preferable to work with g rather than g −1 , we may apply the chain rule to g −1 • g = 1 to convert (S3) to Due to this fact, we will abbreviate the scaling factor as J −1 , denoting the inverse Jacobian determinant.For affine transforms Y = cX + b, (S3) becomes Sums of random variables.Let X, Y be independent random variables taking values in U X , U Y ⊆ R. Then their sum is distributed as the convolution This can alternatively be deduced by applying the transform rule (S3) to the map (X, Y ) → (X, X + Y ).

Definition S2 (Convergence in distribution
Several properties of the exponential and gamma random variables Let us recall with proof some properties of the gamma random variable (and the exponential, which is a special case).
Definition S3 (Gamma random variable.).The gamma random variable Y k,λ with shape parameter k > 0 and rate λ > 0 is the continuous random variable with probability density function Its namesake is the normalizing constant, the gamma function Lemma S1 (Gamma random variables are closed under addition).The sum of two independent gamma random variables Taking the Beta function with the change of variable t = y 1 /y 2 yields Substituting into (S10) and letting x = y 2 , we obtain Corollary.The sum of k iid exponential random variables of rate λ are gamma-distributed with shape parameter k and rate λ.
Corollary.The gamma random variable is infinitely divisible.
Corollary.By the central limit theorem, (λY Lemma S2 (Beta-gamma convergence).Let Z 2,m be a beta random variable.Then m α Z 2,m converges in distribution as m → ∞ to the gamma random variable Y 2,α of rate α.

Proof. By direct calculation,
with the change of variables s = 1 − t, Lemma S3 (Memoryless characterization of the exponential).The only continuous random variable X which (i) possesses a cumulative distribution function F X such that F ′ X (0) exists and (ii) satisfies the memoryless property is the exponential.
Proof.By Bayes' theorem, (S21) is equivalent to Since by hypothesis (i) F ′ (0) exists, (S23) implies F ′ exists everywhere.Let u be such that Integrating, we obtain G(x) = b exp(cx) for some b.By the conditions , and X is an exponential random variable of rate λ.
Lemma S4 (Maximum-entropy characterization of the exponential).The only nonnegative continuous random variable X with density f X which maximizes the entropy with fixed mean µ > 0 is the exponential.
Proof.By hypothesis, f X is a critical point of the functional with Lagrange multipliers λ 0 and λ 1 constraining the 0th and 1st moments in the Lagrangian density Since for all test functions φ, the Fréchet derivative vanishes, by the fundamental lemma of the calculus of variations, . Applying the total probability constraint, where λ 1 > 0 necessarily.Then, f X (x) = λ 1 exp(−λ 1 x) is the density of an exponential random variable with rate λ 1 .
Theorem S5 (Characterization of gamma random variables, Lukacs 1955 [12]).Let Y 1 , Y 2 be independent random variables.Then are independent if and only if Y 1 , Y 2 are gamma random variables of the same rate λ.
Proof ( =⇒ ).Consider the map Then for a ̸ = 0, By (S4), the pushforward density is The total A and fraction B are therefore independent.Substituting the Beta function (S11), Then f A (a) is a gamma distribution (as expected, S1) and f B (b) is a beta distribution.Therefore, Note that (S38) is independent of the rate λ of Y 1 , Y 2 .The converse ( ⇐= ), that gamma distributions are unique in possessing this independence property, is not proven here, but we refer the reader to a proof [13] using the fact that the gamma distribution is uniquely determined by its moments (3.3.25, [14]).
Corollary (Beta-thinned gamma random variable).If Y 1;k1+k2,λ and Z 1;k1,k2 are independent gamma and beta random variables, respectively, then is an independent gamma random variable of the same rate λ and lower shape parameter k 1 .
The thinning here refers to the fact that Z 1 is supported on [0, 1], hence decreasing the number of events occurring in a section of a 1D Poisson process.
Lemma S6 (Differential entropy of a fixed-mean gamma random variable is strictly decreasing in k ∈ (1, ∞)).
Proof.Let Y k,λ be a gamma random variable of shape parameter k and rate λ.Its differential entropy is where ψ(k) = d dk log Γ(k) is the digamma function.Hence, at fixed mean µ = k/λ, using the hypothesis k > 1 and the trigamma inequality ψ ′ (k) > 1/k [14], we have the k-monotonicity Note furthermore that H(Y ) ≥ 0 for k ≤ 1, so one recovers the maximum-entropy property of the exponential random variable S4.
A note on maximum likelihood estimation.Throughout, we use standard methods for maximum-likelihood estimation of the shape and scale parameters (k, λ) of the gamma distribution at fixed offset 0, as implemented in Python's scipy.stats.gamma.fitfunction.As the maximum likelihood estimate of λ is given by the k/x where k is given and x is the empirical mean, we estimate 95% confidence intervals for the shape parameter k at fixed scale λ using N = 1000 parametric bootstraps as follows.Each sample of size n, where n is the size of the original dataset to be fit, is produced from a gamma distribution with ( k, λ), the estimated parameters from the original empirical distribution.Then, at fixed λ, a new k is maximum-likelihood estimated from these samples.From this set { ki } N i=1 , the values corresponding to the (2.5, 97.5) percentiles are reported as the 95% CIs.

1-DIMENSIONAL MODELS OF CELLS WITHIN AN ECM
Recall that the configuration of a sequence of cellular positions {R i } in a one-dimensional ECM (e.g.[0, ∞) or the circle S 1 ) is uniquely defined, up to relabeling, by the intercellular spacings L i = R i+1 − R i .In the following models, we let L i be given (up to appropriate scaling) by the steady-state protein concentration P * i referred to in the main text.Realizing random 1D configurations in different ways by applying constraints analogous to the ensembles of statistical physics, we obtain the same k = 2b-gamma distribution governing the Voronoi lengths in a regime analogous to the thermodynamic limit.These models are named by the ECM dimension d (I or II) and described by the particular ensemble considered.as summarized in Table I.Notation.We denote random variables by capital letters X i;α,β,... accompanied by indices i and parameters α, β, • • • .Variables i ̸ = j are independent unless otherwise noted.The letters W i;µ,σ 2 , X i;λ , Y i;k,λ , Z i;α,β , and N i;λ , and are reserved for Gaussian, exponential, gamma, beta, and Poisson random variables respectively.X pdf ∼ f X (x) indicates that X has the probability density function f X (x), with X cdf ∼ F X (x) indicating the same for the cumulative density.
where p(v) is the distribution (1).
As in the Gibbs, microcanonical, and grand canonical ensembles, we consider three types of random configurations of cells in the ECM: (i) fixed cell counts N = n, with random intercell spacings and circumferences; (ii) fixed circumferences C = c and counts N = n, with random positions {R i } N i=1 ; and (iii) fixed circumferences C = c, with random cell counts N and positions.Like the convergence of the ensembles in the thermodynamic limit, we show that the same gamma distribution arises in the large-n, c limits of these cases.
Model I.0 (on the half-line) -Consider a semi-infinite Volvox modelled as a sequence of cellular positions {R i } ∞ i=1 on the half-line [0, ∞) as in Fig. 1(c).In the special case of b = 1 protein bursts, R i are the cumulative sums of i i.i.d.exponential random variables: and therefore occur on the half-line [0, ∞) as a Poisson process [9].This is the configuration of the ideal gas-or the maximum-entropy configuration-in which the number of cells observed in any interval of length ℓ is a Poisson random variable N λℓ whose positions are i.i.d.uniform random variables [9].This is the case of class (i) configurations.
The Voronoi lengths V i = (L i + L i+1 )/2 = (X i;λ + X i+1;λ )/2 are therefore gamma-distributed with k = 2, This result already shows explicitly the deep link between the Voronoi construction and gamma distributions.More generally, for burst count b, the resulting Voronoi lengths, by the same argument, are k = 2b-gamma random variables.
In the opposite limit, holding the mean spacing E[L i ] = b/λ fixed while taking the burst count b → ∞, the variance σ 2 = b/λ 2 = E[L i ] 2 /b vanishes while the central limit theorem ensures the convergence of the shifted and rescaled lengths √ b Li E[Li] − 1 → W i;0,1 to a Gaussian.This is the perfectly spaced lattice of cellular positions occurring as the natural numbers N on [0, ∞)-the "crystalline," or class (i) configuration.
Model I.1 (circular Gibbs) -Consider a circular Volvox, constructed by selecting a fixed number N ;λ = n + 1 of successive points {R i } n+1 i=0 from the half-line in I.0 and identifying the first and last points, as in Fig. 1(c).This circle has a random circumference C ("Gibbs ensemble") which is nb-gamma distributed, with resulting Voronoi lengths V i = Y i;2b,2λ governed by k = 2b-gamma distributions by an identical argument as in I.0.Fig. 1(d) shows an empirical distribution of V i from 10 4 samples.
Model I.2 (circular canonical)-Consider a fixed-N , fixed-C configuration as follows.Let us take the intercellular spacings L i to be b-gamma random variables Y i;b,λ as in I.1, conditional on their sum which one shows by the fraction-sum independence property of gamma random variables S5, where B(α, β) is the Beta function.Then, by inspection of (S44) and scaling laws for r.v.s, L i is the beta random variable A|C = cZ i;b,b(n−1) with C = c now the fixed circumference.Once more, in the special case of protein burst count b = 1, L i = cZ i;1,n−1 , which is simply the first order statistic of (n + 1) i.i.d.uniform random variables on the interval [0, c] (with n + 1 arising from identifying the ends of the interval to form a circle).Then, this is the distribution of uniform spacings of the interval [0, c], which is precisely the distribution of waiting times for n events in a Poisson process conditional on a total wait time C = c (see §4.1, [15]), as shown in Fig. 1(c).
Since Z i;b,b(n−1) can be expressed as the fraction Y i;b,λ / n j=1 Y j;b,λ S5, this allows us to compute the corresponding Voronoi lengths V i as with subscript Z j indicating that Z j is a distinct (though not independent) random variable from the earlier Z i .In the Poisson case b = 1, this is now simply one-half the second order statistic of i.i.d.uniform r.v.s. on [0, c].
Taking the large-cell count and ECM circumference limits n, c → ∞ limit at fixed cell number density ρ = m/c = (n − 2)/c (analogous to the thermodynamic limit of statistical mechanics) and per-cell burst count b, one obtains S2 the convergence in distribution of the Voronoi segments, As in I.2, let us consider the kth order statistic of N uniform random variables (representing the fixed-circumference constraint), with N λ now a Poisson-distributed random variable conditioned to be minimum k.The marginal distribution of the order statistic given λ is (a particular) compound beta-Poisson distribution, given by (S47) To see the parametrization more clearly, let m = n − k + 1; then, Recognizing the second factor as the gamma distribution, the first factor Z is simply a normalizing constant restricting the support to [0, 1].Here we see the emergence of gamma distributions from the order statistics of a Poissondistributed number of uniform random variables.This may be viewed as a roundabout method of constructing the 1D Poisson process.As we take the support of this distribution (the circumference C) to infinity, we have P (N Cλ < k) → 0 for any fixed k, hence Z → 1 and we recover the true gamma distribution.
For the general b case, we note that this construction is equivalent to placing an observation window [x, x + c] at random on the half-line process I.0, in which case the spacings L i follow a truncated b-gamma distribution: with γ the lower incomplete gamma function.Taking c → ∞ with rate λ fixed (thermodynamic limit), we have , giving the same k = 2b-gamma distribution for Voronoi lengths.Model I.4 (circular canonical, finite size)-This refers to the fixed-cell size model discussed and analyzed in the main text.with d the L 2 metric and ν the uniform (Haar) measure on the unitary group U(n).Therefore, if r(t) satisfies: the RHS in (S59) vanishes in large t, and global exponential stability is assured.Then, as in ID.2, the positions t approach a random uniform spacing of a circle of radius r(t) in large n, resulting in gamma-distributed Voronoi segments V i .Rapid convergence in t to a gamma law (with D = 0.1, n = 1000, ṙ = 1) is shown in Fig. S1(c2).Formally, (S60) is not satisfied by such a linear growth law, yet we observe empirical convergence to a gamma law on the order of a unit of (scaled) time.
Condition (S60) can be understood by a Péclet number relating drift and diffusion timescales in growth.Considering the radial drift velocity v r in (S58), let with L a test length section.In the limit Pe → 0, condition (S60) is satisfied; when Pe → ∞, the exponential multiplier in the bound (S59) approaches 1, "freezing" the angles θ t to initial conditions.Finally, conditions Ḋ = 0, Ṗe = 0, and (S60) cannot simultaneously be satisfied.
Model I.D.4 (maximum-entropy growth rates)-Let the segments grow linearly in time as Li = G i;µ , with i.i.d.growth rates G i which are positive continuous random variables with some common mean growth rate µ.If the distribution of G i maximizes entropy subject to the mean and nonnegativity constraint-perhaps more interpretable as uncertainty about cellular behavior than global maximum-entropy assumptions [8]-then G i;µ = X i;1/µ is an exponential random variable of rate λ = 1/µ.Then, for any time t, the normalized configuration converges in large t to a uniform spacing as in I.2, and therefore has beta-distributed Voronoi lengths converging to gamma-distributed lengths in large n (S46).

DEFINING POINT PROCESSES IN d DIMENSIONS
Many of the following definitions are standard and recalled here for a self-contained reference.

Definitions in R d
The familiar Poisson process of rate λ on the half-line [0, ∞) is constructible as the cumulative sum of i.i.d.exponential random variables X i;λ .The construction of general point processes on a domain K ⊆ R d , however, is more technical, formalizing the notion of a "random almost-surely finite subset," for which we recall several standard definitions [9].Throughout, we assume K is a complete separable metric space (c.s.m.s.), e.g. a closed subset of R n .Definition S4 (Finite point process -2.2-2.4,[9]).A finite point process N on a complete separable metric space K is a family of random variables N (E) for each Borel set E ∈ F K , such that, for every bounded E, Formally, N is a random measure (see e.g.5.1 [9] or Chapter 9 [9]).Without delving too deeply into this formalism, let us introduce the following definition based upon samples of N .Definition S5 (Simple point process).A simple point process N is one whose points are non-overlapping.In other words, every sample of N can be written as the counting measure where I is an index set, δ denotes the Dirac measure, and P[x i = x j ] = 0 for all i ̸ = j.
isoperimetric deficit D = L/ √ 4πa − 1 with L the perimeter, conforms after an appropriate rescaling to a log-normal random variable with σ ≈ 0.6, shown in Fig. S2(c2).The aspect ratio AR of a Poisson-Voronoi tessellation does not conform to a gamma distribution (Fig. S2(b2)), in contrast to confluent tissue [44] in which gamma-distributed aspect ratios appear in a diverse range of densely-packed cell types and inert matter.Instead, AR conforms approximately to a beta-prime distribution, which naturally arises as the ratio of independent gamma random variables.Figure S2(d2-e2) shows that the major and minor axis lengths are approximate gamma-distributed; while we do not in general expect the major and minor axis to be independent random variables, the beta-prime distribution governs the aspect ratio in the event that they are, with shape parameter k 1 /k 2 .
Validating Poisson-Voronoi simulations Let V i be the measure (area, volume, etc.) of the typical Poisson-Voronoi cell.While the distribution of V i is presently unknown, numerically integrated second moments of V i in R 2 and R 3 have been reported [23], facilitating comparison with numerical study.Large simulations [36,37] with n > 10 6 cells have found that gamma distributions, and in particular a 3-parameter generalization [36] achieve good maximum-likelihood fit to data with < 1% error relative to the analytical second moment.In Fig. S3, the estimated second moment ⟨V 2 i ⟩, shape parameter k, and CDF root mean square error are displayed for 500 trials of N pdf ∼ Poisson(10 3 ) points.The average empirical, gamma, and beta second moments show good agreement with Gilbert's [23] numerically integrated value of 1.280 and are within 1% relative error, validating the numerical method.The estimated value of k ≈ 3.7 for gamma on the torus T is consistent with prior results finding k ≈ 3.6 [38] in the plane R 2 .On the other hand, the estimated value of k ≈ 3.2 for beta is lower than gamma and closer to Tanemura's [36] generalized-gamma (S69) fit finding k = 3.315, suggesting that a beta hypothesis is a good substitute for the generalized gamma distribution.Lastly, we observe that the beta RMSE is slightly decreased compared to gamma.

A conjecture for the distribution of Poisson-Voronoi areas
In dimension d ≥ 2, exact distributions for the cell measures (areas, volumes, and so on) of Delaunay triangulations (denoted D i ) of a Poisson point process are known [24], with the distribution in R 2 given by a modified Bessel function of the second kind, with parameters c i , α j , n.On the other hand, exact distributions for their vertex-cell duals-the Voronoi tessellations-are presently unknown.We conjecture, however, that (S70) should also govern the Voronoi areas V i based on the following heuristic argument.Conditional independence of the volume of the fundamental region (a particular set containing the vertices of a Voronoi cell and the origin) and its shape (see [21]) suggests that one may assume a particular approximate shape for the typical Voronoi cell, say, an approximately elliptical region.If, as seen in Fig. S2, the principal axes are gamma-distributed, and additionally they are independent, then V i is proportional to their product and has a distribution precisely of the form (S70) (see [25]).A possible route to proving this claim is to show that the desired independence holds in a thermodynamic limit.

THE VORONOI LIQUID
Let Ω ⊂ R d be a compact connected measurable set with positive d-dimensional Lebesgue measure |Ω| > 0 as in the main text.For a finite set of points {x i } n i=1 = X ⊂ Ω d and associated partitions which is the approximation error of points y ∈ Ω d , and therefore the domain Ω d itself, by the partitions {x i , D i }.This illustrates the origin of (S71) in meshing problems [34].i ] = 1.280 found by [23] is plotted as a dashed line in the left plot, indicating good agreement with our numerical simulations.The experimentally determined shape parameter k = 3.315 of the generalizedgamma fit found by [36] is plotted as a dashed line in the middle plot.

The intensive energy
Define U to be the energy per unit measure We argue (S72) is an intensive quantity as follows.First, consider the "tiling" (frozen) phase in which X , D is a tiling of Ω with x i the centroids of congruent cells which is simply the cell's second moment S i , nondimensionalized by the cell measure.Recall that in 2D, for example, tr(S i ) has units length 4 .More generally, if ⟨E i ⟩ is the average partition energy, then U is expressible as Now we note that U is scale-invariant as follows.Let x → αx with α > 0, so that |Ω| → α d |Ω| and Lastly, let us make an informal argument for U being a dimensionless quantity independent of n in the general-X , D case.Let a ∼ b indicate that a is of the same units and scale as b, such that |Ω| ∼ V is the scale of the system measure.Then, Let ∥x i − y∥ ∼ ℓ be the length scale of the typical partition and note that E i ∼ ℓ 2+d .Assuming further the decomposition ℓ d ∼ V /n, the generalization from the tiling case, we have

Stochastic gradient flow on the torus
Fix now Ω = T d .It is shown in [34] that E is C 2 and where µ i is the centroid of D i .This suggests the continuous-time gradient flow ẋi Consider now the Itô process where W i (t) are independent standard Brownian motions.In the limit σ → ∞, (X (t), D(t)) converges in distribution (in large t) to a Poisson-Voronoi tessellation, while in the limit σ → 0 it converges to a centroidal Voronoi tessellation [34,39].The corresponding Fokker-Planck equation for the joint probability density p(R 1 , . . ., R n ) =: p(R) of (S90) is Since U is C 2 and now compactly supported, obeying growth conditions required [26], a Gibbs measure µ β at temperature β −1 exists and is the stationary solution of (S91), given by with Z β the normalizing constant and β −1 = σ 2 /2 the diffusion constant.Then (S92) minimizes the Gibbs free energy functional with E f β denoting the expectation with respect to f β and H the differential entropy of f β .The previous limits-Poisson-Voronoi and centroidal Voronoi tessellations-correspond to the infinite-(β → 0) and zero-(β → ∞) temperature limits respectively.

Specifics of the Langevin simulation
Let us now sample from the Gibbs measure (S92) using Markov Chain Monte Carlo (MCMC).We take Ω = T d to be the unit d-cube with periodic boundary conditions.Euler-Maruyama integration of (8) with discrete steps ε > 0 is performed by As (S94) no longer necessarily satisfies detailed balance with respect to (7), we add a Metropolis-Hastings step, a procedure known as the Metropolis-adjusted Langevin algorithm (MALA) [42].That is, the transition R i (t) → R i (t + ε) occurs with probability min(1, α), where We iterate (S94), (S95) for n = 1000 particles in the unit square with periodic boundary conditions.For non-Poisson configurations (β > 0, Fig. 2(b)-(d)), we use the "frozen" initial condition (Fig. 2(d)), computed using a gradient descent of ( 5), which has a faster mixing time than starting with the infinite-temperature (Poisson) initial condition.
We use the following simulation conditions to produce Fig. 2 in the main text.
• n = 1000 particles The sufficiency of the mixing time n mix is validated by running the same procedure again with double the time 2n mix , which produced no measurable differences in the resulting distributions except at high β = 100, k ≈ 2000.
Enforcing hard-sphere constraints We now apply a hard-sphere constraint to samples of the fixed-N , V ensemble (7).One of the simplest methods is the Matérn thinning rule (S68), which filters points from a particular configuration X to X ′ such that a minimum user-specified spacing d min is satisfied.Fig. S5 shows several such samples at fixed β = 10 4 and d min a multiple α ∈ {0, 0.25, 0.5, 1} of the minimum spacing in the frozen phase (Fig 2(d) in the main text).As seen in Fig. S5(e), the hard-sphere thinning shows a transition to gamma-distributed aspect ratios.
A second method is to add a divergent term to E (5) for invalid configurations for which f β (7) is vanishing.The Langevin diffusion ( 8) is of course no longer interpretable due to the loss of derivatives.However, we can interpret its discretization (S94) at valid configurations as an arbitrary sequence of proposals (still outperforming the random walk) for which the Metropolis-Hastings step (S95) rejects invalid configurations (V ∆ = ∞) and ensures detailed balance with respect to f β .We mention this approach for completeness; however, additional modifications to MALA need to be made to improve the sampling efficiency, as we find this procedure to be computationally intractable at even moderate hard-sphere radii.One possible approach is to use Hamiltonian Monte Carlo [42] with constraints.

APPROXIMATE VORONOI TESSELLATION OF THE SURFACE OF V. CARTERI
Below are standard data processing procedures recalled for a self-contained reference, which we compose in a pipeline to produce a simplicial Voronoi tessellation about the somatic cells on the surface of Volvox.This establishes the approximate anterior-posterior axis of the spheroid.(c) The ℓ 2 -optimal projection of the somatic cell positions X to this ellipsoid are computed, and the Delaunay tetrahedralization of this projected point cloud X is computed.As X is now its own convex hull, the index triplets corresponding to the triangular faces lying on the convex hull of the tetrahedral complex are taken to be an approximate Delaunay triangulation of the original point cloud X.(d) A quasi-2D Voronoi tessellation of the surface is computed by taking the ℓ 2 -optimal planar embedding of the simplicial ring around any vertex, and constructing a planar Voronoi face in that embedding using the usual circumcenter rule.If invalid (self-intersecting) polygons are produced by the planar embedding, they are corrected by taking the 2D convex hull, which in the limit of zero curvature is a no-op.This procedure introduces overlapping artifacts near gonidia (as seen) but is otherwise unaffected by the global radius of curvature which is large compared to the size of individual polygons.
The cost is bi-convex in the parameters v and n.For fixed n, the critical point of the cost in v is given by A common method to estimate n in this LSQ problem is the singular vector of X corresponding to the smallest singular value, which of course is 0 if {x i } are coplanar.By (S112) it follows that the ℓ 2 -orthogonal projection of X onto H is Let u ∈ R d be a random vector such that u × n ̸ = 0 (e.g. a Gaussian vector, for which this is almost surely true); then an invertible planar embedding Y H of a point Y lying in H is defined by the random orthonormal plane basis B: 3D reconstruction & analysis of Volvox First, 3D meshes and an ellipsoidal approximation of the organism's surface are constructed using the procedure detailed in Figure S6.Then, all Voronoi polygon areas (including those near gonidia, which typically introduce high-aspect-ratio polygons) are converted to solid-angles (4π times the area fraction of total) and are filtered by the cutoff v c = 0.007 specified in [5].The empirical area distribution for each organism is shifted by this cutoff and nondimensionalized to empirical mean 1, at which point they are combined across all 6 organisms and shown as a combined histogram in Figure 1.

FIG. 1 :
FIG. 1: The green alga Volvox and 1D models for cell positions.(a) Adult with cell types labelled, and section of Voronoi tessellation around somatic cells (dots within each Voronoi polygon), adapted from [5].(b) Voronoi areas x follow the translated gamma distribution (1), computed by a fit of k.(c) Model I.1 [11], the Poisson process on [0, ∞), where cells are indicated by dots, Voronoi boundaries by vertical line segments and intercalating ECM is colored.Model I.4, the circular Poisson process with minimum spacing.(d) Numerical distribution for model I.4, with no cell overlaps, compared to the analytical gamma distribution for large n.

FIG. 2 :
FIG.2: Voronoi liquid interpolates between maximum-and minimum-entropy point configurations in 2D.(a)-(d): Monte Carlo simulations of the Voronoi liquid at varying temperatures, from the infinite-temperature (Poisson) limit to the zero-temperature (sphere-packing) limit.(e): Area distributions at all temperatures are approximately k-gamma distributed, with k monotone increasing with β. (f): k is constant for β < 1 and begins to grow superlinearly for β > 1. (g) Voronoi tessellation of V. carteri[11].(h): Minimum spacing r enforced by the Matérn thinning rule[11], with r equal to the minimum spacing in the frozen limit β → ∞.Voronoi polygons are colored from black to white based on relative area in each panel.
S46) whose limit is once again the k = 2b gamma random variable (with rate ρ) in I.0 and the Gibbs ensemble I.1.Model I.3 (circular grand canonical)-Let the circumference C = c now be fixed with the count N a random variable.In the case of b = 1 bursts, N = N ;cλ is Poisson-distributed.Using a similar argument to I.2, the Voronoi length distribution p Vi can be determined by marginalizing the joint distribution p Z;2,n−1,N ;λ over n, giving the compound beta-Poisson distribution which is once again a k = 2 gamma distribution, shown as follows.

1 FIG
FIG. S1: Dynamic models of cellular positions during growth converge to gamma distributions in various limits.(a1,a2) Standard Brownian motion in angular coordinates converges to beta-distributed segments in large t, and gamma-distributed segments in large n.(b1,b2) Dyson Brownian motion in angular coordinates converges to gamma-distributed segments at large t and n.Non-conformance to gamma distributions is observed at low n due to pair-repulsion.(c1,c2) Dyson Brownian motion on a growing domain satisfying particular growth constraints converges in large-t and n to gamma-distributed segments.
FIG.S3: Gamma-and Beta-maximum likelihood fits for Poisson-Voronoi tessellations of the flat torus T 2 .The numerically integrated value of the second moment E[V2  i ] = 1.280 found by[23] is plotted as a dashed line in the left plot, indicating good agreement with our numerical simulations.The experimentally determined shape parameter k = 3.315 of the generalizedgamma fit found by[36] is plotted as a dashed line in the middle plot.

•
k = 1000 samples at each temperature β −1 in distinct Markov chains • β ∈ [10 −3 , 1] at 13 logarithmically spaced intervals and β ∈ [1, 40] at 21 linearly spaced values • n mix = 1000 mixing steps for each chain prior to sampling • ε = 1.0 time step size • n freeze = 2000 gradient steps of size ε freeze = 0.1 to produce the frozen initial condition FIG. S6: Computational pipeline for tessellating the surface of V. carteri.(a) The somatic cell positions, available from[5], displayed in three-dimensional space.(b) The minimum-volume bounding ellipsoid containing the somatic cells is computed.This establishes the approximate anterior-posterior axis of the spheroid.(c) The ℓ 2 -optimal projection of the somatic cell positions X to this ellipsoid are computed, and the Delaunay tetrahedralization of this projected point cloud X is computed.As X is now its own convex hull, the index triplets corresponding to the triangular faces lying on the convex hull of the tetrahedral complex are taken to be an approximate Delaunay triangulation of the original point cloud X.(d) A quasi-2D Voronoi tessellation of the surface is computed by taking the ℓ 2 -optimal planar embedding of the simplicial ring around any vertex, and constructing a planar Voronoi face in that embedding using the usual circumcenter rule.If invalid (self-intersecting) polygons are produced by the planar embedding, they are corrected by taking the 2D convex hull, which in the limit of zero curvature is a no-op.This procedure introduces overlapping artifacts near gonidia (as seen) but is otherwise unaffected by the global radius of curvature which is large compared to the size of individual polygons.
is the centroid.Letting X = X − v, the critical point of the cost in n

TABLE I
Gradient flow of the quantization energy from Poisson initial conditions on the unit circle have Voronoi lengths which are well-approximated by beta and gamma distributions with increasing shape parameter k.