Continuous Rate Modelling of bacterial stochastic size dynamics

Bacterial division is an inherently stochastic process. However, theoretical tools to simulate and study the stochastic transient dynamics of cell-size are scarce. Here, we present a general theoretical approach based on the Chapman-Kolmogorov formalism to describe these stochastic dynamics including continuous growth and division events as jump processes. Using this approach, we analyze the effect of different sources of noise on the dynamics of the size distribution. Oscillations in the distribution central moments were found as consequence of the discrete translation invariance of the system with period of one doubling time, these oscillations are found in both the central moments of the size distribution and the auto-correlation function and do not disappear including stochasticity on division times or size heterogeneity on the population but only after include noise in either growth rate or septum position.


Introduction
Recent experiments involving time-lapse microscopy [1], singlecell tracking [2,3], and gene tagging [4] have revealed how cell size stochasticity, and division events play an important role in the random fluctuations of bio-molecular concentrations [5][6][7][8]. This with important consequences for phenotype variability and cell heterogeneity over a clonal population of microorganisms [9].
Methods approaching the bacterial size control include the discrete stochastic maps (DSMs) [10]. These models define the known as division strategy as a map that takes cell size at birth s b to a targeted cell size at division s d trough a deterministic function s d = f (s b ) plus stochastic fluctuations that have to be fitted from experiment. DSMs, however, are unable to reproduce cell size transient dynamics at arbitrary infinitesimal time intervals without further extension.
To solve this continuous dynamics of the distributions describing stochastic processes usually the Chapman-Kolgomorov formalism (CK) is used [11]. CK solutions corresponds to the distributions of all the possible stochastic hybrid trajectories at a given time. Among the processes that can be modeled by CK it can be included continuous size growth and division as a jump process requiring the definition of a continuous rate of division. In fact, these models are also known as Continuous Rate Models (CRM) [10]. Recently, [12] proposed a power-law function to explain observations in E. coli bacteria cells. Instead, [13] suggested a convoluted function of size and cycle progression is required. [14] proposed a deconvoluted version by introducing division as a multistep process where the occurrence rate of these steps is a function of the size. Despite these recent attempts, there is a lack of a complete formalism describing the phenomena related to bacterial division.
Here, we propose the CK formalism as a framework for studying single-cell transient dynamics. We present how to overcome the non-locality of the division jumps and how to model the division steps. These steps being analogous to the experimental accumulation of FtsZ to trigger the division [15,16]. These equations are solved using both simulations and numerical methods. Analytical expressions are also presented in the cases where it was possible. We also present modifications to the division rates to define multiple division strategies. Then, we will show how stochasticity on division influence the size distribution dynamics and how this dynamics changes considering additional sources of noise like a distribution of initial sizes with finite variance, the noise in septum position and in cell-to-cell growth rate. We discuss how this approach could be coupled to simulations of gene expression.

Theoretical details The population balance Equation
Consider the distribution p(s; t) of sizes s at a given time t solving the Chapman-Kolmogorov equations (CKE). This distribution is related to the histograms of bacterial populations. Some attempts have described the dynamics of this distributions including effects due to the increasing of population number and the mother-daughter correlations [17][18][19]. In our framework, we consider a cell population with an asymptotically large number. After each division, only one of the descendants is tracked such as the number of bacteria in the population remains constant in similar way observed in experiments such as those using Mother Machine micro-fluids [3]. Then, ρ(s; t) can be normalized: To describe the distribution dynamics, let an individual cell grow in size s as per where g(s, t) is the size change per unit time. Some studies have considered the constant growth [17,20,21] we will focus con exponential growth rate (g(s) = µs) with µ being the growth rate and s 0 is the initial cell size. Since (2) defines a deterministic process, the change in the size distribution p(s, t|s 0 ) conditioned on initial size s 0 can be obtained by solving Expression (3) is known as the CKE in its differential version (dCKE), and its solution for this deterministic process is given by where s(t) is the solution of equation (2).
Considering division as a jump process switching, at a time t, from cell size s to size s at rates W (s|s , t), equation (3) can be written as ∂ ∂t p(s, t|s 0 ) = ∂ ∂s g(s, t)p(s, t|s 0 ) + ds W (s|s , t)p(s , t|s 0 ) − W (s |s, t)p(s, t|s 0 ) (5) If perfect symmetric splitting is considered through the condition (δ(s − 2s)) and |W (s |s, t)| = h(s, s ), the division rates W (s|s , t) can be written as: Some studies have explored the particular case where h(s) = ks with k being a constant and discarding the dependence upon s [17,22]. This, resulting on: Which is also known as the Population Balance Equation for a fixed population number [17]. Some theoretical methods like moment closure were used in past studies to solve (7) but stable solutions were not found already [17].

The Chapman-Kolmogorov equation including division events.
To overcome these instabilities, other studies reparametrize the size distribution adding an additional variable: the number of divisions n ∈ {0, 1, 2, · · · } [22]. In this case, the probability distribution is now p(s, n; t) with transition rates satisfying: This is, after division, not only the size is halved but the number of divisions n, increase by one unit. In the particular case where h(s, s ) = ks, the associated CKE is: where x 0 = (s 0 , n 0 ).
The inclusion of the new variable n breaks the non-locality of the operator W (s|s , t) that makes (5) hard to solve. Instead W (s, n|s , n − 1, t) performs jumps between independent subspaces that can be merged together later with marginal sums.
Using this new variable n, exponential growth and s(0) = s 0 equation (9) has closed solutions p(s, n; t) = δ(s − s(n, t))P n (t), (10) where P n (t) is the probability to get divided n times at time t. We will present its associated equation later. s(n, t) corresponds to the bacterial size after n divisions at a time t. To explain how to solve this size, let us consider the size s(1, t) at time t after one division. If t 1 < t is the time when division occurs, this size is: So, having the general sequence of division times 0 < t 1 < t 2 < · · · < t n−1 < t n < t, s(n, t) satisfies: where we used t 0 = 0 and the telescopic properties of the sum. Using this result (12) and ( defining the dynamics of every P n (t).

The division strategy
Focusing on the jump process between the state n − 1 to n, or simply, one division, we can modify the space n ∈ 0, 1, 2, · · · to n ∈ {0, 1} and truncate (13) to: Where a general size dependent splitting rate function h(s) can be used. This system can be integrated under the initial conditions P 0 (0) = 1 and P 1 (0) = 0. Thus, the probability P 1 (t) -to simplify notation, P(t)-that the cell divides in the time interval (0, t) evolves according to: In the particular case of h(s), proportional to the size, this is, h(s) = ks, assuming exponential growth as well, the integration on time has to be done using the implicit formula: Once P(t) is obtained, the probability density function ρ(t) for the time of division can be obtained as: A transformation of variables, allows us to get the distribution of sizes at division ρ(s d ): where, if we assume exponential growth t(s d ) = 1 µ ln sd s0 , then, dt dsd = 1 µsd . Using this ρ(s d ), one can calculate the mean size at division by integrating: Hence, we can calculate the mean added size per cell cycle ∆ = s d − s b as a function of the size at birth s b . This relationship defines the division strategy.

The multi-step Single division
In the general case, division does not correspond to a single jump process, instead, division occurs once bacteria have reached some goal steps M. If the occurrence rate of these steps is proportional to the cell size s by the constant k d , the probability P m (t) of having done m < M steps at time t can be modelled following: P M is the probability of reaching the target steps M or equivalently, the probability of a division event to occur. Once the division event happens, the process is reset to zero steps and size is halved. Using this P M (t) and the growth regime (4), if the procedure (18) is followed, the probability density ρ(s d |s b ) of size at division s d given the size at birth s b in a cell cycle, satisfies: Defining the added size before division ∆ = s d − s b , from , we observe that ∆ = s d − s b is independent on the size at birth s b and is related to the growth rate µ, the objective steps M and the step occurrence rate k d per size unit, satisfying: .

The solution of the CKE including multiple divisions
Using a similar procedure as (13) now with the additional variable m, the probability of have a size s, have done m division steps and n divisions up to time t can be written now as: Cell size s(n, t) follows, again, the equation (12). While, the probability P m,n (t) of have done m division steps and n division events up to time t, can be estimated through the master equation system: where we showed the selection rule defining the divisions as jumps between states (M, n − 1) to (0, n) and the division steps as jumps between states (m, n) to (m + 1, n). These jumps happening at rate h = k d s(n, t) with s(n, t) following the equation (12).

Numerical estimation of size dynamics
Solution of (24) can be obtained using different methods. Analytically, one can start from the initial conditions: with δ i,j , the Kronecker delta. Hence, P m,n (t) can be obtained knowing P m−1,n (t) with m ∈ 1, ..., M and P 0,n (t), can be estimated from P M,n−1 (t). Both, using, using the closed recurrence expression:

The Finite State Projection Algorithm
In general, in (24), while the number of possible steps m ∈ {0, 1, · · · , M} has finite cardinality, the number of possible divisions n ∈ {0, 1, · · · } is infinite. Thus, making impossible the complete solution of (24) using methods like Matrix exponential.
As we explained before [22], this infinite set can be projected into a finite set using the known Finite State Projection (FSP) algorithm [23]. Using this approach, the number of equations in (24) are truncated up to a maximum divisions N and the number of possible division states is now finite. Hence, known methods for solving these finite systems can be used to estimate size dynamics during infinitesimal periods of time.
From P m,n (t), the size distribution ρ(s|s 0 ) given the starting size with P n = M m=0 P m,n and δ(x), the Dirac-delta distribution.
The computing of these moments was done considered that all cells began at initial size s(0) = s 0 , this is, ρ(s 0 ) = δ(s 0 − s(0)). However, if a general density function ρ(s 0 ) is considered, the size distribution ρ(s) is a convolution of solutions of (27):

Stochastic simulation of size dynamics
Let the single step process shown in (14). While (14) was presented to modeling the division as a single step process, in general, these equations are also valid for a division step. Setting explicitly the dependence h = k d s, the system describing the single step process is now: If P 0 (0) = 1 and P 1 (0) = 0, P 1 (t), or simply P(t), has solution: while the associated density function is: The main idea behind the stochastic simulation algorithm is to generate random time events τ s distributed as (31). Following the Gillespie's method [24], we generate a random number r uniformly distributed in the interval (0, 1) and from the cumulative function (30), τ s is obtained matching P(t) and r thus, solving for t: where we take advantage of the fact that 1 − r is distributed as r . This τ s is the time to the occurrence of the next division step.

Additional details to model the cell division
The asymmetric splitting The main assumption proposing (12) is that after each division, the cell size is perfectly halved. However, this is not the case in a realistic situation. Experimentally, some stochastic fluctuations in the septum position are found. In some growth conditions, this noise can be as high as 5% [25,26]. Considering again the size at time t after one division, if the division occurred at time t 1 < t. If the size is not perfectly halved but multiplied by a random variable b 1 , centered on 0.5 and also known as division ratio, the size at time t is now: If the sequence of division ratios {b 1 , b 2 , · · · , b n } is known, the size after n division at time t is given by: Theoretically, b k can be approximated to a beta distributed variable centered on 0.5 with variance fitted from experiments.

Cell-to-cell noise in growth rate
Other important stochastic variable is the cell-to-cell growth rate [27,28]. This fluctuation can be as high as 10% [3]. We can assume that after a division k , a new growth rate µ k is chosen randomly from a distribution centered on µ and cell grows during that cycle with rate µ k . If symmetric splitting is considered again, the size at time t after n divisions is now: Numerically, we modeled these µ k as a gamma distributed variable centered on the mean growth rate µ, with the experimental variance and no correlation with past cycles. This last assumption might be incorrect and these correlation between cycles can be studied in deeper studies [26].

A general CKE incluing additional souces of noise
A general CKE, assuming exponential growth, can be modeled with growth rate distributed with given distribution ρ(µ): The division rate h(s|s ) in general depends on hidden variables like the number steps but can estimated, at least numerically [29][30][31]. This In the case of single step division, this rate can be written as h(s|s ) = ksρ(s, s ) with ρ(s, s )ds = ρ(s) and ρ(s) being the distribution of size at birth. In the therm ρ(s, s ) non-symmetric division can also considered. h(s), in the last therm, is obtained from h(s) = h(s |s)ds . We already do not have a closed solution to (36) and numerical methods could be hard to implement.

Different division strategies
Depending on the mapping s d = f (s b ), or traditionally, between the relationship between added size ∆ = s d − s b and s b , three main division strategies have been defined for exponentially growing bacteria: the timer, adder and sizer strategies [32]. Differing from the slope of ∆ vs s b for timer this slope is +1; -1 in sizer strategy and for null-valued for adder.
The adder strategy, observed for instance, in E. coli and B. subtilis [3], is considered the most common strategy in bacteria. In some bacterial populations, however, division strategies with intermediate slopes for ∆ vs. s b have also been observed [14,32].
This has led to the definition of the timer-like strategy, for slopes between 0 and 1, and of the sizer-like strategy, for slopes between −1 and 0.
These deviations from the adder can be obtained if the step rate (h) is not proportional to the size but proportional to a power (λ) of the size [14]. This is: A multi-step process similar to that described by (24) can be proposed. As was explained in past studies [14], if division is triggered by the occurrence of M steps happening at rate (37), the distribution of size at division s d given the size at birth s b is [14]: Timer strategy (Added size has a slope of -1 with size at birth) is obtained if λ → 0 and sizer strategy (Added size is proportional to the size at birth) is obtained if λ → ∞. The adder is obtained when λ = 1.
Considering the non-linear step rate given by (37) and a similar method like (31), the general stochastic time is given by:

The mean cell size at birth
The main variables defining the mean cell size are the growth rate µ, the number of division steps M and the division steps occurrence rate k d . If adder strategy is considered (λ = 1), the mean added size ∆ follows the relationship (22).
Since there is already a discussion about the nature of k d , the inference of its actual value is not straightforward. For the adder, this k d can be inferred from their mean added size using (22) and by observing that this ∆ is independent on the size at birth s b . In different division strategies (λ = 1), ∆ is now function of s b . Now, the typical size as explained in past studies [14,17], s b , is the size at birth that is perfectly doubled by the division strategy: This, since after division, the cell, with s d = 2s b , splits on a half and the size of its offspring is s b again.
In general, s b , is dependent on k d , µ and λ and is one of the variables that can be measured most easily if we assume that this s b is actually the mean size at birth in a steady growing cell population. Hence, other variables like k d can be estimated from this s b using (40) and root-finding algorithms.

Illustrating examples
To observe the dynamics of the probability of get divided n times at time t, we present, in Figure 1, time trends of some P n (t)s for different λs and Ms with initial condition P n,m (0) = δ n,0 δ m,0 . . Figure 1 allows us to find that, in the limit of t → ∞, the distribution of P n s satisfies lim i→∞ P n (t) − P n+1 (t + τ ) = 0.

A numerical analysis of the behavior in
(41) Which implies asymptotic invariance of the system under translation on, simultaneously, n → n+1 and t → t +τ over the P n . Since sbe µt 2 n also satisfies this invariance, we expect ρ(s|t, s b ) to show periodic properties in the limit t → ∞. This periodicity was already discussed in some theoretical papers [33][34][35]. This convergence is shown in Appendix A.
Three different scenarios in size dynamics can be explored using our formalism. Figure 2 a. shows the mean size dynamics obtained from a simulation of 5000 cells, all of them with the same starting size and beginning with zero division steps or equivalently, starting from their most recent division. We assumed that they have the same growth rate and get split perfectly symmetric. Simulation was done using the stochastic times (32) and numerical estimation was done solving the master equation ( In Figure 2. b., the mean dynamics corresponds to cells with an initial distribution with finite variance (C 2 v (s, t = 0) = 0.02). This distribution was assumed to be Gamma distribution since it is well defined from its mean size and the C 2 v (s). Simulations, on the other hand, were modified by using random initial sizes, numerical estimation was done by performing the convolution (28) and approaching the integral to a numerical Riemann sum. Dynamics on cell size variability are also presented in Figure 2. f. Similar oscillations were found in both the mean and the variability but with less amplitude than the first case. The dynamics of this distribution appears in supplemental video 2.
A third scenario consists on the assumption that bacteria do not split perfectly on a half but in a beta distributed independent stochastic variable centered on 0.5 and with a given variability C 2 v = 0.002. Growth rate could be considered stochastic as well with variability set to C 2 v = 0.02. The dynamics of the mean size (Fig 1.c.) and its variability (Fig 1.g.) are presented. We plotted ten cycles in the background of Figure 2.c. to show some examples of typical single-cell size dynamics. Since this noise is not considered in equation (24), the numerical approach is hard to compute and thus not presented in Figure 2. The distribution dynamics can be seen in supplementary video 3.
We also estimate the division strategy using both simulations and numerical estimations. Data from stochastic simulations can be obtained using the stochastic division times (39) and exponential growth (4) while the trends in added size and its variability can be obtained from the distribution of size at division (38) being both dependent on the exponent λ. In Figure 2 d. we present the mean added size δ as function of the mean size at birth s b for three different λ. These λ were chosen to represent three of the most important division strategies: timer-like (0 < λ < 1, where we choose λ = 0.5) with its characteristic positive slope on ∆ vs s)b, Adder (λ = 1) with no correlation between ∆ and s b and sizer-like (1 < λ < ∞, where we choose λ = 2) with a negative slope in ∆ vs s b . Fluctuations over these trends are also shown in Figure 2 with σ(t) being the standard deviation of the size at time t and x is the mean value of the random variable x How the auto-correlation dynamics changes along the time is presented on Figure 3. a. where we present four different cases: a single division step, which size dynamics have been presented already [22]. This auto-correlation decays exponentially to zero. By increasing the division steps, for instance to 10 steps, oscillations appear around a decaying trend. For a division almost deterministic, for instance 50 steps, these oscillations have higher amplitude around zero. When noise on both growth rate and septum position is considered, these oscillations are damped in the same way found in size dynamics, converging to zero. This asymptotic decorrelation let the distribution reach a stationary distribution (at t = 10τ ) with fixed moments which is presented in Figure 3. b. for M = 10 and the noises explained above.

Discussion
In this article, we present a theoretical scheme to estimate the stochastic dynamics of the cell size for a population of constant number. This approach, based on the Chapman-Kolmogorov formalism, assumed that the population number remains constant along the time. Although the framework can be extended to a exponentially growing population, simulation could be unstable since the number of cells in these populations grows exponentially on time. We think to consider this case of CKE in future studies. Additionally to the estimation of size dynamics, our framework can be also used for: Simulate most of the division strategies found in E. coli: timer-like, adder and sizer-like [10,32]. Estimate the distribution of division times, size at division and size at birth. Noise in both septal ring placing and cell-to-cell growth rate can be considered as well [26].
As is shown in Figure 2, oscillations in both, the mean size s and C 2 v (s), were found. When only stochasticity on division times is considered, these oscillations are maintained over an arbitrary long period of time having lower amplitude when an initial size distribution with finite variance is considered. Some damping occurs if other sources of noise like the cell-to-cell growth rate variability and septum position are added.
This robustness on the oscillations can be understood as result of the asymptotic periodic properties of the probability P n of have n divisions at time n. These probabilities are invariant under the simultaneous transformation n → n + 1 and t → t + τ originating the oscillations. This invariance under this transformation, is broken when other noise sources are considered.
These properties found in size dynamics can be also obtained using the classical Discrete Stochastic Maps. A clear correspondence between DSM and our model can be found when the adder strategy is considered. In this case, the stochastic map between size at birth s b and size at division is: with being an independent random variable with zero mean and a distribution fitted from experiments. If exponential growth is considered, the cell-cycle duration τ d , as random variable, can be obtained from (43): where, using (22), some analogy to (32) can be found. Using these times with parameters fitted from the data, similar oscillations in both size trends and auto-correlation can be obtained.
The main difference with DSMs is found where deviations from the adder are considered. Using the CRMs, the fluctuations ( ) in the division strategy (43) will depend on the size at birth unlike the DSM where these fluctuations are not related to any other variable. Some preliminary observations on the dependence of the fluctuations on added size with the size at birth in sizer-like division in E coli have been reported [14,36] but further observations are needed.
Including cell size stochasticity to gene expression can be an important tool to understand the origin of the fluctuations in molecule concentration. Some efforts have already been done to understand these effects in simple regulatory networks [37][38][39] but our formalism can improve this study to more complex gene regulatory architectures. Other effects such as the division strategy, the noise in growth rate and the asymmetric cell splitting can be also studied using our framework.

A The asymptotic periodicity of the system
To check the property (41), we calculate numerically the distance between the probabilities P n (t) and P n−1 (t − τ ) using the expression: the results are presented in Figure A1 where we can check how this distance decays asymptotically to zero as n increases.

B Size dynamics for different growth conditions and division rates
Past studies suggested that the periodicity under translations in τ and n is an exclusive property of the exponential growth [33].
Thus, if other growth conditions are considered, the symmetry under these translations is broken and the robust oscillations are no expected. We simulated the size dynamics of different possible growth and division rates in similar way explained in past studies [17]. Thus, we define the growth law as (2), where exponential growth is defined as g(s, t) = µs and the linear growth is given by g(s, t) = µ with µ being a constant.
On the other hand, the division rate is defined by the function h as (6). In figure A2, we compare two division rates for the linear growth, one of these are h = k and the other is h = ks with k a constant. These rates define the occurrence of a given division steps. In figure A2 we considered M = 10 steps to trigger the division.
In Figure A2.a. we present the dynamics of the mean cell size s along the time with some single trajectories in the background (gray lines) presented to compare them to the mean trend. While in Figure A2.c. we present its variability C 2 v (s) along the time for for a linear growth g = µ and constant division rate h = k . b) s vs t and d) C 2 v (s) vs t both for a linear growth g = µ and a division rate por portional to the cell size h = ks. Perfectly symmetric splitting and no-noise in growth rate is considered.