Transient Compartmentalization dynamics in the presence of mutations and noise

We extend a recently introduced framework for transient compartmentalization of replicators with selection dynamics, by including the effect of mutations and noise in such systems. In the presence of mutations, functional replicators (ribozymes) are turned into non-functional ones (parasites). We evaluate the phase diagram of a system undergoing transient compartmentalization with selection. The system can exhibit either coexistence of ribozymes and parasites, or a pure parasite phase. If the mutation rate exceeds a certain level called the error threshold, the only stable phase is the pure parasite one. Transient compartmentalization with selection can relax this error treshold with respect to a bulk quasispecies case, and even allow ribozymes to coexist with faster growing parasites. In order to analyze the role of noise, we also introduce a model for the replication of a template by an enzyme. This model admits two regimes: a diffusion limited regime which generates a high noise, and a replication limited regime, which generates a low noise at the population level. Based on this model, we find that, since the ribozyme dynamics belongs to the replication limited regime, the effects of noise on the phase diagram of the system are mostly negligible. Our results underlines the importance of transient compartmentalization for prebiotic scenarios, and may have implications for directed evolution experiments.


Introduction
Compartments play a central role in many biological processes of cells, in particular in organelles such as the ER or in the Golgi apparatus [1]. Cells use compartments to organize chemical reactions in space: compartments eliminate the risk of losing costly catalysts which are essential for biochemical reactions, they also accelerate chemical reactions, while reducing the risk of cross-talks due to other side reactions.
In the early 20th century, Oparin suggested that membrane-less compartments, which he called coacervates, could have played a central role in the origin of life [2]. Many compartments in cells are bounded by membranes, but following the recent discovery of the so-called P-granules in C elegans embryo [3], biologists noticed that membrane-less compartments abound in living organisms.
These membrane-less compartments are of interest for physicists who want to understand better the active non-equilibrium phase separation which creates them [4]; but also for chemists who are trying to synthetize and control them in vitro [5,6].
After the discovery of the structure of DNA, the coacervates scenario for the origin of life got less popular, and was replaced by replication scenarios [7,8]. In the sixties, Spiegelman showed that RNA could be replicated by an enzyme called Qβ RNA replicase, in the presence of free nucleotides and salt [9]. After a series of serial transfers, he observed the appearance of shorter RNA polymers, which he called parasites. Typically, these parasites are non-functional molecules which replicate faster than the RNA polymers introduced at the beginning of the experiment. In 1971, Eigen conceptualized this observation by proving theoretically that for a given accuracy of replication and a relative fitness of parasites, there is a maximal genome length that can be maintained without errors [10]. This result led to the following paradox: to be a functional replicator, a molecule must be long enough. However, if it is long, it cannot be maintained since it will quickly be overtaken by parasites. This puzzle eventually played a central role in origin of life studies [11,12].
In the eighties, a theoretical solution emerged, the Stochastic corrector model [13] inspired by ideas of group selection [14]. The idea is to use compartmentalization and selection to maintain functional replicators (ribozymes). The compartments in that model undergo cell division, which is a sophisticated feature that strongly constrains the allowed prebiotic scenarios.
In order to address this point, and also to assess the role of transient compartmentalization using a quantitative theoretical model, we introduced a general class of multilevel selection with transient compartmentalization [15]. This class includes several scenarios for the origin of life based on various types of compartments (lipid vesicles [16], pores [17,18], inorganic compartments [19], .. ) or various protocols of transient compartmentalization [20,21] and a recent experiment, in which small droplets containing RNA in a microfluidic device [22] were used as compartments.
The related issue of cooperation between producers and non-producers has been discussed before [23]. When compartments are not well defined, their role can be played by spatial clustering, which can favor the survival of cooperating replicators [24,25]. These ideas were combined in a recent study of a population of individuals growing in a large number of compartmentalized habitats, called demes [26]. Another recent related study on transient compartmentalization quantifies co-encapsulation effects in the context of directed evolution experiments [27].
In this paper, we go beyond the analysis carried out in our previous work [15] by including the effect of mutations and noise. The motivation of including mutations comes from experiments, since mutations play a role in the RNA droplet experiment [22] which inspired us. The motivation of discussing noise in more details comes from the realization that replication is inherently stochastic when a small number of replicators are present in compartments. Therefore, the deterministic approach used in our previous work [15] may seem like a severe approximation. In fact, it is important to appreciate that although our deterministic model neglects some sources of noise such as fluctuations in the growth rates, it includes fluctuations due to the smallness of the number of replicators present in the initial condition. This is a major source of noise because at the end of the exponential phase, the compartments no longer contain small numbers of molecules and therefore fluctuations become again negligible. A similar effect occurs when considering protein aggregation kinetics in small volumes: the fluctuations in the mass concentration of proteins at some later time are mainly due to the smallness of the number of proteins present in the initial condition [28].
In Sec. 2, we recapitulate the essence of the mutation-free model which we have introduced in [15], then in Sec. 3, we introduce an extension of that model to include deterministic mutations.
The effect of noise is then covered in Sec. 4, which contains, in particular, in Sec. 4.1 a simple model for the replication of a single template by an enzyme, and in Sec. 4.4 an analysis of the growth noise in a population of such replicating molecules. The latter model is finally used to analyze the effect of noise on the transient compartmentalization dynamics introduced in the first section.

Definition of the model
We start from a large pool of molecules, which contains functional molecules called ribozymes and non-functional ones called parasites. Let the fraction of ribozymes in this pool be x. These molecules then seed a large number of compartments, which is treated in the infinite limit. A given compartment will contain n molecules, out of which m will be ribozymes and the remaining ones parasites. It follows that n is a random variable drawn from a Poisson distribution of parameter λ, while the number m follows a binomial distribution B m (n, x). The resulting probability distribution for seeded compartments is then In addition to the replicating molecules, a large amount of replication enzymes n Qβ and activated nucleotides n u is supplied. We assume that all compartments contain the same amount of replication enzymes and activated nucleotides.
After seeding, the numbers of ribozymes m and parasites y grow exponentially, what in a deterministic model leads tom with T the time at the end of exponential growth phase,m the number of ribozymes andȳ the number of parasites at time T . At the end of this growth phase, we have n Qβ ≈ N =m +ȳ, at which point further growth will be limited by the number of replication enzymes. As a result, after time T , the growth will be linear instead of exponential, but in any case, the system composition defined here by the relative fraction of ribozymes, will not change. For this reason, we focus on the final composition at time T which is controlled by the ratio Λ = e (γ−α)T . Here, we do not describe precisely the crossover between the exponential and the linear regime, which could be done using the notion of carrying capacity [29]. In that case, the growth would be described by logistic equations and the carrying capacity would be comparable to n Qβ . Note also that the exact time T may depend on m, n, but since in practice n Qβ m, n this dependence has a small effect on the results of the model as we have checked in the Suppl. Mat. of Ref. [15].
In any case, the ribozyme fraction at the end of growth phase can be well approximated as If parasites grow faster, we have γ > α, and thus Λ > 1, which is the regime considered in Ref. [15].
In Sec 3, we also consider regimes in which γ < α.
We now implement selection at the compartment level. In practice, selection could be autonomous or non-autonomous. For instance, in the experiment of Ref [22], the selection was non-autonomous : a measurement of the synthesis of a dye molecule by photodetection was used to promote or reject compartments according to the outcome of that measurement. Selection can in general be described by a selection function f (x) ≥ 0. In our work, we have assumed that the selection function only depends on the final compositionx of the compartment.
For the ribozyme-parasite scenario, a natural choice for f is a monotonically increasing function ofx. As an example, we will use the sigmoidal function where x th and x w are dimensionless parameters, which describe respectively a threshold in the composition and the steepness of the function.
The compartments which have passed the selection step are then pooled together, forming a new pool of molecules from which future compartments can be seeded. The ribozyme fraction x of this new ensemble is the average ofx among the selected compartments which is equivalent to n,m f (x(n, m))P λ (n, x, m) .
The transient compartmentalization cycle is then repeated, starting with the seeding of new compartments from that pool of composition x .
Upon repetition of this protocol, the pool composition typically converges to a fixed point x * , which is a solution of The stability of the fixed point x * changes when It is implicitly assumed that x (x) is a sufficiently smooth function of x for this derivative to be defined.

Main dynamical regimes
Although finding a fixed point x * is generally difficult, our ribozyme-parasite model contains  of completed cycles of compartmentalization. The model correctly reproduces that this fraction quickly goes to zero as function of the round number in bulk, less quickly with compartmentalization and no selection and even less quickly in the case of compartmentalization with selection. In the latter case, a finite fraction can be maintained for an infinite number of rounds provided λ is sufficiently small, corresponding to the coexistence region of the phase diagram.
In order to compare precisely the predictions of the model to the experiments of Ref. [22], it is important to know the value of key parameters such as Λ. Table 1 reports the experimental parameters measured in Ref. [22] for the ribozyme and three different parasites. The nucleotide length, its doubling time (T d ), its relative replication rate (r) from which we infer Λ in the final column. The doubling time T d for the ribozyme is related to the growth rate α by T d = ln(2)/α, and similarly the doubling times of the parasites is T d = ln(2)/γ.
In the experiment, a typical compartment contains λ RNA molecules that can be ribozymes or parasites, 2.6 · 10 6 molecules of Qβ replicase, and 1.0 · 10 10 molecules of each NTP. Replication takes place by complexation of RNA with Qβ replicase, which uses NTPs to make a complementary copy.
This copy is then itself replicated to reproduce the original. There is a large amount of nucleotides, so that exponential growth of the target RNA proceeds until N ≈ n Qβ . This large quantity of enzymes also means that in practice, the noise due to fluctuations in the number of enzymes should be very small. Starting from a single molecule, it takes n D = log 2 n Qβ = 21.4 doubling times to reach this regime. In a parasite-ribozyme mixture, we can estimate Λ using the relative r:

A modified model with deterministic mutations
In the deterministic model, we assume that a fraction µ of replicated ribozyme strands mutate into parasites. Thus, the equations describing the evolution of m and y in the growth phase assumes the formṁ which yields for the first equationm wherem is again the number of ribozymes at the end of the growth phase and m the value at the initial time. Now substituting Eq. (12) into the equation for y, one finds The ratio between the number of daughters of one parasite molecule and the number of daughters of a ribozyme molecule is now renormalized by the rate µ:Λ = e (γ+µ−α)T = e µT Λ, where Λ is the relative growth of parasites introduced previously in the mutation-free model.
The fraction of ribozymes at the end of the exponential phase is now given bȳ where δ = µ/(α − µ − γ). We call δ the mutation ratio, which is a dimensionless measure of mutation versus relative growth (competition). When δ → 0, we recover the mutation-free model, if |δ| 0 mutations become dominant.
Selected compartments are then pooled together, and the new average fraction of ribozymes becomes x (x, λ, δ,Λ). Note that for nonzero mutation rate (µ > 0), x = 1 ceases to be a fixed point in this deterministic approach, since parasites will always appear at sufficiently long times.
Therefore, the pure ribozyme (R) phase is no longer present in the phase diagram of fig. 2.
The fixed point x = 0 however is still present. If this fixed point is stable, we have a pure parasite phase. If it is unstable, there is stable coexistence at a fixed composition. If more fixed points appear, multiple stable compositions are in principle be possible.

The prolific parasites regime (Λ ≥ 1)
Prolific parasites have a better bulk reproductive success than ribozymes, whenΛ ≥ 1, which is equivalent to α ≤ µ + γ and δ < 0. In a mutation-free model, this would imply necessarily a faster growth of parasites (α < γ), but in the present case, we could also allow for slower parasites as compared to ribozymes (i.e. α > γ), provided parasites are aided by a sufficiently high mutation rate µ.
The phase diagram is evaluated by testing the stability of the fixed point x = 0. We find an asymptote behaving like 1/λ for large λ, and plateaus for small λ. The ends of these plateaus locate in the limit δ → 0 at the position of the vertical line separating the ribozyme and bistable phase in the original phase diagram.
Let us first derive the right asymptote in the λ 1 limit. In this limit, we evaluate x by considering compartments of size λ The fixed point stability condition dx /dx| x=0 = 1 leads to .
Since we consider monotonically increasing selection functions, f (0) > 0. For λ −δ, we find which is the same expression as the one found in the mutation-free phase diagram [15]. This explains why there is a single asymptote as µ is varied in the λ 1 limit.
The plateaus extend to very low values of λ. We can find their location by considering only compartments of size n = 1. In that case, the final compositions can bex(1, 0) = 0 or We then have for the composition recursion Evaluating the derivative of x (x), we findx  Substituting (19), we find that the location of plateaus obeys the implicit equation

The prolific ribozymes regime (Λ ≤ 1)
We now consider the opposite case where parasites are less prolific than ribozymes. This means α ≥ µ + γ and is equivalent toΛ ≤ 1, δ > 0. This implies that α > γ (less aggressive parasites) and is reminiscent of a quasipecies scenario in which a fit ribozyme succesfully outcompetes its parasites in bulk [10]. Since this can already happen in the absence of selection, we consider here the case where there is no selection, i.e. f (x) = 1.
To analyze this regime we again assess the fixed point stability of x = 0. We locate numerically the separatrix as shown in Fig 3. We obtain separatrices that forΛ → 0 tend to a fixed value of λ.
Let us start by observing that whenΛ → 0, there are only two final compartment compositions for nonempty compartments:x(n, 0) = 0 orx(n, m) = 1/(1 + δ) for m > 0. We can now distinguish between three initial compartment compositions: (i) only parasites, (ii) no parasites, no ribozymes, and (iii) containing at least one ribozyme. Their associated seeding probabilities are: In that case, we can write the composition recursion equation as The condition dx /dx| x=0 = 1 yields the expression for the asymptote. For λ 1, we obtain using (25) which agrees very well with Fig 3. Notice that here the coexistence phase is located to the right of the asymptotes, and the parasite phase to the left, whereas in Fig 2 it is the other way around. An intuitive way to understand this is to consider the limit λ → 0. In this limit, nonempty compartments start with either a parasite or a ribozyme. The former will grow to a fully parasitic compartment, whereas the latter will contain ribozymes plus some parasites acquired by mutations. Therefore, at low λ, the ribozyme's capacity to outgrow parasites (competition) cannot be exploited, leading to ribozyme extinction. It is only when ribozymes and parasites are seeded together that the differential growth rate becomes important, which becomes increasingly likely for higher λ. The phase boundaries in

Error catastrophe
An error catastrophe corresponds to a situation where the accumulation of replication errors eventually causes the disappearance of ribozymes. Since there are only a parasite (P) and a   coexistence phase (C) in the model with mutations, the error catastrophe means that the coexistence region shrinks at the benefit of the parasite phase as the mutation rate increases. One sees this effect in Fig. 2, which corresponds to the prolific parasites regime (Λ ≥ 1) discussed above. In this figure, we see a larger coexistence region in the small λ region, because there the compartmentalization is efficient to purge parasites. As the mutation rate increases however, this region shrinks because the compartmentalization fails to purge the more numerous parasites.
In Fig. 4, a particular example is provided where α and γ are fixed, such thatΛ is fixed, and µ is varied. Since competition is fixed, we have µ ∝ δ. The resulting steady-state value x = x * then decreases monotonically with µ, and reaches x = 0 when crossing the phase boundary in The error catastrophe was also studied in the absence of selection, and was shown to be in the prolific ribozymes regime (Λ ≤ 1). In Fig. 6, an example of this case is shown, and there too, we see that the steady-state value of the ribozyme fraction x * decreases as µ is increased, until it reaches the phase boundary in Fig 7. In contrast to Fig. 4, where the error threshold decreases as  the size of compartments increases, the trend is just the opposite in Fig. 6, which is expected since the role of ribozymes and parasites are exchanged here as compared to the prolific parasites regime.
In the prolific parasites regime,Λ ≤ 1 with selection, it is interesting to recast the error threshold as a constraint on the length of a polymer to be copied accurately, as done in the original formulation of the error threshold [10]. Let us introduce the error rate per nucleotide, . Then, for a sequence of length L, we have α − µ = α(1 − ) L . Since 1, it follows from this that µ = α L.
When α γ, we have lnΛ = α LT . Using Eq. (22), we find that the condition to copy the polymer accurately is where s = f (x)/f (0) and αT / ln 2 is the number of generations. This criterium has a form similar to the original error threshold [10], namely where s = α/γ represents the selective superiority of the ribozyme. In our model, the equivalent of s is s which characterizes the compartment selection.

Noise in growth
For deterministic growth, given by Eqs. This model includes noise due to the stochastic binding of the replicating enzyme to template molecules and the noise due to the stochasticity of monomer addition once the enzyme is bound to a template. Importantly, this model assumes that the replicase once bound stays always active until completion of the copy of the template, therefore the possibility that the replicase falls off the template before completion of the copy is neglected. Similarly, any effects associated with the interaction of multiple replicases on the same template are neglected. In fact, when the replicase falls off of its template, the copying process is aborted and the shorter chain which has been produced in this way becomes a parasite. We can therefore describe such a process as a mutation using the framework of the previous section. To separate the effects due to mutations and noise clearly, we disregard from now on the possibility of mutations, and we focus in the following on the description of the noise associated with replication. Such a noise can stabilize the ribozyme phase at the expense of coexistence, and the coexistence phase at the expense of the parasite phase. The noise of replication becomes very small when the rate-limiting step is nucleotide incorporation, in which case one can use a deterministic approach.

A minimal model for the replication process
The replication of an RNA strand A by a replicase can be considered to proceed through two stages. In the first stage, an RNA molecule A binds to a polymerase E, to form a complex X 0 with the rate κ C .
Subsequently, activated nucleotides X are incorporated in a stepwise fashion to the complementary strand. A complex of E and A with a complementary strand of length n will be denoted by X n , and the strand grows until the final length L is achieved, such that where for simplicity we have assumed the same rate k for both reactions. Let us denote by t the total time to yield 2A from A, which is the sum of the time associated with the step of complex formation, t C and with the step of L nucleotide incorporations t L . We thus have with t L = L i=0 t i and t i the time for adding one monomer, which we assumed is distributed according to For simplicity, we choose a single value κ for all monomer additions. The time for the formation of the complex, t C is similarly distributed according to where κ C = 1/ t C .
Let us denote the moment generating function of t C by M C (s) and similarly for t L by M L (s) with : (36) From M L one obtains the distribution of replication time f (t L ) by performing an inverse Laplace transform: where L −1 represents the inverse Laplace transform. This equation shows that the replication time distribution of one strand of length L follows a Gamma distribution [30]. For L = 1, Eq. (37) becomes a simple exponential distribution, which is a memoryless distribution. For L > 1, this distribution has memory and the growth in the number of RNA strands can no longer be described as a simple Markov process. Note that the Gamma distribution is peaked around the mean value of t L , namely L/κ for L 1. In this limit, the replication time has very small fluctuations. This feature has recently been exploited to construct a single-molecule clock, in which the dissociation of a molecular complex occurs after a well-controlled replication time [31]. Thus, the cumulant-generating function defined as K(s) = ln M (s), yields the two moments of the distribution of t, namely the mean t and the variance σ 2 t . We have

Coefficient of variation of the replication time
Thus the coefficient of variation of the replication time, namely σ t / t is given by

The generations representation
Let us look at these two growth regimes in a generations representations, where by generations we mean an event of copy of the template by the replicase. The diffusion-limited regime corresponds to Fig. 9, while the replication-limited regime corresponds to Fig. 10. In this representation, the differences in the two growth regimes become very clear. In the replication-limited regime, generations remain synchronized, until enough noise has accumulated over multiple generations.
For two independent strains, generations become desynchronized after about √ L generations. In contrast, in the diffusion-limited regime, fluctuations are very large due to lack of synchronicity in the growth.
These figures have been obtained by simulating the growth of a replicating mixture starting from a single strand. The simulation follows k RNA-enzyme complexes, and for each the variable n k measures the length of the growing complementary strand. For every nucleotide incorporation event, a strand i is chosen with probability 1/k, after which its number of nucleotides is updated from n i to n i + 1. When n i + 1 = L, we set n i = 0, we update k to k + 1, and then we introduce an extra strand variable n k+1 for the new strand. Both the replication-limited regime and the diffusion-limited regime can be modeled using this simulation. In the latter case, we need to choose L = 1, which corresponds to exponentially distributed replication times.

Population-level noise
In sec 4.2, we have analyzed the noise associated with the replication of a single strand.
Ultimately, we wish to quantify the compositional variation of the final population. In order to do so, we turn to the theory of branching processes with variable lifetimes taken randomly from a fixed distribution [32]. As explained in AppendixA, this framework describes theoretically a population that grows exponentially starting from a single individual. In our molecular system, this single individual plays the role of the single molecule present in the initial condition before the replication starts; while the distribution of the lifetimes is the replication time distribution f (t L ) obtained in Eq (37).
For t L L/κ, we find that the average population (starting from a single individual) µ (1) scales as µ (1) (t) = µ * e αt , with a growth rate α κ ln(2)/L. The coefficient of variation of the The renewal theory on which these results are based, can be generalized to the case that there are n individuals in the initial condition as shown in AppendixB. The full solution is rather complicated due to correlations between the subpopulations generated by the different molecules present in the initial condition. In the following, we neglect these correlations: therefore the n initial molecules generate n independent subpopulations, which all start at size 1 and follow the branching process described above and in AppendixA. In that case, each subpopulation now has a mean µ (1) = µ (n) /n and a standard deviation σ (1) ≈ µ (1) / √ L. This then allows to write We show in Fig. 11 that the corresponding coefficient of variation, σ (n) /µ (n) , agrees well with simulations of the branching process. The 2000 simulation runs were stopped after a time t * such that N (t * ) 5000.

Fluctuations in logistic growth
The problem of two species competing for the same resources has been studied in the literature and offers a complementary perspective on the role of noise in a growing population, which has been studied in the previous section. Let us consider two such species, which typically start with a few individuals and then grow according to logistic noise. As shown in Ref. [29], when the carrying capacity is reached, the number of each species is subject to giant fluctuations (the coefficient of variation is of the order of unity) when the two species have similar growth rates. In the terminology introduced in previous section, this model applies to the diffusion-limited regime (L → 1), where a Markov description of the population dynamics is applicable.
Keeping the notations of the first section, we denote by n the initial number of molecules, which splits into m ribozymes and y parasites, and by N the final number of molecules in the compartment. In the neutral case (α = γ), the moments of the number of ribozymesm are found to be [29] : In Ref. [29], it was shown that In general, the dynamics of the composition has a large variability for: (i) small compartments formed in small volumes [28].

Noise in transient compartmentalization
Let Since the ribozyme fractionx at the end of the exponential phase is given byx(n, m) =m/N and N n Qβ , the noise onx(n, m) takes the following form : which is consistent with Eq. (45) which was found using a different formalism [29].
Using the parameters of Table 1 and (41), we can quantify the level of noise in the number of ribozymes or parasites. We find from this table that the ribozyme size was L = 362, and that the experiment should be in the replication-limited regime because the diffusion time scale should be approximately over 2 · 10 4 times smaller than replication times of the order of 10s. The noise in composition should be maximal when we start with one ribozyme and one parasite of equal length, and with α = γ, which on average givesx = 1/2. Consequently, the noise in composition is at most σx ≈ 0.02.

A weak noise approach
The growth equations given by Eqs. (2) and (3) are deterministic in nature, which means that a given initial condition (n, m) yields a unique final compositionx(n, m). In contrast to that in a stochastic approach, a given n and m lead to many different trajectories, which means that x(n, m) is a random variable with a probability distribution p(x(n, m)). Consequently, the ribozyme fraction after one round is This expression is computationally demanding to evaluate for λ 1, but it can be simplified significantly in the weak noise limit.
In order to construct a phase diagram in this limit, we simplify Eq. (50), by considering Given that the amplitude of this type of noise should rapidly diminish for larger L, and that L ∼ O(100) in the experiment, we expect our ribozyme-parasite scenario to be well-described by a deterministic dynamics. We also see that the noise stabilizes the pure ribozyme phase (R) with respect to the coexistence phase (C) because in the presence of noise, the R region has grown at the expense of the C region. Similarly, the noise stabilizes the coexistence region (C) against the parasite region (P).

Conclusion
In this paper, we have carried out two important extensions of our previous work on transient compartmentalization [15], by including the effect of mutations and noise in such systems. This new study confirms one result of our previous work, namely that transient compartmentalization alone can stabilize functional replicators in the absence of a division of compartments of the kind considered in the Stochastic corrector model [13]. We can now add to that, that this property is robust with respect to mutations and noise, an important aspect for Origin of life studies.
In the presence of mutations, we have found that the phase diagram of long-time composition of this system only contains the parasite and the coexistence phases. The case where ribozymes grow faster than the parasites can be analyzed in terms of a modified error threshold, which interestingly now depends on the dynamics of compartmentalization and selection.
In order to analyze the role of noise in this system, we have introduced a simple model for the replication of a template by an enzyme. In the replication limited regime of that model, which should correspond to our experimental conditions, a low noise should be present at the population level, which we have quantified using tools borrowed from the theory of branching processes. In the end, we have studied the modified phase diagram of our model in this weak noise limit.
Of course, the two effects that we have studied here separately, namely mutations and noise, could be present simultaneously. We cannot also exclude that a more detailed modeling of the molecular replication or a different form of compartmentalization dynamics could lead to features not captured by the present treatment. Nevertheless, we think that the present framework represents a basis on which further studies could be built. We hope that our work may not only contribute to studies on the Origin of life but also to future developments on related important experimental techniques such as digital quantitative PCR [33] or Directed Evolution [34].
We would like to thank Y. Rondelez for many important and insightful suggestions. We acknowledge stimulating discussions with B. Houchmandzadeh.

AppendixA. Population-level noise generated by a single individual in the initial condition
Let us consider an age-dependent renewal process, in which the probability density of branching at age t is given by f (t), and upon branching, the probability of having k offspring is given by φ k (assumed to be age-independent for simplicity). We would like to evaluate the behavior of the number N (t) of individuals at time t. Let us define the function h(s) by We also define the generating function for the process N (t) by where p k (t) is the probability that N (t) = k. We assume that p k (0) = δ k1 = p 0 k , i.e., that we start from a single object. We can then evaluate p k (t) by adding the probability that no branching has occurred between 0 and t, which is given by 1 − where m = h (1) = k k φ k is the average number of daughters upon branching.
To solve this equation in the limit t → ∞, let us multiply both sides by e −αt and take the limit. Since lim t→∞ F (t) = 1, we obtain µ * = lim We can use this framework to also evaluate higher moments of the population size, and from that obtain the coefficient of variation of the population size which characterizes the amplitude of the noise. Let us denote the second derivative of the generating function with respect to s by ζ At large times, ζ(t) ≈ ζ * e 2αt . The variance of the population size σ 2 follows from the standard relation: (A.14) For the specific case we are considering, we find After extracting the leading contribution in the large L limit, we find: 16) which is numerically close to 1/ √ L since √ 2 ln 2 = 0.980.. ≈ 1.

AppendixB. Population-level noise generated from n individuals in the initial condition
If we start from n individuals rather than just one, we can write the probability to have k individuals at time t, p  which expresses the average with n initial strands in terms of the average with one initial strand.

Together with Eq. (B.3), this leads to
which is the coefficient of variation found previously for a single individual in the initial condition, divided by √ n as expected for the growth from independent individuals. This confirms the scaling found in Eq. (42).