## Abstract

Stem cell heterogeneity is essential for the homeostasis in tissue development. This paper established a general formulation for understanding the dynamics of stem cell regeneration with cell heterogeneity and random transitions of epigenetic states. The model generalizes the classical G0 cell cycle model, and incorporates the epigenetic states of stem cells that are represented by a continuous multidimensional variable and the kinetic rates of cell behaviors, including proliferation, differentiation, and apoptosis, that are dependent on their epigenetic states. Moreover, the random transition of epigenetic states is represented by an inheritance probability that can be described as a conditional beta distribution. This model can be extended to investigate gene mutation-induced tumor development. The proposed formula is a generalized formula that helps us to understand various dynamic processes of stem cell regeneration, including tissue development, degeneration, and abnormal growth.

## 1 Introduction

Stem cell regeneration is an essential biological process in most self-renewing tissues during development and the maintenance of tissue homeostasis. Stem cells multiply by cell division, during which DNA is replicated and assigned to the two daughter cells along with the inheritance of epigenetic information and the partition of molecules. Unlike the accumulated process of DNA replication, inherited epigenetic information is often subjected to random perturbations; for example, the reconstruction of histone modifications and DNA methylation are intrinsically random processes of writing and erasing the modified markers [71, 91]. The stochastic inheritance of epigenetic changes during cell division can lead to stem cell heterogeneity which is important for the dynamic equilibrium of various phenotypic cells during tissue development. Accumulation of undesirable epigenetic changes may result in promoting or causing diseases [17, 18, 36, 43, 59, 60, 70, 72, 81, 92].

The heterogeneity of stem cells has been highlighted in recent years due to new technologies with single-cell resolution, which have led to the discovery of new cell types and changes in the understanding of differentiation landscapes [6, 9, 29, 50, 51, 69]. In early embryonic development, heterogeneous expression and histone modifications are correlated with correlated with cell fate and the dynamic equilibrium of pluripotent stem cells [33, 34, 68, 84]. Chromatin modifications in the human primary hematopoietic stem cell/progenitor cell (HSC/HPC) stage can lead to the dynamic equilibrium of heterogeneous and interconvertible HSCs [10, 89], as well as gene expression changes during differentiations [11]. Moreover, applications of single-cell RNA sequencing have revealed the continuous spectrum of differentiation in zebrafish [54], mice [65], and human HSCs [85]. These findings have challenged the demarcation between stem cells and progenitor cells and have led to the evolving understanding of the complex hematopoietic differentiation landscape [47, 66].

Heterogeneity plays an important role in the development of drug resistance. Cancer development is driven by evolutionary selection on somatic genetic alterations and epigenetic alterations, which result in the multistage tumorigenesis and heterogenous cancer cell phenotypes [22, 23, 39, 51, 53, 64, 90]. Tumors with different subtypes often differ in the treatment response and patient survival [13, 23, 73], and treatment stress can also induce cancer cell plasticity and drug resistance [24, 48, 57, 80, 82]. Cell plasticity is often associated with epigenetic modifications, and targeting the epigenetic regulators, such as the polycomb group protein EZH2, has been an attractive strategy in cancer treatment [20, 77, 83]. To better understand the progress of tumorigenesis and drug resistance, we need to develop predictive models of the evolutionary dynamics of cancer [3, 31].

Despite the central role of stem cell regeneration in tissue development, a quantitative investigation of the process is well beyond the ability of current technologies. Furthermore, in many fields of biological science, mathematical modeling tools have aided in improving the understanding of the principles of related processes [2, 45, 61, 62]. In 2007, Weinberg posed the following question [86]: can algebraic formulae tell us more than reasoning about the behavior of complex biological systems? Various computational models have been established in studies of tissue development and cancer systems biology under different circumstances [3, 4, 16, 25, 26, 27, 87, 88]. Nevertheless, a unified formulation that bypasses detailed assumptions is required to provide more basic logic of the biological behaviors of these complex systems. In this study, based on the general process of the cell cycle and heterogeneous stem cell regeneration, we established a general mathematical framework to formulate the dynamics of heterogeneous stem cell regeneration. The model framework includes essential cellular behaviors, including proliferation, apoptosis and differentiation/senescence; however, it bypasses the biological details of signaling pathways. The heterogeneity of stem cells and epigenetic inheritance during the cell cycle are key points in model development. Various formulas can be applied to different processes, such as embryonic development, tissue disease and degeneration, and tumor development.

The aim of this paper was to introduce a new general formula for the dynamics of stem cell regeneration with an emphasis on the effects of cell heterogeneity; therefore, a discussion of concrete conclusions based on the formula was not included. The simulation results below were included to demonstrate the potent application of the model and were not related to any actual biological processes.

## 2 Results

### 2.1 The G0 cell cycle model for homogeneous stem cell regeneration

A classical model that is used to describe the dynamics of stem cell regeneration is the G0 cell cycle model proposed in the 1970s [7, 55]. In this model, homogeneous stem cell cycles are classified into resting (G0) or proliferating (G1, S, and G2 phases and mitosis) phases (Figure 1A). During each cell cycle, a cell in the proliferating phase either undergoes apoptosis or divides into two daughter cells; however, a cell in the resting phase either irreversibly differentiates into a terminally differentiated cell or returns to the proliferating phase. This can be modeled by an age-structure model for cell numbers in the resting phase and proliferating phase. Integrating the age-structure model through the characteristic line method provides the following delay differential equation (Material and methods)

Here, *β*(*Q*) is the proliferation rate, *µ* is the apoptosis rate of cells in the proliferating phase, *τ* is the duration of the proliferating phase, and *κ* is the rate of removing cells out of the resting phase, which includes terminally differentiation, cell death, and senescence (hereafter, we call *κ* the differentiation rate for simplicity). Hereafter, the subscript indicates a time delay, *i.e., Q*_{τ} indicates *Q*(*t − τ*). The proliferation rate *β*(*Q*) describes how cells regulate the self-renewal of stem cells through secreted cytokines and is often given by a decrease function and *β*_{0} < *β*(*Q*) < *β*_{∞} (Material and method). Typically, for normal individuals, we usually have *β*_{∞} = lim_{Q→+∞} *β*(*Q*) = 0 because of the inhibition of the cell cycle pathway.

The G0 cell cycle model and its extensions are widely used to investigate hematopoietic stem cell dynamics [1, 19, 49, 56]; dysregulation of the apoptosis rate or differentiation rate of hematopoietic stem cells can result in serious periodic hematopoietic diseases [12]. Moreover, from (1), the stem cell dynamics are mainly determined by pathways related to stem cell proliferation, apoptosis, differentiation, senescence, and growth. Major oncogenic signaling pathways obtained from an integrated analysis of genetic alterations in The Cancer Genome Atlas (TCGA) [74] show direction connections to the coefficients *β*(*Q*), *µ*, and *κ* in (1) (Figure 1B) (Material and methods). Equation (1) is capable of describing the population dynamics of stem cell regeneration. Nevertheless, cell heterogeneity is not included in the model and has been highlighted in recent years for the understanding of cancer development and drug resistance in cancer therapy.

### 2.2 The general framework of heterogeneous stem cell regeneration

To extend the abovementioned G0 cell cycle model to include cell heterogeneity, we introduce a quantity **x** (scalar or vector) for the epigenetic state of a cell and denote *Q*(*t*, **x**) as the cell number at time *t* with state **x** (Figure 2A). In general, **x** can refer to the expression levels of marker genes, histone modifications in nucleosomes, or DNA methylations associated with DNA segments and can be measured by single-cell sequencing techniques. Specifically, we often refer to **x** as quantities that affect signaling pathways that control cell cycle progression, apoptosis, and cell growth, so that the coefficients *β, µ*, and *κ* and the duration of the proliferating phase *τ* in (1) are cell specific and dependent on the state **x** in the cell. Moreover, cells in the niche can interfere with stem cell self-renewal through released cytokines. Let *ξ*(**x**) denote the effective cytokine signal produced by a cell with state **x**, and denotes the total concentration of effective cytokines that regulate cell proliferation. The proliferation rate in (1) becomes .

While the cell-to-cell variability is considered, the inheritance of epigenetic states of cells during cell division is essential to shape the distribution of cell heterogeneity. Many biological processes, such as the random partition of molecules [38], random inheritance of nucleosome modification [14, 71] and DNA methylation [91, 36], can be involved in the affecting the inheritance of epigenetic states from mother to daughter cells after cell division. Many efforts have been made to model epigenetic cell memory [14, 30, 37, 38, 79]; however, it remains challenging to develop precise models for this process that is not yet clear. Nevertheless, while we overlook the biological details and focus on the changes of epigenetic states, we introduce the inheritance probability *p*(**x, y**), which represents the probability that a daughter cell of state **x** comes from a mother cell of state **y** after cell division. Therefore, *p*(**x, y**)*d***x** = 1 for any **y**. Based on the abovementioned assumptions and the similar argument to the abovementioned homogeneous model, the dynamical equation for *Q*(*t*, **x**) is as follows (Material and methods):
Here, the integrals are taken over all possible epigenetic states. Moreover, if we consider discrete states, such as gene mutations, we can extend the integrals to include the summation over all discrete states. Equation (2) extends the previous G0 cell cycle model and provides a general framework for heterogeneous stem cell regeneration.

Equation (2) is an autonomous system in which the rate functions and inheritance probability are not explicitly dependent on the time t. Nevertheless, while we apply the equation to situations with time dependence, such as embryo development, environmental changes, injury, and external stimuli, the time-dependent rate functions and the inheritance probability can be included in a straightforward manner. An example is shown in Figure 5 below.

Based on the (2), while we introduce appropriate definitions for the dependence of kinetic rates on epigenetic states and the inheritance function, we are able to model various processes of stem cell regeneration, such as tissue growth, degeneration, and abnormal growth (Figures 2B-G). For simplicity, in all simulations shown in this paper, we consider one epigenetic state *x* (0 ≤ *x* ≤ 1) that affects only cell proliferation and differentiation in a manner similar to the stemness so that a larger value of x indicates higher stemness (Material and methods). Figure 2B-C shows the dynamics of tissue growth starting from a small population cells with high levels of stemness toward a steady state. There is a temporary transition at the early stage characterized by a rapid increase in the cell number and a subpopulation of cells with low level stemness (Figures 2B-C, red arrows). Figures 2D-E shows the dynamics of degeneration with alterations to the inheritance function, and Figures 2F-G shows the abnormal growth due to a decreasing differentiation rate and an alteration to the inheritance function. Both processes include a short-term stage of biphenotypic cell populations with both high and low stemness cells (Figure 2E and G, red curve). Moreover, the simulations show that the steady state heterogeneity can be restored from cell subpopulation fractions (Figure 3), which is in agreement with experiments that were previously explained by transcriptome-wide noise [35, 52, 89].

Equation (2) describes the evolution of the cell numbers with various epigenetic states; however, the total cell number *Q*(*t*) = ∫*Q*(*t*, **x**)*d***x** and the density of cells with different epigenetic states *f* (*t*, **x**) = *Q*(*t*, **x**)*/Q*(*t*) are relevant to the data.

From equation (2), it is easy to obtain the equations for *Q*(*t*) and *f* (*t*, **x**) as follows (see Materials and methods):
and
Equations (3) and (4) provide the evolution dynamics of relative cell numbers that can be obtained from experiments by single-cell sequencing or flow cytometry. Here we note that when *ξ*(**x**) = 1, we have , and hence, (3)-(4) provide a closed-form equation.

Here, the state variable **x** represents the epigenetic state, and *p*(**x, y**) represents the inheritance function; hence, (2) describes the dynamics with epigenetic state transitions. Nevertheless, this equation can also describe the changes in genetic alternations if we consider **x** as the genetic state and *p*(**x, y**) as the probability of point mutations. This is often the situation of genomic instability associated with cancer development [8, 32, 78, 93], and hence, the model can be used to study genetic heterogeneity in cancer development. In this paper, we focus on the equation with epigenetic state transitions and assumed that **x** always represents the epigenetic state.

### 2.3 Stochastic epigenetic state inheritance in the cell cycle

In Equations (2)-(4), the mathematical formula of epigenetic state-relevant coefficients should be expressed based on how the epigenetic states (or genes) affect the relative biological process. However, the inheritance function *p*(**x, y**) cannot be determined from the biological process of cell division. Here, we derive a phenomenological inheritance function to represent the stochastic inheritance of the epigenetic states. More specifically, let **x** = (*x*_{1}, *x*_{2}, …, *x*_{n}) represent the expression level of n marker genes, and derive the inheritance function *p*_{i}(*x*_{i}, **y**) for each gene, and .

We assume that the epigenetic state of a daughter cell is a random number with the distribution depending on that of the mother cell. In previous studies based on stochastic simulations of gene expression coupled with nucleosome modifications over multiple cell cycles [37, 40], we found that the nucleosome modification level of daughter cells, considering the nucleosome modification level of mother cells, which is normalized to the interval [0,1], can be well-described by a betadistributed random number. Therefore, we generalized these findings and defined the inheritance function *p*_{i}(*x*_{i}, **y**) through the beta distribution density function as follows:
where Γ(*z*) is the gamma function and *a* and *b* are shape parameters depending on the state of the mother cell. We assumed that the mean and variance of *x*_{i}, considering **y**, is as follows:
and the shape parameters are (Materials and methods)
Here, we note that *ϕ*_{i}(**y**) and *η*_{i}(**y**) always satisfy
Hence, the inheritance function can be determined through the predefined functions *ϕ*_{i}(**y**) and *η*_{i}(**y**), often through data-driven modeling or assumptions, that satisfies (7).

### 2.4 Modeling tumor development with cell-to-cell variance

As shown in Figure 2F-G, to mimic the process of abnormal cell growth, we varied the differentiation rate and the inheritance probability. These variances to the model parameters can be a consequence of changes in the microenvironment that may affect all stem cells in the niche. Nevertheless, to model tumor development considering driver gene mutations to individual cells, we need to modify the model equations to include the mutants.

To show the framework for modeling tumor development induced by driver gene mutations, we consider the process with two types of mutations that increase the proliferation rate and decrease the differentiation rate (Figure 4A). Hence, let *Q*_{i}(*t*, **x**) (*i* = 0, 1, 2, 3) represent the wild-type (*i* = 0) and the three mutant subpopulations (*i* = 1, 2, 3) cell counts, and *p*_{i,j} represents the mutation rates. For the simplicity, we assume that gene mutations occur during cell division, and two daughter cells have the same mutant type. Therefore, equation (2) can be extended as follows:
and
Here, we consider only the driver mutation types, and at most one mutation occurs in each cell cycle, so that only the mutation rates *p*_{0,1}, *p*_{0,2} and *p*_{1,3}, *p*_{2,3} are nonzero value; and otherwise, the mutation rate *p*_{i,j} is zero (Figure 4A).

Figure 4B-C shows the simulated dynamics. Single mutant cells occur prior to the obvious increase in the cell number, and the mutant cells eventually develop to double mutations that dominate the cell population (Figure 4B). Moreover, our simulation suggests that stemness increases with evolutionary processes when we limit the mutations to proliferation and differentiation (Figure 4C). Here, we consider only two types mutations that often occur in the precancerous stage [15, 27]. To simulate a more complicated process of cancer development, we must extend the simulation to include more mutations, such as apoptosis, DNA damage repair, and immune response pathways.

### 2.5 Modeling tissue growth with cell lineage dynamics

In the abovementioned models, we considered only cells capable of self-renewal, e.g., stem cells and progenitor cells. Nevertheless, to model tissue growth, we must include terminal differentiated cells that lose the ability to progress through the cell cycle. Therefore, let *Q*(*t*, **x**) represent cells with self-renewal ability as previously mentioned, and let *P* (*t*, **x**) represent the number of terminally differentiated cells. The terminally differentiated cells are produced from the stem and progenitor cells with the rate *κ*(**x**) and cleared with the rate *v*(**x**). Hence, equation 2 can be reformulated as follows:
In the simulations shown in Figure 2 by considering the epigenetic state 0 ≤ *x* ≤ 1 as a stemness index and by distinguishing the stem cells from progenitor cells with the boundary *x* = *x*_{0} (Figure S1), the numbers of stem cells, progenitor cells, and terminally differentiated cells can be determined as follows:
This equation provides a model of multistage cell lineages shown in previous studies [1, 21, 46]. The tissue size is given by the total cell number as follows:
and the distribution of stemness among all tissue cells is given by
Figure 5 shows the simulated dynamics beginning with 100 stem cells, which reveal the transition between stem and progenitor cells and the differentiation to terminally differentiated cells. Figure 5B shows the density of phenotypically different cell populations among the stem cells, progenitor cells and the terminally differentiated cells.

## 3 Discussion

Stem cell and progenitor cell regeneration is a basic cellular behavior associated with development, aging, and many complex diseases in multicellular organisms. In this study, to overlook the genetic details, we established a general mathematical framework to describe the process of stem cell and progenitor cell regeneration. This framework highlights cell heterogeneity and connects heterogeneity with cellular behaviors, e.g., proliferation, apoptosis, and differentiation/senescence. Cell heterogeneity is often associated with epigenomic markers that are subject to stochastic inheritance during cell division and is described by an inheritance probability function. Hence, the framework is a multiscale model that incorporates microscopic epigenetic state and gene expressions with macroscopic tissue growth through mesoscopic cell behaviors. We believe that this formula is helpful in answering the Weinberg question [86]. Despite the generality of this formula, different assumptions regarding the kinetic rate function and the inheritance probability can be applied to describe various biological processes related to stem cell regeneration (Figure 2).

In our framework, all stem and progenitor cells are described with a single compartment model, and different phenotypic cells are not distinguished explicitly. This approach differs from differentiation tree models that are widely used to describe the maintenance of hierarchically organized tissues. Recent experimental results have challenged the discrimination between stem and progenitor cell populations and have shown a continuous spectrum of results from cell differentiation [47, 65, 85]. Stochastic state transitions between different phenotypic cells lead to a dynamic equilibrium among a population of self-renewing cells [28, 52, 75]. Our model suggested that discrimination between cell types may not be necessary to describe tissue homeostasis. Different subtypes of cells can be characterized by their kinetic rates of proliferation, apoptosis, differentiation, senescence, *etc*. For convenience, these dynamic features are referred to as the *kinotype*, which is analogous to the genotype, epigenotype, and phenotype of a cell. The kinotype of cells is often associated with specific genes enriched in the related pathways. If the relationship between kinetic rates and the expression of these genes are known, we can extend the proposed framework to include the roles of specific genes. Therefore, in the future, we aim to develop a predictable model to investigate how variations in specific genes serve to alter the long-term dynamics of tissue growth.

Although the probabilistic epigenetic inheritance was considered, equation (2) is a deterministic equation that describes the dynamics of cell densities with different epigenetic states. This model often provides information regarding the average of multiple cells. To model a single cell, we must perform stochastic simulations that explicitly account for random events. Equation (2) suggests a numerical scheme of multiscale modeling for tissue growth where a multiple cell system is represented by a collection of epigenetic states in each cell as . In each cell cycle, each cell undergoes proliferation, apoptosis, or terminal differentiation with a probability following the given kinetic rate so that both the system state Ω_{t} and the cell count *Q*(*t*) change, and the epigenetic state of each cell undergoing cell division changes according to the predefined inheritance probability *p*(**x, y**). In our previous study, this computational model was applied to model the process of inflammation-induced tumorigenesis and reproduced the two-stage tumorigenesis dynamics and revealed the competing oncogenic and onco-protective roles of inflammation. Based on the simulation results, which include the evolution of single-cell states, we were able to uncover the detailed process of cancer development.

## Material and methods

### Resource

Source MATLAB code for the study is available from https://github.com/jzlei/StemCell.

### Age-structured model and delay differential equation models

In the G0 cell cycle model, *Q*(*t*) is the number of resting-phase stem cells, *s*(*t, a*) is an age-structured quantify to represent the population of proliferating stem cells, and the age *a* = 0 is their time of entry into the proliferative state. The resting-phase cells can either reenter the proliferative phase at a rate *β*(*Q*) or differentiate into downstream cell lines at a rate *κ*. The proliferating stem cells are assumed to undergo mitosis at a fixed time *τ* after entry into the proliferating compartment and to be lost randomly at a rate *µ* during the proliferating phase. Each normal cell generates two resting-phase cells at the end of mitosis. Here, the units of cell population are often measured by the number of cells per unit body weight, *e.g.*, cells*/*kg, and the rates of proliferation, differentiation, and apoptosis are often united with day^{-1}.

The above assumptions yield the following partial differential equations [49]:
Here, ∇ = *∂/∂t* + *∂/∂a*. The boundary condition at *a* = 0 is as follows:
and the initial conditions are
Equations (9)-(11) provide a general age-structured model of homogeneous stem cell regeneration.

By integrating (9)-(10) with the characteristic line method, we obtain the following close-form differential equation [49]:
where *Q*_{τ} = *Q*(*t - τ*). When we consider only the long-term behavior (*t* > *τ*) and shift the original time point to *τ*, the delay differential equation model is as follows
This equation describes the general population dynamics of homogeneous stem cell regeneration.

### Formulation of the proliferation rate

The effect of feedback regulation from the cell population to the proliferation rate is given by the function *β*(*Q*). Biologically, the self-renewal ability of a cell is determined by both microenvironmental conditions, *e.g.*, growth factors and various types of cytokines, and intracellular signaling pathways, e.g., growth factor receptors and cell cycle checkpoints, such as fibroblast growth factors (FGFs) and the transforming growth factor beta (TGF-*β*) family [58, 63, 67]. The exact activation pathways that regulate the self-renewal of stem cells are poorly understood. Here, we derived a phenomenological formulation based on simple but general assumptions.

There are positive and negative signals for stem cell proliferation. We assume that positive growth factors are secreted by the niche, and growth factor inhibitors are released by the cells. Different types of cytokines bind to the cell surface receptors to regulate cell behavior. Let [L] denote the concentration of ligands for growth factor inhibitor; [R], the density of free receptor; [R***], the density of activated receptor; *Q*, the stem cell number. The total number of receptors is
where m is the average number of receptors per cell. If n ligands are required to activate one receptor, we assume that ligands bind to the receptor following the law of mass action as follows:
At equilibrium, we have the following equation:
where *K* is the equilibrium constant. We assume that the activated receptors inhibit cell proliferation so that the proliferation rate is proportional to the fraction of free receptors on a cell as follows:
From (14)-(15), we obtain the following expression:
When ligands are secreted from stem cells and are cleared at a constant rate, the ligand concentration is proportional to the cell number, which gives [L] = *σQ*. Thus, we obtain the final form of the proliferation rate as follows:
where *θ* = (1*/σ*)^{1/n} is the 50% effective coefficient (EC50).

From (16), the proliferation rate, which is important for tissue homeostasis, approaches 0 due to the antiproliferation signals when the cell number *Q* is sufficiently large. However, the capabilities of self-sufficiency in growth signals and insensitivity to antigrowth signals are the two characteristics of cancer that enable malignant tumor cells to escape antigrowth signals [31]. Hence, to model tumor development, the proliferation rate can be modified to include a nonzero constant *β*_{1} for self-sustained growth signals as follow:

### Steady state of the G0 cell cycle model and oncogenic signaling pathways

From (13), the steady state *Q*(*t*) ≡ *Q** is given by the equation
which yields either *Q** = 0, or
When *β*(*Q*) is given by (17), equation (13) has a unique positive steady state if and only if
In particular, when
(13) has only a zero steady state, and the zero solution is unstable. Hence, all positive solutions approach infinity, which corresponds to uncontrolled growth. Therefore, the inequality (19) summarizes a general condition to have uncontrolled growth, *i.e.*, malignant tumors. Biologically, equation (19) is satisfied if there is self-sufficiency in growth signals and/or insensitivity to antigrowth signals (increasing *β*_{1}), evasion of apoptosis (decreasing *µ*), and dysregulation in the differentiation and/or senescence pathways (decreasing *κ*), which are well known hallmarks of cancer [31].

### Age-structured model of heterogeneous stem cell regeneration

When heterogeneity in stem cells is considered, and assumed that the apoptosis rates of cells during cell division are dependent on the epigenetic state of the cell before entering the proliferating phase, the age-structured model (9) becomes
and
While we considered the epigenetic state **x** in the first equation as a parameter, the characteristic line method remains valid, which gives the following equation (here we show only the result of long-term behavior):
Substituting *s*(*t, τ* (**x**), **x**) into the second equation, we obtained the following equation:
which gives the equation (2) for heterogeneous stem cell regeneration.

Let
which is the total cell number, and integrate (20) with **x**. Notably, when *p*(**x, y**)*d***x** = 1, we obtain the following equation:
Define
as the density of cells with a given epigenetic state **x**, then
Hence, we obtained the equation

### The inheritance probability *p*(x, y)

In (2), the inheritance probability *p*(**x, y**) is essential to describe the heterogeneity of cells. The function *p*(**x, y**) is associated with the process of cell division during which the epigenetic code and molecules distribute to daughter cells through complex regulation mechanisms that are not well understood. Hence, the general mathematical formula of the function *p*(**x, y**) remains unknown. Here, we proposed an attempt to define the function based on the random inheritance of histone modifications.

In eukaryotic cells, most DNA sequences are enclosed in nucleosomes in which DNA sequences wrap around a histone octamer that is composed of one (H3-H4)2 tetramer capped by two H2A-H2B dimers. These histones can undergo diverse posttranslational covalent modifications that lead to either active or repressive gene expression activities [5, 41, 44]. The patterns of histone modification dynamically change over time, and hence define a dynamic histone code for the transcription activity. The dynamics of histone modifications consist of complex process, including nucleosome assembly, writing and erasing of the modification markers, and random inheritance during DNA replication [71, 76]. Detailed computational models for the process of histone modification and random inheritance over the cell cycle remain a challenging issue in computational biology. While we consider the main process of writing and erasing the modification markers that are modulated by the related enzymes, the kinetics of histone modification can be modeled through stochastic simulations [37, 42].

In a proposed dynamic model of histone modification [37, 42], bivalent modifications of the histone H3, the trimethylation of H3 lysine 4 (H3K4me3) and the trimethylation of H3 lysine 27 (H3K27me3), were considered. Each H3 histone can be in one of the following states: unmodified (U), modified by the activating marker H3K4me3 (A), or modified by the repressing marker H3K27me3 (R). Each nucleosome can be in one of six physically nucleosome states, which include UU, AU, UR, AA, AR, and RR. The nucleosome states dynamically change according to methylation/demethylation, which are regulated by corresponding enzymes. During DNA replication, parental histones and newly synthesized histones are randomly distributed on daughter strands. To avoid the dilution of histone markers, maintenance modifications in the new histones can be achieved by using a neighboring histone as a template [71]. Hence, writing enzyme activities are dependent on the states of neighboring nucleosomes. Thus, changes in the nucleosome state over the cell cycle due to the random distribution of histone markers during DNA replication and kinetic methylation/demethylation can be tracked with a stochastic simulation [37].

Based on the abovementioned model simulation, we are able to study how the nucleosome state of daughter cells depends on that of the mother cells. For example, considering a DNA segment with *N* nucleosomes, we counted the number (*N*_{A}) of nucleosomes with active markers (either AA or AU) at each cycle. The simulation results suggested that considering the state of the mother cell, the active nucleosome number is a random number with a binomial distribution with the parameter (success probability *p*) dependent on the state of the mother cell as follows [37]:
Considering the nucleosome state through the fraction (*f*_{A} = *N*_{A}*/N*) of active nucleosome numbers, we can extend the binomial distribution to a continuous probability distribution defined on the interval [0, 1], which is given by the following beta distribution:
Hence, for the specific situation of the random inheritance of histone modifications during the cell cycle, we can use the beta distribution probability as the inheritance probability function *p*(*x, y*). Here, we extend this formulation to general cases and proposed (5) for the inheritance functions.

### Beta distribution

The beta distribution is a family of continuous probability distributions defined on the interval [0, 1] and is parameterized by two positive shape parameters that appear as exponents of the random variable and that control the shape of the distribution. The probability density function (PDF) of the beta distribution, for 0 ≤ *x* ≤ 1, and the shape parameters *a, b* > 0, is a power function of the variable *x* and of its reflection (1 − *x*) as follows:
where Γ(*z*) is the gamma function.

For a random variable *X* beta-distributed with parameters *a* and *b*, which is denoted by *X* ∼ beta(*a, b*), the mean and variance are as follows:
Then, it is easy to obtain
Hence, if we assume
then
which gives
This gives equation (6) to determine the shape parameters from the functions *ϕ*_{i}(**y**) and *η*_{i}(**y**).

### Simulations for stem cell regeneration

Here, we present a simple example to show the numerical scheme to simulate stem cell regeneration based on the proposed model equations.

We consider a situation with one epigenetic state *x* (0 ≤ *x* ≤ 1) that affects only cell proliferation and differentiation so that only the rates *β* and *κ* are dependent on the epigenetic state *x*. Therefore, we have the following model equation:
Here, *ξ*(*x*) = 1 so that
To specify the rate functions *β* and *κ*, we assume that the state *x* affects the proliferation and differentiation rates in a manner similar to the stemness so that large value of *x* indicates the stem cells with a low proliferation rate, an intermediate value *x* indicates progenitor cells with a high proliferation rate, and the terminated differentiation rate *κ* is a decreasing function of *x* that approaches zero when *x* is large. This is mathematically expressed as follows:
and
The inheritance probability function *p*(*x, y*) is defined from the beta distribution density function as with predefined function *ϕ*(*y*) and *η*(*y*) as follows:
and
Figure 7 shows the functions *β*_{0}(*x*), *κ*(*x*), and *p*(*x, y*).

In the simulations shown in Figure 2 the parameters are as follows: *µ* = 2.0 × 10^{-4} h^{-1}, *τ* = 20 h, *θ* = 10^{3}cells, *a*_{1} = 5.8, *a*_{2} = 2.2, *a*_{3} = 3.75, *b*_{1} = 4.0, *η*(*x*) = 60, and

in (B)-(C): ;

in (D)-(E): ;

in (F)-(G):.

In the simulation show in Figure 3 , *κ*_{0} = 0.4h^{-1}, and other parameters are the same as those shown in Figure 2B.

In the simulation shown in Figure 4 we set the wild-type cell parameters according to those shown in Figure 2B and *p*_{0,1} = *p*_{0,2} = 0.5 × 10^{-4}, *p*_{1,3} = *p*_{2,3} = 1 × 10^{-4}. For mutation 1, the proliferation rate is twice of wild-type cells, and for mutation 2, the differentiation rate is 1*/*10 of wild-type cells.

In the simulation shown in Figure 5 we set *x*_{0} = 0.7, the differentiation rate increases from 0 to normal level following , and the other parameters are the same as those shown in Figure 2B.

## Acknowledagement

This work was support by the National Natural Science Foundation of China (NSFC 91730301 and 11831015).

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵