Cryptsim: Modeling the evolutionary dynamics of the progression of Barrett’s esophagus to esophageal adenocarcinoma

To alleviate the over-diagnosis and overtreatment of premalignant conditions we need to predict their progression to cancer, and therefore, the dynamics of an evolutionary process. However, monitoring evolutionary processes in vivo is extremely challenging. Computer simulations constitute an attractive alternative, allowing us to study these dynamics based on a set of evolutionary parameters. We introduce CryptSim, a simulator of crypt evolution inspired by Barrett’s esophagus. We detail the most relevant computational strategies it implements, and perform a simulation study showing that the interaction between neighboring crypts may play a crucial role in carcinogenesis.


I. INTRODUCTION TO THE TYPE OF PROBLEM IN CANCER
Pre-malignant conditions are changes in the normal tissue structure-usually characterized by dysplasia or benign neoplasia-that increase the risk of developing a specific kind of cancer. These conditions are generally detected via screening, and their association with cancer fosters their surveillance and aggressive treatment. However, their probability of progression to invasive cancer is often low (e.g., 0.12% of patients with Barrett's esophagus (BE) without dysplasia progress to esophageal adenocarcinoma every year [1]), and therefore most actions carried out on patients with these conditions constitute over-treatment. To reduce over-treatment we need to improve our risk stratification procedures, which means we must improve our ability to predict the progression of pre-malignant conditions to invasive cancers. Cancer is a disease of the somatic genome in which somatic cells mutate and evolve. Therefore, predicting cancer progression implies predicting evolution. While rough estimates of somatic evolutionary parameters are available (see [3] for a summary), these are not enough to fully understand or predict cancer progression. Studying evolutionary processes in vivo is extremely difficult due to economic, ethical, temporal and sampling-related limitations. Moreover, because somatic evolution is a stochastic process, we may require impractically large sample sizes to detect regularities. Computer simulations are an attractive alternative and complementary approach to surveillance of Research  patient cohorts. Simulations allow us to perform as many replicates as are needed and eliminate sampling limitations. There is, however, a combinatorial explosion in the number and combinations of parameters, and so we restrict the investigation of the model to a set of a priori most relevant parameters.
Barrett's esophagus (BE) is a pre-malignant condition that increases the risk of progression to esophageal adenocarcinoma [4]. The BE epithelium is divided into crypts, well-like structures with a few stem cells near the bottom. Crypts are relatively homogeneous units of selection that accumulate mutations, divide and die. High genetic diversity among crypts at initial endoscopy is associated with progression [5,6], and clonal expansions via crypt divisions have been observed [7,8]. However, little is known about the dynamics of this process or the relative importance and role of the different evolutionary parameters that drive progression.
In this highlight, we introduce CryptSim [9]-a simulator of crypt evolution inspired by BE. Using it, we show that the interaction between neighboring crypts plays a key role in carcinogenesis of neoplasms that occur in crypt-based twodimensional tissues, constituting a promising target for cancer prevention.

II. ILLUSTRATIVE RESULTS OF APPLICATION OF METHODS
We performed a series of simulation studies to assess the effect of different evolutionary and crypt-biology parameters in carcinogenesis (see Table 1 for a list of simulation parameters). It is unclear if a dividing crypt can replace a neighbor or if it must wait for a neighbor to die before it can divide. We model this with a parameter for the probability of replacing a neighbor crypt. The value of this parameter dramatically affected progression to cancer. Reducing the probability that a crypt could replace a neighbor reduced the probability of progression, and zero probability of replacing a neighbor completely stopped progression (Fig. 1).
The implications of this finding are twofold. On the one hand, it shows the need for a better understanding of basic crypt biology to better model and predict cancer progression. Specifically, little is known about the process and dynamics of tissue homeostasis at the crypt level (e.g., control of replication, replication rate, and death rate). On the other hand, it highlights the key role of spatial competition in cancer progression. Progression to cancer requires the acquisition of a few advantageous mutations (i.e., driver mutations) in (at least) one clone in order to acquire the hallmarks of cancer [10]. Large population sizes increase the rate of fitness increases by mutations and natural selection. However, in most tissues population sizes are limited by spatial competition. The replacement of neighboring crypts included in our model facilitates somatic evolution and constitutes a promising target for cancer prevention. Although high tissue renewal rates (i.e., high replication and death rates) and reductions in crypt size are additional mechanisms that could provide additional space for evolution to act, they would require unrealistic parameter values to allow cancer progression.

III. QUICK GUIDE TO THE METHODS
Cryptsim's model is composed of three main constituents: an agent-based model, a two-dimensional spatial model, and a clone tree. The agent-based model describes the evolution of crypts through time. Using crypts rather than cells as our basic unit allows us to keep computational requirements of realistic tissue sizes within reason and is based on the fact that the small number of stem cells in a crypt are thought to be rapidly homogenized by genetic drift. Each crypt i has associated a genome Xi composed of n neutral and m selective loci l=m+n. Crypts can undergo four different events, each controlled by a rate: replication (b), death (d), mutation of neutral loci (v), and mutation of selective loci (µ). All events can act freely except replication, which requires free space in the neighborhood of the dividing crypt. If there is none, the new crypt can replace a neighbor at random with probability p. While most simulation parameters are constant and shared for all crypts for each simulation run, Xi, bi, and di are private to the crypts and change with the accumulation of mutations modifying genotype and crypt fitness. Cryptsim assumes an irreversible binary model for selective mutations (each Xl Î {0 (normal), 1 (mutated)}), and an additive model for the neutral mutations. Each selective locus Xl, with l Î {1…m}, has associated selective coefficients rl and sl, which modify replication rate b and death rate d, respectively (1) and (2).
Mutation rates (µ and v) per locus can be relative to time or replication events, which is determined by the a parameter. The first option captures the effect of external mutagens (like the acidic environment in BE) while the second models errors in the DNA replication and DNA repair machinery. Equations (3) and (4) show the effect of a in the calculation of crypt selective (µi) and neutral (vi) mutation rates per time unit for a given crypt i.
The two-dimensional spatial model represents the BE segment and consists of a hw (height x width) hexagonal lattice wrapped along the h dimension. This component allows us to consider neighbor interactions and limit the carrying capacity (i.e., maximum population size) of the simulated neoplasm; and can be either empty or full of normal crypts when the simulation starts.
Finally, a clone tree keeps track of the ancestral relationships of the different clones, their genotypes and divergence times.

A. Strategies to overcome computational challenges
Forward simulation of a BE segment using biologicallyreasonable parameters requires on the order of one billion crypt divisions, which makes this problem computationally intensive. Cryptsim implements a series of strategies to reduce its computational burden, among which we would like to highlight the usage of the Gillespie algorithm [11] and the calculation of genotypes on-the-fly using the clone tree. Rather than checking every crypt every time step for a rare event, we calculate the time until the next event and keep an ordered queue of events. This algorithm can be summarized in the following steps: a) sample the waiting time to the next event from an exponential distribution with rate equal to the sum of rates of all events, b) sample the event with probability proportional to its relative rate, c) choose the crypt that carries out the event at random, d) advance the sampled waiting time and carry out the event, e) update rates and iterate. The clone tree reduces memory requirements and operations by not storing genomes of each crypt (Xi). Thus, crypts pertain to the spatial model and are linked to clones in the clone tree and rate groups in an associative array (determined by their selective genome). Mutations generate new clones in the clone tree, which also tracks their new mutation and rate group. Crypt replacements only change crypt information and their linkage with the clone tree, avoiding copying Xi for every replication. Clone genotypes are obtained with a simple tree traversal when needed. This clone tree is regularly pruned to remove extinct subtrees, minimizing memory requirements.

B. Settings in which these methods are useful
As-is, this model can be applied to study evolutionary dynamics of somatic evolution of two-dimensional neoplasms organized in crypts, such as BE or the colon. However, most of the computational techniques detailed here can be applied to the study of somatic evolution in general.

C. Availability
Cryptsim is implemented in Java and will be distributed under the GPL v3 license upon publication. In the meantime, contact the authors to get access to the source code.

ACKNOWLEDGMENT
We thank Todd Parsons for suggesting using a Gillespie algorithm for efficient simulation.