Abstract
During development, gene regulatory networks allocate cell fates by partitioning tissues into spatially organised domains of gene expression. How the sharp boundaries that delineate these gene expression patterns arise, despite the stochasticity associated with gene regulation, is poorly understood. We show, in the vertebrate neural tube, using perturbations of coding and regulatory regions, that the structure of the regulatory network contributes to boundary precision. This is achieved, not by reducing noise in individual genes, but by the configuration of the network modulating the ability of stochastic fluctuations to initiate gene expression changes. We use a computational screen to identify the properties of a network that influence boundary precision, revealing two dynamical mechanisms by which small gene circuits attenuate the effect of noise to increase patterning precision. These results establish design principles of gene regulatory networks that produce precise patterns of gene expression.
Introduction
Embryos are characterised by remarkably organised and reproducible patterns of cellular differentiation. An illustration of this accuracy are the sharp boundaries of gene expression observed in many developing tissues. These patterns are determined by gene regulatory networks (GRNs), governed by secreted developmental signals [Davidson, 2010], raising the question of how precision is achieved in spite of the biological noise and inherent stochastic fluctuations associated with the regulation of gene expression [Raser and O’Shea, 2005]. For individual genes, the activity of redundant regulatory elements (so-called shadow enhancers), the 3D architecture of the genome, the presence of multiple alleles, and the effect of RNA processing have all been proposed to buffer fluctuations and increase the robustness of gene expression [Perry et al., 2010, Frankel et al., 2010, Lagha et al., 2012, Battich et al., 2015, Dickel et al., 2018, Osterwalder et al., 2018, Paliou et al., 2019]. At the level of the tissue, mechanisms that regulate the shape, steepness or variance of gradients have been explored and their effect on the precision of gene expression detailed [Bollenbach et al., 2008, Sokolowski et al., 2012, Tkačik et al., 2015, Zagorski et al., 2017, Lucas et al., 2018]. Moreover, several mechanisms, including differential adhesion, mechanical barriers and juxtacrine signalling, have been proposed to correct anomalies and enhance precision, once cellular patterning has been initiated [Xu et al., 1999, Standley et al., 2001, Rudolf et al., 2015, Dahmann et al., 2011, Addison et al., 2018]. Finally, theoretical studies have suggested that the structure and activity of GRNs can also affect precision [Lo et al., 2015, Perez-Carrasco et al., 2016]. However, experimental evidence to support this remains elusive.
To explore the role of GRNs in the precision of patterning, the vertebrate neural tube provides a well-characterised system. A gene regulatory network (GRN) partitions neural progenitors into discrete domains of gene expression arrayed along the dorsal-ventral axis [Sagner and Briscoe, 2017]. The boundaries between these domains are clearly delineated and accurately positioned [Kicheva et al., 2014] resulting in sharp spatial transitions in gene expression that produce characteristic stripes of molecularly distinct cells. In the ventral neural tube, the secreted ligand Sonic Hedgehog (Shh), emanating from the notochord and floor plate, located at the ventral pole, controls the pattern forming GRN (Fig. 1A). This network includes the transcription factors (TFs) Pax6, Olig2, Irx3 and Nkx2.2. Irx3 represses Olig2 [Novitch et al., 2001], while Nkx2.2 is repressed by Pax6, Olig2 and Irx3 [Briscoe and Ericson, 1999, Briscoe et al., 2000, Novitch et al., 2001, Balaskas et al., 2012]. In the absence of Shh signaling, progenitors express Pax6 and Irx3. Moderate levels of Shh signalling are sufficient to induce Olig2 expression and repress Irx3 to specify motor neuron progenitors (pMN) [Ericson et al., 1997, Briscoe et al., 2000, Novitch et al., 2001, Balaskas et al., 2012]. In response to high and sustained levels of Shh signalling, Nkx2.2 is induced and inhibits the expression of Pax6 and Olig2 and this generates p3 progenitors and delineates the p3/pMN boundary (Fig. 1B). Thus the regulatory interactions between the TFs controlled by Shh signaling explain the dynamics of gene expression in the ventral neural tube and produce the genetic toggle switches that result in discrete transitions in gene expression in individual cells [Balaskas et al., 2012]. However, stochastic fluctuations in gene expression in individual cells would be expected to generate variations in the position and precision at which cells switch from pMN to p3 identity [Lv et al., 2014, Perez-Carrasco et al., 2016] and would erode the sharpness of the domain boundary. We therefore asked whether there were other features of the GRN that counteract the effect of intrinsic noise in gene expression to enhance boundary precision.
Results and Discussion
In mouse embryos lacking Pax6, the precision of the boundary between p3-pMN domains appears decreased, resulting in more intermixing of cells expressing Olig2 or Nkx2.2 (Fig. 1C) [Ericson et al., 1997, Briscoe et al., 2000, Balaskas et al., 2012]. To quantify this change in boundary precision, we compared the dorsal-ventral width of the region that contains both Nkx2.2 and Olig2 expressing cells in WT and Pax6 mutant embryos (Supplemental Section F.7). In Pax6−/− embryos, the boundary p3-pMN is shifted dorsally, expanding the Nkx2.2 expressing p3 domain and shrinking the Olig2 domain (Fig. 1D) [Ericson et al., 1997]. In addition, between e9.0 and e10.5, the pMN-p3 boundary becomes progressively wider in Pax6−/− embryos, indicating a loss of precision (Fig. 1E & S1). These observations show that as well as determining the dorsal limit of Nkx2.2 expression, Pax6 contributes to the sharpness of the p3-pMN boundary.
We hypothesised that the decreased precision of the Nkx2.2 boundary observed in the Pax6−/− could be explained by intrinsic noise in gene expression in the GRN. Previously, we established a model of the GRN, based on coupled Ordinary Differential Equations (ODEs), that replicated the response of the network to Shh signalling and the shifts in boundary position in mutant embryos, including Pax6−/− [Balaskas et al., 2012, Cohen et al., 2014]. However, the deterministic description of gene expression in this model meant that it did not capture any stochastic effects. In order to explore the effect of gene expression noise, we constructed a stochastic differential equation (SDE) model that retained the parameters of the ODE model but incorporated a description of intrinsic gene expression fluctuations, based on experimental measurements (Supplemental Section B). In simulations of the Pax6−/−, not only was the limit of Nkx2.2 displaced dorsally, as in the ODE simulations, but the precision of the boundary was also decreased (Fig. 1F,G,H). Encouraged by these results, we tested the effect of eliminating other components of the network. These simulations suggested that removal of Nkx2.2 or Olig2 had the expected effect on the positions of gene expression boundaries without a pronounced effect on boundary precision (Supplemental Section B). This agrees with experimental observations [Briscoe and Ericson, 1999, Novitch et al., 2001, Balaskas et al., 2012]. Thus, inclusion of intrinsic noise in the GRN dynamics accurately predicted the known alterations in the precision of gene expression boundaries.
To understand the mechanism by which Pax6 contributes to the precision of the p3-pMN boundary we explored the dynamical properties of the SDE model. The model didn’t predict any difference in the magnitude of the fluctuations in the expression of individual genes between the WT and the Pax6 mutant (Supplemental Section B). Consistent with this, experimental measurements of the coefficient of variation (CV) of Olig2 from WT (CV: 0.42) and Pax6−/− (CV: 0.44) embryos did not reveal significant differences (Mann-Whitney p=0.422). This raised the possibility that, rather than the size of fluctuations in individual genes, the change in precision was a consequence of the dynamical landscape specified by the regulatory interactions of the network. The cross-repressive interactions between Nkx2.2, Pax6 and Olig2 predict a bistable regime between the two steady states of Nkx2.2 (p3) and Olig2/Pax6 (pMN) (Fig. 1I). In the absence of noise, the transition between the two steady states is determined solely by the level of Shh signalling, and the system remains in the pMN state until the level of signalling increases above the bistable region. However, in the presence of intrinsic noise, fluctuations in gene expression can result in spontaneous transitions between pMN and p3 identity within the bistable region [Perez-Carrasco et al., 2016]. Below a threshold of Shh signalling, the rate of transitions is very low and cells remain in the pMN state. Conversely, above a certain level of Shh signalling, transitions from the pMN to the p3 steady state take place so rapidly that essentially all cells undergo this transition and become Nkx2.2 positive. In between these two regimes, a region of heterogeneity is observed in which there is an intermediate probability for each cell of a spontaneous transition on a developmentally relevant timescale (≤50 hours). The size of this region of heterogeneity depends on how the probability of a noise induced transition changes in response to alterations in the level of Shh signalling. We calculated the characteristic time it would take for transitions between the pMN and p3 states at different dorsal-ventral positions of the neural tube. We termed this “fate jump time”. For WT, fate jump time changes rapidly in response to Shh signalling, implying that there is only a limited region where the effective probability of transitions is not 0 or 1 (Fig. 1J; black line). By contrast, the larger region of heterogeneity observed in the Pax6−/− mutant is due to the weaker dependence of fate jump time on levels of Shh signalling (Fig. 1J; blue line). There is a larger range of Shh levels for which noise driven transitions are possible and therefore a larger boundary region where cells in both Nkx2.2 or Olig2/Pax6 states exist.
To investigate why there are differences in the rate at which fate jump time changes with position, we analysed the gene expression dynamics during a transition between the two steady states. Transitions between states involve the system passing through, or very close to, a point in gene expression space - the saddle point in the dynamical landscape - that is characterised by specific levels of the transcription factors (TFs), we refer to this as the “transition point” (Fig. 1K; purple point). Simulations of the SDE model indicated that intrinsic fluctuations around the pMN state are initially directed away from the transition point in WT. By contrast, in the Pax6 mutant fluctuations are oriented directly towards the transition point. To characterise this rigorously, we calculated the minimum action path (MAP) between the pMN and p3 steady states: this predicts the most likely gene expression trajectory that a stochastic transition resulting from a small fluctuation in gene expression will take, thereby providing a portrait of the dynamical landscape that leads to a noise induced transition (Fig.1 K & Supplemental Section B) [Perez-Carrasco et al., 2016, Kleinert, 1990, Bunin et al., 2012]. Consistent with the SDE simulations, in WT, the MAP from the pMN (Olig2/Pax6) to p3 steady state does not follow the shortest route leading to the transition point, instead the levels of Pax6 drop rapidly and pitch away from the transition point, resulting in a curvature of the gene expression path between steady states (Fig. 1K). By contrast, in the absence of Pax6, the MAP is directly oriented towards the transition point (Fig. 1K). Taken together, the analysis suggests that the GRN affects the precision of a domain boundary by determining the dynamical landscape, without changing the level of noise in overall gene expression.
To identify alternative genetic perturbations that might affect the precision of patterning, we turned our attention to the cis-regulatory elements controlling the TFs. Several predicted regulatory regions are located in the vicinity of Olig2; these include a prominent candidate region 33kb upstream of the Olig2 gene [Oosterveen et al., 2012, Peterson et al., 2012], which we termed O2e33. This region binds (i) the repressor Nkx2.2, (ii) Sox2, which activates Olig2, and (iii) the Gli proteins, the transcriptional effectors of the Shh pathway (Fig. 2A) [Oosterveen et al., 2012, Peterson et al., 2012, Nishi et al., 2015, Kutejova et al., 2016]. To test the role of O2e33 in the network we first analysed its function in vitro in neural progenitors differentiated from mouse embryonic stem (ES) cells [Gouti et al., 2014]. Using CRISPR/Cas9, the ~1kb enhancer region was excised in ES cells harbouring an Olig2 fluorescent reporter [Sagner et al., 2018]. Unlike WT cells, which express high levels of Olig2 at Day 5 of differentiation [Gouti et al., 2014, Sagner et al., 2018], cells lacking the O2e33 enhancer had a marked reduction in levels of Olig2. By Day 6, Olig2 expression had increased in O2e33 mutant cells, but the percentage of cells and the level of expression never reached that of WT (Fig. 2B,C).
We used the observed decrease in the levels of Olig2 and the delay in its induction to identify changes in model parameters that mimic the effect of deleting the O2e33 enhancer (Supplemental Section D). Of the parameter sets that gave reduced and delayed Olig2 induction in silico, most predicted the generation of a smaller pMN domain, resulting from a ventral shift in the dorsal boundary. Strikingly, many of the parameter sets also predicted a loss of boundary sharpness (Supplemental Section D). To test whether deletion of the O2e33 region resulted in these changes in vivo, we generated a mouse line lacking the Olig2 enhancer (see Methods). Assaying the neural tube of embryos from these mice revealed lower Olig2 expression levels in pMN cells and a delay in the induction of Olig2 in O2e33−/− embryos compared to WT, in agreement with the in vitro results (S2). Moreover, as predicted by the in silico analysis, the pMN domain was decreased in size in O2e33−/− embryos, with its dorsal limit of expression noticeably more ventrally positioned, compared to WT. Strikingly, the boundary between the pMN and p3 domain was less precise than WT (Fig. 2D). Quantification confirmed the decreased size of the pMN domain and loss of precision of the p3-pMN boundary (Fig. 2E,F). The decrease in the precision of the boundary, despite continued expression of Olig2 and Pax6 in pMN cells, suggests that secondary correction mechanisms do not play a major role in determining the precision of the boundary between these two domains.
Using the in vivo observations we further limited the parameter space of the dynamical model by restricting our analysis to those parameter sets that generate an imprecise boundary and alter the position of the pMN-p2 boundary (Supplemental Section D). This produced simulations in which the loss of boundary precision in the O2e33 is not as severe as the Pax6−/− phenotype, consistent with the experimental data (Fig. 2G). Boundary width and pMN domain size from simulations were consistent with the in vivo analysis (Fig. 2H & I). Calculating the pMN to p3 fate jump times revealed that for the O2e33−/−, fate jump times changed more slowly than for WT (Fig. 1J), in line with the decreased boundary precision of O2e33−/−. Analysis in vivo of the magnitude of the combined fluctuations in Pax6 and Olig2 indicated that it was similar in WT and O2e33−/− (Fig. 2J; Supp. F.7). Consistent with this, the combined magnitude of fluctuations of Pax6 and Olig2 in simulations were similar in WT and O2e33 mutants, suggesting the decreased precision was not the result of an increase in the overall size of the noise driven fluctuations (Fig. S5)(Fig. 2K). However, the simulations predicted that variability in Olig2 should increase while the variability of Pax6 should decrease. In agreement with this prediction, the coefficients of variation of Olig2 and Pax6 gene expression between WT and O2e33 in vivo were increased and decreased, respectively (Fig. 2L).
To investigate why this led to a decrease in boundary precision, we analysed the MAP of the O2e33−/− at a fixed neural tube position. The model indicated that the transition path from pMN to p3 curved away from the shortest path to a lesser extent than for the WT; stochastic simulations further confirm this behaviour (Fig. 3A,B). Thus, in the absence of the O2e33 enhancer, stochastic fluctuations around the Olig2/Pax6 steady state tended to take the system closer to the transition point than similar magnitude fluctuations in WT, making a noise driven switch in fate more likely in the mutant. Nevertheless, the curvature in the path in O2e33 was greater than in the Pax6−/−, providing an explanation for the greater imprecision in Pax6−/− compared to O2e33 mutant (Fig. 3B cf. Fig. 1J & K). We calculated the action along the path for each genotype [de la Cruz et al., 2018](Fig. 3C & Supplemental Section B). This represents the effective energy required to reach a point along the MAP and is a measure of the extent of the barrier that has to be overcome for a fate transition. Consistent with the results of the simulations, the effective energy necessary for a noise induced transition was greatest for WT, less for O2e33, and lowest for the Pax6 mutant. Moreover, the analysis indicated that the initial part of the trajectory presented a more significant barrier to noise induced transitions in the WT than O2e33 and Pax6 mutants (Fig. S3A), corresponding to the relative divergences of their transition trajectories from the shortest route to the transition point.
The model predicted that the deletion of O2e33 alters the relative expression levels of Olig2 and Pax6 in individual cells, resulting in cells close to the pMN-p3 boundary expressing higher levels of Pax6 and lower levels of Olig2 in O2e33 mutants than in WT (Fig. 3D, E). We used single cell immunofluorescence quantifications to compare cells in the boundary region of WT and O2e33 embryos (Fig. 3F, G & Supplemental Section F.7). Consistent with the predictions, this revealed higher levels of Pax6 and lower levels of Olig2 in O2e33 mutants compared to WT. Thus the experimental evidence supports the idea that the strength of regulatory interactions encoded in the GRN contributes to the precision of domain boundaries by configuring the dynamics of stochastic fluctuations to reduce the probability of a noise driven change in cell identity.
The finding that the dynamics produced by the regulatory interactions between Pax6-Olig2-Nkx2.2 influences the pMN-p3 boundary precision prompted us to ask whether this is is the only way in which precision can be enhanced by the GRN or whether other mechanisms can increase the fidelity of gene expression boundaries. We performed a computational screen to identify three node networks capable of generating a sharp boundary in response to a graded input (Fig. 4A & Supplemental Section E). For the networks recovered from the screen, we compared the boundary precision with the extent its MAP deviates from the shortest path to the transition (we informally refer to this quantity as “curvature”) (Supplemental Section E). This showed a positive correlation, consistent with our observations in the WT network, of high curvature and low boundary width. This supported the idea that the shape of the transition pathway contributes to boundary precision (Fig. 4C). Nevertheless, within the screen, for any given level of boundary sharpness, there were a range of MAP curvature values. We therefore investigated additional features that might affect boundary precision. We found that a subset of the networks do not rely on path curvature to achieve precision and instead functioned effectively as two node networks (Fig. 4D). For these networks, the major contributor to boundary precision was the rate at which the steady state and transition point separated in response to changes in level of the input signal: the higher the rate of separation, the sharper the boundary (Fig. 4B). We termed this “separation speed”. Plotting both curvature and separation speed for the networks recovered from the screen indicated that both features contribute to precision (Fig. 4E). Moreover the most precise boundaries were generated by networks that exploited both separation speed and curvature, which includes the Pax6-Olig2-Nkx2.2 network (Fig. 4E-F).
An important corollary, for networks in which separation speed dominates, is that within the region in which gene 2 (x2, analogous to Olig2 in WT) is expressed, the level of its expression changes markedly. This produces a graded expression domain rather than a uniform domain and precludes the generation of blocks of progenitors with constant levels of gene expression (Fig. S10). By contrast, networks that rely on curvature to achieve a sharp boundary allow uniform gene expression levels within the gene expression domain (Fig. S10).
Finally, we assessed whether particular network topologies favoured boundary sharpness. Many topologies were able to generate sharp boundaries (Fig. 4G,H & Supplemental Section E), consistent with the expression dynamics produced by the network being key to determining behaviour. Nevertheless, four topologies appeared to be most effective at preventing very imprecise boundaries (numbered 1-4 in Fig. 4H). These tended to have similar separation speeds but much higher curvature than the networks with other topologies (Fig. S12). Crucial for this behaviour was the inhibition of gene 3 by gene 2, and the absence of repression of gene 2 by gene 3 (Fig. 4G & S11). This regulatory configuration generates curvature by allowing a steep decrease in x3, the concentration of gene 3, while sustaining high levels of x2 prior to the transition. Notably, the WT neural tube network conforms to this configuration, raising the possibility that it was adopted in the developing vertebrate neural tube for its capacity to generate precise patterns in the presence of intrinsic noise. Moreover, the graded expression within the domain associated with the use of separation speed (Fig. S12) is represented by Pax6 (x3) and this allows constant levels of Olig2 (x2; the gene necessary for defining neuronal subtype identity) within the progenitor domain. Hence, an understanding of the dynamical properties of the GRN offers an explanation for its structure and the resulting gene expression behaviour.
Taken together, our studies provide insight into the collective cell decision making processes that result in the generation of precise domains of gene expression in developing tissues. The data reveal that the effects of stochastic gene expression on spatial heterogeneity can be attenuated by the dynamics of the GRN. These mechanisms do not rely on suppressing the stochasticity of individual genes, or on cell-to-cell communication, but instead take advantage of dynamical properties of regulatory networks to increase the fidelity of decision making. This “precision by design” highlights the capacity of transcriptional circuits to contribute to robust tissue patterning and identifies a mechanism that might be exploited in other biological settings requiring precise responses from groups of cells.
Competing Interests
The authors declare no competing or financial interests.
Author Contributions
KE, EHD, PS & JB conceived the project, interpreted the data and wrote the manuscript with input from all authors. KE performed all experiments except those listed under other authors. EHD performed theoretical modelling and data analysis. LGP performed the protein copy number quantifications. RPC contributed to building the mathematical models and provided advice. AS generated the Olig2-T2A-mKate2 ES cell derived neural progenitors. VM analysed the ATAC-seq data and supervised experimental work. PS & JB supervised the project.
B Formulation and analysis of stochastic GRN dynamics
Formulation of stochastic dynamics
In order to investigate heterogeneity of gene expression in the neural tube we make use of stochastic differential equations that describe the GRN and the concentration xj of each TF j. We start with a thermodynamic-like model as detailed in [Cohen et al., 2014], which captures the macroscopic behaviour by a system of ODEs; these contain terms for production and decay of each TF. The ODE description corresponds to the limit of a reaction volume Ω that is large enough for the copy numbers Ωxj of all protein species to be large, allowing fluctuations to be neglected; formally one takes Ω → ∞. When Ω is finite stochastic effects occur, these can be described by the chemical Langevin equation, a system of SDEs, see e.g. [Van Kampen, 1992, Gillespie, 2000]. The drift, i.e. the systematic variation with time in the SDEs coincides directly with the deterministic limit. The diffusion (stochastic) term arises from the stochastic nature of the individual protein production and decay reactions; it is a Gaussian white noise [Gillespie, 2000] whose covariance structure is determined by the mean reaction rates. In our case the chemical Langevin equation for the protein levels xj within the GRN takes the form:
The deterministic part of these equations is equivalent to those used in [Cohen et al., 2014]. The covariance (B.1b,B.1c) of the zero mean Gaussian white noise ϵj(t) arises from the decay and production of each protein being independent and random, given the concentration of the regulators of the relevant gene. In the equations above, α represents protein production rate and β degradation rate, while the w provide the weights of the respective DNA conformations (j, n) when multiplied by the respective concentration. The conformations are labelled by the protein j being produced and the numbers n = {ni} of TF molecules bound. The δ in (B.1b) and (B.1c) are the Kronecker and Dirac delta respectively. As explained above, Ω is the volume of the system in which all reactions take place.
When looking at the chemical Langevin equation (B.1a), one notices that the rate ∑np(j,n)α(j,n) for producing protein j, has a nonlinear dependence on the TF concentrations xi. One might be concerned that with such a nonlinear dependence, modelling production of protein j as a single reaction is too simplistic. However, (B.1a) can be obtained from a larger system of simple unary and binary mass action reactions, in which the concentration of each DNA conformation is tracked individually. We only sketch this construction here and explain its implications for the stochastic terms in (B.1a); for further details see [Herrera-Delgado et al., 2018]. The deterministic part of the time evolution of the DNA concentrations is given as follows:
Here tracks the concentration of each DNA conformation and is scaled down by a large factor γ′ to account for the low quantity of binding sites in relation to protein numbers. Correspondingly the protein production rate constants have to be large in order to give an appreciable overall rate of protein production nonetheless.
To derive the correct stochastic equations for the protein species, the large γ-limit of (B.2) is taken: the concentration of each DNA conformation then changes sufficiently quickly that it constantly tracks the instantaneous protein concentrations. For appropriately chosen binding and unbinding rate constants and this leads back to the thermodynamiclike form of the deterministic part of the protein dynamics in (B.1a) [Herrera-Delgado et al., 2018]. As shown in [Thomas et al., 2012] the existence of fast species (in our case, DNA conformations) can lead to additional terms arising in the noise acting on the slow species (protein production), as a consequence of reactions between slow and fast species. In our case it turns out that these extra noise terms scale with γ′/γ. We then make use of the biological meaning of the terms: 1/γ represents the timescale of reaction rates for TF binding to DNA and 1/γ′ represents the characteristic time for the process of going from active DNA to producing a protein. We find it biologically reasonable to choose a 1/γ that is substantially smaller than 1/γ′, given the many biological processes necessary for the production of a fully functional protein. The ratio γ′/γ is hence small, so that the additional noise terms that arise from the general calculation in [Thomas et al., 2012] become negligible, leaving exactly the noise terms in (B.1c). The intuition is that because protein production is slow compared to binding and unbinding of factors to DNA, noise from the many binding and unbinding events during production averages out; the overall noise then arises only from the stochasticity of the production processes, at the relevant average DNA concentrations. We note that in accordance with this conclusion, explicit calculations show that when γ′ is of the order of γ, or larger, then additional noise terms from the stochasticity in DNA concentrations do enter the dynamics of the protein concentrations. Moreover, these additional terms are dependent on the precise choices of binding and unbinding rates, which are only partially constrained by the requirement that the thermodynamic-like deterministic equations (B.1a) are retrieved for large γ [Herrera-Delgado et al., 2018].
Amount of noise
The above model is a coarse-grained description that does not explicitly describe the many possible sources of noise within a living cell. These include spatial heterogeneity and effects from the bursty, multi-step nature of protein production, which includes processes such as transcription, translation, post-translational modification, protein folding and protein shuttling [McAdams and Arkin, 1997].
The noise level in our model is set by Ω−1, the inverse reaction volume. This determines the scale of the stochastic fluctuations in protein production and decay, both of which the model represents as single step processes. A larger Ω thus leads to smaller stochastic effects. In equation (B.1a), multiplying Ω by the concentration of a protein species gives the total number of molecules for that protein. In our calculations we measure volumes in units that make typical protein concentrations of order unity, so that Ω can be directly interpreted as a copy number. In accordance with our observations in (Supp. C), a value for Ω can be read as a copy number for Pax6, Nkx2.2 and Irx3; the corresponding typical copy numbers for Olig2 are ten times higher (Supp. C).
We estimate a lower bound on the noise level Ω−1, i.e. the lowest amount of noise that makes sense within our description. It is given by the typical number of proteins of each species in a cell: these numbers determine the minimum amount of noise that must arise from the stochastic nature of protein production and decay. From protein quantifications (Supp. C) we obtain Ωmax ~ 10, 000 for the protein counts of Nkx2.2 and Pax6 per cell at saturation levels (which in our model correspond to concentrations close to unity). Olig2 has a higher estimated count of ~100,000 and in accordance a 10 times higher concentration in the model (the maximum concentration for Olig2 is 10, and 1 for the other TFs). Because of the many neglected sources of additional noise, we expect 1/Ωmax to be a considerable underestimate; indeed, simulations with this noise level show almost deterministic behaviour. However for a slightly increased noise level (Ω = 2000), we find that the relationships between jump-rate differences across WT and mutant phenotypes discussed in the main text hold already (see Fig. S3). This means that the WT presents a small amount of heterogeneity (as observed in vivo) and the mutants have a more heterogeneous boundary than the WT.
To obtain a lower bound for Ω, we measure the coefficient of variation at steady state for all 3 TF values across embryos, to estimate the total amount of noise in the system (Fig. 1A). We then decrease Ω in our numerical simulations until we see coefficients of variation similar to those observed in vivo, giving Ωmin = 20. This assumes that all observed differences in protein levels arise solely from the stochasticity in our model. We reason that there are other sources of noise that make the coefficients of variation higher in vivo, such as protein transport within the cell, antibody specificity and measurement error, so that the amount of noise contributed by the stochasticity in our dynamical model will be smaller than 1/Ωmin = 1/20. On that basis we find a reasonable smallest value of Ω of ~ 100. The value we use for all results throughout this study is Ω = 250, which is within the broad bounds of Ωmin = 20 and Ωmax = 20, 000. Importantly, the results we observe remain qualitatively unchanged across the entire range of Ω that we assess as reasonable, 100 ≤ Ω ≤ 2000 (Fig. S3).
Minimum action path
Much of the theoretical analysis in the main text concentrates on the stochastic transitions between fixed points of the deterministic GRN dynamics, which are long-lived metastable states of the stochastic dynamics. The minimum action path (MAP) is the most likely path the system takes in such a transition (for large enough values of Ω), from a steady state to a transition point (which is the saddle point of the dynamical system) and then onwards to a new steady state. The second piece of the path always follows the deterministic dynamics and has a negligible effect on the transition times, so we focus on the first part of the path.
The negative log probability for any path is proportional to what is called the action, which for our Langevin dynamics is of so-called Onsager-Machlup form [Kleinert, 1990]. The action is an integral over time of the Lagrangian, which in turn depends only on the current state (vector of concentrations) and velocity of the system. The time integral can be discretised and the action then minimised as described in e.g. [Bunin et al., 2012]. We analyse the resulting MAP in gene expression space in order to understand how its shape affects the jump times between steady states and thus eventually the boundary width.
The typical time the system takes to reach any point on the MAP scales exponentially with the action up to that point, hence this quantity can be interpreted as an effective energy, within the analogy of a particle making a transition from one local minimum in an energy landscape across a barrier to another minimum. In Fig. 3C we plot this effective energy along the (relative) length of the MAP, describing the effective energy landscape governing the transition. Fig. S4 shows an alternative representation that gives further insight: we plot the derivative of the action along the path, which is the effective force pushing the system back towards the initial steady state.
Calculating magnitude of fluctuations
To compare the magnitude of fluctuations between WT and mutants in silico we take two separate approaches. The first is to consider fluctuations in expression levels around a steady state, before any transition to a new state occurs. For moderate noise levels such fluctuations can be analysed using a linear expansion of the dynamics around the steady state (here: pMN), leading to a local Gaussian distribution of expression levels. The corresponding covariance matrix C can be calculated from the Jacobian matrix J of the linearized dynamics and the noise covariance D as defined in (B.1b), both evaluated at the steady state. The required link between the three matrices is the Lyapunov equation, which determines C via
Once C has been found we normalise it by the corresponding pMN steady state values (X), to obtain . We finally take the trace of and take the square root. The end result is the typical standard deviation (root-mean-square fluctuation) of the expression levels, relative to the mean expression levels, which is shown in Fig. S5A as a function of neural tube position.
The second approach to quantifying noise levels is to use the noise variance, which is the trace of the noise covariance matrix given in (B.1b). This noise variance depends on the expression levels so we measure it at equidistant points along the MAP and take the square root of this value to obtain the root-mean-square noise level. Example results at a specific position along the neural tube are shown in Fig. S5B; results at other positions were qualitatively the same (data not shown). Both approaches to quantifying noise show comparable total variance across the different genotypes, with slightly lower noise in Pax6−/− than in WT and O2e33. To make the comparison to in vivo observations we accounted for the fact that experimentally, noise levels are averaged across several neural tube positions throughout the pMN domain. We therefore also performed an average in silico of neural tube positions to obtain comparable data for Fig. 2K.
D Simulating WT and mutant GRNs
We used the equations and parameters described in [Cohen et al., 2014] for the GRN that patterns the neural tube; this parameter set was optimised to replicate the boundary positions in wild-type and mutant embryos. Following the inclusion of the noise term as explained in Supp. B we explored the effect of the initial conditions for the TFs (i.e. their initial expression levels xj). The aim was to find a consistent set of initial conditions that sustain the boundary positions but also recapitulate the boundary sharpness of each mutant. The initial conditions that satisfied these conditions were identified in a systematic scan as xPax6 = 0.1, xOlig2 = 0, xNkx2.2 = 0, xIrx3 = 0.1. The p3-pMN boundaries in WT, Irx3−/−, Nkx2.2−/− and Olig2−/−simulations remained sharp as is the case in vivo (Fig. S7). Only the loss of Pax6 resulted in decreased boundary sharpness. Boundary positions remained consistent with in vivo observations as was the case in the original deterministic model (Fig. S7) & [Cohen et al., 2014].
Model parameters
We detail the parameters used throughout the paper to model neural tube development for equation (B.1a), and adapted for the computational screen as explained in Supp. E.
Where factors of 10 have been written in the table, these arise because we have modified the model of [Cohen et al., 2014] to represent explicitly the experimental observation that Olig2 has a concentration 10 times higher than the other TFs. While this difference is immaterial for a deterministic description of the GRN dynamics, it affects the stochastic representation because larger copy numbers have smaller relative fluctuations.
The above parameters are used in the general model (B.1a) for the dynamics of the TFs j = P (Pax6), O (Olig2), N (Nkx2.2) and I (Irx3). DNA conformations are defined by the numbers n = (np, nin, nP, nO, nN, nI) of bound molecules of polymerase, Gli signal input, Pax6, Olig2, Nkx2.2, Irx3 in that order. The only allowed conformations are the empty conformation, the conformations with polymerase and nin = 0 or 1 signal molecule bound; and conformations with at least one molecule of the other TFs bound, with maximally two molecules from each other TF. All other conformations are assigned affinity zero. The weights for the allowed conformations are multiplicative, with bound polymerase contributing a factor wj,p (see below), bound signal a factor kj,inxin and each TF i bound to DNA producing TF j a factor kjixi. Examples of the corresponding affinities are kO,(0,0,0,0,1,0) = kON and kO,(0,0,0,0,0,2) = kOI2. The polymerase binding parameters are directly stated as the weights wj,p = kj,pxp including polymerase concentration (which is assumed constant). As detailed in [Cohen et al., 2014], this weight describes all basal production inputs for each TF and thus represents input from TFs such as Sox2. Finally, the protein production rates αj,n in the general model (B.1a) are set to the value given in the table for the DNA conformations with bound polymerase, and zero otherwise.
Explicitly, the production rate for Olig2 is then written as:
The signal input concentration xin is the gradient e−s/0.15, which depends on the dorsal-ventral neural tube position s ranging from 0 to 1 as in [Cohen et al., 2014].
O2e33 mutant
To find parameter sets that describe the behaviour of the O2e33 enhancer mutation, we first identified those parameters that are related directly to the deletion of the respective enhancer. Analysis of the sequence of the enhancer together with CHIP-seq and ATAC-seq [Oosterveen et al., 2012, Peterson et al., 2012, Kutejova et al., 2016, Metzis et al., 2018] suggested that Gli proteins, Nkx2.2, Irx3, and Sox2 all have a direct effect on this enhancer (Fig. 2A). We therefore considered variations in the parameters that specify Nkx2.2 binding, Irx3 binding, Gli binding and basal production (corresponding to Sox2 binding). We systematically explored how reducing the parameters for each of these interactions, to a fraction f of their original value, could explain the observed phenotype. We used a uniform distribution to perform this search and represent the respective parameter reductions directly in terms of the ratio f between new and original (WT) parameter values.
We first identified parameter sets that could replicate the observed in vitro delay in the onset of Olig2 expression in the mutant, leading to a reduced parameter space (Fig. S8). The delay in Olig2 activation was determined for networks positioned a fraction 0.3 along the neural tube, and we retained those networks that took twice the amount of time to express Olig2 than in the WT.
We next investigated what further phenotypical behaviour the retained parameter sets predict, focussing on the domain size and boundary precision generated in response to a graded Shh signal. We found that 68% of the parameter combinations reduced boundary precision, 80% reduced the size of the pMN domain, with 83% presenting one or other of the phenotypes (data not shown). Here, the pMN domain size was calculated with respect to the Shh gradient and we considered it reduced if it was below 70% of the WT size. For determining boundary sharpness, we regarded as imprecise those systems that had a boundary width at least twice the size of the WT; this width is calculated using the SDE system with the thresholds described in Fig. 1J. The fact that a majority of the parameter sets identified affected domain size and boundary precision encouraged us to generate the mouse lines.
Once the mouse lines were generated we noted two additional phenotypes to the delay in onset of Olig2, as expected from the initial parameter screen: a loss of precision at the p3-pMN boundary and a ventral shift of the pMN-p2 boundary. We made use of these two additional observations to constrain our parameter space further, thus leading to the parameter distributions shown in Fig. S9. We quantified sharpness as explained above. The targets set for the boundary position were extracted from in vivo data, and were set as: pMN-p3 boundary position to be between [0.17 0.25] (as the WT boundary position is at 0.17 and the in vivo data show a small dorsal shift in the mutant) and p2-pMN position to be lower or equal to 0.5 (WT boundary is at 0.55, this means a reduction of the domain size of at least 15% with respect to WT) but higher than the pMN-p3 boundary position, such that the pMN domain does not disappear. From the parameter sets that met both criteria, we finally took a representative point as our model for the O2e33 mutant; as expected this replicates the observed experimental phenotypes.
E Screening three node networks for precision
Defining a functional form
To perform a parameter screen we explored three node networks with all possible interactions between the nodes, as this has provided useful insights in other systems (Fig. 1A) [Cotterell and Sharpe, 2010, Leon et al., 2016]. For the purpose of exploring different dynamics, we enumerated the different possible transcriptional/occupancy states of the promoter to model the production rates of a given protein. These rates depend on polymerase availability, signal input (morphogen) and regulating transcription factors, with concentrations xp, xin and xi respectively. The transcription factors i can be activating or repressing , with and denoting the sets of activating and repressing transcription factors, respectively. While in the previous model, in its most general form (B.1a), different protein production rates can be used for different DNA conformations, in the neural tube network we used the same the production rate for all protein-producing input conformations (see Supp. D). We adopt the same approach here and set the production rate to unity in appropriate units of time; thus the model is specified only by the binding affinities of the various DNA conformations. Without loss of generality we fixed the affinity (and hence the weight) of the unbound conformation to 1 as explained in [Sherman and Cohen, 2012]. We assign the weights of conformations with only one bound molecule as kpxp, kinxin and kixi. In accordance with our previous model (B.1a), we set the following constraints:
All conformations with polymerase and without any repressor produce protein; it does not matter whether signal or any activator are bound.
Conformations that have one or more repressor bound together with either signal, polymerase or any activator are excluded, based on the assumption that these molecules compete for the same binding site
Binding of signal or any activator enhances binding of polymerase
No other cooperativity effects are present
Expressions for conformation states
The only states that can produce protein are those with polymerase bound. For brevity we follow the convention in Supp. D and abbreviate in the following, taking polymerase levels as constant throughout our dynamics. As specified above, the only states that can bind polymerase are those that have no repressors bound. We assume no cooperativity between signal xin and activators , hence the total weight of states that can potentially bind polymerase (assuming two binding sites per activator but only one for the signal) is:
Given that repressors can only bind by themselves, and that there is no other cooperativity between the inputs, the total weight for conformations with at least one repressor bound while assuming two binding sites per repressor is:
In accordance with biological intuition, polymerase is recruited by activators or signal. The simplest way to implement this is to increase the weight of conformations having both polymerase and at least one activator or signal by a cooperativity factor c, giving a total weight of:
Finally, the weight for the unbound (empty) conformation is taken as 1, as explained above, and for the conformation with one polymerase bound it is wp as defined in (E.1). The total weight, i.e. the denominator of the protein production rate, is then while the numerator is the total weight of conformations with polymerase, either on its own (E.1) or together with activators or signal (E.4), giving overall for the production rate (which with protein production set to unity is also the probability of being in a DNA conformation that produces protein) with the abbreviations
General strong cooperativity limit
It will be convenient in the following to write the effective affinities of signal and activating TFs in combination with polymerase in a form that includes the cooperativity effect from the factor c, i.e. in terms of and for . The protein production rate is then expressed as with now
We can now compare with the analogous expression (D.1) in the neural tube network. There all interactions are repressive so that is the empty set and hence ϕ =1, which simplifies (E.8) to
This agrees with (D.1) except for the middle term in the denominator, which represents the weight of DNA conformations with only signal but no polymerase bound. Its absence in the neural tube network formally corresponds to the strong cooperativity limit c → ∞. In our screen we use a finite cooperativity c = 100 to avoid the extreme case of excluding conformations with only signal bound completely; this value of c is still large enough, however, to replicate the dynamics of the neural tube network. We thus take (E.8) with c = 100 as the form of protein production rates in our screen; compared to the neural tube case this allows us to include both activating and repressive interactions.
Adding a protein decay term (with unit decay rate) and stochastic fluctuations, the dynamics of the three-node networks in our screen, with protein levels x1, x2 and x3, is thus described by for j = 1,2, 3; compared to (E.8) we have dropped all tildes to unclutter the notation. We have also allowed the sets and of activating and repressing transcription factors to be determined implicitly by the system parameters. This is done by generalizing the affinities kji so that a positive sign indicates an activation of j by i and a negative sign a repression. The corresponding switching of species i between the products over activators and repressors is achieved mathematically by setting [k] + = max(k, 0) and [k]− = max(−k, 0).
To mimic the structure of the neural tube network, we assume that only proteins 1 and 2 have direct signal inputs, while 3 does not, so that k3.in = 0. This leaves 11 network parameters: 2 for the signal (gradient) inputs from the gradient (k1,in into node 1 and k2,in into node 2), 6 from the interactions between TFs (k12, k13, k21, k23, k31 and k32) and 3 for polymerase binding weights (w1,p, w2,p and w3,p).
Parameter exploration
We explored the 11 dimensional parameter space specified above using a uniform log distribution (log10), where the ranges are set differently depending on the parameter. Specifically we chose the ranges as: range(kin) = [10 : 400], range(wp) = [0.1 : 10], range(kij) = [−100 : −1] ∪ [1 : 100] with the sign of each regulation kji being chosen randomly.
We explored parameter combinations for a three node network defined in the form (E.11). The main criterion for choosing a viable set of parameters was that they must produce a patterned steady state, i.e. a saddle-node bifurcation on the same gradient as in the neural tube: defined as xin = e−s/0.15 where s defines dorsal-ventral neural tube position and ranges from 0 to 1. To avoid trivial effects from shifts in the boundary position we set a further constraint that the bifurcation must occur at a position s in the same range as in the neural tube network, 0.165 ≤ s ≤ 0.17. More specifically networks were required to be monostable below s = 0.165, with high levels of x1; and bistable beyond s = 0.17, with one state having high x2 and the other high x1 (with “high” being a concentration value above 0.6). For each network meeting these criteria, we then proceeded to calculate the MAPs, as for the neural tube network (as explained in Supp. B), and the jump time. We selected networks that have boundaries sharper than a certain threshold, set by requiring the boundary to be no wider than 0.2 fractional neural tube units; boundaries were calculated based on their transition time obtained from simulating the SDEs. To simulate the neural tube network from (D.1) in the screen we used the standard parameters from that network, reverting to the original version [Cohen et al., 2014] with maximal concentrations of unity for all TFs in order to ensure comparability with the networks produced by the screen. We removed all terms relating to Irx3, as these do not contribute substantially to the dynamics of transitioning from a pMN to a p3 steady state. We further set production and degradation rates to be equal to unity in the screen as these simply scale the jump time and do not affect the results.
In analysing the results of the network screen we quantified the curvature of the MAP as the largest perpendicular distance of any point on the MAP from the straight line between steady state and transition point, normalised by the total length of this line. We refer to this value throughout the text by the shorthand “curvature” as it gives a quantitative indication of how much the MAP deviates from the shortest path. The curvature was measured at s = 0.25 and the robustness of the results with respect to this choice of neural tube position was tested by comparing with multiple other locations, with qualitatively similar results in all cases (data not shown).
In the analysis we also characterised networks by the strength of the contribution of the third node, which does not receive direct signal input. We quantified this by taking the value of x3 at the steady state and transition point (saddle point) and multiplying each by parameters for the repression or activation of nodes 1 and 2 by node 3, taking the maximum value. The multiplication by representative concentration levels of the third node was motivated by the fact that when those concentrations are small, even large interaction parameter values have small net effects.
Networks with a low third node contribution are effectively two node networks, and turned out to have low MAP curvature. This led us to explore other mechanisms for generating sharp boundaries. Geometrically, in the space of expression levels, the speed at which the steady state and saddle point separate as a function of neural tube position s is a plausible contributor to boundary sharpness because even if the fluctuations around the initial steady state favour a jump, such a jump will be inhibited by a large separation between steady state and transition point. High separation speed should thus lead to rapidly increasing jump times and hence to sharp boundaries. To measure separation speed we focussed on a fixed position (chosen as s = 0.25) along the neural tube, beyond the saddle-node bifurcation, and calculate the Euclidean distance between steady state and transition point. We then used this as a simple quantitative indication of separation speed. We checked the robustness of this measure by performing the measurement for different fixed positions along the neural tube, and also at variable locations chosen as the centre of the boundary region for each network; we found qualitatively similar results in every case (data not shown).
When a network had a high separation speed, this typically resulted in the steady state (the expression profile) of x2 varying, i.e. changing within a domain of the steady state pattern. We quantified this heterogeneity by the standard deviation of x2 within the region of high x2 expression. This confirmed (see Fig. S10) that sharp 2D networks have a higher level of heterogeneity than 3D networks, which use the curvature of the MAP to generate sharpness.
Characterisation of topologies
Finally we analysed the topologies of the networks resulting from the screen. To sort networks into topologies we used thresholds to identify whether nodes 1 and 2 receive significant signal input, and for each of the TF nodes whether it significantly activates or represses the other TFs. Starting with the former, within the input parameter range [10 : 400] for nodes 1 and 2, we took any parameter 30 < kin to be a positive input; lower values were classified as lack of input. This cutoff was chosen by testing a range of different values and imposing the constraints that we want to neither classify the majority of networks as having two inputs (which would provide no information on the input topology, as could happen if the cutoff was too low) nor assign any network to a topology with no inputs (which would not make biological sense and would occur when the cutoff is too high). For interactions between nodes we took into account not only the parameters kji but whether each parameter in conjunction with the actual states of the system would have a noticeable effect. We evaluated interactions by considering the contribution of an interaction given the highest level that the effector node can take. Accordingly, we consider an interaction with 0.3 < |kji| max(xi) to be significant, otherwise we classify it as negligible. The maximum was taken over all steady states for all neural tube positions. The cutoff value of 0.3 was chosen by systematic inspection of a representative number of networks, for which we compared the dynamics with and without individual interactions and assessed whether these were qualitatively identical or not. To assess the robustness of the cutoff value, we varied it within a range up to an order of magnitude larger and found that the results of our characterisation of network topologies remained qualitatively the same (data not shown).
With this approach we classified all the 3D network parameter sets into topologies, determined those that occurred most often (Fig. S11) and plotted the boundary precisions they generate (Fig. 4H). The results indicated that although some topologies are more frequently represented amongst networks producing a sharp boundary, there is no single topology that ensures sharpness. Some networks (such as 1-4 in Fig. S11) prevented the boundary from becoming very imprecise, but even within these network topologies the range of sharpness was large (Fig. 4G,H & Fig. S11 & Fig. S12). This leads to the conclusion that the dynamical properties generated by the network, rather than the structure of the network determines boundary precision. Indeed, we confirmed by analysing each topology separately that the main indicators of sharpness are the two mechanisms identified in the main text: curvature of transition path and separation speed (Fig. S12). Thus a network’s topology can substantially bias the dynamics towards high MAP curvature, and hence towards sharpness.
F Materials and methods
F.1 Mouse Strains
Mouse strains containing the following alleles were used: Pax6(Sey) [Ericson et al., 1997] and O2e33 in strain backgrounds C57BL/6Jax and F1(B6xCBA) respectively. The O2e33 allele was derived using zygote injection of CRISPR gRNA and Cas9 plasmids (see below). Embryos were transferred to psuedopregnant females and subsequent pups were genotyped. O2e33 mice were maintained as a heterozygous population; the line was sub-viable with less than 2/40 homozygous offspring surviving. Embryos for analyses were collected at the indicated time points following a mating, with the day of plug detection designated e0.5. All animal procedures were carried out in accordance with the Animal (Scientific Procedures) Act 1986 under the Home Office project licence PPL80/2528 and PD415DD17.
F.2 Embryonic Stem Cell Culture
For the enhancer deletion in vitro, mouse ES cells containing a fluorescent reporter cotranslated with Olig2 (Olig2::T2A-mKate2) [Sagner et al., 2018] were used. Mouse embryonic stem cells were maintained on mitotically inactivated fibroblasts (feeder cells) in ES medium with 1,000 U/ml LIF. Cells were differentiated to spinal cord neural progenitors as previously described [Gouti et al., 2014]. To initiate differentiation, ES cells were dissociated using 0.05% Trypsin (Gibco) and panned in ES medium on culture plates for 2x 15 minutes to remove feeder cells. ES cells were collected, spun down and re-suspended in N2B27 medium. 50,000 cells were plates on 35mm CellBIND dishes (Corning). Dishes had been coated with 0.1% gelatine in PBS before addition of 1.5ml of N2B27 with 10 ng/ml bFGF. After 48 hours medium was replaced with N2B27 + 10ng/ml bFGF + 5uM CHIR99021 (Axon). 24 hours later, at D3, medium was replaced with N2B27 + 100nm RA (Sigma) and 500nm SAG (Calbiochem), this was repeated every 24 hours.
F.3 CRISPR/Cas9 targeting
For CRISPR/Cas9-mediated excision of the −33 kb enhancer, two pairs of short guide RNA (sgRNA) sequences were designed to target either side of the enhancer region. ZiFit online tool (http://zifit.partners.org/) was used to select guides that had the lowest number of potential off target sites. sgRNA sequences (ACTTTGTAAGCCGAGCC) and (GATAATCGC-CTCCCTCC were cloned into pX459 v2.0 (Addgene, [Ran et al., 2013]) and transfected into ES cells via nucleofection. This generated a cell line with a 995bp deletion (chr16: 91192464-91193458). Two separate clones were analysed to determine whether there was substantial clonal variation. A second line was generated with a larger deletion of approximately 3.3kb using sgRNA sequences (GTTTATGGCTCATCCCC and TCCAGGCTCCCATATCC). Cell lines with this larger deletion yielded the same results as the smaller deletion (data not shown). To generate the mouse line, plasmids encoding the sgRNAs for the 3.3kb deletion were injected into zygotes before being transferred to pseudo-pregnant females. The mouse line generated had a 3259 bp deletion (chr16: 91191295-91194570).
To assess Olig2 protein copy number, a transgenic cell line was constructed, Olig2-HA-SnapTag. Sequencing encoding an HA tagged SnapTag was placed at the C-terminus of the endogenous coding sequence for Olig2 via homologous recombination using CRISPR. The SnapTag sequence was extracted from the pSNAPf vector (N9183S, NEB) and inserted into a plasmid containing Olig2 [Sagner et al., 2018] and targeted as previously described.
F.4 Protein Copy Number Quantification
The concentration of recombinant proteins (used as standards) was calculated from Coomassie staining (GelCode Blue Stain Reagent, Thermo scientific). Recombinant proteins used were Pax6 (Bioclone, PI-0099) Nkx2.2 (MyBioSource, MBS717917) and SnapTag (NEB, P9312S). A solution of 5 m SNAP-tag was labelled with Janelia Fluor JF549 (TOCRIS, 6147) SnapTag Ligand at 10 m (assembled in house) for 30 mins at 37řC.
To determine Pax6 and Nkx2.2 average molecule number per cell, a WT HM1 mouse embryonic stem cell line was used [Doetschman et al., 1987]. Cells were lysed in RIPA buffer supplemented with protease inhibitors. The cell lysates were analysed by Western blot, with lysate from a known number of cells loaded per lane. The following antibodies were used: rabbit anti-Pax6 (Millipore AB2237, 1:2000), mouse anti-Nkx2.2 (DSHB 745A5, 1:50), donkey anti-mouse IRDye 800CW (Licor) and donkey anti-rabbit IRDye 680RD (Licor). Blots were scanned using an Odyssey Scanner (Licor).
We used the cell line Olig2-HA-SnapTag to determine protein copy number for Olig2. Cells for Olig2 and Nkx2.2 copy numbers were differentiated as described. For Pax6, cells were exposed to 100nm RA only from day 4 to induce a more dorsal spinal cord cell fate. One day prior to sample collection, the cells were incubated with Janelia Fluor JF549 SnapTag Ligand (assembled in house) directly in the media at 1 μM overnight. Cells were lysed in RIPA buffer supplemented with protease inhibitors. A known number of cells were loaded per lane. Gels were scanned using Typhoon FLA 9500.
To determine the percentage of expressing cells, flow cytometry was carried out as described in the Flow Cytometry section.
F.5 Flow Cytometry Analysis
Cells were dissociated using 0.05% Trypsin and collected in ES media. Cells were then washed in PBS and resuspensed in PBS containing live-cell Calcein Violet dye (Life Technologies). Control and O2e33 cells were differentiated in parallel and analysed together. Control cells differentiated without SAG from day 4 were used to set population gates for mKate positive cells.
For protein quantifications, flow cytometry was used to determine percentage of cells expressing Olig2, Pax6 and Nkx2.2. Cells were labelled with either PE Mouse anti-Nkx2.2 (BD Pharmingen 564730, 1:20); AlexaFluor 647 mouse anti-Human Pax6 (BD Pharmingen 562249, 1:50); goat anti-Olig2 (R&D Systems AF2418, 1:800) then donkey anti-goat 405 (Biotium 20398, 1:500). Flow analysis was performed using a Becton Dickinson LSRII flow cytometer.
F.6 Immunohistochemistry and Microscopy
Embryos were collected at defined timepoints and fixed for 30 minutes for e8.5, 1 hour for e9.5 and 2 hours for e10.5 in 4% paraformaldehyde in PBS. Embryos for wholemount imaging were washed in PBS containing 0.1% Triton X-100 (PBST) before addition of primary antibodies. Embryos for sectioning were placed in cryopreservation 30% sucrose overnight at 4řC then dissected into forelimb neural tube fragments. These were mounted in gelatine then frozen. 12m sections were collected on glass slides using Zeiss Hyrax C 60R cryostat. Gelatine was removed from the slides by 4 x 5 min washes in PBS at 42řC and sections washed with PBST. For in vitro stainings, cells were washed in PBS and fixed in 4% paraformaldehyde for 15 min at 4řC then washed in PBS then PBST. For whole embryos, embryo sections and cells, primary antibodies diluted in blocking solution (1% BSA in PBST) were applied overnight at 4řC. These were then washed in 3 x PBST before secondary antibodies diluted in PBST were added for 1 hour at room temperature. Secondary antibodies were removed with 3 x washes with PBST and one wash containing PBST and DAPI. Sections and cells were mounted using Prolong Gold (Invitrogen). Embryos for wholemount were mounted using glycerol. Primary antibodies used were guinea pig anti-Olig2 (gift from Bennett Novitch, 1:8000 [Novitch et al., 2001]); mouse anti-Nkx2.2 (BD Pharmingen 564731, 1:500); rabbit anti-Pax6 (Millipore AB2237, 1:1000); goat anti-Sox2 (R&D Systems AF2018, 1:200). All secondary antibodies were raised in donkey and conjugated to Alexa488, Alexa568, Alexa647 (Abcam).
Cells were imaged on a Zeiss Imager.Z2 microscope using 20x objective. Z-stacks were taken and presented as a maximum projection using FiJi imaging software. A Leica SP5 upright confocal microscope was used to image embryo sections (40x oil objective) and whole embryos (20x dry objective). For whole embryos, z-stacks were taken across a tile-scan then assembled and maximally projected using FiJi imaging software.
F.7 Image quantification
Fluorescent intensity measurements
Single optical planes from confocal z-stack images were used for analysis. Each nucleus was identified individually using the FiJi point tool. The DAPI channel was used as reference for the position of the nuclei regardless of TF expression. A circle of 2 μm radius was taken around each point, x and y position and mean fluorescence intensity values for Nkx2.2, Olig2 and Pax6 were recorded. Reference points at the ventral and dorsal pole of the neural tube in each section were recorded in order to align all embryos along the dorso-ventral axis.
Pre-processing
We performed a set of normalisation steps in order to compare embryos from different batches and across phenotypes:
The datasets were realigned vertically with respect to the reference points and the ventral-most point was set to (0,0) in axes coordinates
Cells with DAPI levels below two SDs from the mean were removed to eliminate falsely identified nuclei. This value was decided individually for each sample to account for different background levels resulting from technical noise.
Points that were very low in intensity (below two SDs) were set to a minimum threshold in each individual channel.
For Nkx2.2 and Olig2, the intensity values were re-scaled such that the minimum value is at 0 and the 40% quantile is at the arbitrary value of 0.08. This was done individually for each embryo with the assumption that most nuclei in a full neural tube cross-section will not express these proteins.
For Pax6, most nuclei in the image express some level of Pax6; accordingly we set the 60% quantile at 0.6 across all datasets.
Staging embryos with size
We used the dorsal-ventral length of the neural tube as a proxy for embryo age [Cohen et al., 2015]. For e9.5 embryos, the neural tube size measured was between 250μm and 350μm and for e10.5 embryos were larger than 350μm. In order to subgroup e9.5 embryos, neural tube size was used. In total we have 46 WT, 29 O2e33 and 16 Pax6−/−. By sizes they are distributed as:
Classification into cell types
In order to analyse the heterogeneity at the boundary between domains, we classified all cells into one of 5 specific cell types: floor plate, p3, pMN, Irx3 positive, other; this was done based on the position and expression profile of each cell. We refrained from using the Pax6 channel in our classifier to avoid any bias in the classification of Pax6−/− embryos. We therefore classified based on three parameters: Nkx2.2 intensity, Olig2 intensity and dorsal-ventral position. The thresholds we employed for Nkx2.2 and Olig2 concentrations are shown in Fig. S13A-B. There was a further constraint on the dorsal-ventral position for each cell type, in order to avoid anomalies from blood vessels and imaging artefacts and to be able to separate floor plate cells from Irx3 positive cells, both of which lack expression of Nkx2.2 and Olig2 (Fig. S13B-C). Manually bench-marking this method indicated that we were able to classify most cells accurately for all three phenotypes. The classifier becomes less accurate for cells in dorsal regions but this is of no concern as our subsequent analysis did not involve these cells. For the specific task of quantifying the Olig2-Irx3 boundary position we employed the Pax6 channel as a further parameter to aid classification. This was only performed for WT and O2e33 (data not shown).
Defining boundary position and width
Once the cell types had been classified we assigned a quantitative measure of the width of gene expression boundaries. For this we fit to the cell position data, for each embryo, a smooth function indicating the probability of finding a cell of one type (the prevalent type on one side of the boundary) at each location of the image. We focused on the boundary between p3 and pMN domains. The classifier is then binary and gives the probability of finding a p3 cell at each image location. We used a Gaussian process approach to fit this classifier as detailed in [Rasmussen and Williams, 2004], using public MATLAB code (MATLAB version r2018b). The Gaussian process was chosen to have a constant mean function and a squared exponential covariance function. This choice of covariance function is relatively standard and allows us in particular to assign separate covariance function lengthscales in the x and y image directions by automatic relevance determination [Rasmussen and Williams, 2004]. We used a logistic transfer function to convert Gaussian process values to probabilities, again a standard choice. Once the classification probabilities have been obtained in this way, we define the boundary as the region where the probability of p3 cells lies in the range 11% to 89%, i.e. where there is significant mixing of cell types. We then determine the width of this region geometrically. This method allowed us to calculate the boundary widths for all embryos in a consistent manner, and to compare WT with mutants. The boundary region is determined from the trained classifier for each embryo as explained above; the position where the classification probability is 50% for either cell type is used to define the position of the boundary (an average position of the boundary along the left-right axis) (Fig. S14).
Quantifying TF levels
We extracted Olig2 positive cells that were classified as being within the boundary region. The model predicted that these cells were the most likely to transition to a Nkx2.2 positive state, given sufficient time. We quantify the levels of Pax6 and Olig2 for these cells in WT and O2E33 mutants. The resulting measurements do not provide absolute numbers; but given that all samples are normalised in the same way, as described (Sec. F.7), the resulting measurements are comparable relative to each other. We use these measurements as equivalents to observing fluctuations around a steady state over a series of dorso-ventral positions. In this way, we take the corresponding equivalent in the simulations, where we also average fluctuations across several neural tube positions (Supp. B).
Calculating variance levels
In order to calculate the total variance of Olig2 and Pax6 levels within the pMN domain we extracted all Olig2 expressing cells, for both WT and O2e33, outside the boundary region. The variances and covariances of the normalised fluorescence intensity values were calculated, analogous to the theoretical approach (Supp. B). The square root of the trace of the resulting covariance matrices were then used to obtain the typical root-mean-square relative variance.
Acknowledgments
We thank JP Vincent and members of the lab for constructive comments. We are grateful to the Flow Cytometry, Biological Resource and HPC Facilities of the Francis Crick Institute. This work was supported by: the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001051), the UK Medical Research Council (FC001051), and Wellcome (FC001051); funding from Wellcome [WT098325MA and WT098326MA]; the European Research Council under European Union (EU) Horizon 2020 research and innovation program grant 742138.