Numerical parameter space compression and its application to microtubule dynamic instability

Physical models of biological systems can become difficult to interpret when they have a large number of parameters. But the models themselves actually depend on (i.e. are sensitive to) only a subset of those parameters. Rigorously identifying this subset of “stiff” parameters has been made possible by the development of parameter space compression (PSC). However, PSC has only been applied to analytically-solvable physical models. We have generalized this powerful method by developing a numerical approach to PSC that can be applied to any computational model. We validated our method against analytically-solvable models of random walk with drift and protein production and degradation. We then applied our method to an active area of biophysics research, namely to a simple computational model of microtubule dynamic instability. Such models have become increasingly complex, perhaps unnecessarily. By adding two new parameters that account for prominent structural features of microtubules, we identify one that can be “compressed away” (the “seam” in the microtubule) and another that is essential to model performance (the “tapering” of microtubule ends). Furthermore, we show that the microtubule model has an underlying, low-dimensional structure that explains the vast majority of our experimental data. We argue that numerical PSC can identify the low-dimensional structure of any computational model in biophysics. The low-dimensional structure of a model is easier to interpret and identifies the mechanisms and experiments that best characterize the system.


INTRODUCTION
A central goal of biophysics is to develop mathematical and computational models that describe biological systems.These models can operate at different temporal and spatial scales.In the case of the microtubule cytoskeleton, models range from molecular dynamics simulations of αβ-tubulin heterodimers [1], to Monte Carlo simulations of microtubule dynamic instability [2][3][4], to analytical theories that treat the mitotic spindle as a nematic liquid crystal [5].These models vary in their degree of complexity, e.g., in the number of parameters they use.
A central problem in biophysical modeling is defining the "right" number of parameters to explain and predict experimental data, or observables.We prefer simple models; in the well-known quip from Von Neumann, four parameters are sufficient to fit an elephant, and five can make its trunk wiggle [6], as was indeed later demonstrated [7].More parameters can sometimes improve a model's performance, but too many can be a problem.Unnecessary parameters obfuscate those that determine the model's output, namely its reproduction of observables, and render the model less interpretable-all with-out any gain in predictive power [8].A recent computational model of microtubule dynamic instability has 22 parameters and reproduces an impressive range of experimental data [9].But complex models can be black boxes; we need a rigorous way to define which parameters determine, for example, the distribution of microtubule lifetimes.
The behavior of a model can be described within a socalled "parameter space", which has as many dimensions as there are parameters.Moving within this parameter space (by changing the values of parameters) should change a model's output.But usually a given observable significantly changes along only a few directions in parameter space [10].In other words, most directions in parameter space are irrelevant.In order to make sense of complex models, an important scientific problem is to reliably extract relevant directions in parameter space, defining the true, lower-order "dimensionality" of the model.There are several methods to solve this problem.In the 1980's, classical Principal Component Analysis was proposed as a method to reduce ODE-based models of biochemical systems [11].More recently, the Manifold Boundary Approximation Method has been developed to fit data while minimizing dimensionality [12]; similarly, Fitness Based Asymptotic Parameter Reduction can extract the "core working module" of a model [13].Other machine learning approaches can develop realistic models with a minimal number of parameters, e.g., using Bayesian Information Criterion [14].These methods are focused on ODE-based models, so there is an acute need for universal methods that are applicable to stochastic computational models as well.
Our method to reduce the complexity of stochastic computational models is parameter space compression (PSC) [15].The PSC method uses the Fisher Information Matrix (FIM) in order to determine the relative significance of a model's parameters.More specifically, the PSC method tracks the eigenvalues of the FIM over time to identify combinations of parameters that are "stiff" (viz., those with strong effects on model outputs) and "sloppy" (those with very weak effects) (Fig. 1A) [16].The sloppy parameters or parameter combinations are "compressed away" to reveal the simpler dimensionality that underlies a model's performance [17].That most parameters are sloppy, and thus irrelevant, is why coarse-grained models in physics provide such satisfying descriptions of the natural world [18].
PSC is incredibly powerful at identifying the lowdimensional structure of a model [15], but it has currently been applied to a limited number of analyticallysolvable physics models.Those derivations are not easy to generalize to other contexts.In order to apply PSC to any computational model, we developed a numerical PSC method, significantly expanding the applicability of PSC.To validate our method, we recovered the analytical results of a simple one-dimensional random walk model [15] and its perturbations.We further tested our method on an analytically-solvable model of protein production and degradation.Finally, we applied numerical PSC to a well-known Monte Carlo model of microtubule dynamic instability.For the microtubule case, we identify a low-dimensional description of the system that accounts for all of our observables.By adding new parameters to our base model, we demonstrate that a feature of microtubules of current interest [19], namely the "seam" in the lattice, is "compressed away", while a feature neglected in some models, namely the fine structure of microtubule ends, is critical to model performance.In all three test cases, we show that the eigenvalues of the FIM provide critical insights into the behavior of a model and the importance of its parameters.Thus, our numerical PSC method opens the door to an analysis of computational models in biophysics that reveals the minimal yet predictive descriptions of living systems.

NUMERICAL PARAMETER SPACE COM-PRESSION Mathematical formulation
Our numerical approach is derived from the theoretical work of Machta and colleagues and relies on the computation of the Fisher Information Matrix (FIM) [15][16][17].We study the distribution of an observable x and its sensitivity to a vector of parameters θ = {θ µ }.The central idea is that changes in an important parameter will result in bigger changes in the probability distributions y( θ, x) compared to changes in a less important one (Fig. 1A).The important parameters will dominate the eigenvalues and eigenvectors of the FIM.Dominating eigenvalues define directions in parameter space where observables vary significantly.We call these directions "effective parameters."In general, the effective parameters of a model are not the original parameters but rather combinations of them (see below).Thus, the goal of parameter space compression (PSC) is to identify these dominating eigenvalues and eigenvectors, which will define the most important directions in parameter space and the effective parameters defining the distribution y( θ, x) of observable x [16,17].In particular, for a dynamical system, we expect that a hierarchy of eigenvalues will appear for the FIM of y( θ, x, t) as time t progresses, such that only a few effective parameters define the observed dynamics at any given time.These few effective parameters are sufficient to completely describe the system.
The FIM can be difficult to compute, but a simplification arises if we assume that the deviations of the noisy function y are Gaussian distributed to the true model.As shown in [15], the FIM at any given time t can then be rewritten as a simple deterministic "metric": where y can be evaluated as a function of time t.Notice that g µ,ν (t) then becomes a simple function of the Jacobian with respect to its parameters.A detailed derivation is provided in Appendix S1.We independently consider several observables x, and for each x compute the FIM of distribution y( θ, x, t) and its eigenvalues as a function of time t.

Scaling and Algorithm
A challenge in analyzing computational models, especially in biology, is that the parameters have different units and scales.Some parameters are energies (e.g., the ∆G o of bond formation) and some are kinetic rate constants (e.g., the rate constant of a GTP hydrolysis reaction).Since rate constants are exponentially distributed to thermal energy k B T , we choose to rescale the parameters to express all of them in terms of energies when cal- Track eigenvalues over time 3. culating the FIM.Energies are more fundamental quantities and their variations are easier to interpret physically.We thus define newly rescaled parameters, θµ , so that θµ = θ µ for parameters that are energies already, and θµ = log θ µ for rate constants (with an implicit conversion factor to remove units), as done previously [10].Therefore, equation 1 becomes:

Eigenvalue Fisher Information Matrix
where α µ = 0 if θ µ is an energy and α µ = 1 if θ µ is a rate constant.
To calculate the FIM numerically using equation 2, we developed a three-step algorithm shown in Fig. 1B.First, we generate the probability distributions y( θ, x, t) of each observable x at any time t for incremental variations of parameters θ µ and evaluate the finite derivatives corresponding to the Jacobian: First, we generate 2N + 1 probability distributions y( θ ± ∆ θ, x, t) for each observable x, for a model with N parameters.All the probabilities used in equation 3 need to be normalized to the number of points that generated the probability distribution.Second, we calculate the eigenvalues of the FIM by summing over the entire observable landscape.Third, we track the eigenvalues of the FIM over time.In general, the eigenvalues of the FIM are logarithmically-distributed [16].The important feature of the eigenvalues is not their absolute values but rather their relative values, which is to say that the largest eigenvalue points to the most important direction in parameter space.
When evaluating the finite derivatives in equation 3, The production rate ρ dominates at early time points but at stationarity, the production rate and degradation rate contribute equally.Note: The eigenvector % is the absolute value of the parameter component.
the choice of ∆θ is arbitrary.In our experience, the most robust choice to avoid numerical instability and artifacts while keeping significant changes is to incrementally vary energies by 0.05 k B T (leading to a change of 5% for corresponding rate constants, see Appendix S2).

Test Case: One-dimensional random walk
To test our numerical PSC method, we benchmarked our algorithm by simulating a model for which an analytical solution is available.We chose the one-dimensional random walk model introduced in Machta et al. [15], which is the model used to develop the concept of PSC.The parameters of the model are the probabilities of a particle jumping to one of six neighboring sites (Fig. 1B); the observable x of the model is the position of a particle and y( θ, x, t) is the distribution of particle positions as a function of time (viz., the particle density in a meanfield approximation when there are many particles).We simulated the random walk and plotted the eigenvalues of the FIM over time (Fig. 1C), and our results precisely match those derived from the analytical expression (see Appendix S3).In particular, the eigenvalues start at unity; as time progresses, the distribution of eigenvalues expands, establishing a clear hierarchy of eigenvalues at later times.As pointed out in Machta et al. [15], the first two eigenvalues can be interpreted as a drift term and a diffusion coefficient, respectively; the spreads of the eigenvalues are enough to reproduce most of the data in an effective theory (as discussed in [15]).We further tested the correspondence between our numerical results and the analytical theory by introducing drift into the random walk, which was not done previously.The particle density over time is shown in Fig. S2A.The eigenvalues of the FIM over time for the perturbed random walk are shown in Fig. S2B.The result is similar to uniform dif-fusion and we are able to show that the eigenvalues are defined by the probabilities of particles jumping to neighboring sites.Importantly, our numerical results precisely match the analytical solutions we derived for a random walk with drift and a uniform random walk with different numbers of parameters (see Fig. S2C, D respectively).Thus, our numerical PSC method successfully compressed this classic physical system.

Test case: A simple protein production and degradation system
Having benchmarked our algorithm against the random walk model, we next wondered how our numerical PSC method would handle a model where the distribution of an observable is determined by a combination of the initial parameters.Such situations will arise in most if not all real computational models.Therefore, we applied our numerical PSC method to a textbook biophysical model of protein production and degradation.The model has only two parameters, the production rate ρ and the degradation rate δ (see Fig. 2A).The observable x of the model is the number of proteins in the system at any given time.Importantly, The stationary distribution y( θ, x) of protein number is a Poisson distribution of the parameter combination ρ/δ (representing the expectation value for the number of proteins) [20].Using this stationary distribution, we can analytically solve for the dominating eigenvalue of the corresponding FIM in the continuous limit: The derivation of the eigenvalues and the expression for equation 4 can be found in Appendix S4.In the continuous limit, the second eigenvalue of the system goes to 0, but not in a discrete simulation (see Appendix S4 and Fig. S3B).
Starting from an initial condition with no proteins, we simulated this process using the Gillespie algorithm [21] and computed the eigenvalues of the system over time (Fig. 2B).One eigenvalue is always over 2 orders of magnitude larger than the other, indicating that the system is governed by one effective parameter, which is to say that there is only one relevant direction in parameter space that determines the model's output.Looking at the relative contribution of the eigenvector components of the dominating eigenvalue in Fig. 2C, we can see that during the early stage of the system, the production rate ρ dominates, corresponding to the net production of proteins from the initial condition.The system then reaches stationarity, at which point the eigenvector components of the dominating eigenvalue are a mix of the the production rate ρ and degradation rate δ, with equal contributions as expected from our derivation (Fig. 2C).We checked that our method recovers the analytical result of equation 4 (in the asymptotic limit) for different ratios of production rate over degradation rate (see Fig. 2B, S3A).Thus, our numerical PSC method is able to compress out irrelevant directions and extract the effective parameter defining the distribution of protein number (here, a Poisson distribution).

Microtubule dynamics: a complex biological system
Having fully characterized our method, next, we applied it to a biological system that cannot be solved analytically, namely the dynamic instability of microtubules [22].Microtubules are polymers of αβ-tubulin and dynamic instability is the non-equilibrium behavior in which the polymers stochastically switch between periods of growth and shrinkage.This complex, nonequilibrium phenomenon was first simulated numerically in the 1980s [23,24] and has remained a subject of considerable interest for computational biologists, who have developed increasingly sophisticated models [3,4,9,25].The long-term goal of these collective efforts is to develop a powerfully-predictive yet minimal model that can be used to explain microtubule physiology.Our numerical PSC method has the power to determine whether existing models have an underlying low-dimensional structure.
Our starting model is based on VanBuren et al. 2002 (see Fig. 3A) [2]; a similar model is used by Ayaz et al. 2014 [26].We chose this model because it's a classic and because understanding its underlying dimensionality will inform ongoing modeling work on microtubules.Briefly, tubulin subunits associate head-to-tail to create protofilaments (pfs), forming longitudinal bonds described by an energy parameter ∆G o long .In our model, 13 pfs are connected by lateral bonds between adjacent subunits with an energy parameter ∆G o lat [27].β-tubulin forms lateral bonds with other β-tubulins (and αwith α) except at a single discontinuity in the lattice known as the "seam", where β-tubulin binds to an α-tubulin (see below).The rate at which tubulin binds to the microtubule ends is described by an association rate constant, k + .Because tubulin is a GTPase, these incoming tubulin subunits contain GTP in the β-tubulin nucleotide pocket.This GTP becomes hydrolyzed after (1) the subunit incorporates into the polymer and (2) another GTP-tubulin binds on top of it, contributing catalytic residues that complete the nucleotide pocket [28].The rate of GTP hydrolysis is described by a rate constant parameter k H . GTP hydrolysis and phosphate release converts GTPtubulin to GDP-tubulin and weakens the bonds between tubulin subunits in the polymer [29,30].Following Van-Buren, this weakening of energies is described by an energy parameter, ∆∆G o lat , which is assigned to the lateral bonds of the new GDP-tubulin subunit.Using parameter values similar to Castle et al. 2017 (see Fig. 3B), our simulation produces microtubule growth curves that correspond reasonably with in-house experimental data generated by in vitro reconstitution experiments at 8 nm tubulin [31] (see Fig. 3C).More specifically, microtubules grow as long as their ends are protected by a "cap" of GTP-tubulin [32].If this GTP cap is "lost", the polymer switches to rapid shrinkage in an event known as a "catastrophe", the hallmark of dynamic instability [22].
There are many subtleties and caveats to models of dynamic instability.For example, which bonds are weakened by GTP hydrolysis is not well established [33,34], and the transition from GTP-tubulin to GDP-tubulin may have substeps [34].These subtleties are discussed in Appendix S5.We used the direct method of the Gillespie algorithm [21], which is a different implementation than the one in VanBuren et al. [2] and Ayaz et al. [26].In order to validate our Gillespie algorithm, we used the parameters found in Ayaz et al. and confirmed that our simulation produces identical results.The details of our simulation method and the benchmarking of our algorithm against published data can be found in Appendix S5 and Fig. S4.
To our minimal model, we sequentially added parameters that are motivated by recent discoveries and controversies concerning the structure of the microtubule lattice and the microtubule end.We then "compressed" this complex model to see whether the new parameters are essential (viz., become components of the dominating eigenvalues of the FIM) and whether new parameters increase the underlying dimensionality of the model.Following VanBuren et al., we varied 3 core parameters (∆G o long , ∆G o lat , and k H ) and treated the others as nonadjustable (k + , ∆∆G o lat ).In order to compress our microtubule model, we measured four independent observables of the simulations that correspond to experimental data [31].Two observables can be tracked continuously over the time course of the simulation: (1) the length of microtubule (Fig. 4A) and ( 2  describes the conversion of GTP-tubulin into GDP-tubulin ("GTP cap size", Fig. 4B) [35].The second column of Fig. 4A and B shows the eigenvalues over time for these observables.The other two observables are the distribution of microtubule lifetimes (Fig. 4C) and the postcatastrophe shrinkage rate (Fig. 4D).These observables are not tracked continuously because they require postsimulation analysis to determine when catastrophes occurred (see Appendix S6).The second column of Fig. 4C and D shows the eigenvalues for these observables at the conclusion of the simulation, when the distributions have reached stationarity.This framework allowed us to apply our numerical PSC method to this model of microtubule dynamic instability and test the importance of new parameters.
The "seam" parameter can be compressed away We first added a parameter to account for the atypical lateral bonds that form at the "seam" in the lattice, ∆G o seam .These atypical bonds are presumably weaker than other lateral bonds (∆G o seam < ∆G o lat ), a presumption that is supported by molecular dynamics simulations [36] and recent atomic resolution cryo-electron microscopy, in which the lateral bonds at the seam are slightly "open" relative to other lateral bonds [19,33].The apparent weakness of the seam has lead to the proposal that microtubules first crack open at the seam when a catastrophe begins.We wanted to test this proposal by explicitly including the seam in our simple model.
We generated the FIM for a simulation of the Van-Buren et al. model with an explicit seam: ∆G o seam = 0.9 • ∆G o lat based on [36].For all four observables, one eigenvalue strongly dominates (note the log-scale for eigenvalues).This dominance implies that the distribution of each observable is determined by a single effective parameter.This result is not obvious: one expects that the mean and the variance of any given distribution are described by independent parameters, as was the case for the random walk [15].Rather, microtubule observables are similar to the number of proteins in the protein production/degradation model, where both the mean and variance of the distributions are determined by a single effective parameter.
As for the protein production/degradation case, the single effective parameter determining the distribution of each observable is a priori a complex function of the initial parameters.The relative influence of each initial parameter is given by the eigenvector components of the dominant eigenvalue (see column three of Fig. 4A-D).Importantly, we also can see which parameters are not important for a given observable, because these initial parameters will be insignificant components of the dominating eigenvalue.The important components for microtubule length are the longitudinal bond, ∆G is not surprising, considering that these bond energies are what drive polymerization.The important components for the decay constant are more interesting: in addition to the obvious parameter of the GTP hydrolysis rate constant, k H , the decay constant is also determined by ∆G o long and ∆G o lat .A simple interpretation of this result is that a microtubule that forms stronger bonds (and hence grows faster) will have a larger GTP cap.Consistently, microtubules that grow faster have larger GTP caps when End Binding proteins are used as reporters of GTP cap size [35].The lifetime distribution and the post-catastrophe shrinkage rates are similarly complex, depending on both bond energies and k H . Interestingly, k H makes a relatively minor contribution to the microtubule lifetime distribution.This result implies that the best way to avoid a catastrophe might not be to hydrolyze GTP slightly slower but rather to form slightly stronger bonds and hence grow faster (see Discussion).seam did not impact the model's outputs.Of course, we cannot rule out that ∆G o seam will have a significant impact on an observable not measured here.Nevertheless, while the seam may have interesting structural characteristics [19,33], it can be safely ignored in simple models of microtubule dynamics.
Accounting for end structure is essential Next, we added a parameter to account for the tapering of pfs found at microtubule ends [37].Because of tapering, some incoming dimers will bind to sites with lateral neighbors and some will not.Recent work from Castle et al. [38] used Brownian dynamics simulations to show that lateral neighbors block off some of the diffusional paths that lead a tubulin dimer to its binding site.The blocking of these paths creates a "penalty" for the association rate constant k + .As such, we introduce k s , a parameter that reduces the association rate constant k + based on the number of lateral neighbors at each binding site [4,39].Our value for k s is roughly equivalent to the penalties produced by Castle et al.'s Brownian dynamics simulations.We simulated microtubules and calculated the FIM for the system, as above.Fig. 5A-D follow Fig. 4, showing our four observables, the eigenvalues, and eigenvector components.
Despite the introduction of a new parameter, one eigenvalue strongly dominates each observable (Fig. 5).Thus, the introduction of a new parameter did not add an additional dominating eigenvalue, meaning that the underlying dimensionality of the system did not change.This result is also not obvious.Adding a new parameter like k s might have added a new relevant direction in parameter space, but here it did not.Each observable is still governed by a single effective parameter.
Although the addition of k s did not add a new effective parameter, k s does appear in the eigenvector components of the dominant eigenvalues for all 4 observables (see column 3 of Fig. 5).Indeed, k s makes a contribution equal to or greater than k H .This result indicates that accounting for k s , and more generally for the complexity of the structure of microtubule ends, is essential even in simple models of microtubule dynamics that track only a few observables.

Estimating the dimensionality of the system
As shown above, the distribution of each observable (e.g., microtubule lifetime) can be described by a single effective parameter, given by the eigenvector components of the dominating eigenvalue.But are these effective parameters the same for all observables or are there four orthogonal effective parameters?Looking at the eigenvector components for the length, the microtubule lifetime distribution, and the post-catastrophe shrinkage rate, it's clear that ∆G o long , ∆G o lat , and k s are important components of the effective parameter for all three of these observables.So perhaps the effective parameter here is the same.In contrast, the decay constant has a significant eigenvector component from the hydrolysis rate constant k H , indicating that the effective parameter determining the decay constant might differ from the others.So what is the true dimensionality of our microtubule model?In order to estimate the dimensionality of the model, we need to consider all observables as a whole.Intuitively, the dimensionality of the system is defined by the dominating eigenvalues.To perform a more rigorous estimation of dimensionality for each model, we computed the singular value decomposition (SVD) on the eigenvectors of the four dominating eigenvalues corresponding to the four observables [40].We performed the SVD analysis for both our base model with ∆G o seam (Fig. 6, red ) and our base model with ∆G o seam and k s (Fig. 6, blue).Interestingly, in both cases SVD revealed a sloppy distribution of singular values, with one dominating singular value, a second singular value roughly one order of magnitude smaller, and two even smaller singular values one order of magnitude below.The presence of two large singular values means that two effective parameters are enough to fit the data with close to a 99% precision.The vector components for the base model with ∆G o seam and base model with ∆G o seam and k s are shown in Fig. S5A and Fig. S5B respectively.This analysis demonstrates rigorously that the full dimensionality of our model is roughly equal to two in both cases.The first effective parameter drives the length, the microtubule lifetime distribution, and the post catastrophe shrinkage rate, with extended contributions from lattice energies and a smaller contribution from the hydrolysis rate constant.We call this parameter the "effective polymerization" parameter.The second effective parameter drives the decay constant and is a combination of the k H and ∆G o lat .We call this parameter the "effective hydrolysis" parameter.It is remarkable that dynamic instability can be compressed into a two parameter system.Interestingly, the addition of k s did not change the underlying dimensionality of the microtubule model.Rather, k s simply became a significant component of the effective polymerization parameter.

DISCUSSION
As biophysicists, we want to capture the complexity of biology in the simplest possible terms, even if those terms are themselves quite complex.Our work has demonstrated the power of numerical PSC as a method for identifying the essential parameters and low-dimensional structure of complex models.We first validated our method against two analytically-solvable models and then applied it to a well-known computational model of microtubule dynamic instability.Thus, our method opens the door to the simplification of any computational model in biology.
Our application of numerical PSC to a simple model of microtubule dynamic instability provided four direct insights into microtubule growth and catastrophe.First, we were surprised to find that k H , the hydrolysis rate constant, had a lesser influence on the lifetime distribution relative to ∆G o long and ∆G o lat .This suggests that catastrophe might be more efficiently prevented by making stronger bonds rather than by slowing down hydrolysis.Our interpretation is that stronger bonds help prevent pfs from losing their terminal GTP-tubulin dimers, which would cause the pf to become "fully uncapped".Bowne-Anderson et al. argued that uncapping of pfs are the irreversible events that lead to catastrophe [41].Similarly, poisoning of pf ends with the drug eribulin has a very strong effect on catastrophe frequency [42].Therefore, our results showing the importance of ∆G o long and ∆G o lat are consistent with the emerging concept that "pf destabilization" is a root cause of catastrophe.
The second conclusion is that k s contributes significantly to the effective parameters controlling all four observables.k s is important because it significantly changes the rate at which longitudinal bonds are formed.Parameters like k s were absent from early models of microtubule dynamics [2] before the importance of tapering of microtubule ends was clear [43].Additional parameters related to end structure, such as the lateral and longitudinal curvature of tubulin dimers as they enter the lattice [44], are sure to determine microtubule behavior.
Third, the insignificance of ∆G o seam demonstrates that the seam's slight weakness relative to other lateral bonds is unlikely to determine the fate of a microtubule in terms of its growth rate or lifetime.Interestingly, increasing the number of seams in a microtubule experimentally does not change its measured mechanical properties [45].Therefore, although the seam is an interesting structural feature of the microtubule [19], accounting for it is not necessary in simple computational models, at least as far as our observables are concerned.We look forward to expanding our analysis to more complex models and larger sets of observables.
Lastly, our analysis of dimensionality revealed that al-most all data simulated here can be described with only two parameters, corresponding to an effective polymerization parameter and an effective hydrolysis parameter.Computation of such effective parameters is made rigorous and possible by the use of our approach.The addition of new parameters (such as k s ) might contribute to the effective parameters of a model without changing dimensionality.Interestingly, our approach could help experimentalists identify the types of data that are necessary to define the effective parameters.In the microtubule case, we see that the decay constant directly identifies the effective hydrolysis parameter.
Our ability to distinguish between models in science is always limited by the availability of hard data (except in string theory).Which parameters of a model are "stiff" and which are "sloppy" depends critically on the observables that the model attempts to reproduce.In biophysics, the rigor of physical modeling collides against the complexity of biological interactions.A coupling of theory and experiment is necessary to disentangle this complexity.PSC tightens this coupling by improving the interpretability of models, which in turn identifies the key experiments that drive theory forward.

Appendix S1: Derivation of the Fisher Information Matrix (FIM) expression following Machta et al. 2011 [1]
The FIM expression presented in the main text is a way to estimate how model parameters can be fitted to data, and by extension how models with different parameters are distinguishable.We will use the FIM in the latter sense, and in the following recall its connection to data fitting.
Assume we have a mathematical model of a biological systems, with parameters θ, giving the probability y θ, x i of observable x to take the value x = x i .We call d i the experimentally measured probability of value x i and assume that: where we assume r i to be a random Gaussian noise of mean 0 and variance 1, and σ i a local variance so that: Assuming all r i s are independent, the total probability P ( r, θ) of experimentally observing the values d = {d i } given the residuals r = {r i },is thus : where M is the number of points where we try to fit data.
The Fisher Information Matrix (FIM) defines the amount of information that the residuals r i contain on parameters.Intuitively, the FIM tells us about the distinguishability of two parameter sets given the data.It is given by: Since r i are Gaussian distributed, one can explicitly perform the computation of the FIM, which can then be interpreted as a metric g µ,ν quantifying the ability to distinguish between different parameter sets.Following Machta et al. we have: We substitute equation 2 into equation 5.The first sum cancels out because r i is independent of y θ, x i , d i is independent of θ and the expectation value of the residue r i itself is zero.The second term becomes fully deterministic since with the same assumptions ∂ri ∂θν = ∂y( θ,xi) ∂θν . We arrive at the final expression for the Fisher Information Matrix of the model: where we assume all σ i to be equal (and rescaled to 1).The remarkable result is that this metric, while initially computed by averaging over experiments, is a pure function of the model y, and as a consequence, can be used independently of actual experiments to estimate in a deterministic way models distinguishability.Biological models mix rates and energies, so the parameters are potentially of very different nature.Mixing units might yield purely dimensional effects in the analysis of important directions in parameter space.Energy is the most fundamental quantity and kinetic rates are exponentially related to the rescaled energy (in unit of k B T ).We take the derivatives with respect to the log of the parameter (i.e.

∂y( θ,xi)
∂ log θmu = ∂y( θ,xi) ∂θmu θ mu ) to express every parameters in the Fisher Information Matrix with the same effective unit.Therefore, the final expression becomes: where α µ = 0 if θ µ is an energy and α µ = 1 if θ µ is a kinetic rate.

Appendix S2: Conversion for energy k B T to rates s −1 in the microtubule model
As shown in Appendix S1, biological models mix energies and rate constants as parameters.While energies are the most fundamental quantities for our Fisher Information Matrix computation, computing derivatives by incremen-Hsu et al., PREPRINT | 13 tal changes of energies might be challenging since small changes of energies could potentially give big changes of rates and thus of probability distributions.For numerical computations, we vary rate constants by a small increment α, and shift energies such as ∆G o long , ∆G o lat and ∆G o seam logarithmically.In order to ensure a smoother change of probability distribution, we need to come up with a conversion between changing bond energy and the change in rate.
As an example, the off rate of a dimer for the microtubule model is given as: where k + is the apparent on rate and ∆G o tot is the total bond energy associated with the particular dimer.
Let us define the original off rate k − ori and the new off rate after changing the bond strength k − new .. Assume the difference factor is defined as the following: which implies that Equation 10 gives us the change in energy corresponding to a relative change α.Practically, when computing derivatives such as the ones in equation for the Fisher Information Matrix, we thus use a relative rate of αfor kinetic rates, and of ln (1 − α) for energies.Notice that as αgoes to 0 the relative change of energy also becomes zero.However, we can not pick an αthat is too small due to numerical imprecision.Figure 1 shows the relationship between the change in energy and the ratio of change in rates.For example, a 5% change in rate constant is around 0.05 k B T change in energy.

Appendix S3: Analytic calculation of eigenvalues for a one dimensional diffusion system at the first time step
For the one dimensional random walk where the particles diffuse uniformly, the probability density after one time step is proportional to the number of particles at each possible lattice site.Therefore, P (N µ ) = Nµ N total = κ where N µ is the number of particles that are in each possible lattice sites and total number of particles N total = µ N µ .Using equation 7 from Appendix S1 and taking into account that after the first time step only the diagonal elements of the Fisher Information Matrix are nonzero.Therefore, equation 7 simplifies to: where δ µν is the Kronecker delta.Therefore, the corresponding eigenvalues are: . This shows that the eigenvalue of a one dimensional random walk at the first time step is equal to the probability at a given lattice sites squared.The result for

Appendix S4: Analytic calculation of the dominant eigenvalue for the protein production and degradation system
The stationary distribution of a simple production and degradation system is given by the Poisson distribution [2]: therefore, the Fisher Information Matrix for this system from equation 7 becomes: where α n = ∂Pn ∂ρ ρ and β n = ∂Pn ∂δ δ.To find the eigenvalues λ of this matrix, we subtract the identity matrix with diagonal value λ and set the determinant equals to zero: We can calculate α n and β n analytically and show that the magnitudes of the two terms are equal to each other: Thus the determinant matrix becomes the following form, where A = −α n β n : The eigenvalues for this matrix are easy to compute with only one nonzero eigenvalue: To calculate analytically the nonzero eigenvalue: λ 1 , we approximate the Poisson distribution with a Gaussian distribution and change the integral into a summation with γ = ρ δ : The dominating dominating eigenvalue for the protein production degradation system is shown to be a ratio of the production rate over the degradation rate times some constant.The simulations of different ratios matches the analytic solution.
(B) The second eigenvalues for the protein production and degradation rate is non zero during simulation due to the limitation of the physical system itself.At steady state the average number of protein is one hundred protein which means that the smallest shift for probability to calculate the finite derivatives is one protein which is one percent which has a nonzero eigenvalue.
If the value of γ 0, the second term becomes negligible due to the exponential.Thus, the final expression for the eigenvalue is: It is important to note that even though the second eigenvalue is zero from the analytic calculation, due to the limitation of the numerical precision, it is impossible to reach the zero value given the physical definition of the system.For example if production rate ρ = 1 and the degradation rate δ = 0.01, we know that the steady state solution will have a peak at number of protein of N = 100.However, this means that when calculating the finite derivatives, using any shift smaller than 1% will result in the shift being less than one protein which is nonphysical.We show that even using complete analytic values of Poisson distribution to calculate the finite derivatives, we will not be able to reach zero when using the finite derivatives.
The figure is shown in Appendix figure S3B.

Appendix S5: Microtubule simulation
We use VanBuren et al. 2002 as our inspiration for our simulation of microtubule dynamic instability [3].

Parameters
The microtubule has 13 protofilaments with a three monomer offset at the seam.The base parameters in the model are: (1) k + , the association rate constant for tubulin subunts to associate with the end of a protofilament; (2) ∆G o long , the longitudinal bond energy between dimers; (3) ∆G o lat , the lateral bond energy between tubulin subunits in a B-lattice configuration (α−α and β −β); (4) k H , the hydrolysis rate constant for the conversion of GTP-tubulin to GDP-tubulin; (5) ∆∆G o lat , the change in free energy associated with GTP hydrolysis, which is assigned to each lateral bond; and (6) [T ubulin], the concentration of tubulin.The additional parameters added to the model are: (7) ∆G o seam , the lateral bond energy between tubulin subunits in an A-lattice configuration, namely at the seam (α − β and β − α); and (8) k s , the dimensionless factor that reduces k + in the presence of lateral neighbors.See Fig. 3A and B for schematics.

Coupled Random Hydrolysis
An α-tubulin contributes catalytic residues to the GTP pocket of the β-tubulin that sits below it.Therefore, a tubulin subunit cannot hydrolyze its GTP unless another tubulin subunit is above it; in other words, only non-terminal subunits can hydrolyze GTP.The GTP hydrolysis reaction for all non-terminal subunits occurs at random time intervals (see Gillespie algorithm below).This implementation of GTP hydrolysis is known as coupled random hydrolysis [4].We do note, however, that earlier models of dynamic instability used other implementations of GTP hydrolysis, e.g., where the hydrolysis reaction was obligate after association of a new terminal subunit, known as vectorial hydrolysis.But coupled random hydrolysis is the standard among contemporary models.
Our model assumes the transition from GTP-tubulin to GDP-tubulin occurs in a single step, with no intermediates in the hydrolysis pathway.A single step is surely a simplification, as GTPases often have important GDP-Pi intermediate states.Recently, Manka et al. solved a cryo-EM structure of the putative GDP-Pi state [5], and very recent models have incorporated a GDP-Pi state explicitly [6].Testing the relevance of a GDP-Pi state will be the subject of future studies.

Lateral Weakening
The effect of GTP hydrolysis in the VanBuren model is a weakening of lateral bonds.This choice is based on the observation that protofilaments peel outward after a catastrophe [7], which indicates that the lateral bonds rupture first.More recent observations suggest that longitudinal bonds may also be affected by GTP hydrolysis; more specifically, the N-domain of α-tubulin appears to compact down into the β-tubulin below it.

Gillespie algorithm
We use the direct method of the Gillespie algorithm to simulate all the possible events for the microtubule at a given time [8].The possible events for the microtubule simulation are: association of dimers, dissociation of dimers, and hydrolysis of GTP-tubulin to GDPtubulin.The association and dissociation events only happen at the ends of protofilaments.The association rate is equal on the top of each protofilament and is defined as the association constant k + multiplies by the concentration of tubulin: The dissociation rate depends on ∆G o total : the total bond energies the tubulin dimer have with its neighboring dimers: The total number of hydrolysis events depend on the number of GTP tubulin that are present in the lattice and a nonterminal GTP can be hydrolyzed into a GDP at a fixed first order rate k H which leads to a total rate of hydrolysis at any given time as: The effect of hydrolysis is a decrease in the bond strength laterally by ∆∆G o .This is commonly known as the coupled random hydrolysis model.We sum up all the possible rates r i for all the pos-sible events αand generate two random number R 1 , R 2 between zero and one.We choose the events that satisfy the following condition: and we increment the simulation time t by τ , where τ is defined as

Benchmarking
Because we use the direct method of the Gillespie algorithm, we benchmarked our Gillespie simulation against published results of Ayaz et al. [9] and VanBuren et al. [3] to ensure that our simulation was properly implemented.We used their precise parameter values and simulated microtubule growth in the absence of GTP hydrolysis (which is handled differently in the two models).Our simulation recovered their results exactly (see Fig. 4A  and 4B).Therefore our simulation is well-executed.

Reproduction of Experimental Data
Simple models like those described here are best at reproducing microtubule growth rates and post-catastrophe shrinkage rates across a range of tubulin concentrations.The models can reproduce the mean lifetime at a single tubulin concentration, but then the trouble begins.These models cannot reproduce the mean lifetime across a range of tubulin concentrations.The models are much too sensitive to [T ubulin-at high [T ubulin, catastrophes become exceedingly rare, in contrast with experimental observations.Similarly, the distribution of microtubule lifetimes is best described by a Gamma distribution because catastrophes are a result of microtubule "aging".Simple models do not reproduce this Gamma distribu-

Appendix S6: Microtubule simulation analysis
We use an in house Matlab code to analyze the microtubule trajectories over time.The algorithm smooth the trajectories and identify local maxima and minima.The maxima correspond to potential catastrophe positions and minima correspond to potential rescue positions.lifetime and post-catastrophe shrinkage rate is then determined.Note: to avoid fluctuation of the stochastic simulation, a threshold of 250 nm is used (since this is the point spread function of our microscope) such that any lifetime that is counted towards the probability distribution starts from a minima to a maxima while passing 250 nm and vice versa for the post-catastrophe shrinkage rate but instead it goes from a maxima to a minima.

Figure 1 .
Figure1.Numerical PSC.(A) Schematic of a ("stiff") parameter versus a ("sloppy") parameter in parameter space.The "stiff" parameter changes the observable more significantly than a "sloppy" one.Note: the red and blue curves are probability distributions when shifting different parameters by the same amount.(B) Schematic of one-dimensional random walk with parameters θi, the probability of jumping to neighboring sites.(C) The three steps of our numerical PSC method: 1. Generate the probability distributions of the observables needed (particle density at different time steps), 2. calculate the finite derivatives for the Fisher Information Matrix and its corresponding eigenvalues, and 3. repeat at each time step of the simulation and track the eigenvalues over time.Note: our numerical PSC is able to reproduce the analytic result from[15].

Figure 2 .
Figure 2. Protein production and degradation.(A) Schematic of a simple protein production-degradation system with production rate ρ and degradation rate δ.(B) Plot of eigenvalues over time for the protein production-degradation system.There is one dominating eigenvalue and it matches the analytical result.(C) Plot of eigenvector % from the dominating eigenvalue of panel B.The production rate ρ dominates at early time points but at stationarity, the production rate and degradation rate contribute equally.Note: The eigenvector % is the absolute value of the parameter component.

5 A
) the decay constant that Hsu et al., PREPRINT | Microtubule model and parameters C Simulated growth curve B New parameters

Figure 3 .
Figure 3. Microtubule dynamics.(A) The base model of our microtubule simulation following VaBurren et al. (B) Two new adjustable parameters were introduced to the base microtubule model, the weaker lateral bond at the seam ∆G o seam and the neighboring penalty ks, which decreases the on rate for tip positions that have neighboring tubulins.(C) Plot of length versus time from our microtubule simulation.(D) (Left) Plot of the decay constant distributions of two ∆G o long .(Right) Plot of the decay constant distributions of two lateral bonds at the seam ∆G o seam .

Figure 4 .
Figure 4. Numerical PSC for the base microtubule model plus ∆G o seam .Eigenvalues and eigenvector components (%) for four observables: (A) Length (the exact number of dimers in each microtubule), (B) Decay constant of GTP from the tip, (C) Average lifetime of the microtubule, (D) Postcatastrophe shrinkage rate.Note: The eigenvector % is the absolute value of parameter component.

Figure 5 .
Figure 5. Numerical PSC for the base microtubule model plus ∆G o seam and ks.Eigenvalues and eigenvector components (%) for four observables: (A) Length (the exact number of dimers in each microtubule), (B) Decay constant of GTP from the tip, (C) Average lifetime of the microtubule, (D) Post-catastrophe shrinkage rate.Note: The eigenvector % is the absolute value of parameter component.

Figure 6 .
Figure 6.Identifying the dimensionality of the models.Plot of the singular values for the base model plus ∆G o seam and the base model plus ∆G o seam and ks.Both systems are two-dimensional.

Figure S1 .
Figure S1.The conversion plot for various percentage change for parameters that have a unit of kBT .The conversion rate is universal and it does not depend on the value of the parameter itself.Note: a 5% change in parameter value is about 0.05 kBT .

Figure
Figure S2.(A)The particle density for a non uniform one dimensional random walk over time (with probability higher to the right side of the space).(B) The eigenvalues for the non uniform one dimensional random walk.The eigenvalues at the first step of the simulation is not unity compare to the uniform random walk.(C) At the first step of the simulation for the one dimensional random walk, the eigenvalue is equal to the squared rate given for the non uniform simulation.This result is universal for any rates given.(D) The eigenvalues at the first step of a uniform simulation is also equal to the squared rate of the simulation i.e. squared of one over the number of parameters.

Hsu et al., PREPRINT | 15
Figure S3.(A)The dominating dominating eigenvalue for the protein production degradation system is shown to be a ratio of the production rate over the degradation rate times some constant.The simulations of different ratios matches the analytic solution.(B) The second eigenvalues for the protein production and degradation rate is non zero during simulation due to the limitation of the physical system itself.At steady state the average number of protein is one hundred protein which means that the smallest shift for probability to calculate the finite derivatives is one protein which is one percent which has a nonzero eigenvalue.

Figure S4 .
Figure S4.Microtubule simulation bench mark against computer simulation of Ayaz et al. and experimental data of Walker et al. (A) GMPCPP tubulin (B) GTP tubulin.

Figure S5 .
Figure S5.The first column is the leading vector for the highest singular value for the microtubule system.The longitudinal bond and lateral bond are the controlling parameters.The second column is the leading vector for the second highest singular value for the microtubule system.The lateral bond and the hydrolysis rate is the controlling parameters: (A) Microtubule simulation without neighboring penalty.(B) Microtubule simulation with neighboring penalty Note: The SVD % plotted are the absolute value of parameter component.
Hsu et al., PREPRINT | 17 tion but rather given an exponential distribution of lifetimes.More complex models do better.At present, no published model is able to reproduce data on templated nucleation.