Identifying mechanisms and therapeutic targets in muscle using Bayesian parameter estimation with conditional variational autoencoders

Cardiomyopathies, often caused by mutations in genes encoding muscle proteins, are traditionally treated by phenotyping hearts and addressing symptoms post irreversible damage. With advancements in genotyping, early diagnosis is now possible, potentially preventing such damage. However, the intricate structure of muscle and its myriad proteins make treatment predictions challenging. Here we approach the problem of estimating therapeutic targets for a mutation in mouse muscle using a spatially explicit half sarcomere muscle model. We selected 9 rate parameters in our model linked to both small molecules and cardiomyopathy-causing mutations. We then randomly varied these rate parameters and simulated an isometric twitch for each combination to generate a large training dataset. We used this dataset to train a Conditional Variational Autoencoder (CVAE), a technique used in Bayesian parameter estimation. Given simulated or experimental isometric twitches, this machine learning model is able to then predict the set of rate parameters which are most likely to yield that result. We then predict the set of rate parameters associated with both control and the cardiac Troponin C (cTnC) I61Q variant in mouse trabeculae and and model parameters that recover the abnormal 61Q cTnC twitches.


Identifying mechanisms and therapeutic targets in muscle using Bayesian parameter estimation with conditional variational autoencoders
Travis Tune a,b , Kristina B Kooiker b,c , Jennifer Davis b,d,e , Thomas Daniel a,b , and Farid Moussavi-Harami b,c,e   This manuscript was compiled on May 7, 2024 Cardiomyopathies, often caused by mutations in genes encoding muscle proteins, are traditionally treated by phenotyping hearts and addressing symptoms post irreversible damage.With advancements in genotyping, early diagnosis is now possible, potentially preventing such damage.However, the intricate structure of muscle and its myriad proteins make treatment predictions challenging.Here we approach the problem of estimating therapeutic targets for a mutation in mouse muscle using a spatially explicit half sarcomere muscle model.We selected 9 rate parameters in our model linked to both small molecules and cardiomyopathy-causing mutations.We then randomly varied these rate parameters and simulated an isometric twitch for each combination to generate a large training dataset.We used this dataset to train a Conditional Variational Autoencoder (CVAE), a technique used in Bayesian parameter estimation.Given simulated or experimental isometric twitches, this machine learning model is able to then predict the set of rate parameters which are most likely to yield that result.We then predict the set of rate parameters associated with both control and the cardiac Troponin C (cTnC) I61Q variant in mouse trabeculae and and model parameters that recover the abnormal 61Q cTnC twitches.

Machine Learning | Cardiac Muscle | Myopathies | Sarcomere Models | Inverse Problems
In muscle, force is generated by the interaction of two overlapping filaments, the myosin containing thick filaments and the actin containing thin filaments (Figure 1).The contractile process is regulated by troponin-tropomyosin complexes spaced along each thin filament.In the presence of calcium ions, these complexes undergo a conformational change, allowing myosin motors to attach to actin binding sites on the thin filament.Upon binding, myosin motors generate force (1).This complex series of chemo-mechanical events can be modeled as state transitions in the regulatory proteins in the thin filament and energy (ATP) transformations of the myosin cross bridges extending from the thick filament (Figure 1) (2)(3)(4)(5)(6).Both genetic and acquired muscle disorders can interrupt the chemo-mechanical state transitions of contractions.And several of the above studies have used computational simulations to connect estimates of state transition rates to observed mechanical behaviors of abnormal muscle contraction.These approaches generally fall into the category of forward models -that is they generate predictions of muscle contractile behavior based on the best possible estimates of the kinetics of state transitions as well as the mechanics and geometry of the contractile lattice.
Recently, advances in machine learning and optimization methods allow us to address the connection between the underlying mechanochemistry of muscle and the emergent behavior as an inverse problem.That is, given the observed contractile behavior, what underlying kinetics, mechanics, and geometry best explain the observation?Answering this question through experimental efforts alone is both time consuming and challenging.Simulations, on the other hand, can allow us to explore a vast parameter space and allow us to estimate the effect of different underlying parameters.However, even then, there are significant challenges regarding selection of rate constants or other model parameters at baseline or with particular diseased states.For one thing, because we are exploring a high dimensional parameter space, our system is under-determined, and there are likely many different combinations of rate constants which correspond to indistinguishable outcomes.Also, because many current simulations are based on Monte-Carlo methods (3), random variation in both the simulation and experimental data can make predictions difficult.Solving the inverse problem requires a very large data set that spans many combinations of rate constants.

Significance Statement
Machine learning techniques have potential to accelerate discoveries in biologically complex systems.However, they require large data sets and can be challenging in high dimensional systems such as cardiac muscle.In this study, we combined experimental measures of cardiac muscle twitch forces with mechanistic simulations and a newly developed mixture of Bayesian inference with neural networks (in autoencoders) to solve the inverse problem of determining the underlying kinetics for observed force generation by cardiac muscle.The autoencoders are trained on millions of simulations spanning parameter spaces that correspond to the mechanochemistry of cardiac sarcomeres.We apply the trained model to experimental data in order to infer parameters that can explain a diseased twitch and ways to recover it.A common way to approach inverse problems is by reformulating them as a probabilistic inference problem using Bayes' Theorem.Rather than seeking a particular solution to a problem, we treat the solution as a random variable, and attempt to quantify our uncertainty in its value.This allows for incorporating uncertainty, whether from noise or the under-determined nature of the model in question, into our predictions.A challenge of Bayesian inference in high dimensional data is slow convergence, which can be improved using machine learning techniques.
Previously, a machine learning architecture called Conditional Variational Autoencoders (CVAE) have been used in other contexts in order to solve high dimensional noisy inverse inference problems (8).Auto encoders work by learning a distribution over an abstract latent space of an input, and then learning to reconstruct the input from the reduced representation.In order to solve the inverse problem of rate parameters that describe a particular twitch, we first generated a very large set of twitches using our spatially explicit half sarcomere muscle model (2,7,9).In our spatially explicit muscle model, the thick and thin filaments are comprised of a series of springs with stiffnesses determined experimentally.Our model allows us to specify the geometric configuration of filaments, myosin motors, and actin binding sites, and allows us to alter the rate kinetics of motors or binding sites (6,7,10), or even a subset of the population according to any spatial distribution.We first trained the model to reduce and reconstruct the distribution describing the vector of rate parameters ⃗ r using both the vector (in time) of a twitch,f , and ⃗ r.We then tested the model by giving it only information on f alone.The posterior distribution p(⃗ r|f ) represents our belief that a set of rates ⃗ r will result in an observation f , conditioned on our specific dataset, generated using our spatially explicit model.We are able to successfully train that model to predict probability of a given rate constant resulting in a particular twitch.We then apply the model to a set of experimental mouse cardiac twitches and predict possible combination of rate constants that can produce a mouse genetic dilated cardiomyopathy (DCM) twitch.Lastly, we predict a combination of parameters than can best recover the abnormal DCM twitch.We show feasibility and validation of using machine learning techniques to infer underlying mechanism of diseased muscle and best ways to correct them.

Results
A. Parameter Inference of a Simulated Twitch.After training the CVAE network on a simulated training dataset, as described in the Methods section, we can illustrate the probability density function for all of the nine parameters that underlie a specified twitch.By examining only the time series of the twitch, and comparing to the known rate factors which led to the simulated twitch, we can estimate the accuracy of its parameter inference on unseen data.
In order to visualize the posterior distribution p(⃗ r|f ), we show the resulting multidimensional distribution as a corner plot.In the corner plot, every 2-D projection of the high dimensional distribution is shown.This allows us to see every 2-D marginalized probability distribution.Also, along the diagonal, we can show the 1-D marginalized probability distribution of every rate factor.The axes indicate the rate factors that were multiplied by the base rates, and are shown on a log scale from 10 −1 to 102 .An example simulated twitch and inferred distributions are shown in Fig. 2.
Corner plots allow us to visualize the covariance between each 2-D pairwise parameter combination.For example, we can see in Fig. 2 the covariance between the rate factors rt,12 and rt,41, indicating that changes in one parameter can potentially be partially compensated for by changes in the other.It also can be interpreted as a kind of parameter sensitivity analysis, indicating, for example, that the rate rx,34 must be tightly constrained for this particular simulated twitch.
B. Measuring the credibility of the CVAE.We used the probability-probability plot to ensure that the probability density estimate we find is representative of the actual probability density.On the x-axis is shown the credible interval and on the y axis is shown the fraction of times the true rate was found within the credible region for our validation dataset.For an ideal estimator, we should expect that when we estimate the probability that the true parameter lies with a certain volume of the rate space to be X%, we should expect the true parameter to be in that volume X% of the time.Values above the diagonal indicate underconfidence, and values below indicate over-confidence in the estimation.We show this probability-probability plot in Fig. 3, with each colored line representing one of the nine rates and the black dashed line indicating the ideal case.These were constructed by finding the cumulative probability density for each 1-D marginal probability estimate, for each data point in the validation dataset, and then plotting the fraction of posterior samples less than the true rate parameter.Our testing set contained 15050 samples, or 1% of our total simulated dataset.

C. Control and I61Q cTnC twitches from mouse trabeculae.
We wanted to see how our CVAE would perform on predicting rates in real muscle, so we collected twitch data from mouse trabeculae.We used control and I61Q cTnC mice, which we have previously shown to model genetic cardiomyopathy with hypocontractility at cellular and organ level (11,12).The trabeculae were held isometrically and stimulated at 1 Hz while stress was recorded (see Methods for further details).We recorded 10 twitches from 8 different muscles for both control and I61Q cTnC transgenic mice (11).We averaged the twitches together for each type and show the mean ± standard deviation in Fig. 4. Consistent with previous findings (11,12), we found that the I61Q cTnC variant resulted in reduced stress relative to the control twitch by approximately half.Table 1 shows mean and standard deviation of the peak stress, as well as peak, rise, and fall timings of stress.

D. Parameter Inference of Mouse Cardiac Twitches.
Using the mean experimental twitches for control and I61Q cTnC across individuals (Fig. 4), we next used the trained CVAE to infer the set of rates which were most likely to lead to  a simulated twitch which was similar to the experimental twitch.Fig 5 shows the predicted probability distributions for the mean experimental twitches of both the control (blue) and I61Q cTnC (orange), with the most probable values indicated.One notable result from the probability distribution plots is that rate constants for the calcium thin filament on and off rates (rt,12 and rt,41) are among the most divergent between the two twitches, which is consistent with the known experimental data of the I61Q cTnC variant (13).The results demonstrate the strength of the CVAE as it was not provided any information about the underlying mechanism of the abnormal twitches.
Next, we wanted to use the probability distributions plots to determine how well we can recapture the experimental twitches.Using the median value of the estimated distribution, we simulated a twitch using our spatially explicit model and plotted the resulting simulations along with the experimental twitches (Fig 5, upper right).In general, the simulated twitches using the rate parameters estimated by the CVAE follow the rising and falling characteristics of the simulated twitches, but do not achieve the same peak force.This is particularly noticeable in the control mice. ) Fig. 5. Upper right: Experimental and simulated isometric twitch stress for control and I61Q cTnC are plotted against time (solid lines).Simulations (mean solid black lines, standard deviation shaded) used the most probable rate parameters estimated from the CVAE model.Lower left: Estimated probability distributions for the control and I61Q cTnC experimental twitches, with the most probable rate for each indicated.Orange corresponds to I61Q cTnC.Blue corresponds to control E. Inferring rates associated with I61Q cTnC variant.In the parameter inference above, we allowed the CVAE to vary all nine rates in order to best fit the experimental data in both control and I61Q cTnC variants.However, we know from experimental work that the likely mechanism by which the I61Q cTnC variant decreases muscle performance is by reduced Ca 2+ binding affinity (14).We decided therefore to start from the control parameters inferred in Fig. 5 (blue) and then selectively alter the two parameters affecting Ca 2+ affinity, rt,12 and rt,41, to match that of the inferred I61Q cTnC twitch (orange).This hybrid twitch, shown in Fig. 6, therefore combines the inference of the CVAE with experimental knowledge to provide an estimate of an I61Q cTnC twitch.The I61Q cTnC simulation with this hybrid rate factor combination does not match the experimental data as well as when all nine rates were allowed to vary, which is expected.F. Thick filament intervention to correct thin filament Ca 2+ deficiency.Starting from the hybrid twitch we simulated in Fig. 6, we wanted to know what intervention we could perform (in simulation) which could potentially correct for the alteration in thin filament Ca 2+ sensitivity we introduced.

Wild
To accomplish this, we generated a second data set in which we fixed the rate factors in the thin filament that were used in the hybrid twitch (Fig. 6) and allowed the rates in the thick filament to vary as before, from .1 to 100 over a log uniform scale.We then trained a second CVAE network to predict the probability distribution of rates over this reduced space.
Once the CVAE had been trained sufficiently (as determined by the training curve in conjunction with a prob-prob, as in Fig. 3), we used the original control twitch as input and generated a probability distribution over the thick filament rates.We then chose the rate factors corresponding to the median value of the probability distribution as input to the spatially explicit model.The probability distribution, median rate factor, and resulting twitch are shown in Fig. 7.We found that the estimated intervention approximately restored the peak twitch force, but that the relaxation time had been increased in comparison to the original control simulation.These findings are similar to our published work using the myosin activator Danicamtiv to recover the abnormal twitch of the I61Q cTnC mice (12).

Discussion
In this study, we wanted to combine Bayesian inference methods with a spatially explicit muscle model in order to predict combination of transition rates that can reporduce the isometric twitches from healthy and diseases mouse cardiac trabeculae.By re-framing this inverse problem using Bayes' law, we can infer a probability distribution over the rate space -rather than single rate vales -which may generate simulated twitches matching experimental data.By finding such distributions for both control as well as different genetic variants in mouse cardiac trabeculae, we hope to further be able to develop a method to infer how changes in some rate factors can 'undo' changes made in another.We began by updating our previously published computational model of the sarcomere to include additional states in order to more easily compare the spatially explicit muscle model to experimental results.The new model was used to generate a large data set of cardiac twitches in which 9 of the model parameters were randomly varied.We then used Bayesian parameter estimation with a Conditional Variational Autoencoder (CVAE) to generate probability density plots of parameter(s) that result in a particular cardiac twitch.
Four key results emerge from our study.First, changes to probability calculations allow for more accurate simulations with fewer times-steps and less variations.This will have implications for others who use similar stochastic models of muscle contraction.Second, variational inference enables accurate parameter estimation for high dimensional data such as muscle twitch.Third, machine learning tools such as conditional variational autoencoders that are trained on simulated data, can be applied to experimental twitch data.We are able to predict the parameter space that can produce particular normal or diseased twitches.Fourth, we are able to use our newly trained tools to predict possible ways to recover abnormal cardiac twitches.
Before we expand on the results, it is important to highlight the limitations of this study.First, the spatially explicit model only captures multiscale interactions at the half-sarcomere level while in whole muscle, interactions between heterogeneous collections of half sarcomeres as well as structures outside the sarcomere may have significant effects on whole muscle function (5,(15)(16)(17).A whole muscle is not simply a half sarcomere scaled up as we have done here.Despite this, is has been shown that isometric twitch stresssimulated and experimental -can be predictive of underlying biophysical perturbations (18).
Another challenge is the vast number of parameters which could potentially be explored.Our spatially explicit model contains parameters that kinetic, mechanical, and geometric aspects of the sarcomere.Kinetic parameters include the transition between states in the actin binding states and myosin crossbridges, such as the nine rate factors we modified in this paper.However, those nine are only a subset of the overall rate space that could be affected by genetic variants.Mechanical parameters represent the elastic coupling between neighboring myosin heads and binding sites, as well as the mechanical properties of the myosin heads themselves, and changes in that coupling can lead to very different sarcomeric function (2).Geometric parameters represent the spatial distribution of myosin heads including their orientation, number relative to thin filaments and radial spacing, all of which can depend on species (19,20).Because the model is spatially explicit, any sub-population, or even multiple subpopulations, can be given different characteristics, leading to an exceedingly large parameter space.And, because of the spatially explicit nature of the underlying model, each value can be varied.Spatial variation in parameters can be useful, since the penetrance of a genetic variant may not be 100%, but rather only a sub-set of proteins may be affected, and their distribution may be uniformly random or spatially heterogeneous (18).
Changes to probability calculations allow for more accurate calculations.One significant modification to the spatially explicit muscle model is the implementation of the matrix exponential in obtaining the per time step probability of transition for states.Previous work with our model used fewer states and were isometric, in which the approximate probability calculations were sufficient.Thus our model, and other similar Monte-Carlo simulations, simply used the rate and simulation time step to compute a transition probability (7,21).However, with the addition of states this approximation should become less accurate.For example an unbound crossbridge in state 1 could transition into one of three states.Using our previous probability calculation, each competing state would require a smaller and smaller time step for an accurate approximation of the probability of transition, increasing computational cost.Using Eq. 2, since the probability is guaranteed to be conserved regardless of time-step size, we can use coarser time steps.This speeds calculations not only by having to simulate fewer time-steps per twitch, but because variation is reduced, we can average fewer twitches together to form each data point in the training dataset.Due to these changes in the probability estimation, we expect that techniques with higher frequency oscillations and amplitudes, such as Nyquist measurements in (22), should be feasible.

Variational inference allows for approximation of high dimensional inverse problems.
In general, high dimensional inverse problems are difficult for a number of reasons, namely that there is often no unique solution for a particular observation, but rather many solutions which lie on a manifold in a high dimensional space.In other words if we observe some data ⃗ f , we can expect for a theoretical model, there will be many parameters ⃗ r which can generate that data (23).A common way to deal with this under-determined nature of the problem is by adding regularization terms to a cost function.By adding additional constraints, one can limit the number of potential solutions (24).By reformulating the problem in terms of Bayes Theorem, we can model the solution as a random variable and guess its probability distribution instead of treating it as a single value.

Combination of Simulations and ML methods can illuminate underlying mechanisms of muscle diseases and ways to recover function.
Machine learning methods such as variational autoencoders are increasingly applied in a variety of biological settings to identify cell types that contribute to disease states (25), design novel protein variants (26) or recombinases (27).Recently, autoencoders have been applied to multiplexed immunofluorescence images to understand how alterations in different processes can alter subcellular organization (28).For the first time, we apply CVAE to cardiac twitch simulated and experimental data.As seen in Figure 3, CVAE can predict the actual rate of a simulated twitch.The method can inform about underlying cardiac sarcomere biology.For example, in Figure 2, there is very little variation in the predicted rx,34 for the single twitch shown.This is consistent with the fact that ADP release is the rate-limiting step in the cross-bridge cycle and cardiac muscle relaxation kinetics.
The traditional way of studying sarcomeric pathology is to start with biochemical or biophysical assays often from recombinant proteins and then scale up to cellular, tissue or animal models.Based on the mechanism of action, potential treatment strategies can be tested in vitro and then in vivo.This approach is informative for single variants but lacks the resolution to screen many variants at once.Here we apply novel ML tools to muscle twitch data to infer underlying mechanism of action.As seen in Figure 5, there are many possible set of parameter combinations that can perform reasonably well in generating a twitch that resembles an average control or I61Q cTnC.For many of the thick filament parameters, there is significant overlap between the rates for these two conditions.However, for most of the thin filament parameters, there is very little overlap between the possible rates.As seen in Figure 5, there is minimal overlap between the rt,41 parameter space between control and I61Q cTnC.These findings are consistent with the known mechanism of I61Q cTnC variant, which reduces Ca(2+) binding affinity and cTnC-cTnI interaction (13).This demonstrates the power of this technique as the trained model did not know anything about the underlying mechanism of the diseased twitch.
Another use of of ML methods in cardiac biology is in predicting potential therapeutic targets.While there are no examples of this approach in sarcomere biology, a recent study used ML methods to identify a small molecule that corrected gene network dysregulation in NOTCH1haploinsufficient hiPSC-derived endothelial cells (29).As seen in Figure 7, we generated a twitch with the best predicted thin filament parameters that fit the I61Q cTnC data (orange trace).We then, generated a new smaller data set where only the thick filament parameters are varied in order to tell us about potential thick filament-based interventions to correct the abnormal twitch.Two effective strategies seem to be reducing ADP release rate (decreasing rx,34) and increasing the DRX myosin population (decreasing rx,61).These two rate constants are affected by the myosin activator Danicamtiv, which we have previously shown to correct the hypocontractility seen in the I61Q cTnC mouse model (12).While the peak force is clearly recovered by using this type of approach, the relaxation is prolonged, which we also saw in our experimental treatments with Danicamtiv.

Animals.
All experiments followed protocols approved by the University of Washington Institutional Animal Care and Use Committee according to the "Guide for the Care and Use of Laboratory Animals" (National Research Council, 2011).We used a previously published transgenic mouse model of genetic cardiomyopathy that over expresses the calcium desensitizing I61 cTnC (cardiac troponin C) variant.The transgenic mice have cardiac chamber dilation and reduced contractility compared to wild type mice (11,12).Control and I61Q cTnC mice between the ages of 4-6 months were used to collect experimental cardiac twitch data.
Experimental mouse cardiac twitch force measurements.Intact papillary muscles were dissected from the right ventricle of mouse hearts, secured between two aluminum t-clips (Aurora Scientific, Ontario, Canada) and mounted into the IonOptix Intact Muscle Chamber (IonOptix, Westwood, MA) between a force transducer and a length controller.The papillary was then submerged in an experimental chamber and was continuously perfused with modified Krebs buffer (1.8 mM CaCl) at 33 °C aerated with a 95/5 percent oxygen/carbon dioxide gas mixture to maintain physiological pH.Twitches were generated after field stimulation at 1 Hz with oscillating polarity.The initial length was set to just above slack.After pacing for about 20 minutes at 1 Hz, papillary muscles were stretched to an optimal length where peak twitch no longer increases for data acquisition.Width and thickness were measured for each muscle preparation to determine cross-sectional area, and all twitches are reported as stress ( mN mm 2 ).Data was analyzed using the IonOptix IonWizard software.Twitches were averaged over a 30 second recording with ten individual twitches for each preparation exported.
Spatially explicit half-sarcomere model.We began with the spatially explicit model as described in our previous publications (2,4,6,7,9,10,30,31).In this model, we define a 3D lattice of thick and thin filaments, each composed of a series of elastic elements in which the nodes represent myosin cross bridges and actin binding sites respectively.The stiffness of k thick (2020 pn/nm) and k thin (1743 pn/nm) are determined by measuring the stiffness of entire thick and thin filaments by a combination of microscopy and x-ray diffraction and using the repeat distances of 38.7 and 43 nm to scale the stiffness of each segment of the two filament types (2,32,33).Titin is included, and is modeled as a passive exponential spring with F T itin = ae b∆L element with a=220 pN and b = 0.0045 nm −1 , as previously described (7,10).
Crossbridges themselves are composed of a torsional and linear spring, situated at the nodes of the thick filament.The three state model described in previous articles consisted of a single free state, a loosely bound state (pre-powerstroke), and a tightly bound state (post-powerstroke).The stiffness of the crossbridge states, which are kr = 5 KT nm −2 and k θ = 40 KT rad −2 , and the zero points for each state for each state, which are r weak = 19.93 nm, rstrong = 16.4 nm, θ weak = 47.16 rad, and θstrong = 73.2rad, are determined by electron tomography and x-ray diffraction (6,34,35).
In order to more easily connect to experimental measurements of state transitions and other models, we have expanded our spatially explicit model to incorporate additional states in the crossbridge chemo-mechanical cycle.In the myosin crossbridge, we split the unbound state into two states corresponding to myosin with ATP and ADP (states 1 and 5, respectively); we split the tightly bound state into two states corresponding to a post power stroke and a rigor like state (states 3 and 4, respectively).Transitions between states are based on the free energy differences between states, with free energy given by the free energy changes at each step in the ATP cycle plus strain potential energy in the myosin head.Rate constants and functions given in other models (3,36) were used as a basis for the transition rates for the new states.
In the thick filament, forward rates are dependent on the free energy and the distance between the crossbridge head and the actin binding site.We define free energy as the potential energy of the two-spring crossbridge system, minus the free energy drop of each step of the ATP cycle, as measured in solution (36).
Here, kr and k θ refer to the stiffness of the radial spring element (the globular domain) and radial stiffness (the converter domain), and the subscripts (W,S) refer to the weak and strong states, with the change in the spring equilibrium point being what generates force.Energies are given in units of KT.The rate equations are then calculated as: We also included transitions between the discorded relaxed state (DRX, state 1) and the super relaxed state (SRX, state 6).The SRX state, also called the parked or OFF state, is a state with very low ATP turnover which allows for energy conservation in resting muscle (37,38).The rate r SRX gives the rate from the DRX to the SRX state and was set at a baseline of 50s −1 , which was chosen so that during tetanus roughly 10% of cross bridges were in a bound state.The reverse rate, r DRX of SRX to DRX uses the same functional form as that of (3): with the baseline rate k 0 ps set to 50s −1 , which was chosen so that in inactive muscle, approximately 50% of myosin heads would be in the SRX state, as found in (38,39).Another significant modification to the model involves the way we calculate the probability of state transitions.In each time step, we determine the probability of a transition for each individual myosin head and its closest neighbor actin binding site.Subsequently, we must compute the probability of transitioning from one state to another within the time-step dt.Previously, the probability of a transition from state i to state j was calculated simply as P = e −r ij * dt .In a multi-state Markov model, however, when r ij is large compared to dt, this leads to the sum of all probabilities being greater than one.This also does not account for the possibility of multiple transitions within a single timestep.For example, even though certain transitions may be directly impossible, we need to account for the possibility that multiple state transitions occur within a single time step.When dt is small relative to the rate r −1 ij the equation is approximately correct.However, because the rate functions used have infinite walls at some strain values, indicating rapid detachment at high strains, it can become impossible to have a small enough time step.
Here, we instead rely on the matrix exponential of the rate matrix Q, which is the matrix which has elements r ij giving the transition rate from state i to state j, with r ij calculated for each possible transition based on each crossbridge's configuration, at each time step (40,41).The diagonal entries of Q are chosen so that each row sums to 0, guaranteeing conservation of state transitions.Then the probability matrix P can be found by taking the matrix exponential: P = e Q * dt .The matrix exponential is defined by e X = ∞ k=0 1 k! X k , but can be efficiently approximated using Pade's approximation (42).
Each element p ij of the matrix P gives the probability that if the system starts in state i, it will end up in state j after time dt.It also guarantees that the sum of probabilities sums to 1. So, for each crossbridge in each time step, we draw a random number from a uniform distribution from 0 to 1 and compare it with the row in P which corresponds to the current state of the crossbridge in order to determine the new state.
This also allows us to account for multiple transitions in a single time step as well.So for example, even though it's impossible to go directly from state 1 to 5 in the model (r 15 = 0), we can still calculate the probability of a crossbridge passing through multiple states to in one time step, p 15 ̸ = 0.This allows us to use coarser time steps than previous instances of the model while still gaining better signal to noise.As dt goes to zero, P becomes the identity matrix and as dt goes to infinity, P will indicate the steady state distribution for the current geometric configuration and calcium concentration for each crossbridge.Because it gives the steady state only for the current geometric configuration, it cannot take into account implicit cooperative effects like compliant realignment.However, it can be used to initialize the state of the model for a given calcium concentration.
Training data.Our training data is generated by randomly choosing multiplicative factors over the range (.1, 100) from a log uniform distribution, and every corner plot shown here shows this factor on a log scale.Each of the 9 rate functions or constants shown in 8 is multiplied by this factor, changing it from its default value that was chosen based on literature values to generate mouse cardiac twitch (43).Each simulated twitch was 1 second in duration, as in the experimental protocol, and consisted of 1ms time steps.The calcium transient used in our simulations was obtained from isolated mouse adult cardiomyocytes incubated in Tyrode's Buffer with 1 µM Fura-2 at room temperature in the dark for 13 minutes.After incubation, cells were moved into fresh Tyrode's Buffer and plated on a glass coverslide at 37 °C.Cells were paced at 1 Hz and calcium transients were measured using the IonOptix Calcium and Contractility System.Because the calcium transient recorded is ratiometric, we used minimum (-7) and maximum (-6) pCa values previously reported in mouse trabeculae at 35 °C to set the absolute magnitude of our calcium kinetics (44).This calcium transient was used in all simulations We simulated each randomly selected set of rates 50 times and average them to form one data point, and we generated approximately 1 million rate combinations and corresponding twitches.We calculate the cross-sectional area of the spatially explicit model by measuring the area of the unit cell which is defined by a rhombus with vertices centered on the 4 thick filaments.This unit cell contains one thick filament (since it contains one-third of two separate thick filaments and one-sixth of two other thick filaments) and 2 thin filaments.Since the model contains 4 thick filaments and 8 thin filaments in total, we consider the cross-sectional area of the model to be 4 times the area of the unit cell.This lets us scale the output of the model, which is in pN , to stress, which can then be compared to experimental data, which is also reported in stress ( mN mm 2 ).We set 1% of the data aside as validation data, and standardized the other 99% by subtracting the global mean and dividing by the global standard deviation.These same values were used to scale validation data and experimental data for inference.
Conditional Variational Auto-encoder (CVAE).Auto-encoders are a type of machine learning tool which can be used for both dimensionality reduction and generative purposes.In general, they consist of encoder and decoder layers, where the dimensionality of the encoding layer's output (often referred to as the latent space, which we denote as z) is much smaller than the input, which allows them to be used for non-linear dimensionality reduction.Variational auto-encoders are a particular type of auto-encoder in which the output of the encoder is designed to describe a probability distribution within the latent space rather than a single point.A single point drawn from that space is then used as the input to decoder.We also can condition the input to the auto-encoder on an observation, in our case, the twitch force ⃗ f .The design of this specific Conditional Variational Auto-encoder (CVAE) follows that described in (8).It consists of three subnetworks: two encoders, Q 1 and R 1 , and the decoder R 2 (Fig. 9).Both encoders and the decoder have the same general architecture, consisting of a convolutional ResNet followed by a series of fully connected layers with only the width of the first fully connected layer and size of the output differing between each of the subnetworks.The convolutional layers are shared by each encoder and decoder, meaning they use the same weights.
In the encoder Q 1 , after the convolutional layers, the convolved twitch is concatenated with the true simulation rate parameters r before being passed to the fully connected layers which outputs the means and log-variances describing a unimodal nz-dimensional Gaussian distribution Q 1 .In the encoder Q 2 only the convolved twitch is passed to the fully connected layers, which outputs the means, (log) variances, and (log) weights describing an nzdimensional Gaussian mixture model with m components.
In the decoder R 2 , after performing the same convolutional operation on the force time-series, a point is drawn from either the distribution Q1 or R1, depending on if we are in training or inference mode.During training, the point is drawn from the unimodal Q1, which was given the true rate parameters as part of its input, while during inference the mixture distribution R1 is sampled.In either case, this nz-dimensional point is concatenated to the convolved time-series and passed to the decoder R 2 , which will output the parameters of an nr-dimensional, m-component Gaussian mixture model.
The full derivation of the loss function is given in (8,45).In short, we start with the cross-entropy loss between the 'true' distribution p(r|f ), and our estimate of the distribution generated by the CVAE, R(r|f ): Since the the distribution r(r|f ) is actually the product of the distributions of the encoder/decoder pair R 1 and R 2 marginalized over the latent variable z, we can write it as During training, however, we use the simpler encoder/decoder pair Q 1 and R 2 , also called the recognition function By finding the Kullback-Liebler divergence between these two networks, and substituting the difference into the equation for H(p|r), and minimizing the expectation of H over our observed data  f, we can derive that the difference between the 'true' distribution p(r|f ) and the CVAE distribution r(r|f ) can be written as We approximate this integral as The loss function consists of two values, the log-probability L of the distribution R 2 at the location of the true rate parameters r, and the Kullback-Leibler (KL) divergence between the distributions Q 1 and R1.The value −L + KL is minimized.Similar to (8), in practice we minimize the value −L + α * KL, where α is 0 for the first 30 epochs, and then linearly increases for the next 60 epochs to one.This is done because the CVAE network can become stuck in local minima where the KL term is at its minimum of 0, which prevents optimization of the L term.
When training is finished, as determined by both convergence of the loss function and the Probability-Probability plot, we can begin inference of rate factors corresponding to simulated or experimental twitches from cardiac or skeletal muscle for both control and diseased muscles.During inference of a simulated or experimental twitch, we sample a point, z R 1 , from the distribution R 1 over the latent space, which is used to generate a different distribution from the decoder R 2 over the rate space, from which is also drawn a single point.We repeat this process 5000 times to generate the estimated parameter distribution for a given twitch.The probability-probability plot in Fig. 3 is calculated by inferring the rate factors for our validation dataset.Once the model can accurately reproduce the distribution of the model output f over ⃗ r for our validation set, we can input simulated or experimental twitches from cardiac or skeletal muscle for both control and diseased muscles.One advantage of this method is that the network only needs to be trained once, and can then be used to predict the set of rates which correspond to any muscle twitch in the range of parameters simulated in the training dataset, as opposed to other methods which might require more computational cost for new conditions.
Data Availability.The spatially explicit model used in this article can be found at https://github.com/travistune3/multifilfive state.
The training and testing datasets are available from Dryad at DOI: 10.5061/dryad.d51c5b0bj.

Fig. 1 .
Fig. 1.Left: 3d rendering of a sarcomere's lattice showing the thick (red) and thin filaments (blue) as well as a superimposed spring network to give a sense of the model's geometry.Following the geometry specified in Ascensio et al. (7) we model the filaments as a network of springs for the thick and thin filaments, with crossbridges, each consisting of a torsional and linear spring.Right: The rate transition diagram for the thick and thin filaments, indicating the four thin filament rates (labeled rt) and the six myosin motor rates (labeled rx), as well as a table indicating the key biological elements.Crossbridges are modeled as a torsional and linear spring.

Fig. 2 .
Fig.2.The 2-D joint probability distributions for the 9 rates we chose to vary in our simulations are shown as contour plots that indicate the probability density over the rate space.The rates factors are plotted on a log scale from 10 −1 to 10 2 .The base rates correspond to a log multiplication factor of 10 0 1-D marginalized probability distribution of every rate factor is plotted above each factor.Additionally, in the upper right, the simulated twitch stress (force/area) used as the target prediction is plotted as a function of time.

Fig. 3 .Fig. 4 .
Fig. 3.The Probability-Probability plot shows how our estimated probability distribution describes the actual probability distribution across our validation dataset.The x-axis shows the estimated probability the rate is contained in a certain volume of the rate space, and the y-axis indicates the actual fraction of times the rate was contained in that volume.For an ideal predictor, the true parameter should be in the X% credible region X% of the time, indicated by the black dashed line.Values above the diagonal indicate under-confidence, and values below indicate over-confidence.

2 )Fig. 6 .
Fig.6.Isometric twitch stress is plotted against time for the experimental data (wild type, blue solid line; I61Q cTnC orange solid line) and data simulated from CVAE predictions for the wild type (blue with solid black line average), hybrid prediction (green with black solid line average) and I61Q cTnC (orange with black solid line average).

Fig. 8 .
Fig.8.The forward and backward rate functions for the thick and thin filaments at a pCa of −4 are plotted as a function of the axial separation between a cross-bridge and a binding site.Not shown are the thick filaments rates rx,15 and rx,54 as well as the thin filament rates rt,14 which are all 0.

Fig. 9 .
Fig. 9.The encoding and decoding networks Q1, R1, and R2 are each composed of the same shared deep convolutional layers.The loss function -L+KL is composed of the log probability of the true rate at for the estimated probability distribution function (L), and the Kullback-Liebler divergence between the probability distributions from the Q1 and R1 networks (KL).

Table 1 . Stress and timing differences -Control and I61Q cTnC mice.
Tune et al.PNAS -May 7, 2024 -vol.XXX -no.XX -3 Upper right: Simulated control and hybrid twitch stress is plotted against time as in Fig.6, as well as the twitch corresponding to our best estimate thick filament intervention.Lower left: the probability distribution over rate factors in the thick filament which informed twitch rate factor changes.