Abstract
Legged locomotion is controlled by two neural circuits: central pattern generators (CPGs) that produce rhythmic motor commands (even in the absence of feedback, termed “fictive locomotion”), and reflex circuits driven by sensory feedback. Each circuit alone serves a clear purpose, and both are understood to be active normally. The difficulty is in how or why they work together, as there lacks an objective and operational criterion for combining the two. Here we propose that optimization in the presence of uncertainty can explain how feedback should be incorporated for locomotion. The key is to re-interpret the CPG as a state estimator: an internal model of the limbs that predicts their state, using sensory feedback to optimally balance competing effects of environmental and sensory uncertainties. We demonstrate use of optimally predicted state to drive a simple model of bipedal, dynamic walking, which thus yields minimal energetic cost of transport and best stability. The internal model may be implemented with classic neural half-center circuitry, except with neural parameters determined by optimal estimation principles. Fictive locomotion also emerges, but as a side effect of estimator dynamics rather than an explicit internal rhythm. Uncertainty could be key to shaping CPG behavior and governing optimal use of feedback.
Introduction
Animal locomotion appears to be controlled by two main types of neural circuitry. One type is the central pattern generator (CPG; Figure 1A), which generates pre-programmed, rhythmically timed, motor commands [1–3]. The other is the reflex circuit, which produces motor patterns triggered by sensory feedback (Figure 1C). A hierarchy of reflex loops act together, some integrating multiple sensory modalities for complex behaviors such as stepping and standing control [4,5]. Although reflexes alone seem sufficient to control locomotion, animal CPGs have also demonstrated fictive locomotion, in which rhythmic patterns are sustained even in the absence of sensory feedback [6,7]. In fact, within the intact animal, both types of circuitry work together for normal locomotion (Figure 1B; [8]). But this cooperation also presents a dilemma, of how and even why authority is shared between the two [9].
The combination of central pattern generators with sensory feedback has been explored in computational models. For example, the biologically-inspired Matsuoka oscillator [10] employs a network of mutually inhibiting neurons to intrinsically produce alternating bursts of activity. Sensory input to the neurons can change network behavior based on system state, such as foot contact and limb or body orientation, to help respond to disturbances. The gain or weight of sensory input determines whether it slowly entrains the CPG [11], or whether it resets the phase entirely [12,13]. Controllers of this type have demonstrated legged locomotion in bipedal [14] and quadrupedal robots [15,16], and even swimming and other behaviors in others [17]. Feedback improves robustness, such as ability to traverse different terrains [18]. These models suggest how sensory feedback could improve the robustness of locomotion for animals.
This raises the question whether it might be optimal to use sensory feedback alone. Human-like models can learn reflexive control and produce quite complex and robust walking motions based on state feedback alone [19–21]. The bipedal (Atlas [22]) and quadrupedal (BigDog [23]) robots of highest performance and robustness are typically driven by feedback (of state of body and environment). In fact, reinforcement learning and other optimization approaches (e.g., dynamic programming [24,25]) are typically expressed in terms of state, rather than time. They have no need for, nor even benefit from, an internally generated rhythm. Thus, although CPGs may be modeled with feedback, operational performance seems to favor feedback alone.
There may nevertheless be a principled reason for a controller to have its own internal rhythm or dynamics. State feedback requires knowledge of state, which cannot be known perfectly but may be estimated from noisy and imperfect sensors. The state estimator [25] uses an internal model of the body to predict expected state and sensory information, despite two types of noise. One is due to uncertainty in environment and internal model, termed process noise, and the other to imperfect sensors, termed sensor noise. Error in predicted vs. actual sensory feedback is used to correct the state estimate. We previously proposed that these internal model dynamics can function like a CPG [26], albeit with its output interpreted not as the motor command per se, but as a state estimate that drives the motor command. A simple model of rhythmic leg motions demonstrates how such a scheme could produce the equivalent of fictive locomotion [26]. But walking, as suggested by a preliminary model [27], is considerably more complex, with continuous-time dynamics, discrete and changing ground contact conditions, and risk of falling. Perhaps state estimation could apply to walking as well.
The purpose of the present study was to test an estimator-based CPG controller with a dynamic walking model. We devised a simple state feedback control scheme, producing stance and swing leg torques as a function of the leg states. Assuming noise acts on both the sensors and as disturbances to the system, we devised a state estimator for the linear, continuous-time dynamics of the legs, with a discrete switch between stance and swing dynamics. The combination of control and estimation thus define our version of a CPG controller that incorporates sensory feedback. In fact, this same controller may be realized in the form of a Matsuoka oscillator [10], with similar neuron-like dynamics. We expected that minimum state estimation error would allow this model to walk with optimal performance, in terms of measures such as mechanical cost of transport. Scaling the sensory feedback either higher or lower than optimal would be expected to yield poorer performance. Such a model may conceptually explain how CPGs could incorporate sensory feedback based on optimal estimation principles.
Results
Central pattern generator controls a dynamic walking model
The CPG controller produced a periodic gait with a model of human-like dynamic walking (Figure 2A). Much of the walking motion was due to the passive dynamics of pendulum-like legs, which can swing back and forth on their own. The legs were also influenced by active torque commands (T1 and T2, Figure 2B) from the CPG, which in turn could be influenced by sensory signals. The passive leg dynamics were sufficient to yield a periodic gait, if it were not for energy dissipation in each step’s ground contact collision [28,29]. Active control was therefore necessary to restore that energy through the torque commands. The result was an alternating motion of the two legs (Figure 2C), offset in phase by half a stride period. These leg angles and the ground contact condition (“GC”, 1 for contact, 0 otherwise; Figure 2C) were treated as measurements to be fed back to the CPG. Each leg’s states described a periodic orbit or limit cycle (Figure 2D), which could be perturbed and made to fall (Figure 2E).
The resulting gait had approximately human-like parameters when walking without noisy disturbances. The nominal walking speed was equivalent to 1.25 m/s and step length 0.55 m (or normalized 0.4 (gl)0.5 and 0.55 l, respectively; g is gravitational constant, l is leg length). The corresponding mechanical cost of transport was 0.053, comparable to other passive and active dynamic walking models (e.g., [30–32]).
Pure feedforward and pure feedback both susceptible to noise
The critical importance of sensory feedback in the presence of noise was demonstrated with the extremes of pure feedforward and pure feedback (Figure 3A). For both of these cases, we applied a process noise disturbance consisting of a single impulsive force acting on the body. The pure feedforward controller failed to recover (Figure 3A left), and would fall within about two steps. Its perturbed leg and ground contact states became mismatched to the nominal rhythm, which in pure feedforward does not respond to state deviations. In contrast, the feedback controller could recover from the perturbation (Figure 3A right) and return to the nominal gait. Feedback control is driven by system state, and therefore automatically alters the motor command in response to perturbations.
We also applied an analogous demonstration with sensor noise. Adding continuous noise to sensory measurements had no effect on pure feedforward control (Figure 3B left), which ignores sensory signals entirely. But pure feedback was found to be sensitive to noise-corrupted measurements, and would fall within a few steps (Figure 3B right). This is because erroneous feedback would trigger erroneous motor commands not in accordance with actual limb state. The combined result was that both pure feedforward and pure feedback control had complementary weaknesses. They performed identically without noise, but each was unable to compensate for its particular weakness, either process noise or sensor noise.
Equivalence between Matsuoka neural oscillator and state estimator
The same CPG model could be represented in two ways. The Matsuoka neural oscillator (Figure 4A) representation had two mutually inhibiting half-center oscillators, one driving each leg (i = 1 for left leg, i = 2 for right leg). Each half-center had a total of three neurons, one a primary Matsuoka neuron with standard second-order dynamics (states u and v). Its output drove the second neuron (α) producing the motor command to the ipsilateral leg. The third neuron was responsible for relaying ground contact (“c”) sensory information, to both excite the ipsilateral Matsuoska neuron and inhibit the contralateral one.
The same CPG architecture was then re-interpreted in a control systems framework (Figure 4B), while changing none of the neural circuitry. Here, the structure was not treated as half-center oscillators, but rather as three neural stages from afferent to efferent. The first stage receiving sensory feedback signal was interpreted as a feedback gain L (upper rectangular block, Figure 4B), modulating the behavior of the second stage, interpreted as a state estimator (middle rectangular block, Figure 4B) acting as an internal model of leg dynamics. Its output was interpreted as the state estimate, which was fed into the third, state-based motor command stage (lower rectangular block, Figure 4B). In this interpretation, the three stages correspond with a standard control systems architecture for a state estimator driving state feedback control. In fact, the neural connection weights of the Matsuoka oscillator were determined by, and are therefore specifically equivalent to, a state estimator driving motor commands to the legs.
Sensory feedback gain L optimized by state estimation
We next examined walking performance in the presence of noise, while varying sensory gain L above and below optimal (Figure 5). We applied a combination of both process and sensor noise (with fixed covariances), which made sensory gain L critical to walking performance, unlike the noiseless case. Both pure feedforward and pure feedback control yielded poor performance, as quantified by mechanical cost of transport, step variability, mean time between falls, and state estimator error (Figure 5). Better performance was achieved by varying sensory feedback L continuously between these extremes. The combination of feedforward and feedback, where the CPG rhythm was modulated by sensory information, performed better than either extreme alone.
Best performance was found for the gain L equal to predicted theoretically by linear quadratic estimation (LQE) principles (Figure 5). We had designed based on the covariances of process and sensor noise. Using that gain in nonlinear simulation, the mechanical cost of transport was 0.077, somewhat higher than the nominal 0.053 without noise. Step length variability was 0.046 l, and the model experienced occasional falls, with MTBF (mean time between falls) of about 9.61 g−0.5 l0.5 (or about 7.1 steps). This optimal case served as a basis for comparisons with other values for gain L.
Other values for sensory gain generally resulted in poorer performance (Figure 5). Over the range of gains examined ( ranging 0.82 – 1.44), the performance measures worsened on the order of about 10%. This suggests that, in a noisy environment, a combination of feedforward and feedback is important for achieving precise and economical walking, and for avoiding falls. Moreover, the optimal combination for performance can be designed using control and estimation principles.
Fictive locomotion emerges
Although the CPG model normally interacts with the body, it was also found to produce fictive locomotion even with peripheral feedback removed (Figure 6). Here we considered two types of biological sensors, referred to as “error feedback” and “measurement feedback” sensors. Error feedback refers to sensors that can distinguish unexpected perturbations from intended movements [33]. For example, some muscle spindles and fish lateral lines [34] receive corollary efferent signals (e.g. gamma motor neurons in mammals, alpha in invertebrates [35]) that signify intended movements, and could be interpreted as effectively computing an error signal within the sensor itself [34]. Movement feedback refers to sensors without efferent inputs, such as nociceptors, golgi tendon organs, cutaneous skin receptors, and other muscle spindles [36], that feed back information more directly related to body movement. Both types of sensors are considered important for locomotion, and so we examined the consequences of removing either type.
These cases were modeled by disconnecting different components of the closed-loop system. This is best illustrated by redrawing the CPG (Figure 4) more explicitly as a traditional state estimator block diagram (Figure 6A). The case of removing error feedback (Figure 6B) was modeled by disconnecting error signal e, so that the estimator would run in an open-loop fashion, as if the state estimate were always correct. Despite this disconnection, there remained an internal loop between the estimator internal model and the state-based command generator, that could potentially sustain rhythmic oscillations. The case of removing measurement feedback (Figure 6C) was modeled by disconnecting afferent signal y, and reducing estimator gain by about half, as a crude representation of highly disturbed conditions. There remained an internal loop, also potentially capable of sustained oscillations. We tested whether either case would yield a sustained fictive rhythm, illustrated by transforming the motor command T into neural firing rates using a Poisson process.
We found that removal of both types of sensors still yielded sustained neural oscillations (Figure 6C), equivalent to fictive locomotion. In the case of error feedback, the motor commands from the isolated CPG were equivalent to the intact case without noise in terms of frequency and amplitude. In the case of measurement feedback, simulations still produced periodic oscillations, albeit with slower frequency and reduced amplitude compared to intact. The state estimator tended to drive estimate toward zero, and how this altered state estimation affected the final motor commands was quite dependent on the specifics of the state-based motor command.
Discussion
We have examined how central pattern generators could optimally integrate sensory information to control locomotion. Our CPG model offers an adjustable gain on sensory feedback, to allow for continuous adjustment between pure feedback control to pure feedforward control, all with the same nominal gait. The model is compatible with previous neural oscillator models, while also being designed through optimal state estimation. Simulations reveal how sensory feedback becomes critical under noisy conditions, although not to the exclusion of intrinsic, neural dynamics. In fact, a combination of feedforward and feedback is generally favorable, and the optimal combination can be designed through standard estimation principles. Estimation principles apply quite broadly, and could be readily applied to other models, including ones far more complex than examined here. The state estimation approach also suggests new interpretations for the role of CPGs in animal or robot locomotion.
One of our most basic findings was that the extremes of pure feedforward or pure feedback control each performed relatively poorly in the presence of noise (Figure 3). Pure feedforward control, driven solely by an open-loop rhythm, was highly susceptible to falling as a result of process noise. The general problem with feedforward or time-based control is that a noisy environment can disturb the legs from their nominal motion, so that the nominal command pattern is mismatched for the perturbed state. Under noisy conditions, it is better to trigger motor commands based on feedback of actual limb state, rather than time. But feedback also has its weaknesses, in that noisy sensory information can lead to noisy commands. The solution is to combine both feedforward and feedback together, modulated by sensory feedback gain L. A more uncertain environment (higher process noise) would favor higher gain, and noisier sensors would favor reduced gain. And for a given combination of noise, the theoretically optimal gain would be expected to minimize estimation error, and in turn yield best gait performance (Figure 5). This is expected because theoretically, optimal control also typically calls for optimal state estimation [25,37]. And empirically, imprecise visual information can induce variability in foot placement [38] and poorer walking economy [39]. As expected, the present model walks best with the optimal trade-off between noise effects.
Our model also explains how neural oscillators can be interpreted as state estimators (Figure 4). Previous oscillator models (e.g., [10]) have demonstrated how neural half-centers could be modulated by sensory feedback, but not how the feedback gain should operationally be determined, nor how mechanistic principles can be used to combine feedforward and feedback. We have re-interpreted neural oscillator circuits in terms of state estimation (Figure 4), and shown how the gain can be determined in a principled manner, to minimize estimation error (Figure 5). The nervous system has long been interpreted in terms of internal models, for example in central motor planning and control [40–42] and in peripheral sensors [33]. Here we apply internal model concepts to CPGs, for better locomotion performance.
This interpretation also explains fictive locomotion as an emergent behavior. We observed persistent CPG activity despite removal of sensors (and either error or measurement feedback; Figure 6), but this was not because the CPG was in any way intended to produce rhythmic timing. Rather, fictive locomotion was a side effect of a state-based motor command, in an internal feedback loop with a state estimator, resulting in an apparently time-based rhythm (Figure 6B). Others have cautioned that CPGs should not be interpreted as generating decisive timing cues [43–45], especially given the critical role of peripheral feedback in timing [9,46,47]. In normal locomotion, central circuits and periphery act together in a feedback loop, and so neither can be assigned primacy. The present model operationalizes this interaction, demonstrates its optimality for performance, and shows how it can yield both normal and fictive locomotion.
This study argues that it is better to control with state rather than time. The kinematics and muscle forces of locomotion might appear to be time-based trajectories, and therefore require an internal clock to drive them. But another view is that the body and legs comprise a dynamical system dependent on state (described e.g. by phase-plane diagrams, Figure 2), such that the motor command should also be a function of state. Indeed, this is generally the case in optimal control [25], dynamic programming [24], and related methods (e.g., iterative linear quadratic regulators [48] and deep reinforcement learning [21]). As also demonstrated in robots [49,50], state-based control typically also calls for state estimation in realistic conditions [22,51]. Thus, robots with state-driven optimal control and estimation might also exhibit CPG-like fictive behavior, despite having no explicit time-dependent controls.
State estimation may also be applicable to movements other than locomotion. The same circuitry employed here (Figure 4) could easily contribute a state estimate for any state-dependent movements, whether rhythmic [26], non-rhythmic, or discrete. In our view, persistent oscillations were the outcome of state estimation with an appropriate state-based command for the α motoneuron (see Methods). But the same half-center circuitry could be active and contribute to other movements that use non-locomotory, state-based commands. It is certainly possible that biological CPGs are indeed specialized purely for locomotion alone, but the state estimation interpretation suggests the possibility of a more general, and perhaps previously unrecognized, role in other movements.
The present optimization approach may offer insight on neural adaptation. Although we have explicitly designed a state estimator here, we would also expect a generic neural network, given an appropriate objective function, to be able to learn the equivalent of state estimation. That objective could be to minimize error of predicted sensory information, or simply locomotion performance such as cost of transport. Moreover, our results suggest that the eventual performance and control behavior should ultimately depend on body dynamics and noise. A neural system adapting to relatively low process noise (and high sensor noise) would be expected to learn and rely heavily on an internal model. Conversely, relatively high process noise (and low sensor noise) would rely more heavily on sensory feedback. A limitation of our model is that it places few constraints on neural representation, because there are many ways (or “state realizations” [37]) to achieve the same input-output function for estimation. But the importance and effects of noise on adaptation are hypotheses that might be testable with artificial neural networks or animal preparations.
There are, however, cases where state estimation may not apply. State estimation applies best to systems with inertial dynamics or momentum. Examples include inverted pendulum gaits with limited inherent (or passive dynamic; [31]) stability and pendulum-like leg motions [26]. The perturbation sensitivity of such dynamics make state estimation more critical. But other organisms may have well-damped limb dynamics and inherently stable body postures, and thus benefit less from state estimation. There may also be task requirements that call for fast reactions with short synaptic delays, or organismal, energetic, or developmental considerations that limit the complexity of neural circuitry. Such concerns might call for reduced-order internal models [37], or even their elimination altogether, in favor of faster and simpler pure feedforward or feedback. A more holistic view would balance the principled benefits of internal models and state estimation against the practicality, complexity, and organismal costs.
There are a number of limitations to this study. The “Anthropomorphic” walking model does not capture three-dimensional motion and multiple degrees of freedom in real animals. We used such a simple model because it is unlikely to have hidden features that could produce the same results for unexpected reasons. We also modeled extremely simple sensors, without representing the complexities of actual biological sensors. The estimator also used a constant, linear gain, and could be improved with nonlinear estimator variants. We also used a particularly simple, state-based command law, which was designed more for robustness than for economy. Better economy could be achieved by powering gait with precisely-triggered, trailing-leg push-off [26], rather than the simple hip torque applied here. However, the timing is so critical that feedforward conditions (L < 1) would fall too frequently to yield meaningful economy or step variability measures. We therefore elected for more robust control to allow a range of feedforward through feedback to be compared (Figure 5). But even with more economical control, we would still expect optimal performance to correspond with optimal gain.
Our principal contribution has been to reconcile the biological evidence for CPGs with the principles of feedback control and state estimation. The evidence of fictive locomotion has long been suggestive that neural oscillators produce the definitive timing and amplitude cues for locomotion. But pre-determined timing is also problematic for control in unpredictable situations [44], making it questionable whether CPG oscillators should dictate timing [43]. To our knowledge, previous CPG models have not included process or sensor noise in control design. Such noise is simply a reality of non-uniform environments and imperfect sensors. But it also yields an objective criterion for uniquely defining control and estimation parameters. The resulting neural circuits resemble previous oscillator models and can produce and explain nominal, noisy, or fictive locomotion. In our interpretation, there is no issue of primacy between CPG oscillators and sensory feedback, because they interact optimally to deal with a noisy world.
Method
Details of the model and testing are as follows. The CPG model is first described in terms of neural, half-center circuitry, which is then paired with a walking model with pendulum-like leg dynamics. The walking gait is produced by a state-based command generator, which governs how state information is used to drive motor neurons. The model is subjected to process and sensor noise, which tend to cause the gait to be imprecise and subject to falling. The CPG is then re-interpreted as an optimal state estimator, for which sensory feedback gain and internal model parameters may be designed, as a function of noise characteristics. The model is then simulated over multiple trials to computationally evaluate its walking performance as a function of sensory gain. It is also simulated without sensory feedback, to test whether it produces fictive locomotion.
CPG architecture based on Matsuoka oscillator
The CPG consists of two, mutually-inhibiting half-center oscillators, receiving a tonic descending input (Figure 4A). Each half-center has a primary Matsuoka neuron with second-order dynamics, described by states ui for membrane potential and vi for adaptation or fatigue. The membrane potential also produces an output qi that can be fed to other neurons. In addition, we included two types of auxiliary neurons (for a total of three neurons per half-center): one for accepting the ground contact input (ci, with value 1 when in ground contact and 0 otherwise for leg i), and the other to act as an alpha (αi) motoneuron to drive the leg. We used a single motoneuron to generate both positive and negative (extensor and flexor) hip torques, as a simplifying alternative to including separate rectifying motoneurons.
Each half-center receives a descending command and two types of sensory feedback. The descending command is a tonic input s, which determines the walking speed. Sensory input from the corresponding leg includes continuous and discrete information. The continuous feedback contains information about leg angle from muscle spindles and other proprioceptors [52], which could be modeled as leg angle yi for measurement feedback, or error ei for error feedback sensors. The discrete information is about ground contact ci sent from cutaneous afferents [53].
The primary neuron’s dynamics are as follows. The membrane potential ui has first-order dynamics, and is mainly affected by its own adaptation, a mutually inhibiting connection from other neurons, sensory input, and efference copy of the motor commands. Adaptation or fatigue vi decays with first-order dynamics, driven by the same neuron’s membrane potential as well as sensory input. This is described by the following equations, inspired by [10] and previous robot controllers designed for rhythmic arm movements (e.g., [54] and walking [18]): where there are several synaptic weightings: membrane potential decay ai and , adaptation gain bi, mutual inhibition strength wij(weighting of neuron i’s input from neuron j’s output, where wii = 0), sensory input gains hij and , and efference copy strength rij. The neuron also receives efference copy of its associated motor command αj(s, vj, cj), which depends on neuron state, descending drive and ground contact. There are also secondary, higher-order influences summarized by the function fi(u, v, c), which have a relatively small effect on membrane potential but are part of the state estimator. The network parameters for Matsuoka oscillators are traditionally set through a combination of design rules of thumb and hand-tuning, but here nearly all of the parameters will be determined from an optimal state estimator, as described below.
Walking model with pendulum dynamics
The system being controlled is a simple bipedal model walking in the sagittal plane (Fig. 3A). The passive dynamics of pendulum-like legs [31] are actively actuated by added torque inputs (the “Anthropomorphic Model,” [30]), and energy is dissipated mainly with the collision of leg with ground in the step-to-step transition. The dissipation determines the amount of positive work required each step. In humans, muscles perform much of that work, which in turn accounts for much of the energetic cost of walking [28].
The walking model is described mathematically as follows. The equations of motion may be written in terms of vector θ ≜ [θ1, θ2]T as where M is the mass matrix, C describes centripetal and Coriolis effects, G contains position-dependent moments such as from gravity, GC ≜ [GC1, GC2]T contains ground contact, and T ≜ [T1, T2]T contains hip torques exerted on the legs (State-based control, below). The equations of motion depend on ground contact because each leg alternates between stance and swing leg behaviors, inverted pendulum and hanging pendulum, respectively. We define each matrix to switch the order of elements at heel-strikes, so that equation of motion can be expressed in the same form.
At heelstrike, the model experiences a collision with ground affecting the angular velocities. This is modeled as a perfectly inelastic collision. Using impulse-momentum, the effect may be summarized as the linear transformation where the plus and minus signs (‘+’ and ‘–’) denote just after and before impact, respectively. The ground contact states are switched such that the previous stance leg becomes and swing leg, and vice versa.
State-based motor command generator
The model produces state-dependent hip torque commands to the legs. Of the many ways to power a dynamic walking model (e.g., [30,50,55,56]), we apply a constant extensor hip torque against the stance leg, for its parametric simplicity and robustness to perturbations. The torque normally performs positive work (Fig. 3B) to make up for collision losses, and could be produced in reaction to a torso leaned forward (not modeled explicitly here; [31]). The swing leg experiences a hip torque proportional to swing leg angle (Fig. 3B, 3C), with the effect of tuning the swing frequency [26].
The overall torque command Ti for leg i is used as the motor command αi, and may be summarized as where the stance phase torque is increased from the initial value kst by the amount proportional to the descending command s with gain μst. The swing phase torque has gain ksw for the proportionality to leg angle θi.
There are also two higher level types of control acting on the system. One is to regulate walking speed, by slowly modulating the tonic, descending command s (Eqn. 6). An integral control is applied on s, so keep attain the same average walking speed despite noise, which would otherwise reduce average speed. The second type of high-level control is to restart the simulation after falling. When falling is detected (as a horizontal stance leg angle), the walking model is reset to its nominal initial condition, except advanced one nominal step length forward from the previous footfall location. No penalty is assessed for this re-set process, other than additional energy and time wasted in the fall itself. We quantify the susceptibility to falling with a mean time between failures (MTBF), and report overall energetic cost in two ways, including and excluding failed steps. The wasted energy of failed steps is ignored in the latter case, resulting in lower reported energy cost.
Noise model with process and sensor noise
The walking dynamics are subject to two types of noisy disturbances, process and sensor noise (Figure 3). Both are modeled as zero-mean, Gaussian white noise. Process noise nx (with covariance Nx) acts as an unpredictable disturbance to the states, due to external perturbations or noisy motor commands. Sensor or measurement noise ny (with covariance Ny) models imperfect sensors, and acts additively to the sensory measurements y. The errors induced by both types of noise are unknown to the CNS controller, and so both tend to reduce performance.
The noise covariances were set so that the model would be significantly affected by both types of noise. We sought levels sufficient to cause significant risk of falling, so that good control would be necessary to avoid falling while also achieving good economy. Process noise was described by covariance matrix Nx, with diagonals filled with variances of noisy accelerations, which had standard deviations of 0.015 (g/l) for stance leg, 0.16 (g/l) for swing leg. Sensor noise covariance Ny was also set as a diagonal matrix with both entries of standard deviation 0.1. Noise was implemented as a spline interpolation of discrete white noise sampled at frequency of 16 (g/l)0.5 (well above pendulum bandwidth) and truncated to no more than ±3 standard deviations.
State estimator with internal model of dynamics
A state estimator is formed from an internal model of the leg dynamics being controlled (see block diagram in Fig. 4), to produce a prediction of the expected state and sensory measurements ŷ (with the hat symbol ‘̂’ denoting an internal model estimate). Although the actual state is unknown, the actual sensory feedback y is known, and the expectation error e = y − ŷ may be fed back to the internal model with negative feedback (gain L) to correct the state estimate. Estimation theory shows that regulating error e toward zero also tends to drive the state estimate towards actual state (assuming system observability, as is the case here; e.g., [37]). This may be formulated as an optimization problem, where gain L is selected to minimize the mean-square estimation error. Here we interpret the Matsuoka oscillator network as such an optimal state estimator, the design of which will determine the network parameters.
The estimator equations may be described in state space. The estimator states are governed by the same equations of motion as the walking model (Eqns. 4, 5), with the addition of the feedback correction. Again using hat notation for state estimates, the nonlinear state estimate equations are
We used standard state estimator equations to determine a constant sensory feedback gain L. This was done by linearizing the dynamics about a nominal state, and then designing an optimal estimator based on process and sensor noise covariances (Nx and Ny) using standard procedures (“lqe” command in Matlab, The MathWorks, Natick, MA). This yields a set of gains that minimize mean-square estimation error , for an infinite horizon and linear dynamics. The constant gain was then applied to the nonlinear system in simulation, with the assumption that the resulting estimator would still be nearly optimal in behavior. Another sensory input to the system is ground contact GCi, a boolean variable. The state estimator ignores measured GCi for pure feedforward control (zero feedback gain L), but for all other conditions (non-zero L), any sensed change in ground contact overrides the estimated ground contact . When the estimated ground contact state changes, the estimated angular velocities are updated according to the same collision dynamics as the walking model (eqn. 5 except with estimated variables).
The state estimate is applied to the state-based motor command (Eqn. 6). Although the walking control was designed for actual state information (θi, GCi), for walking simulations it uses the state estimate instead:
As with the estimator gain, this also requires an assumption. In the present nonlinear system, we assume that the state estimate may replace the state without ill effect, a proven fact only for linear systems (certainty-equivalence principle, [25,57]). Both assumptions, regarding gain L and use of state estimate, are tested in simulation below.
Theoretical equivalence between neural oscillator and state estimator
Having fully described the walking model in terms of control systems principles, the equivalent Matsuoka oscillator may be determined (Figure 4B). The identical behavior is obtained by re-interpreting the neural states in terms of the dynamic walking model states, along with the neural output function defined as identity,
In addition, motor command and ground contact state are defined to match state-based variables (Eqn. 6):
The synaptic weights and higher-order functions (Eqns. 1 – 3) are defined according to the internal model equations of motion (Eqn. 7),
Because the mass matrix and other variables are state dependent, the weightings above are state dependent as well. The functions f1 and f2 are higher-order terms, which could be considered optional; omitting them would effectively yield a reduced-order estimator.
The result of these definitions is that the Matsuoka neuron equations (Eqns. 1 – 3) may be rewritten in terms of and , to illustrate how the network models the leg dynamics and receives inputs from sensory feedback and efference copy:
The above may be interpreted as an internal model of the stance and swing leg as pendulums, with pendulum phasing modulated by error feedback ej and efference copy of the motor command (plus small nonlinearities due to inertial coupling of the two pendulums).
Parametric effect of varying sensory feedback gain L
The sensory feedback gain is selected using state estimation theory, according to the amount of process noise and sensor noise. High process noise, or uncertainty about the dynamics and environment, favors a higher feedback gain, whereas high sensor noise favors a lower feedback gain. The ratio between the noise levels determines the optimal linear quadratic estimator gain (Matlab function “lqe”). A constant gain was determined based on a linear approximation for the leg dynamics, an infinite horizon for estimation, and a stationarity assumption for noise [37]. In simulation, the state estimator was implemented with nonlinear dynamics, assuming this would yield near-optimal performance.
It is thus instructive to evaluate walking performance for a range of feedback gains. Setting L too low or too high would be expected to yield poor performing. Setting L equal to the optimal LQE gain would be expected to yield approximately the least estimation error, and therefore the most precise control (e.g. [58]). In terms of gait, more precise control would be expected to reduce step variability and mechanical work, both of which are related to metabolic energy expenditure in humans (e.g., [39]). The walking model is also prone to falling when disturbed by noise, and optimal state estimation would be expected to reduce the frequency of falling.
We performed a series of walking simulations to test the effect of varying the feedback gain. The model was tested with 20 trials of 100 steps each, subjected to pseudorandom process and sensor noise of fixed covariance (W and V, respectively). In each trial, walking performance was assessed with mechanical cost of transport (mCOT, defined as positive mechanical work per body weight and distance travelled; e.g., [32]), step length variability, and mean time between falls (MTBF) as a measure of walking robustness (also referred to as Mean First Passage Time [59]). The sensory feedback gain was first designed in accordance with the experimental noise parameters, and then the corresponding walking performance was evaluated. Additional trials were performed, varying sensory feedback gain L with lower and higher than optimal values to test for a possible performance penalty. These sub-optimal gains were determined by re-designing the estimator with sensor noise ρV (ρ between 10^-4 and 10^0.8, with smaller values tending toward pure feedforward and larger toward pure feedback). This procedure guarantees stable closed-loop estimator dynamics, which would not be the case if the matrix were simply scaled higher or lower. For all trials, the redesigned L was tested in simulations using the fixed process and sensor noise levels. The overall sensory gain was quantified with a scalar, defined as the L2 norm (largest singular value) of matrix L, normalized by the L2 norm of .
We expected that optimal performance in simulation would be achieved with gain L close to the theoretically optimal LQE gain, . With too low a gain (L = 0, feedforward Figure 1A), the model would perform poorly due to sensitivity to process noise, and with too a high gain (L → ∞, feedback Figure 1C), it would perform poorly due to sensor noise. And for intermediate gains, we expected performance to have an approximately convex bowl shape, centered about a minimum at or near .
These differences were expected from noise alone, as the model was designed to yield the same nominal gait regardless of gain L. Simulations were necessary to test the model, because its nonlinearities do not admit analytical calculation of performance statistics.
Evaluation of fictive locomotion
We tested whether the model would produce fictive locomotion with removal of sensory feedback. Disconnection of feedback in a closed-loop control system would normally be expected to eliminate any persistent oscillations. But estimator-based control actually contains two types of inner loops (Figure 6A), both of which could potentially allow for sustained oscillations in the absence of sensory feedback. However, the emergence of fictive locomotion and its characteristics depend on what kind of sensory signal is removed. We considered two broad classes of sensors, referred to producing error feedback and measurement feedback, with different expectations for the effects of their removal.
Some proprioceptors relevant to locomotion, including some muscle spindles and lateral lines [34], could be regarded as producing error feedback. They receive corollary discharge of motor commands, and appear to predict intended movements, so that the afferents are most sensitive to unexpected perturbations. The comparison between expected and actual sensory output largely occurs within the sensor itself, yielding error signal e (Figure 6B). Disconnecting the sensor would therefore disconnect error signal e, and would isolate an inner loop between state-based command and internal model. The motor command normally sustains rhythmic movement of the legs for locomotion, and would also be expected to sustain rhythmic oscillations within the internal model. Fictive locomotion in this case would be expected to resemble the nominal motor pattern.
Sensors that do not receive corollary discharge could be regarded as direct sensors, in that they relay measurement feedback related to state. In this case, disconnecting the sensor would be equivalent to removing measurement y. This isolates two inner loops, both the command-and-internal-model loop above, as well as a sensory prediction loop between sensor model and internal model. The interaction of these loops would be expected to yield a more complex response, highly dependent on parameter values. Nonetheless, we would expect that removal of y would substantially weaken the sensory input to the internal model, and generally result in a weaker or slower fictive rhythm.
We tested for the existence of sustained rhythms for both extremes of error feedback and measurement feedback. Of course, actual biological sensors within animals are vastly more diverse and complex than this model. But the existence of sustained oscillations in extreme cases would also indicate whether fictive locomotion would be possible with some combination of different sensors within these extremes.
Acknowledgements
This work was supported in part by NSERC (Discovery Award and CRC Tier I) and Dr. Benno Nigg Research Chair.