## Abstract

Animal behavior is often quantified through subjective, incomplete variables that may mask essential dynamics. Here, we develop a behavioral state space in which the full instantaneous state is smoothly unfolded as a combination of short-time posture dynamics. Our technique is tailored to multivariate observations and extends previous reconstructions through the use of maximal prediction. Applied to high-resolution video recordings of the roundworm *C. elegans*, we discover a low-dimensional state space dominated by three sets of cyclic trajectories corresponding to the worm’s basic stereotyped motifs: forward, backward, and turning locomotion. In contrast to this broad stereotypy, we find variability in the presence of locally-unstable dynamics, and this unpredictability shows signatures of deterministic chaos: a collection of unstable periodic orbits together with a positive maximal Lyapunov exponent. The full Lyapunov spectrum is symmetric with positive, chaotic exponents driving variability balanced by negative, dissipative exponents driving stereotypy. The symmetry is indicative of damped, driven Hamiltonian dynamics underlying the worm’s movement control.

## INTRODUCTION

Animals move in a wide variety of ways; the complex posture dynamics generating these behaviors span multiple spatiotemporal scales, and exhibit both regularity and variability [1, 2]. At large scales, behavior is structured, organized into stereotyped motifs such as walking or running, but the dynamics within each motif can be highly irregular [3, 4]. This complexity is apparent in spontaneous behaviors [5, 6], but also in highly stereo-typed sequences such as an “escape response” [7], which must also be unpredictable for successful avoidance from motile predators [8]. Despite the importance of behavior in fields ranging from neuroscience [9, 10], ethology [11, 12], control theory [13], robotics and artificial intelligence [14] to the physics of living systems [15], the complexity of movement presents unique challenges in quantification, analysis, and understanding.

Technological advances, including recent progress in machine vision [16, 17], now make it possible to gather high-resolution movement data, even in complex, naturalistic settings and for animals with intricate body plans [18–20]. But how do we map high-resolution recordings of animal behavior into a compressed set of interpretable numbers while retaining maximal information about the dynamics? Indeed, among biological signals, behavior exhibits a remarkable divergence of descriptions, from representations based on pixels and wavelets (see e.g. [21]), to postures (see e.g. [22]) to more abstract states (see e.g. [23, 24]). Certainly, a good representation should capture the difference between distinct movement patterns. An ideal representation will also allow near-future predictions and be interpretable so as to provide insight into movement control principles. Finally, we seek to reveal rather than impose on the structure of the behavioral signal, letting the representation and analysis guide important characteristics such as continuous vs. discrete, variable vs. stereotyped and spontaneous vs. controlled.

We detail the construction and application of a *behavioral state space* inspired by the similar approach of dynamical systems (also known as a phase space [25, 26], not to be confused with “state-space models” in statistics [27]). A point in our generally multidimensional behavioral state space represents the complete, nearinstantaneous movements of an animal: posture and short-time posture changes. As time evolves, the state-space point follows a smooth trajectory, thus providing a geometrical encoding of behavior. Combining dynamical systems theory with high-resolution posture time series of the nematode *C. elegans*, we exploit the detailed structure of these trajectory encodings to seek a new quantitative perspective of ethological analysis [28].

## STATE SPACE RECONSTRUCTION BY MAXIMIZING PREDICTABILITY

We consider a *d*−dimensional time series of duration *T* collected in a *T* × *d* matrix *Y*, which represents noisy, incomplete measurements of an underlying dynamical system, Fig. 1. With a state space reconstruction, we seek a coordinate transformation Ψ that maps *Y* into a space *X* that is topologically equivalent [29] to the state space of the underlying dynamical system, a process known as time series embedding [30, 31]. Dynamical embeddings have been used to model complex phenomena such as ecological and neural dynamics [32, 33], and to characterize the stability and symmetry of their reconstructed attractors [34]. Although early approaches primarily used singe-variable measurements, multivariate embeddings provide better reconstructions [35] and can improve prediction [36].

In our approach, we first lift the *d*-dimensional measurements into a *Kd* dimensional space of *K* contiguous delays and then project to a smaller *m*-dimensional subspace. Formally we decompose the embedding Ψ = *P*_{m} ◦ Φ_{K} into a delay map Φ_{K} in which we iteratively stack (*K* − 1)-delayed copies of *Y* into a (*T* − *K* +1)× *Kd* matrix , followed by a dimensionality reduction transformation *P*_{m} which projects onto an *m* < *Kd* dimensional space. *P*_{m} can in principle be any transformation and examples include numerical derivatives [37], delay coordinates [38] and random projections [33]. Here, we use singular value decomposition (SVD) [31, 35] followed by independent components analysis (ICA) [39], which results in a state space with independent components spanning the dimensions of the *m* singular vectors. In matrix notation , where Γ_{m} is the *Kd* × *m* matrix of basis vectors spanning the *m* dimensional state space, while *X*_{m} contains the state space trajectories. This space of transformations allows for both derivative and more general linear filters [40] and the resulting coordinates reflect the most significant linear modes of the dynamics [31, 40].

The reconstruction is parameterized by the window length *K* and the state space dimension *m* and we describe a new, principled procedure for determining (*K, m*) based on optimal prediction. Notably, embedding parameters have often been chosen heuristically (see e.g. [31, 40]). To predict future observations we use *N*_{b} nearest neighbors in the reconstructed state space, Fig. 2A(left). To compute , the *τ*-step prediction of , we average the future of the nearest neighbors of the corresponding state space point so that and then apply to pull back to observation space. This is known as the nearest neighbor predictor and also as Lorenz’s “method of analogs” [41]. The nearest neighbor predictor provides a lower bound to the predictability of a state space reconstruction as it is equivalent to a zeroth order Taylor approximation of the dynamics in a local neighborhood.

We quantify the prediction quality after *τ* steps using the error
as shown in Fig. 2A(middle). Although *E*(*τ*) is *function* we seek a single scalar that captures overall predictability. For a completely predictable system *E*(*τ*) is constant with a value corresponding to the noise level in the observations. On the other hand, for systems where predictions get worse over time, *E*(*τ*) grows according to a non-trivial process, possibly involving multiple timescales [41–43], shown schematically in Fig. 2A(right).

As long as the system is stationary, the error is bounded by the maximum distance within the state space, denoted *e*_{s}; *E*(*τ*) grows until it saturates to *e*_{s} as *τ* → ∞, at which time the predictions are as good as choosing randomly from the sampled state space. We use the cumulative difference between the early-time and asymptotic error to define *T*_{pred} as a new measure of predictability,
where Δ is the area between the curve *E*(*τ*) and the asymptote *e*_{s}. A state space reconstruction with a large value of *T*_{pred} is good in the sense that it allows us to predict future observations for as long as possible. Although several previous studies about state space reconstruction are based on prediction as a guiding principle, they have used the predictive error in a more *ad hoc* manner, either by setting *τ* to a specific value [30, 32, 44, 45], or by integrating *E*(*τ*) to a chosen time *τ*_{0} [46].

The average prediction error for an arbitrary time *τ*′ is . At large enough *τ*′, the error *E*(*τ*) approaches *e*_{s} and we can write ⟨*E*(*τ*) ⟩ = *e*_{s} − Δ*/τ*′. Thus, the average prediction error is reduced from its asymptotic limit by an amount given by Δ*/τ*′. *T*_{pred} is also the characteristic timescale for a ball of points to randomize according to the state space density.

We demonstrate our embedding approach on a noisy measurement of a single coordinate of the Lorenz system (Methods: Lorenz System) and display the results in Fig. 2B-E. We find that *T*_{pred} increases with *K* for *K* < 25 frames after which it decreases gradually and we choose *K** = 25. We project on the first *m* singular vectors and find that *T*_{pred} decreases after *m** = 3.

## THE LOW-DIMENSIONAL STATE SPACE OF *C. ELEGANS* LOCOMOTION

We leverage our state space reconstruction to elucidate the behavior of the nematode *C. elegans* freely-foraging on a flat agar plate [47, 48]. In 2D, worms move by making dorsoventral sinusoidal bends along their body [49, 50], which can be captured through high resolution tracking microscopy to give a multidimensional time-series of posture changes [51]. Despite the variety of visible postures, most of the shape variation is captured by a linear combination of a small number of primitive shape dimensions (eigenworms) [22, 48], Fig. 3A.

Projections along the eigenworm dimensions describe the worm’s instantaneous shape and are not a direct indication of behavior, which arises from posture changes. Dynamical representations based on derivatives [22, 47, 52], and on sequences of postures [9, 53, 54] have been used to quantitatively explore the worm’s behavior. Importantly, the low-dimensionality of the worm’s shape space doesn’t imply simplicity and low-dimensionality of the behavioral dynamics, and there are several signs of complexity in *C. elegans* behavior, such as heavy-tailed distributions [53], hierarchical structure in posture sequences [55, 56], indications of dynamical criticality in local linear approximation of the dynamics [52], as well as simultaneous presence of stereotypy and variability in posture sequences [22, 47, 53].

To reconstruct the state space of the worm’s posture dynamics, we start with a *T* × 5 measurement matrix *Y* consisting of 5 eigenworm coefficients for a recording of duration *T* = 33600 frames (sampled at 16 Hz), Fig. 3A. We stack (*K* − 1) time-shifted copies of *Y* to give the (*T* − *K* + 1) × 5*K* state matrix . To estimate the optimal window size, we compute *T*_{pred} for each choice of *K*, as shown in Fig. 3B for a single representative worm, and choose *K** = 12. Within this window, we find that predictability saturates with *m* = 7 singular vectors, Fig. 3C. Analysis of each worm in the foraging dataset reveals a similar simplicity, Fig. S1. Despite it’s observed complexity, worm behavior is characterized by a low-dimensional state space.

We increase the interpretability of the worm’s state space reconstruction through a final transformation to independent components. We use the FastICA algorithm [39] on the *m* = 7 projections of the delay matrix to obtain independent coordinate directions and we denote these coordinates behavioral modes . We show the seven behavioral modes in Fig. 3D as curvature kymographs and note that they come in three groups, broadly corresponding to the three coarse categories of worm movement: forward, backward and turning locomotion. Specifically, Γ_{f1} and Γ_{f2} modes correspond to the ventrally and dorsally initiated anterior-posterior body waves that worms make during forward locomotion. The reversal modes Γ_{r1} and Γ_{r2} capture the posterior-anterior body waves worms make during backward locomotion. Finally, {Γ_{t1}, Γ_{t2}, Γ_{t3}} correspond to longer-ranged body bends. Large projections along Γ_{t1} and Γ_{t2} correspond to bends directed towards the ventral or dorsal direction respectively during a delta-turn like bend [48], while Γ_{t3} corresponds to an Omega-turn like bend. In this representation, worm locomotion is approximated by linearly combining these modes with time-varying amplitudes. We find similar modes for different choices of *m** (Fig. S2) and also for an ensemble embedding constructed by concatenating all *N* = 12 foraging organisms, Figs. S3-S4. We note that the behavioral modes emerge in an unsupervised manner, with no prior information of the worm’s movement.

The topology and geometry of trajectories in the behavioral state space contain important qualitative and quantitative information about worm behavior. A 10-min trajectory is visualized in Fig. 3E as projections onto the three mode combinations described above. In the (*X*_{f1}, *X*_{f2}), and (*X*_{r1}, *X*_{r2}) planes, trajectories are colored by the centroid velocity of the worm, negative for backward locomotion and positive for forward locomotion. Trajectories in the (*X*_{t1}, *X*_{t2}, *X*_{t3}) space are colored by the mean body curvature. Large excitations in each of the three projections correspond to forward, backward and turning locomotion respectively. Specifically, trajectories in the (*X*_{f1}, *X*_{f2}) plane form a prominent circular band indicating nearly constant amplitude body waves during forward locomotion. Reversals emerge as trajectories spiraling from the center to a maximum radius in (*X*_{r1}, *X*_{r2}) plane, and then spiraling in as a reversal ends. Finally, deep body bends are represented as large transient orbits, with ventral turns and dorsal turns on opposite sides. Wild type worms have a ventral bias in their deep body bends, which is visible in the state space as a greater density of orbits on one side of 3D projection.

The state space also captures relationships between different body wave patterns. For example, we find that most reversals transition to forward by way of a deep ventral bend (Fig. S5), an observation that was previously reported in the context of the escape response and pirouette reorientation sequence [57, 58]. To quantify the relative activity of each set of body waves and the phase relationships between them, we define normalized mode amplitudes, , where *i* ∈ {*f, r, t*}. The *A*_{i} range from 0 to 1 and measure the relative activity of different body wave patterns. We use these amplitudes to examine the behavior of *N* = 92 on-food worms where a brief laser impulse is applied to the head, resulting in a localized thermal stimulus provoking an escape response [48, 59], shown schematically in Fig. 3F. We project the posture dynamics of each stimulated worm onto the ensemble modes (Fig. S3) and show the normalized mode amplitudes averaged across all worms, Fig. 3G. The amplitudes capture the timescales and phase relationships between different body wave patterns during an escape sequence. In particular, the turning modes are strongly suppressed after the initiation of the reversal, increasing gradually as the reversal ends and worms transitions into a turn. The turning amplitude then decreases, while forward amplitude increases as worms resume forward movement in the opposite direction.

## UNSTABLE PERIODIC ORBITS AND DETERMINISTIC BEHAVIORAL VARIABILITY

The state space of worm locomotion is organized such that neighboring points correspond to similar behavioral sequences of length *K*. However, these neighboring sequences diverge with time, resulting in longer-time un-predictability, shown as an example in Fig. 4A. To understand this variability we note the strong cyclic appearance of trajectories within the projections, Fig. 3E, suggesting that cycles play an important role. We search for periodic orbits by identifying the first recurrence times in a neighborhood [60–62]. Briefly, given a point in state space, we find the smallest *k* > *i* such that is in the neighborhood of . The sequence is then detected as a periodic orbit of period *p* = *k* − *i* (Methods: Periodic Orbits). Across all foraging worms, the distribution of the number of periodic orbits exhibits peaks at approximately integer multiples of a minimum period *p*_{min} corresponding to the frequency of each worm’s body wave during forward locomotion, Fig. 4B (inset). We quantify the stability of each periodic trajectory by computing its maximal Floquet exponent (Methods: Floquet Exponents). The distribution of Floquet exponents is largely positive, indicating that the worm’s periodic orbits are mostly unstable, Fig. 4B. The unstable periodic orbits (UPOs) of worm behavior provide a longer timescale description of the movement and also a quantitative characterization of the trajectory divergence in Fig. 4A. We estimate the maximal Lyapunov exponent *λ*_{max} by a weighted average of the Floquet exponents of periodic orbits of increasing length, weighted by , where *µ*_{1} is the maximal Floquet exponent of the orbit, and *p* is its period [63]. Including orbits of duration up to *p* = 8, Fig. 4C (blue), provides an approximation of *λ*_{max}, which agrees with direct trajectory divergence estimates averaged across all worms (gray bar, see also Fig. S6 and Methods: Maximal Lyapunov Exponent). The average across random segments of the same length converges more slowly, Fig. 4C (red).

The detected periodic orbits are interpretable in terms of commonly observed *C. elegans* behaviors. Orbits with the minimum period *p*_{min} correspond to forward and backward crawling including orbits with a dorsal or ventral bias, Fig. S7(B-C). More surprisingly, longer periodic orbits are composites, corresponding to longer time reorientation behaviors of the worm’s navigation and escape strategies [57, 58, 64]. In Fig. 4D (blue) we show state space trajectories of one such period-4 orbit. This orbit is composed of a reversal followed by a deep body bend, and subsequent forward movement; a posture sequence previously reported in pirouette reorientation and escape behaviors [57, 58]. Though this periodic orbit is several body waves long, it is repeated almost exactly at different times during the recording. We show one such recurrence Fig. 4D (orange), along with the corresponding posture sequences. The presence of such UPOs suggests an intriguing view of the worm’s foraging dynamics as following a complex landscape composed of unstable orbits, a picture that is rigorously correct for chaotic systems [65, 66]. Periodic orbits have also been investigated in a number of biological systems, including neuronal activities [67], human electroencephalograms [67], crayfish photoreceptors [68, 69], as well as cardiac arrhythmias and seizures [70–72].

## SYMMETRIC LYAPUNOV SPECTRUM AND DAMPED-DRIVEN HAMILTONIAN DYNAMICS

While the behavior of *C. elegans* is simpler than most animals, the quantitative dynamics of worm posture defy a straightforward interpretation or even, as yet, a model (see e.g. [73, 74] for reviews). There is rough stereotypy in the orbits corresponding to each behavior, but also large cycle-to-cycle variation. Such variability is linked to a positive maximal Lyapunov exponent and unstable periodic orbits, Fig. 4(B,C), Fig. S6, so that even within a “single” behavior such as forward crawling, each cycle is deterministically different. To more fully illuminate this variability, we examine the dynamics along all dimensions within the state space.

In an *m*-dimensional state space, local neighborhoods are sheared by the flow and are simultaneously stretched and squeezed along different directions, dynamics which are invariantly characterized the Lyapunov exponents, *λ*_{i=1…m}. Formally, such stretching and squeezing is described by the Jacobian , which maps an *m*-dimensional spherical neighborhood to an *m*-dimensional ellipsoid. The spectrum of Lyapunov exponents is given by the infinite time average of the logarithms of the principle axes of the ellipsoid, as illustrated in Fig. 5A. Positive Lyapunov exponents reflect directions along which trajectory bundles expand, while negative exponents reflect shrinking directions.

The Lyapunov exponents reveal important information about the dynamics of a system (see e.g. [75]). The sum of the exponents is the average dissipation rate: zero for conservative systems and negative for those with dissipation. The sum of the positive exponents bounds the metric or Kolmogorov-Sinai (KS) entropy rate [76, 77], providing a principled measure of the unpredictability. In addition, the spectrum of Lyapunov exponents can reveal underlying symmetries and conservation laws. For example, continuous dynamical systems exhibit at least one zero exponent corresponding to time-translation invariance along the direction of the flow.

We compute the Lyapunov spectrum for the state space of *C. elegans* (Methods: Lyapunov Spectrum and Jacobian Estimation), and show bootstrapped density estimates of the *m* = 7 exponents across different worms, Fig. 5B. We find two positive exponents, *λ*_{1} = 0.66 (0.62, 0.69) *s*^{−1}, *λ*_{2} = 0.29 (0.26, 0.32) *s*^{−1}, and a third, near-zero exponent *λ*_{3} = 0.056 (−0.02, 0.11) *s*^{−1}. The KS entropy rate is thus bounded by the sum of positive exponents as *h*_{KS} ≤ 1 (0.93, 1.09) nats*/*s (note that we have restored the units of nats for ease of comparison with other entropy measures). The sum of all of the Lyapunov exponents is negative, indicating that the system is dissipative with a dissipation rate of, Σ_{i} *λ*_{i} = −0.94 (−1.15, −0.78) *s*^{−1}. Although trajectory bundles expand locally, dissipation causes them to contract as a whole and relax to an attracting manifold. We estimate the dimension of the attractor as the Kaplan-Yorke dimension *D*_{KY} = 5.93 (5.75, 6.08) [78]. The combination of local expansion generating variability and local contraction generating stereotypy is an essential complexity of the worm’s posture dynamics.

The Lyapunov spectrum also exhibits a striking symmetry; exponents come in conjugate pairs that sum to the same number *α* = −0.27 (−0.3, −0.24) *s*^{−1}, Fig. 5B (inset). The entire spectrum is thus symmetric about (dotted line). The symmetry is also present in 6- and 8-dimensional embeddings, Fig. S8. Symmetric Lyapunov spectra have been previously observed in at least two kinds of damped-driven Hamiltonian systems: coupled oscillators with viscous damping where *α* is the dissipation per degree of freedom [79], and thermostatted molecular dynamic simulations where *α* is a feedback friction force that acts to maintain a dynamic equilibrium by either keeping the the kinetic energy of the particles constant [80–83]. Interestingly, in a biomechanical model of larval Drosophila locomotion, damped-driven Hamiltonian chaotic dynamics were sufficient to generate realistic forward and backward crawling, as well as more complex reorientation behaviors [84].

## DISCUSSION

We use sequences of multidimensional data to reconstruct a maximally predictive state space (Fig. 1, Fig. 2). Conceptually, our approach is a timescale separation; short-time sequences define the reconstructed state variables while longer-time dynamics are encoded as state space trajectories. Our reconstruction *explicitly* seeks the full state information available in short-time dynamics, analogous to discovering the additional variable of velocity from the displacement time series of a simple oscillator. Such information is often added implicitly, for example through the choice of derivative filters in neural imaging [85, 86]. Both the resulting state variables and the geometry and topology of their trajectories offer important, coordinate-invariant understanding of the processes generating the dynamics.

We applied our approach to the posture time series of the locomotor behavior of the roundworm *C. elegans* and found that the state space is spanned by a 7D basis of interpretable modes Γ (Fig. 3), and their coefficients *X*, which are qualitatively similar for all worms in our foraging dataset. The basis is divided into three groups closely corresponding to forward, backward and turning locomotion. Biologically, these behaviors are linked to three classes of motor neurons: A and B ventral cord neurons which drive and backward and forward locomotion respectively, and sublateral motor neurons such as SMB and SMD which control deep body bends [87]. Furthermore, excitatory classes of ventral cord motor neurons were recently reported to be capable of spontaneous rhythm generation and proposed to be central pattern generators for forward and backward locomotion [88, 89]. We expect worms defective in different motor neurons to display a smaller projection on the behavioral modes in an interpretable manner. Although we have described the results for a 7D dimensional embedding, similar results are also found in 6 and 8 dimensions (Fig. S2-S3,S4). Indeed, it is likely that higher modes capturing head movement and other subtle motions exist but carry little predictive structure in our analyzed conditions. Mutant worms defective in motor control, different sensorimotor contexts, or even faster sampling rates could reveal the presence of subtle, additional dynamics.

In our embedding, the state space trajectories retained significant variability, occupying much of the volume in the reconstructed space. A measure of this volume is the Kaplan-York dimension and we find *D*_{KY} ∼ 6, not substantially smaller than the embedding dimension. One hypothesis for this variability is that worm behavior is stochastic and results from noise induced transitions between a finite number of elements such as stable limit cycles representing individual stereotyped motifs [21, 47, 90, 91]. However, the exponential divergence of nearby state-space trajectories (Fig. S6) and the consistency of this divergence with the spectrum of unstable periodic orbits (Fig. 4), as well as the symmetric Lyapunov spectrum (Fig. 5) provide evidence for important, deterministic variation. From the perspective of deterministic chaos, behavioral dynamics are an aperiodic wandering among an infinite number of unstable periodic orbits, allowing an animal to generate an infinite number of behavioral sequences. Indeed, this agrees with the finding that the number of novel sequences in *C. elegans* behavior grows with the observation time [53]. On the other hand, stereotyped trajectories can emerge naturally as orbits with low values of the maximal Floquet exponent. Such trajectories can also be generated by stabilizing periodic orbits with control, e.g. a simple linear controller of the form *K*(**g**(*t*) − **x**(*t*)), where **g**(*t*) are the desired goal dynamics, **x**(*t*) is the current state and *K* is a control gain matrix [92, 93].

The symmetric form of the Lyapunov spectrum suggests that the worm’s behavioral dynamics can be interpreted as normal modes of a system of coupled, damped and driven, Hamiltonian oscillators,
where (*Q*_{i}, *P*_{i}) are the generalized position and momentum coordinates for the *i*^{th} normal mode. The Hamiltonian is a scalar function governing the time-independent dynamics resulting from the mechanics of the worm’s body, while *C*(*Q*_{i}, *P*_{i}, *ψ*(*t*)) encapsulates the time-dependent neuromuscular control forces due to interaction of worm’s body with the environment, proprioceptive feedback and neural processing of various sensory stimuli *ψ*(*t*). Dynamics based on Hamiltonian structure are often associated with optimality and conservation laws and multiple efforts have reported quantities that remain roughly constant across a range external loads during *C. elegans* locomotion, such as the normalized wave length of the body wave, angle of attack, bending power, and the phase relationship between the muscle activity and body curvature [94–98]. Following the example of thermostatted dynamics (designed to capture constant temperature dynamics, see e.g. [83, 99]), such emergent constants could be explained through feedback control arising from proprioceptive feedback, which is thought to underlie gait modulation in *C. elegans* [100, 101]. Our work also allows for connections between non-equilibrium thermodynamics and worm behavior. For example, worm dynamics breaks the Hamiltonian time-reversible symmetry in a continuous fashion via the dissipation rate *α*, which sets the characteristic time-scale at which dynamics can be considered time-reversible symmetric. In addition, the sum of Lyapunov exponents reported here is an estimate of the entropy production rate [102].

The dynamical invariants such as Lyapunov exponents, dimensions and entropies made accessible by our embedding approach provide important constraints and new understanding for short-time behavior consisting of neuromuscular control along with the biomechanics of the body and its environmental interaction. However, longer timescales are also present in the short periodic orbits, which are interpretable as forward/backward locomotion, and also longer time reorientation sequences such as pirouettes. Longer timescales can also be addressed through a systematic coarse-graining of the continuous state space dynamics which results in a transfer operator, see e.g. [103]. In this approach the partition itself subsumes much of the nonlinearity so that the eigenvalues of the transfer operator can provide a systematic and useful timescale separation. In contrast, linear measures like the power spectrum are often not informative on the original dynamics of complex systems [104].

While we expect a dynamical systems perspective to be generally useful in understanding natural behavior, the analysis here benefits from the relative simplicity of the worm’s foraging dynamics and the resulting interpretability of the modes. Though other settings and organisms may generate more complex embeddings, important dynamical information such as trajectory stability and dynamical invariants can still be extracted from the state space reconstruction. Embedding ideas have also been recently used to understand the global brain dynamics of *C. elegans* [105] and to identify metastable sets and slow order parameters from molecular dynamics simulations using Markov operators [106, 107].

Across wide areas of science there has been a remarkable increase in the availability of precise, multidimensional and dynamical data and new analysis ideas are emerging (see e.g. [52, 108–110]). Here, we improve on the prior work on state space reconstruction [38, 111– 116], where much was in the context of either univariate measurements or known dynamical systems and included a heuristic search of reconstruction parameters. However, challenges associated with high-dimensionality, data sampling and nonstationarity remain. For example, the one-step error for *N* samples from a *D* dimensional dynamical system is *E*(1)*/e*_{s} ≈ *N* ^{−1/D} [117, 118]-larger dimensional systems require exponentially more data to keep *E*(1)*/e*_{s} ≪ 1. A related difficulty is the Euclidean metric used to find nearest neighbor distances, which is invalid even in lower-dimensional spaces with large curvature fluctuations. In this setting, it might be possible to use metric learning algorithms [119] to recover a suitable metric from data. Finally, recent progress in leveraging artificial neural networks to recover dynamical invariants [120] and to seek state-space representations with parsimonious dynamics [121] offers promising directions for combining a principled dynamical perspective with high-dimensional, real-world systems.

## METHODS

### Software

Code for all analysis reported here was written in MATLAB [122] and is publicly available: https://bitbucket.org/tosifahamed/behavioral-state-space.

### Experimental Details

A brief description of the for-aging and escape response datasets is given below. For more details please see the original manuscripts [47, 48].

### Foraging Dataset

*N* = 12 L4-stage N2 worms were recorded at 32 Hz with high resolution tracking microscopy. For the analysis here the data was downsampled to 16 Hz. Worms were cultivated under standard conditions at 20° C [123]. Before the assay, worms were cleaned of *E*.*coli* bacteria by a 1-minute immersion in NGM buffer. Worms were then placed on a 9.1cm assay plate (Petri-Dish) with a 5cm radius copper ring pressed into the agar surface for confinement. The assay started 5 minutes after the transfer and lasted 35 minutes.

### Escape Response Dataset

*N* = 92 mid to late L4 stage N2 worms were targeted on the head with a 100 ms, 75 mA IR laser pulse from a diode laser (*λ* = 1440nm), resulting in a localized temperature change of approximately 0.5° C. Images were recorded at 20 Hz for 30 s (10 s before stimulation and 20 s after stimulation). To prevent adaptation each worm was only assayed once. To match the sampling rate of the foraging dataset, the posture time series was interpolated and downsampled to 16 Hz using the MATLAB [122] `resample` command.

### State Space Reconstruction for the Lorenz system

We simulated the Lorenz system [124],
using MATLAB’s `ode45` Runge-Kutta ODE solver [122] with a time-step *dt* = 0.01 s and error tolerances of 10^{−8}. We take the variable *s*_{1} as the observation time series *y*(*t*). To simulate a noisy observation process we add to *y*(*t*) a uniform white noise with standard deviation of % the standard deviation of *s*_{1}.

### Image Analysis and Posture Space Estimation

The tracking and posture space estimation follows [48]. Briefly, we parameterize the shape of a worm by tangent angles calculated at 100 points along the body image skeleton. For a recording session of *T* frames, this results in a *T* × 100 matrix **Θ**, containing the shape information for each uncrossed frame where the worm’s body does not intersect itself. Next, a 5-dimensional approximation of the 100 dimensional posture space is calculated by projecting the elements of **Θ** onto the basis given by the first 5 singular vectors (eigenworms) of **Θ**. For frames with a body crossing an inverse tracking algorithm is used to identify the eigenworm projections [48].

### Worm State Space Reconstruction

Given a *d*-dimensional time-series in , along with an estimate of the optimal embedding window *K**, and minimum embedding dimension *m**, the state space reconstruction proceeds as follows. First, we create the *L* × *K***d* matrix containing delayed copies of the mean subtracted measurements, , where *L* = (*T* − *K** + 1). For the postures of *C. elegans*, the measurements are composed of *d* = 5 eigenworm coefficients. Next, we perform ICA on the space formed by the first *m** singular vectors of using the FastICA algorithm [39] to obtain an *m**-dimensional state space spanned by the independent basis vectors, which we call behavioral modes and denote **Γ**. Projections of on the state space are contained in the *L* × *m** state space matrix *X*. Each row, , of *X* is the behavioral state encoding the instantaneous behavior of the worm at time *t*, while the temporal sequence, , forms a continuous trajectory in state space which encodes the shape change dynamics of a behavioral sequence.

### Choosing Reconstruction Parameters by Maximizing Predictability

To choose the reconstruction parameters (*K, m*) we first vary *K* in the range 1 ≤ *K* ≤ *K*_{max} and estimate *T*_{pred} in the candidate state space formed by the delayed observations. We set *K** as the minimum *K* where *T*_{pred} as a function of *K* begins to decrease. In cases where *T*_{pred} saturates but doesn’t decrease, we choose *K** as the *K* at which *T*_{pred} saturates. As a guide, we choose *K*_{max} such that for any delay larger than *K*_{max} the autocorrelation function of the observations is close to zero. If *K*_{max} appears too short, then it can be increased step-wise until *T*_{pred}(*K*) starts decreasing. For the Lorenz system we have *K*_{max} = 100 frames, while for the worm data we have *K*_{max} = 30 frames. Intuitively, *K** should allow the reconstruction to capture the fastest time scale of the system, which for chaotic systems is set by the period of the smallest UPO, *p*_{min}. Increasing *K** further filters across longer periods and in the limit *K* → ∞, the SVD filter becomes a discrete Fourier transform [125]. On the other end, *K** should be large enough to embed the dynamics completely. Using the bound given by Takens embedding theorem [31, 38], we get (2*m** + 1)*/d* ≤ *K** < *p*_{min}. Empirically, we find that *p*_{min}*/*4 ≤ *K** ≤ *p*_{min}*/*2.

Once the embedding window is set as *K**, we next perform the singular value decomposition . The first *m* columns of *U* contain the normalized projections of onto its first *m* singular vectors. To find the embedding dimension, we vary *m* and compute *T*_{pred} as above. We set the embedding dimension *m** as the minimum *m* where *T*_{pred} as a function of *m* saturates or begins to decrease.

### Nearest Neighbor Prediction

We estimate the *τ*-step future of an observation , denoted , from an average of the *τ*-step future of *N*_{b} nearest neighbors of the corresponding state space point . Specifically, we find *N*_{b} nearest neighbors of in state space, denoted by for the *r*^{th} nearest neighbor of , and average their values after *τ* steps, , for all *N*_{b} neighbors. Finally, we project back to the observation space to get . We take only the transverse nearest neighbors (i.e. neighbors that are not in succession). The transverse nearest neighbors of are identified by the local minima of *R*_{t′} (*t*) estimated using the `findpeaks` function in MAT-LAB [122], where *R*_{t′} (*t*) is the distance between and all other points in state space. We quantify the *τ*-step prediction accuracy by the root mean squared error
for *N* = 10^{4} different test points in the measurement time series. The predictions are made to a maximum prediction time which is long enough so that *E*(*τ*) saturates to *e*_{s}. The root mean squared error is also a function of the *N*_{b}, the total number of nearest neighbors used for prediction. Making this dependence explicit, we write *E*(*τ, N*_{b}) when *N*_{b} is considered a variable. We set the number of neighbors by minimizing the bias and variance of the one-step prediction error.

### Prediction Timescale *T*_{pred}

In cases where the error growth *E*(*τ*) is well approximated by a sigmoid (conjectured by Lorenz for chaotic systems with a single Lyapunov exponent *λ* [41]), one can show , where *e*_{1} is the one time step error *E*(1). Based on information theoretic considerations Farmer derived the upper bound for the predictability time scale as , where *D*_{I} is the information dimension and *h*_{KS} is the Kolmogorov-Sinai entropy rate, which is consistent with our calculation for the sigmoid assumption. Importantly, these estimates shed light on the asymptotic behavior of *T*_{pred}. For small values of *K* and *m*, the error is affected by some fraction of false nearest neighbors due to underembedding [44] leading to an over-estimate of the local expansion rate and consequently the positive Lyapunov exponents. This causes a drop in *T*_{pred} via the 1*/λ* term. On the other hand, as we increase *K*, the average Euclidean distance between nearest neighbors, *e*_{1} steadily increases leading to a decrease in *T*_{pred} for high dimensions. In the middle of these two extremes we find a range of suitable values for the embedding window *K*. The SVD coordinates are weighted by decreasing singular values, which correspond to the variance of the data projected along the different singular vectors. In the noiseless case, the singular values decay towards zero, while in the presence of noise they decay before saturating to the standard deviation of noise (termed noise floor in Ref. [31]), thus higher dimensions are generically dominated by noise in this case. Consequently, *e*_{1} doesn’t increase as a function of *m* in the noiseless case leading *T*_{pred} to saturate after successful embedding. However, in the presence of noise *e*_{1} increases causing *T*_{pred} to go down.

### Calculation of *T*_{pred}

We developed a fixed-point algorithm to estimate *T*_{pred}. We begin with an initial guess of *e*_{s} labeled and time such that for all . Next, noting that for large times *∫E*(*τ*) *dτ* = *e*_{s}*τ* − Δ we fit a line to to a numerical estimate of *∫E*(*τ*) *dτ* from to *τ*_{max}. The slope of this line provides the next estimate of *e*_{s} labeled , and the intercept is the next estimate of the area Δ, labeled Δ^{1}. We use to again estimate and fit a line to *∫E*(*τ*) *dτ* from to *τ*_{max}, repeating the process until the estimates for and Δ^{j} converge. Using the final estimates of Δ and *e*_{s} we can get a robust estimate . A schematic of this iterative process is shown in Fig. S9. In our experience it only takes 3-4 iterations for the estimates to converge. To obtain the error bars we bootstrap across the prediction test points, generating 100 bootstrapped *E*(*τ*) curves along with *T*_{pred} estimates for each. These are then used to estimate the 95% confidence intervals of *T*_{pred}.

### Periodic Orbits

To detect periodic orbits of length *p* we identify close recurrences in the state space after approximately *p* time steps [60–62]. Specifically, we start at point in state space, and find smallest *k > i* such that . The sequence is then stored as a periodic orbit of period *p* = *k* − *i*. Again, we only consider transverse recurrences to avoid sequential points. If such a *k* cannot be found then it implies that no periodic orbits exist at the scale of *ϵ*. The distance scale of the recurrence *ϵ* is calculated through the function *ϵ*(*r, t*) as defined in Ref. [61], which gives the *r*^{th} smallest distance between state space points separated by time *t*. An example of this function is shown in Fig. S7A. Small values of *ϵ*(*r, t*), identified by local minima of *ϵ*(*r, t*), indicate close recurrences after times *t**, and in consequence reveal the existence of periodic orbits of length *t**. If *ϵ*(*r, t*) does not show any local minima, then periodic orbits cannot be detected from the data. In this manner *ϵ*(*r, t**) gives the minimum distance at which we must look to find a periodic orbit of length *t**. The smallest recurrence time corresponding to the first local minima of *ϵ*(*r, t*) equals the smallest period *p*_{min} detected in the data. In the Jacobian and maximal exponent calculations described below, *ϵ** is the distance corresponding to *p*_{min}. Finally, we set *r* = *m**, the dimension of the reconstructed state space.

### Maximal Lyapunov Exponent *λ*_{max}

Our test for the exponential divergence of neighboring trajectories follows standard approaches [126]. Specifically, we consider a reference trajectory and its nearest neighbors within a distance *ϵ** (see Methods: Periodic Orbits). We then track the average distance between the reference and neighboring trajectories over time to obtain the curve *δ*_{t′} (*τ*). A significant linear region in the ⟨log *δ*_{t′} (*τ*) ⟩_{t′} curve indicates an exponential divergence of neighboring trajectories, while the slope of the linear region provides an estimate of the maximal Lyapunov exponent *λ*_{max}. There is typically a transient before the exponential growth where the perturbation vector aligns itself with the Lyapunov vector corresponding to the maximal exponent. In the Lorenz system it is seen that this transient arises from the finite-size of the perturbation and vanishes in the infinitesimal limit. To avoid effects due to non-stationarities we perform this calculation on the final two minutes of the recording.

### Jacobian Estimation

The Jacobian at the point in state space, denoted , is the derivative of the dynamics at , forming the local linear approximation of the dynamics at that point. We use a modified version of the Jacobian estimation algorithm described in Ref. [127], which solves a weighted regression problem , where points are assigned weights according to their distance from as per the weighting function defined below. **B**_{x(i)} is a (*T* − *K**) × (*m* +1) matrix containing all weighted state space points concatenated with a column of ones, while is a (*T* − *K**)×*m* matrix containing all weighted successors. Each row of and is weighted by . The estimated local Jacobian matrix is then given by where is the pseudoinverse of which we compute using the pinv function in MATLAB [122]. Note that *ϵ** is the distance scale corresponding to the minimum period recurrence (see Methods: Periodic Orbits).

### Floquet Exponents

The real parts of the Floquet exponents of a periodic orbit, which measure their stability, are equal to the Lyapunov exponents of the orbit [128]. To estimate the Floquet exponents of a periodic orbit, we estimate the maximal local Lyapunov exponent along the orbit using established algorithms [129, 130]. We use a recursive QR iteration for obtaining the eigenvalues of the product of Jacobian matrices along a periodic orbit. The average of the logarithms of the eigenvalues give the *m* local Lyapunov exponents of the orbit in an *m* dimensional state space. The maximal exponent is then the Floquet exponent of the periodic orbit.

### Lyapunov Exponents of Random Sequences

To calculate the exponents for short random sequences in Fig. 4C, we proceed as above but instead of the sequence being a periodic orbit, it is formed by starting at a random random point in state space and following it for the duration of a periodic orbit. As a result these sequences are not necessarily recurrent.

## ACKNOWLEDGMENTS

We thank David Jordan, Ian Ehteridge and Anto-nio Celani for comments. This work was supported by OIST Graduate University (TA, GJS), a program grant from the Netherlands Organization for Scientific Research (AC, GJS), and by Vrije Universiteit Amsterdam (GJS).

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].
- [82].
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].
- [96].
- [97].
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].
- [110].↵
- [111].↵
- [112].
- [113].
- [114].
- [115].
- [116].↵
- [117].↵
- [118].↵
- [119].↵
- [120].↵
- [121].↵
- [122].↵
- [123].↵
- [124].↵
- [125].↵
- [126].↵
- [127].↵
- [128].↵
- [129].↵
- [130].↵