## Abstract

Classical models of perceptual decision-making assume that animals use a single, consistent strategy to form decisions, or that decision-making strategies evolve slowly over time. Here we present new analyses suggesting that this common view is incorrect. We analyzed data from two mouse decision-making experiments and found that choice behavior relies on an interplay between multiple interleaved strategies. These strategies, characterized by states in a hidden Markov model, persist for tens to hundreds of trials before switching, and may alternate multiple times within a session. The identified strategies were highly consistent across animals, consisting of a single “engaged” state, in which decisions relied heavily on the sensory stimulus, and several biased or disengaged states in which errors frequently occurred. These results provide a powerful alternate explanation for “lapses” often observed in psychophysical experiments, and suggest that standard measures of performance mask the presence of dramatic changes in strategy across trials.

## 1 Introduction

Understanding the computations performed in the brain will require a comprehensive characterization of behavior [6, 29, 39]. This realization has fueled a recent surge in methods devoted to the measurement, quantification, and modeling of natural behaviors [2, 12, 37, 59, 67]. Historically, studies of perceptual decision-making behavior have tended to rely on models derived from signal detection theory (SDT) [30, 38] or evidence accumulation [8, 27, 52]. More recently, approaches based on reinforcement learning have also been used to model the effects of context, reward, and trial-history on perceptual decision-making behavior [34, 40, 50, 63]. In all cases, however, these approaches describe decision-making in terms of a single strategy that does not change abruptly across trials or sessions.

One puzzling aspect of sensory decision-making behavior is the presence of so-called “lapses”, in which an observer makes an error despite the availability of strong sensory evidence. The term itself suggests an error that arises from a momentary lapse in attention or memory, as opposed to an inability to perceive the sensory stimulus. Lapses arise in all species, but are surprisingly frequent in rodent experiments, where lapses can comprise up to 10-20% of all trials [35, 48, 49].

The standard approach for modeling lapses involves augmenting the classic psychometric curve with a “lapse parameter”, which characterizes the probability that the observer simply ignores the stimulus on any given trial [13, 51, 66]. This model can be conceived as a mixture model [20, 43, 46] in which, on every trial, the animal flips a biased coin to determine whether or not to pay attention to the stimulus when making its choice. Previous literature has offered a variety of explanations for lapses, including inattention, motor error, and incomplete knowledge of the task [13, 41, 66], and recent work has argued that they reflect an active process of uncertainty-guided exploration [50]. However, a common thread to these explanations is that lapses arise independently across trials, in a way that does not depend on the time course of other lapses.

Here we show that lapses do not arise independently, but depend heavily on latent states that underlie decision-making behavior. We use model-based analyses to show that mice rely on discrete decision-making strategies that persist for tens to hundreds of trials. One of these states corresponds to an “engaged” strategy, in which the animal’s choices are strongly influenced by the sensory stimulus, while other states correspond to biased or weakly stimulus-dependent strategies. These analyses show that lapses arise primarily during long sequences of trials when the animal is in a biased or disengaged state. Conversely, we find that animals with high apparent lapse rates may nevertheless be capable of high-accuracy performance for extended blocks of trials.

Our modeling framework consists of a hidden Markov Model (HMM) with states corresponding to different decision-making strategies. Within each state, the animal’s strategy is parameterized by a Bernoulli generalized linear model (GLM). The resulting “GLM-HMM” framework [5, 12, 22] includes the classic lapse model as a special case, where the probability of entering the stimulus-independent “lapse state” is the same on every trial. However, the framework allows for a variety of more complex behaviors in which multiple decision-making strategies trade off in a state-dependent manner over longer timescales.

We used the GLM-HMM to analyze choice data from two large cohorts of mice performing different visual detection tasks [35, 48]. We showed that lapses in these experiments were the result of mice switching between relatively engaged and disengaged decision-making strategies *within* a session. While lapse events are typically expected to last for a single trial, these strategies persisted for tens to hundreds of trials. We studied the statistics of data simulated from the best-fitting GLM-HMM, and found that the long choice run-lengths (a run refers to a sequence of consecutive trials in which a mouse makes the same repeated choice) that are observed in the real data only exist in data simulated from a GLM-HMM and not in data simulated from the classic lapse model, indicating that the GLM-HMM better captures the true behavior of real mice. Finally, we validated our interpretation of the GLM-HMM’s latent states when we compared the response times and violation rates (quantities not used to fit our model) associated with engaged and disengaged states. We found that the most extreme response times were typically associated with the disengaged states, which is consistent with previous findings linking accuracy and response times [32, 53, 61]. Taken together, these results shed substantial new light on the factors governing sensory decision-making in rodents, and provide a powerful set of tools for identifying hidden states in behavioral data.

## 2 Results

### 2.1 The classic lapse model for sensory decision-making

A common approach for analyzing data from two-choice perceptual decision-making experiments involves the psychometric curve, which describes the probability that the animal chooses one option (e.g., “rightward”) as a function of the stimulus value [13, 51, 66]. The psychometric curve is commonly parameterized as a sigmoidal function that depends on a linear function of the stimulus plus an offset or bias. This sigmoid rises from a minimum value of *γ _{r}* to a maximal value of 1 –

*γ*, where

_{l}*γ*and

_{r}*γ*denote “lapse” parameters, which describe the probability of of making a rightward or leftward choice independent of the stimulus value. Thus, the probability of a rightward choice is always at

_{l}*least γ*, and it cannot be greater than 1 –

_{r}*γ*. In what follows, we will refer to this as the “classic lapse model of choice behavior”, which can be written: where

_{l}*y*∈ {0, 1} represents the choice (left or right) that an animal makes at trial

_{t}*t*, is a vector of covariates, and is a vector of weights that describes how much each covariate influences the animal’s choice. Note that x

_{t}includes both the stimulus and a constant ‘1’ element to capture the bias or offset, but it may also include other covariates that empirically influence choice, such as previous choices, stimuli, and rewards [11, 25, 44].

Although the classic lapse model can be viewed as defining a particular sigmoid-shaped curve relating the stimulus strength to behavior (Fig. 1c), it can equally be viewed as a mixture model [20, 43, 46]. In this interpretation, we regard the animal as having an internal state *z _{t}* that takes on one of two different values on each trial, namely “engaged” or “lapse”. If the animal is engaged, it makes its choice according to the classic sigmoid curve (which saturates at 0 and 1). If lapsing, it ignores the stimulus and makes its choice based only on the relative probabilities of a left and right lapse. Mathematically, this can be written:
where

*p*(

*z*= “lapse”) = (

_{t}*γ*+

_{r}*γ*) and

_{l}*p*(

*z*= “engaged”) = 1 – (

_{t}*γ*+

_{r}*γ*). In this interpretation, the animal flips a biased coin, with fixed probability (

_{l}*γ*+

_{r}*γ*), on each trial and then adopts one of two strategies based on the outcome—a strategy that depends on the stimulus vs. one that ignores it. Note that the animal can make a correct choice in the lapse state through pure luck; we use “lapse” here simply to indicate that the animal is not relying on the stimulus when making its decision.

_{l}Viewing the classic lapse model as a mixture model highlights some of its limitations. First, it assumes that animals switch between only two decision-making strategies. Second, it assumes that lapses occur independently in time, according to an independent Bernoulli random variable on each trial. Finally, the model assumes that choices in the “lapse” state are fully independent of the stimulus, neglecting the possibility that they are still weakly stimulus dependent [13], or are influenced by other covariates such as reward or choice history [11, 25]. These limitations motivate us to consider a more general family of models, which includes the classic lapse model as a special case.

### 2.2 A model for decision-making with multiple strategies

Recognizing the limitations of the classic lapse model, we propose to analyze perceptual decision-making behavior using a framework based on Hidden Markov models (HMMs) with Bernoulli generalized linear model (GLM) observations [12, 22]. The resulting “GLM-HMM” framework, also known as an input-output HMM [5], allows for an arbitrary number of states, which can persist for an extended number of trials and exhibit different dependencies on the stimulus and other covariates.

A GLM-HMM has two basic pieces: an HMM governing the distribution over latent states, and a set of state-specific GLMs, specifying the decision-making strategy employed in each state (see Fig. 1). For a GLM-HMM with *K* latent states, the HMM has a *K* × *K* transition matrix *A* specifying the probability of transitioning from any state to any other,
where *z*_{t–1} and *z _{t}* indicate the latent state at trials

*t*– 1 and

*t*, respectively. The “Markov” property of the HMM is that the state on any trial depends only on the state from the previous trial, and the “hidden” property refers to the fact that states are latent or hidden from external observers. For completeness, the HMM also has a distribution over initial states, given by a

*K*-element vector

**π**whose elements sum to 1, giving

*p*(

*z*

_{1}=

*k*) =

**π**

*.*

_{k}To describe the state-dependent mapping from inputs to decisions, the GLM-HMM contains *K* independent Bernoulli GLMs, each defined by a weight vector specifying how inputs are integrated in that particular state. The probability of a rightward choice (*y _{t}* = 1) given the input vector x

_{t}and the latent state

*z*is given by where denotes the GLM weights for latent state

_{t}*k*∈ {1, …,

*K*}. The full set of parameters for a GLM-HMM can therefore be denoted , consisting of an initial probability vector π, a transition matrix

*A*, and a set of state-specific GLM weights .

It is worth noting that the classic lapse model described in Eq. 1 and Eq. 2 corresponds to a restricted 2-state GLM-HMM. If we consider state 1 to be “engaged” and state 2 to be the “lapse” state, then the state-1 GLM has weights **w**_{1} = **w**, and the state-2 GLM has all weights set to 0 except the bias weight, which is equal to – log(*γ _{l}*/

*γ*). The transition matrix has identical rows, with probability 1 – (

_{r}*γ*+

_{r}*γ*) of going into state 1 and probability (

_{l}*γ*+

_{r}*γ*) of going into state 2 at the next trial, regardless of the current state. This ensures that the probability of a lapse on any given trial is stimulus-independent and does not depend on the previous trial’s state. Fig. 1a-c shows shows an illustration of the classic lapse model formulated as a 2-state GLM-HMM.

_{l}However, there is no general reason to limit our analyses to this restricted form of the GLM-HMM. By allowing the model to have more than two states, multiple states with non-zero stimulus weights, and transition probabilities that depend on the current state, we obtain a model family with a far richer set of dynamic decision-making behaviors. Fig. 1d shows an example GLM-HMM with three latent states, all of which have high probability of persisting for multiple trials. Intriguingly, the psychometric curve arising from this model (Fig. 1f) is indistinguishable from that of classic lapse model. Thus, the psychometric curve, which is simply the average probability of a choice given the stimulus, cannot provide insight into the dynamics of decision-making across trials.

### 2.3 Mice switch between multiple strategies during visual decision-making

To examine whether animals employ multiple strategies during decision-making, we fit the GLM-HMM to behavioral data from two binary perceptual decision-making tasks (see Methods 4.1). First, we fit the GLM-HMM to choice data from 37 mice performing a visual detection decision-making task developed in [10] and adopted by the International Brain Laboratory (IBL) [35]. During the task, a sinusoidal grating with contrast between 0 and 100% appeared either on the left or right side of the screen (Fig. 2a). The mouse had to indicate this side by turning a wheel. If the mouse turned the wheel in the correct direction, it received a water reward; if incorrect, it received a noise burst and an additional 1-second timeout. We analyzed choice data from animals with at least 3000 trials of data (across multiple sessions) after they had successfully learned the task (see Methods 4.4).

We modeled the animals’ decision-making strategies using a GLM-HMM with four inputs: (1) the (signed) stimulus contrast, where positive values indicate a right-side grating and negative values indicate a left-side grating; (2) a constant offset or bias; (3) the animal’s choice on the previous trial; and (4) the stimulus side on the previous trial. A large weight on the animal’s previous choice gives rise to a strategy known as “perserveration” in which the animal makes the same choice many times in a row, regardless of whether it receives a reward. A large weight on the previous stimulus side, which we refer to as the “win-stay, lose-switch” regressor, gives rise to the well-known strategy in which the animal repeats a choice if it was rewarded, and switches choices if it was not. Note that for the IBL task in question, bias and trial-history dependencies were sub-optimal, meaning that the maximal-reward strategy was to have a large weight on the stimulus and zero weights on the other three inputs.

To determine the number of different strategies underlying decision-making behavior, we fit GLM-HMMs with varying numbers of latent states. Note that the 1-state model is simply a standard Bernoulli GLM, while the classic lapse model (Eq. 1, Eq. 2) is a constrained version of the 2-state model. We found that a 3-state GLM-HMM substantially outperformed models with fewer states, including the classic lapse model. The states of the fitted model were readily interpretable, and tended to persist for many trials in a row.

Figures 2 and 3 show results for an example mouse. For this animal, the multi-state GLM-HMM outperformed both the standard (1-state) GLM and the classic lapse model, both in test log-likelihood and percent correct, with the improvement approximately leveling off at 3 latent states (Fig. 2b-c). Note that the test set for this mouse contained 900 trials. Log-likelihood increases of 0.13 bits/trial and 0.09 bits/trial for the 3-state model over the 1-state model and classic lapse models, respectively, meant that the data were (2^{0.13})^{900} ≈ 1.7 × 10^{35} and (2^{0.09})^{900} ≈ 2.4 × 10^{24} times more likely under the 3-state model.

The transition matrix for the fitted 3-state model describes the transition probabilities between three different states, each of which corresponds to a different decision-making strategy (Fig. 2d). Large entries along the diagonal of this matrix, ranging between 0.94 and 0.98, indicate a high probability of remaining in the same state for multiple trials. The other set of inferred parameters were the GLM weights, which define how the animal makes decisions in each state (Fig. 2e). One of these GLMs (“state 1”) had a large weight on the stimulus and negligible weights on other inputs, giving rise to high-accuracy performance on the task (Fig. 2f). The other two GLMs (“state 2” and “state 3”), by comparison, had small smaller weights on the stimulus, and relatively large bias weights.

We can visualize the decision-making strategies associated with these states by plotting the corresponding psychometric curves (Fig. 2g), which show the probability of a rightward choice as a function of the stimulus, conditioned on both previous choice and reward. The steep curve observed in state 1, which exhibited near-perfect performance on high-contrast stimuli and little dependence on previous choice or reward, led us to call it the ‘engaged’ state. By comparison, the psychometric curves for states 2 and 3 reflected large leftward and rightward biases, respectively. They also had relatively large dependence on previous choice and reward, as indicated by the gap between solid and dashed lines. While this mouse had an overall accuracy of 80%, it achieved 90% accuracy in engaged state, compared to only 60% and 58% accuracy in the two biased states (Fig. 2f).

To gain insight into the temporal fluctuations in decision-making behavior, we used the fitted 3-state model to compute the posterior probability over the mouse’s latent state across trials (Fig. 3). These probabilities, formally given by , reflect the experimenter’s ability to infer the animal’s state on any given trial from the entire sequence of choices and inputs during a session. Fig. 3a shows state probabilities for three example sessions (Note that we examined only the first 90 trials of each session, when the stimulus statistics were stationary; see Methods Sec. 4.4 for details). Contrary to the predictions of the classic lapse model, strategies persisted for many trials at a time. Remarkably, the most probable state often had probability close to 1, indicating that we can be highly confident about the mouse’s internal state given the observed data.

If we assign each trial to its most probable state, this mouse spent approximately 69% of all trials (out of 5040 total trials over 56 sessions) in the engaged state, compared to 15% and 16% of trials in the biased left and rightward states (Fig. 3c). Moreover, the mouse changed state at least once within a session in roughly 71% of all 90-trial sessions, and changed multiple times in 59% of sessions (Fig. 3d). This rules out the possibility that the states merely reflect the use of different strategies on different days. Rather, the mouse tended to remain in an engaged, high-performance state for tens of trials at a time, with lapses arising predominantly during interludes when it adopted a left-biased or right-biased strategy for multiple trials in a row. The multi-state GLM-HMM thus provides a very different portrait of mouse decision-making behavior than the basic GLM or lapse model.

### 2.4 State-based strategies are consistent across mice

To assess the generality of these findings, we fit the GLM-HMM to the choice data from 37 mice in the IBL dataset [35] (Fig. 4). We found that the results shown for the example animal considered above were broadly consistent across animals. Specifically, we found that the 3-state GLM-HMM strongly outperformed the basic GLM and classic lapse model in cross-validation for all 37 mice (Fig. 4a and Fig. S5). On average, it predicted mouse choices with 4.2% higher accuracy than the basic GLM (which had an average prediction accuracy of 78%), and 2.8% higher accuracy than the classic lapse model (Fig. 4b). Furthermore, for one animal the improvement in prediction accuracy for the 3 state GLM-HMM was as high as 12% relative to the basic GLM, and 7% relative to the classic lapse model.

Although performance continued to improve slightly with four and even five latent states, we will focus our analyses on the 3-state model for reasons of simplicity and interpretability. Supplementary figure S8 provides a full analysis of 4-state model, showing that it tended to divide the engaged state from the 3-state model into two “sub-states” that differed slightly in accuracy.

Fits of the 3-state GLM-HMM exhibited remarkable consistency across mice, with a majority exhibiting states that could be classified as “engaged”, “biased-left”, and “biased-right” (Fig. 4d). (See Methods section 4.1.5 for details about the alignment of states across mice). While we plot inferred transition matrices for all 37 mice in supplementary figure S6, here we used the diagonal elements of each matrix to compute an expected dwell time for each animal in each state (Fig. 4e). This revealed a median dwell time, across animals, of 24 trials for the engaged state, versus 13 and 12 trials for the biased-left and biased-right states, respectively. Thus, mice tended to remain engaged for longer periods than they remained in either biased state, though the average duration of biased states still departed dramatically from the assumptions of the classic lapse model. For context, for the classic lapse model and a lapse rate of 20%, the expected dwell time in the lapse state is just trials. We analyzed the distribution of state dwell times inferred from data and found they were well approximated by a geometric distribution, matching the theoretical distribution of data sampled from a Hidden Markov Model (Fig. S9 and Fig. S10).

Finally, we examined the fraction of trials per session that mice spent in each of the three states (Fig. 4c). To do so, we used the fitted model parameters to compute the posterior probabilities over state, and assigned each trial to its most likely state. The resulting “fractional occupancies” revealed that the median mouse spent 69% of its time in the engaged state, with the best mice exceeding 90% engagement. Moreover, the majority of sessions (83% of 2017 sessions across all animals) involved a switch between two or more states; in only 17% of sessions did a mouse remain in the same state for an entire 90-trial session.

### 2.5 The GLM-HMM captures the statistics of real data

In order to understand why the GLM-HMM outperformed the classic lapse model, we compared the statistics of real data with data simulated from the two models. The GLM-HMM incorporates long temporal dependencies in strategy, which may induce particular temporal patterns in the observed choice data. Hence, a natural statistic to examine is choice run-length, where a run refers to a sequence of consecutive trials in which the mouse makes the same repeated choice (Fig. 5a). Figure 5b shows the distribution of choice run-lengths observed in the 181,530 trials of real mouse data (red), as well as the choice run-lengths that would arise if the mice performed the task perfectly (grey). Under perfect performance, only 2% of trials would occur in runs of length 10 or greater, whereas 19% of trials occur in such runs in the real data. A good model of mouse decision-making behavior should be able to capture the heavy tail of this run-length distribution.

We examined the distribution of simulated run-lengths generated by three different fitted models: (1) the classic lapse model with only a bias and stimulus weight; (2) the classic lapse model with the usual 4 inputs (stimulus, bias, previous choice and win-stay-lose-switch); (3) the full 3-state GLM-HMM. The stimulus-only lapse model had the largest discrepancy with real data, with a pronounced under-prediction of long runs Fig. 5c). The full lapse model performed better, confirming that the trial-history regressors were instrumental in producing longer run-lengths. However, the full 3-state GLM-HMM performed far better than either lapse model, generating a run-length distribution that was closest to the real data.

We also examined a related statistic, given by the number of runs lasting more than 5 trials (Fig. 5d). The real data contained 6,111 runs longer than 5 trials. This was consistent with datasets simulated from the fitted GLM-HMM, but far greater than the number produced by either of the two lapse models. Thus, compared to the classic lapse model, the GLM-HMM was far better able to account for the temporal distribution of mouse choices, in particular the tendency to produce long sequences of repeated choices.

### 2.6 Data provide evidence for discrete, not continuous, states

The GLM-HMM describes perceptual decision-making in terms of discrete states that persist for many trials in a row before switching. However, the model’s dramatic improvement over classic models does not guarantee that the states underlying decision-making are best described as discrete. One could imagine, for example, that a continuous state governing the animal’s degree of engagement drifts gradually over time, and that the GLM-HMM simply divides these continuous changes into discrete clusters. To address this possibility, we fit the data with PsyTrack, a psychophysical model with continuous latent states [55, 56]. The PsyTrack model describes sensory decision-making using an identical Bernoulli GLM, but with dynamic weights that drift according to a Gaussian random walk (see Methods sec. 4.3). Although PsyTrack has previously been used to characterize slow changes that arise over the course of learning, here we used it to assess whether the decision-making in well-trained mice is better described using weights that drift continuously or that switch between three discrete states. For all 37 mice in our dataset, the 3-state GLM-HMM achieved substantially higher test log-likelihood than the PsyTrack model (Fig. 6a). Model selection also correctly identified simulated data from the GLM-HMM, whereas datasets simulated from a matched first-order autoregressive model had roughly equal log-likelihood under the two models (Fig. 6b-c).

We also used the GLM-HMM to look for evidence of discrete state changes in the raw behavioral time series data itself. Fig. S11 shows the resulting plots, revealing that an identified state change from engaged to one of the two biased states corresponded to a drop of approximately 20% in accuracy, with accuracy approximately constant during the 5 trials before and after the switch. This suggests that—at least for the state changes inferred using the fitted model—that performance did not appear to descend gradually over the 10 trials spanning the state change, but rather appeared consistent with a discrete drop in accuracy. We performed a similar analysis for datasets simulated from a GLM-HMM and from a continuous autoregressive model (see Section 4.3). Both models exhibited step-like changes in accuracy after an inferred state change. However, only the GLM-HMM produced the large jump in accuracy found in real data.

### 2.7 Mice switch between multiple strategies in a second task

To ensure that our findings were not specific to the IBL task or training protocol, we examined a second mouse dataset with a different sensory decision-making task. Odoemene et al. [48] trained mice to report whether the flash rate of an LED was above or below 12Hz by making a right or left nose poke (Fig. 7a). Once again, we found that the multi-state GLM-HMM provided a far more accurate description of mouse decision-making than a basic GLM or the classic lapse model (Fig. 7b). Although the performance of the 3-state and 4-state models was similar, we focused on the 4-state model because—in addition to having slightly higher test log-likelihood for a majority of animals—the 4-state model balanced simplicity and interpretability, with each state in the 4-state model corresponding to a distinct behavioral strategy. (See supplementary figures S17 and S18 for a comparison to 3-state and 5-state fits). The 4-state model exhibited an average improvement of 0.025 bits/trial over the classic lapse model, making the test dataset approximately 1 × 10^{18} times more likely under the GLM-HMM than to the lapse model.

Fig. 7d shows the inferred GLM weights associated with each state in the 4-state GLM-HMM, while Fig. 7e shows the associated psychometric curves, conditioned on previous choice and previous reward. Based on these curves, we labeled the four states as (from left to right): “engaged”, “biased left”, “biased right” and “win-stay”. The combination of stimulus and choice history weights for this fourth state gave rise to a large separation between psychometric curves conditioned on previous reward; the resulting strategy could be described as “win-stay” because the animal tended to repeat a choice if it was rewarded. (However, it did *not* tend to switch if a choice was unrewarded). Accuracy was highest in the engaged state (92%) and lowest in the biased-left (67%) and biased right states (77%), and took an intermediate value in the win-stay state (83%).

Similar to the IBL dataset, the identified states tended to persist for many trials in a row before switching. As before, we used the diagonal entries of the inferred transition matrices to compute the expected dwell time for each animal for each state, producing median expected dwell times of 63, 34, 49 and 13 trials, respectively (Fig. 7f). These dwell times were nevertheless much shorter than the length of a session, which lasted 650 trials on average in this experiment. This indicates that the mice in the Odoemene et al. [48] task, like the IBL mice, typically switched strategies multiple times per session.

Finally, we used the fitted model to examine the temporal evolution of latent states within a session. Figure 7g shows the average posterior state probabilities over the first 200 trials in a session for two example mice and the average over all mice (Fig. S19 shows average posterior state probabilities for each individual mouse in the cohort separately). These trajectories reveal that mice typically began a session in one of the two biased states, and had a low probability of entering the engaged state within the first 50 trials: mice used these initial trials of a session to “warm-up” and gradually improve their performance [34]. This represents a substantial departure from the IBL mice, the majority of which had a high probability of being engaged from the very first trial, and had relatively flat average trajectories over the first 90 trials of a session (Fig. 3 and Fig. S7). We also examined whether the effects of fatigue or satiety could be observed in the average state probabilities at the end of sessions, but did not find consistent patterns across animals (Fig. S20).

### 2.8 External correlates of engaged and disengaged states

One powerful feature of the GLM-HMM is the fact that it can be used to identify internal states from binary decision-making data alone. However, it is natural to ask whether these states manifest themselves in other observable aspects of mouse behavior. In other words, does a mouse behave differently when it is in the engaged state than in a biased state, above and beyond its increased probability of making a correct choice? To address this question, we examined response times and violation rates, two observable features that we have did not incorporate into the models.

Previous literature has revealed that humans and monkeys commit more errors on long duration trials than short duration trials [32, 53, 61]. We, thus, looked at the distributions of response times for the engaged state, compared to for the disengaged states (the biased left and biased right states), for mice performing the IBL task. Within the IBL task, response time is the time from when the stimulus appears on the screen to when the animal receives feedback on the outcome of its decision (when it receives a reward, or hears an auditory cue to indicate an error trial). The median response time across trials and across IBL mice was just 0.34s, but it was not uncommon for trials to last much longer (up to tens of seconds). We show response time Q-Q plots for each of the 37 IBL mice in Fig. 8a. The engaged and disengaged response time distributions were statistically different for all 37 mice (Komogorov-Smirnov tests reject the null hypothesis with p < 0.05 for all 37 mice). Examining the Q-Q plots, it is clear that the most extreme response times for each animal were typically associated with the lower accuracy disengaged states.

Figure 8b shows the difference in the 90th quantile response times for the disengaged and engaged states for each mouse. For the majority of mice, 90th quantile response times were longer for the disengaged states compared to for the engaged state. The median difference in the 90th quantile response time, across all mice, was 0.95s (shown in blue) and this was statistically different from 0s (we calculate 95% bootstrap confidence intervals).

We also examined the difference in violation rates for each animal in the Odoemene et al. [48] dataset for the disengaged (states 2, 3 and 4) compared to engaged states (state 1), and plot these in Fig. 8c. Violation trials are those where the mouse did not make a decision within the task-specific response period after the appearance of the stimulus. The mean violation rate across all mice and all trials in this dataset was 21% (much higher than in the IBL dataset, where the violation rate was less than 1%), and we found that, across all mice, the violation rate was 3.2% higher in the disengaged states compared to in the engaged state (shown in blue). Given that no information about response times or violation rates was used for to train the GLM-HMM, these analyses provide a useful external validation for our interpretation of the GLM-HMM’s latent states as corresponding to ‘engaged’ and ‘disengaged’ behavior.

## 3 Discussion

In this work, we used the GLM-HMM framework to identify hidden states from perceptual decision-making data. Mouse behavior in two different perceptual decision-making tasks [35, 48] was far better described by a GLM-HMM with sustained engaged and disengaged or biased states. Unlike the classic lapse model, these states alternated on the timescale of tens to hundreds of trials. Additionally we found that, for most of the IBL mice, the slowest response times were associated with the disengaged states, and that mice performing the Odoemene et al. [48] task had higher violation rates in the less engaged states. These behavioral correlates for the GLM-HMM’s latent states provide independent validation for our interpretation of the retrieved states as corresponding to ‘engaged’ and ‘disengaged’ behavior.

While we found similarities in the strategies pursued by mice performing the different visual detection tasks, we also found some differences. In particular, we found that mice performing the Odoemene et al. [48] task often “warmed up” to the task [34], only reaching their high accuracy, engaged state after one hundred or so trials. In comparison, mice performing the IBL task often started the session in their engaged states. This ability to infer the state or strategy employed by an animal at different times during a session will be useful for characterizing differences in performance across sessions and across animals. It will also provide a powerful tool for neuroscientists studying the neural mechanisms that support decision-making, as different strategies may well rely on different circuits or different patterns of neural activity [18, 19, 21, 33, 62, 69].

Although we found evidence of warm-up behavior at the start of sessions in the Odoemene et al. [48] mice, we were somewhat surprised to find few signatures of satiation or fatigue toward the end of a session. This might be due to the fact that sessions were typically of fixed duration, and may have ended before mice had a chance to grow satiated or fatigued. One future direction will be to apply the GLM-HMM to experiments with longer sessions, where it may be useful for detecting changes in behavior reflecting satiety or fatigue.

Another exciting future direction will be to explore how the GLM-HMM framework contributes to the wider discussion on the observed differences in lapse rates across species [60, 61]. While the lapse rates of rodents performing perceptual decision-making tasks are often as high as 10-20% [35, 48, 49], lapse rates for non-human primates and humans performing these tasks are often much lower [60, 61]. Could these differences be due to different species using different strategies during these tasks? We look forward to applying the model to data from rats, non-human primates, and humans to explore this problem further.

In future work, we will aim to make explicit the connection between the ‘engaged’ and ‘disengaged’ strategies identified by model and existing measures of arousal and engagement in the literature. Identifying the relationship between the GLM-HMM’s hidden states and pupil diameter, low-frequency LFP oscillations, spontaneous neuronal firing, noise correlations and the action of neuromodulators [14, 31, 54, 64] will be a priority.

That discrete states underpin mouse choice behavior may also call for new normative models to explain why mice may develop these states to begin with [1]. The existence of disengaged and engaged states could reflect explore-exploit behavior [15, 16, 50], optimal learning (e.g., [4, 23, 26, 47]), or could simply indicate incomplete learning of the task. Another promising direction will be to replace the model’s fixed transition matrix with a generalized linear model allowing external covariates to modulate the probability of state changes [12]). This would allow us to identify the factors that influence state changes (e.g., a preponderance of unrewarded choices) and, potentially, seek to control such transitions.

While there are many avenues for future research, we believe that these results call for a significant rethinking of rodent perceptual decision-making behavior and the methods for analyzing it. Indeed, standard analysis methods do not take account of the possibility that an animal makes abrupt changes in decision-making strategy multiple times per session. We feel that the ability to infer internal states from choice behavior will open up new directions for data analysis and provide new insights into a previously inaccessible dimension of perceptual decision-making behavior.

## 4 Methods

### 4.1 Inference of GLM-HMM parameters

#### 4.1.1 GLM-HMM objective function

We fit the parameters of the GLM-HMM, , to choice data using Maximum A Posteriori (MAP) estimation via the Expectation Maximization (EM) algorithm [17]. This algorithm iteratively maximizes the log-posterior of the parameters given the data, given (up to an unknown constant) by:
where *s* indexes the session in which the data was collected (out of *S* sessions), and the sum over **z*** _{s}* is over all possible state allocations for trials in that session. The prior distribution over the model parameters

*p*(Θ) was given by:

Here, *W* ≡ [**w**_{1} … **w**_{K}] represents the matrix formed by concatenating the vectors of per-state GLM weights, and denotes the GLM weights for state *j*, where *M* = 4 was the number of inputs or covariates (including the bias). For each weight vector, we used an independent zero-mean Gaussian prior with variance *σ*^{2}. For the (*K* × *K*) transition matrix *A*, we placed an independent Dirichlet prior distribution over each row *A _{i}*. The Dirichlet is controlled by a shape parameter

*α*, giving: . We also placed a Dirichlet prior over the initial state distribution with

*α*= 1, which corresponds to a flat prior. In order to select the prior hyperparameters for the transition matrix and the weights for a given dataset, we performed a grid search for

_{π}*σ*∈ {0.5, 0.75, 1, 2, 3} and

*α*∈ {1, 2} and selected the set of hyperparameters that resulted in the best performance on a held-out validation set. For IBL mice, the prior hyperparameters selected were

*σ*= 2 and

*α*= 2. While for mice in the Odoemene et al. [48] dataset, the best hyperparameters were

*σ*= 0.75 and

*α*= 2.

#### 4.1.2 Expectation Maximization Algorithm

We used the Expectation Maximization (EM) algorithm [17, 45] to maximize the objective function, , given in (Eq. 5) with respect to the GLM-HMM parameters. The E-step of the EM algorithm involves forming the Expected Complete log-likelihood (ECLL) using the forward-backward algorithm [3]. The ECLL is a lower bound on [7, 17, 45], which is then maximized with respect to Θ during the M-Step. Concretely, the Expected Complete log-likelihood is: where we substituted the definition of the joint distribution, in order to get to the second line. and are the quantities estimated through the forward-backward algorithm, as we will describe below; is obtained from Eq. 4.

#### 4.1.3 Expectation Step

During the E-Step of the EM algorithm, the quantities {*γ _{s,t,k}*} and {

*ξ*} were computed using the forward-backward algorithm [3] at the current setting of the GLM-HMM parameters, Θ

_{s,t,j,k}^{old}. In particular, the forward-backward algorithm involves calculating and so as to be able to form

*γ*and

_{s,t,k}*ξ*as follows:

_{s,t,j,k}Similarly,
where, once again, *p*(*y*_{s,t+1}|*z*_{t+1} = *k*, **x**_{t+1}, **w**_{k}) comes from (Eq. 4).

The forward-backward algorithm is a recursive algorithm and calculates the forward probabilities, {*a _{s,t,k}*}, as follows:
and, for 1 <

*t*≤

*T*:

Similarly, the backward probabilities are calculated recursively as follows:
and, for *t* ∈ {*T* – 1, …, 1}:

The forward-backward algorithm enables evaluation of the sum over all possible state allocations in Eq. 5 and Eq. 7 in linear time in the number of trials, rather than in exponential time.

#### 4.1.4 Maximization Step

After running the forward-backward algorithm, we formed the ECLL as in Eq. 7 and then maximized it with respect to the GLM-HMM parameters, Θ. For the initial state distribution **π** and the transition matrix *A*, this resulted in the closed-form updates:

For the GLM weights, there was no such closed form update, but a Bernoulli GLM falls into the class of functions mapping external inputs to HMM emission probabilities considered in [22], so we know that the ECLL is concave in the GLM weights. As such, we were able to numerically find the GLM weights that maximize (not just locally but globally) the ECLL using the BFGS algorithm [9, 24, 28, 58] as implemented by the scipy optimize function in python [65].

#### 4.1.5 Comparing states across animals and GLM-HMM parameter initialization

In Fig. 4 and Fig. 7, we show the results from fitting a single GLM-HMM to each animal; however it is nontrivial to map the retrieved states across animals to one another. As such, we employed a multistage fitting procedure that allowed us to make this comparison, and we detail this procedure in Algorithm 1. In the first stage, we concatenated the data for all animals in a single dataset together (for example, in the case of the IBL dataset, this would be the data for all 37 animals). We then fit a GLM (a 1 state GLM-HMM) to the concatenated data using Maximum Likelihood estimation. We used the fit GLM weights to initialize the GLM weights of a *K*-state GLM-HMM that we again fit to the concatenated dataset from all animals together (to obtain a “global fit”). We added Gaussian noise with *σ*_{init} = 0.2 to the the GLM weights, so that the initialized states were distinct, and we initialized the transition matrix of the *K*-State GLM-HMM as where and . We then normalized this so that that rows of the transition matrix added up to 1, and represented probabilities. While the EM algorithm is guaranteed to converge to a local optimum in the log probability landscape of Eq. 5, there is no guarantee that it will converge to the global optimum [57]. Correspondingly, for each value of *K*, we fit the model 20 times using 20 different initializations.

In the next stage of the fitting procedure, we wanted to obtain a separate GLM-HMM fit for each animal, so we initialized a model for each animal with the GLM-HMM global fit parameters from all animals together (out of the 20 initializations, we chose the model that resulted in the best training set log-likelihood). We then ran the EM algorithm to convergence; it is these recovered parameters that are shown in Fig. 4 and Fig. 7. By initializing each individual animal’s model with the parameters from the fit to all animals together, it was no longer necessary for us to permute the retrieved states from each animal so as to map semantically similar states to one another.

We note that the initialization scheme detailed above is sufficiently robust so as to allow recovery of GLM-HMM parameters in various parameter regimes of interest. In particular, we simulated datasets from a GLM-HMM with the global fit parameters for both the IBL and Odoemene et al. [48] datasets, as well as a global fit lapse model. We show the results of these recovery analyses in Fig. S3 and Fig. S4.

### 4.2 Assessing Model Performance

#### 4.2.1 Cross Validation

There are two ways in which to perform cross-validation when working with Hidden Markov Models. Firstly, it is possible to hold out entire sessions of choice data for assessing test-set performance. That is, when fitting the model, the objective function in Eq. 5 and the ECLL in Eq. 7 are modified to only include 80% of sessions (since we use 5-fold cross-validation throughout this work); and the log-likelihood of the held-out 20% of sessions is calculated using the fit parameters, and a single run of the forward-pass on the held-out sessions:
where *S* \ *S*′ is the set of held out sessions, and Θ′ is the set of GLM-HMM parameters obtained by fitting the model using the trials from *S*′.

The second method of performing cross-validation involves holding out 20% of trials within a session. When fitting the model, the third term in the ECLL is modified so as to exclude these trials and is now , where *T*′ is the set of trials to be used to fit the model. Furthermore, the calculation of the posterior state probabilities, *γ _{s,t,k}* and

*ξ*, is also modified so as to exclude the test set choice data. In particular,

_{s,t,j,k}*γ*is now and similarly

_{s,t,k}*ξ*is now . The method of calculating these modified posterior probabilities is as detailed in Eq. 9 and Eq. 10, but now the calculation of the forward and backward probabilities,

_{s,t,j,k}*a*and

_{s,t,k}*b*in Eq. 11, Eq. 12, Eq. 13 and Eq. 14 is modified so that, on trials that are identified as test trials, the

_{s,t,k}*p*(

*y*|

_{s,t}*z*=

_{s,t}*k*,

**x**

*,*

_{s,t}**w**

*} term in these equations is replaced with 1.*

_{k}In Fig. 2, Fig. 4 and Fig. 7, we perform cross-validation by holding out entire sessions. We believed it would be harder to make good predictions on entire held out sessions, compared to single trials within a session, as we thought that mice would exhibit more variability in behavior across sessions compared to within sessions. When we compare the performance of the GLM-HMM against the PsyTrack model of [55] in Fig. 6, we use the second method of cross-validation so as to use the same train and test sets as PsyTrack (PsyTrack cannot make predictions on entire held-out sessions).

#### 4.2.2 Normalized Log-likelihood

In Fig. 2, Fig. 4 and Fig. 7 we report the normalized log-likelihood of different models on held-out sessions. This is calculated as follows:
where, for the GLM-HMM, LL_{test} is the test set log-likelihood as calculated in Eq. 17, and LL_{0} is the log-likelihood of the same test set under a Bernoulli model of animal choice behavior. Specifically, this baseline model assumes that animals flip a coin on each trial so as to decide to go Right, and the probability of going Right is equal to the fraction of trials in the training set in which the animal chose to go to the Right. *n*_{test} is the number of trials in the test set, and is important to include since LL_{test} depends on the number of trials in the test set. Dividing by log(2) gives the Normalized log-likelihood the units of bits per trial. Clearly, larger values of the Normalized log-likelihood are better, with a value of 0 indicating that a model offers no improvement in prediction compared to the crude baseline model described above. However, even small values of normalized log-likelihood can indicate a large improvement in predictive power. For a test set size of *n*_{test} = 500, a normalized log-likelihood value of 0.01 indicates that the test data is 31.5 times more likely to have been generated with the GLM-HMM compared to the baseline model. For a test set of *n*_{test} = 5000, and the same value of NLL, the test set becomes 1 × 10^{15} times more likely under the GLM-HMM compared to the baseline model!

#### 4.2.3 Predictive Accuracy

In Fig. 2, Fig. 4 and Fig. 7, we also report the predictive accuracy of the GLM-HMM. When calculating the predictive accuracy, we employ a method similar to the second method described above in section 4.2.1. In particular, we hold out 20% of trials and then obtain the posterior state probabilities for these trials, *t*″ ∈ {*T* \ *T*′}, as , using the other 80% of trials (this latter set of trials being labeled *T*′). We then calculate the probability of the held-out choices being to go Right as:

We then calculate the predictive accuracy as:

### 4.3 Comparison with PsyTrack Model of Roy et al

The PsyTrack model of Roy et al. [56] assumes an animal makes its choice at trial *t* according to
where **w**_{t} evolves according to
where and . Specifically, the animal is assumed to use a set of slowly changing weights to make its decision on each trial.

In order to perform model comparison with the PsyTrack model of [55, 56], we utilized the code provided at https://github.com/nicholas-roy/psytrack.

#### 4.3.1 Simulating Choice Data with AR(1) Model for Weights

While the GLM-HMM is a generative model that can be readily used to simulate choice data that resembles, in accuracy and in the resulting psychometric curves, the choice data of real animals, this is not true for the PsyTrack model of [55, 56]. Indeed, specifying only the hyperparameters of that model and then generating weights according to Eq. 22 and choice data according to Eq. 21 will likely result in choice behavior that is vastly different from that of real animals (the PsyTrack model is underconstrained as a generative model). As such, so as to produce Fig. S11 we simulated smoothly evolving weights from an AR(1) model where the parameters of this model were obtained using the PsyTrack fits to real data. Specifically, we assumed that the probability of going rightward at time t was given by:
and we assumed that the weights evolved according to an AR(1) process as follows:
where *w _{m,t}* is the

*m*th element of

**w**

*in Eq. 23 and is the average weight that an animal places on covariate*

_{t}*m*across all trials when fit with the PsyTrack model: . We obtained

*α*by regressing against and taking the retrieved slope (after confirming that the retrieved slope had a magnitude of less than 1, so as to ensure that the weights did not diverge as

_{m}*t*→ ∞). , where and was obtained from the PsyTrack fit to an animal’s real choice data. Finally, we set

**w**

_{0}= 0.

We simulated weight trajectories for thousands of trials for each animal, so that the AR(1) process reached the stationary regime for each covariate, and so that the mean, variance and autocovariance of each weight for each covariate were close to those returned by the PsyTrack fits to the real choice data.

#### 4.3.2 Additional details about Figure S11

In order to produce supplementary Fig. S11, we aligned sets of 10 trials for which there was a state transition after trial 5 either out of the engaged state or into the engaged state. Animals were in the same state for all 5 trials prior to the transition and in the same state for all 5 trials after the transition (we excluded sequences where state switches occurred during the first 5 or latter 5 trials). Each grey line in Fig. S11 is the average accuracy across all eligible 10 trial sequences for a particular animal; we required that all animals had at least 30 sequences to be included in Fig. S11. As a result, 27 (out of 37) animals are shown in Fig. S11 and are used to compute the average curve across all animals.

### 4.4 Datasets studied

In this paper, we applied the GLM-HMM to two publicly available behavioral datasets associated with recent publications. Firstly, we studied the data associated with [35] that is made available via figshare at https://doi.org/10.6084/m9.figshare.11636748. We used the framework developed in [36] to access the data. We modeled the choice data for the 37 animals in this dataset which had more than 30 sessions of data during the ‘bias block’ regime. We focused on this regime because of the fact that mice, when they have reached this regime, understand the rules of the task and exhibit stationary behavior (see Fig. S1 for plots of accuracy against session identity for each animal, as well as Fig. S2 for the psychometric curves for these animals for the trials studied). For each session, we subset to the first 90 trials of data because, during these trials, the stimulus was equally likely to appear on the left or right of the screen. After the first 90 trials, the structure of the task changed and for a block of trials, the stimulus appeared on the left with a probability of either 80% or 20%; the block identity switched multiple times throughout a session, so that 80% and 20% blocks were interleaved. We subset to the animals with more than 30 sessions of data because we were able to confidently recover GLM-HMM and lapse model generative parameters when we simulated datasets with this number of trials: see Figures Fig. S3 and Fig. S4. As a sanity check to make sure that the recovered states and transitions were not a consequence of the animals we study having been exposed to bias blocks in earlier sessions, we obtained data for 4 animals that were never exposed to bias blocks (not included in the publicly released dataset) and fit the GLM-HMM to the choice data for these animals. In Fig. S12, we show that the retrieved states, dwell times and model comparison results for these animals look very similar to those shown in Fig. 4.

The second dataset that we studied was that associated with [48], with the data being made available at https://doi.org/10.14224/1.38944. Once again, we studied sessions after animals had learned the task (see Fig. S13 and Fig. S14). For this dataset, the retrieved states were less distinct compared to those for the IBL dataset, and as such, we required more trials to be able to recover the generative parameters in simulated data: see Fig. S3. We thus subset to the 15 animals with more than 20 sessions of data and 12,000 trials of data. Compared to the IBL dataset, where the violation rate across all animals’ data was less than 1% of trials (where a violation is where the animal chose not to respond), the violation rate across the 15 animals that we studied from this second dataset was 21%. Thus, it was important to develop a principled method for dealing with violation trials. We treated violation trials as trials with missing choice data, and we handled these trials in a similar way to how we handled test data when performing the second type of cross-validation described in section 4.2.1 above. That is, we modified the third term of the ECLL given in Eq. 7 to exclude violation trials, and we modified the definition of the posterior state probabilities for these trials to be and , where *T*′, rather than representing the training set data, is now the set of non-violation trials. The calculation of the forward and backward probabilities, *a _{s,t,k}* and

*b*, was modified so that, on violation trials, in Eq. 11, Eq. 12, Eq. 13 and Eq. 14, the

_{s,t,k}*p*(

*y*|

_{s,t}*z*=

_{s,t}*k*,

**x**

*,*

_{s,t}**w**

_{k}) term was replaced with 1.

#### 4.4.1 Forming the Design Matrix

Each of the models discussed in this paper (GLM-HMM, the classic lapse model, the PsyTrack model of Roy et al. [56]) were fit using a design matrix of covariates, , where T was the number of trials of choice data for a particular animal. A single row in this matrix was the vector of covariates, , influencing the animal’s choice at trial *t*. For all analyses presented in text, unless specified otherwise, *M* = 4.

For both tasks, the first column in the design matrix was the z-scored stimulus intensity. For the IBL task, we calculated the stimulus intensity as the difference in the value of the visual contrast on the right side of the screen minus the visual contrast on the left of the screen. This resulted in 9 different values for the ‘signed contrast’: {−100, −25, −12.5, −6.25, 0, 6.25, 12.5, 25, 100}. We then z-scored this difference quantity across all trials. For the Odoemene et al. [48] task, we subtracted the 12Hz threshold from the flash rate presented on each trial and then z-scored the resulting quantity.

For all trials, all animals and both tasks, the second column of the design matrix was set to 1, so as to enable us to capture the animal’s innate bias for going rightward or leftward. The third column in the design matrix was the animal’s choice on the previous trial *X*_{t,3} ≡ 2*y*_{t–1} – 1. Whereas *y*_{t–1} ∈ {0,1}, *X*_{t,3} ∈ {−1,1}. It is not strictly necessary to perform this scaling, but we did so to ensure that the range of values for *X*_{:;1} and *X*_{:,3} were more similar (which can be useful when performing parameter optimization). Finally, the fourth column in the design matrix was the win-stay-lose-switch covariate, which was calculated as *X*_{:,4} ≡ *r*_{t–1} × (2*y*_{t–1} – 1), where *r*_{t–1} ∈ {−1,1} was a binary variable indicating whether or not the animal was rewarded on the previous trial. Again *X*_{:,4} ∈ {−1,1}.

### 4.5 Code availability

We contributed code to the Bayesian State Space Modeling framework of [42] and we use this code base to perform GLM-HMM inference. The code to analyze the resulting model fits and to produce the figures in this paper will be made available at https://github.com/zashwood/glm-hmm.

## Supplementary Information

## 5 Acknowledgments

We are grateful to Miles Wells, Rebecca Terry and the Cortexlab at University College London for providing us with the data for the 4 mice plotted in Fig. S12. We are grateful to Scott Linderman for developing the beautiful Bayesian State Space Modeling framework of [42]: as described in our Methods section, we built our code on top of this framework. We thank members of the Pillow Lab, the International Brain Laboratory (IBL), and specifically the Behavior Analysis Working Group within the IBL for helpful feedback throughout the project. We thank Peter Dayan, Sebastian Bruijns and Liam Paninski for acting as the IBL Review Board for this paper. We thank Anne Urai and Emily Dennis for their feedback at various points during the project. We thank Abigail Russo and Matthew Whiteway for providing feedback on drafts of this manuscript. We thank Hannah Bayer for her help and advice as we were preparing to submit this paper. This work was supported by grants from the Simons Collaboration on the Global Brain (SCGB AWD543027; JWP), the NIH BRAIN initiative (NS104899 and R01EB026946; JWP), and a U19 NIH-NINDS BRAIN Initiative Award (5U19NS104648; JWP).

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵