A recurrent cortical model can parsimoniously explain the effect of expectations on sensory processes

Buse M. Urgen; Huseyin Boyaci

doi:10.1101/2021.02.05.429913

Abstract

The effect of prior knowledge and expectations on perceptual and decision-making processes have been extensively studied. Yet, the computational mechanisms underlying those effects have been a controversial issue. Recently, using a recursive Bayesian updating scheme, unmet expectations have been shown to entail further computations, and consequently delay perceptual processes. Here we take a step further and model these empirical findings with a recurrent cortical model, which was previously suggested to approximate Bayesian inference (Heeger, 2017). Our model fitting results show that the cortical model can successfully predict the behavioral effects of expectation. That is, when the actual sensory input does not match with the expectations, the sensory process needs to be completed with additional, and consequently longer, computations. We suggest that this process underlies the delay in perceptual thresholds in unmet expectations. Overall our findings demonstrate that a parsimonious recurrent cortical model can explain the effects of expectation on sensory processes.

Introduction

A growing body of work in the last two decades have examined whether and how prior knowledge and expectations affect perceptual processes. These studies have consistently shown that expected stimuli are detected more rapidly and accurately than unexpected stimuli. While these effects are well-established at the behavioral level, the computational mechanisms that may underlie these effects remain relatively unclear.

Bayesian modeling framework have been pretty successful to account for a wide range of empirical data in visual perception. Numerous studies have provided evidence that perception can be modeled as an inference process where noisy or ambiguous sensory stimuli can be combined with the prior (Chalk, Seitz, & Seriès, 2010; de Lange, Heilbron, & Kok, 2018; Ernst & Banks, 2002; Kersten, Mamassian, & Yuille, 2004; Knill & Pouget, 2004; Maloney & Mamassian, 2009; Mamassian, Landy, & Maloney, 2002; Summerfield & De Lange, 2014; Teufel, Subramaniam, & Fletcher, 2013; Weiss, Simoncelli, & Adelson, 2002; Yuille & Kersten, 2006). In line with this, recently, using a behavioral paradigm and Bayesian modeling, we examined whether and how prior knowledge and expectations affect perceptual processes (Urgen & Boyaci, 2021). Unlike what is commonly suggested in the field, we found that valid expectations do not speed up perceptual processes (Urgen & Boyaci, 2021). On the contrary, our findings indicate that unmet expectations lead to a delay, and consequently longer processing times. Moreover, we showed that the recursive Bayesian updating scheme can successfully capture that behavior (Urgen & Boyaci, 2021).

Bayesian framework is not limited to modeling the human behavior. It has gained considerable interest to model the brain function as well (e.g., Friston (2005); Heeger (2017); Rao and Ballard (1999)). These models are generally known to be predictive processing models. Despite some architectural differences, at a common and fundamental level all these mechanistic cortical models suggest that information processing in the brain can be implemented via dynamic interaction between bottom-up sensory input and top-down prior knowledge. Currently, we have considerable amount of evidence that is in line with the main assumptions of these models (e.g., Shipp (2016)). For example, increased neural response to unexpected events is consistently shown in several neuroimaging studies, and is interpreted as reflecting the prediction error signal. However, even though there is an extensive effort to link the empirical findings with the predictive processing models, a very crucial step has been overlooked in many studies. Unlike Bayesian models of behavior, the proposed mechanistic cortical models have not usually been directly tested against behavioral and neuronal data. Given the power of computational models not only in interpreting empirical data but also in providing a mechanistic understanding for the information processing and even in making predictions, it is important to directly test these predictive processing models against empirical data.

Here, we propose a recurrent cortical model (Heeger, 2017) to examine the effects of expectation on perceptual processes. For this aim, we modeled behavioral findings of Urgen and Boyaci (2021), where we found that unmet expectations lead to higher temporal thresholds. As mentioned above, in this earlier work, using a recursive Bayesian model we show that longer time is needed by the system to complete the sensory process when sensory input and expectations disagree. Here, we take a step further and examine how these effects can be modelled in the predictive processing framework. The cortical model we present here is a parsimonious one and simply assumes that the activity of neural units is an interplay between weighted response to bottom-up sensory input and top-down prior. There is no specified sub-population of neural units, e.g. for prediction or error computing. Notably, within each trial of the behavioral experiment the prior is updated recursively to catch the temporal dynamics of the sensory process. Modeling the behavioral data allowed us to reveal whether the proposed cortical model can explain the behavioral effects of expectation. This approach also made it possible to test whether the cortical model predictions approximate the Bayesian model implemented in Urgen and Boyaci (2021).

Methods

The cortical model we propose here is used to model the behavioral findings of Urgen and Boyaci (2021). See Figure 1 for experimental paradigm and behavioral results. Briefly, in the behavioral experiment each trial started with a foveally presented cue, either a house or face symbol. This predictive cue was informative about the upcoming target image category. Subsequently an intact (target) image and its scrambled version were briefly shown on either side of the central fixation point, followed by new scrambled images as masks. Participants’ task was to indicate the location, left or right, of the intact (target) image. The validity of the cue was set at 100%, 75%, 50%, and neutral (no expectation) in different experimental sessions. We computed temporal thresholds in neutral, congruent (expected) and incongruent (unexpected) trials under different validity conditions. We found that incongruent trials lead to longer thresholds than congruent trials in 75%-validity condition (Urgen & Boyaci, 2021). We also found that thresholds under the 100%-validity condition are not lower than those neither under the neutral condition nor the valid trials of other validity conditions. Hence, we concluded that valid expectations do not speed up sensory processes, instead, violation of expectations slow them down.

Figure 1: Behavioral experiment in Urgen and Boyaci (2021).

a. Experimental design. At each trial task-irrelevant predictive cues, face or house, provided prior information about the probability of the upcoming target image category; face or house. Next, an intact target image and a scrambled version of it were presented separately on left and right periphery. Presentation duration of target images varied at each trial, and determined by an adaptive staircase procedure. Participants’ task was to detect the spatial location of the target image; left or right. The validity of the cue was set at 100%, 75%, and 50% in separate sessions. b. Results. We measured temporal thresholds, which is the shortest duration that participants can successfully detect the spatial location of the target image. We examined how the thresholds of congruent (expected) and incongruent (unexpected) trials differ. We found that incongruent trials lead to higher temporal thresholds than congruent trials only in the 75%-validity condition. This figure is reproduced, with permission, from the results and figures presented in Urgen and Boyaci (2021). Copyright 2021, Elsevier.

Cortical Model

For a biologically plausible mechanistic model to explain these findings, we adapted a recently proposed cortical model (Heeger, 2017; Heeger & Mackey, 2019). Figure 2 outlines the model, which was composed of one input, one decision and three intermediate layers, and three category-specific feature units (representing populations of neurons) for face, house and scrambled images. We first define an energy function that the system tries to minimize:

Figure 2: A schematic illustration of the cortical model.

The model has one input, one decision and three intermediate layers, and three category-specific feature units for house (H), face (F) and scrambled (S) images. Weights of the connections are depicted by the thickness of the arrows. Priors, Ŷ _i, are initialized in the beginning of the trial based on the cue and its validity. Later they are updated with past values of layer 3 unit responses. All unit responses are updated until the end of the presentation. Number of iterations in a trial is determined by the stimulus presentation duration, τ, divided by Δt, where Δt defines how long each iteration lasts in the system. A final decision is made by the model based on the sum of layer 3 house and face unit responses on the left and right side of the visual field.

View this table:

Table 1:

Notations for cortical model.

where indices i and j run over units and layers respectively, and Ŷ_i are priors, are unit responses. The parameters γ^(j) can have values between 0 and 1, and determine the relative weights of the feedforward and prior drives, where as α ^(j) determine relative contributions of layers. are the weights of connections between units of different layers. Unit responses are updated by minimizing the energy function (Eq. 1) with respect to using gradient descent: where a is the inverse of a time constant and set to 1/5. Note that the feedback and “horizontal” interactions between different units in the same layer emerge in the equations after taking the derivative of the energy function. Number of iterations, N, is determined by where τ is the duration of presentation of the images in the trial, and Δt determines how long each iteration lasts in the system. At the beginning of a trial (t = 0) intermediate layer unit responses are randomly drawn from a normal distribution with mean 0 where σ_u defines the noise in unit responses (Ma, Beck, Latham, & Pouget, 2006).

The values of the priors, Ŷ_i, are initialized based on the cue and its validity at the beginning of a trial (t = 0). Later (t > 0), however, the priors are updated based on the responses of layer 3 units in previous iterations. This amounts to using priors that are updated over time.

Input layer units

We defined the input stimulus, s = (s₁, s₂, s₃), as a three element vector and at each iteration we computed a noisy abstracted observation where σ_s defines the noise level. Next, we calculated the input layer responses where ψ_i are noise-free neuronal responses based on their tuning curves Note that the input layer units were not subject to the energy minimization, and they did not receive feedback and prior drive.

Prior units

At the beginning of each trial (t = 0) we defined initial prior probabilities, c = (c₁, c₂, c₃), which depended on the cue and its validity. For example in a trial under the 75% validity condition, if the cue is a face, Then we computed the activity of prior units at t = 0 as follows: For t > 0, the prior unit activities were updated recursively at each iteration (in a single trial) with the past values of unit responses. Specifically, the values of in the previous iteration become the prior in the next iteration.

Decision

To make a decision we calculated the sum of last layer’s (Layer 3) face and house unit responses for left- and right locations separately (T_LEFT, T_RIGHT). Then, a decision is made by the model where λ is the decision threshold (Heekeren, Marrett, Bandettini, & Ungerleider, 2004). If the above-mentioned conditions are not satisfied, a choice is made randomly.

Results

Model Simulations of Behavioral Data

We tested whether the cortical model can explain the observed behavioral effect. To this end we fit the model to the observer data at the individual participant level by optimizing three parameters λ (decision criteria), Δ-t (duration of an iteration), and (variance of the tuning curves at the input layer).

Figure 3a shows the simulation results for a single participant (see Supplementary Material for simulations of all participants). The results agree well with the empirical findings: when the cues are invalid, the curve shifts to the right in the 75%-validity condition, indicating that the cortical model also needs a longer time to detect the location of the target image in an incongruent (unexpected) trial. There was no difference between the congruent and incongruent trials in the 50% validity condition, again consistent with the empirical data.

Figure 3: Cortical model simulations.

a. Simulation results for a single participant. See Supplementary Material for simulations of all participants. b. Averaged number of iterations, N, in congruent (expected) and incongruent (unexpected) trials under all validity conditions. The model simulation results are consistent with the behavioral findings (see Figure 1b) and support that the empirical differences under different conditions and trial types can be explained by different amount of internal processing required to reach a decision. Error bars are twice the standard error.

Next, we tested whether the cortical model suggests that further processes, in other words greater number of iterations, leads to the longer thresholds under incongruent trials. For this, we compared the number of iterations performed by the model in all trial types (congruent, incongruent) and validity conditions (100%, 75%, 50%). Recall that the number of iterations, N, is computed by taking the ratio of the duration of that trial, τ, and the time it takes to complete a single iteration, Δ t. Figure 3b shows the number of iterations under each condition. The results show that N is greater for incongruent trials under the 75% validity condition but not under the 50% validity condition. Specifically, we performed 2 (trial type: congruent, incongruent) x 2 (validity: 75, 50) repeated measures ANOVA to investigate the effect of expectation and validity on N. We found that the main effect of expectation was statistically significant (F (1,7) = 18.511, p = 0.004), but there were no main effect of validity and interaction (F (1,7) = 0.299, p = 0.602; F (1,7) = 0.738, p = 0.419). The number of iterations were significantly greater in incongruent trials than in congruent trials in the 75%-validity condition (t (7) = 3.220, p = 0.015). However, there was no difference between the congruent and incongruent trials in 50%-validity condition (t (7) = 2.047, p = 0.08), as well as no differences between the 100% validity condition and the congruent trials of 50% (t (7) = -1.829, p = 0.110) and 75% validity conditions (t (7) = -1.247, p = 0.253). These results are consistent with the empirical data, and show that simply further processing, thus a longer time is required to converge on a decision when the expectations are not met.

Timecourse of unit responses

Figure 4 shows the unit responses at each layer of the cortical model for a single trial. The trial is from the 75%-validity condition in which a face cue is presented. In all layers, at the beginning of the trial, face units respond higher than other units when the presented image is congruent or incongruent with the cue. However, the responses change throughout the iterations of the trial. Specifically, in the congruent trials (i.e. when the presented image is a face), face units continue to be the most responsive units until the end of the trial. However, in the incongruent trials (i.e. when the presented image is a house), face units’ responses decrease while house unit responses gradually increase throughout the trial. The comparison between the unit responses of the congruent and incongruent trials clearly show that the model responses (for a correct decision) are delayed in the incongruent trials compared to the congruent trials.

Figure 4: Cortical model unit responses at each layer within a single trial.

Figure shows cortical model unit responses when a face-cue is presented in a 75%-validity condition. Left and right panels show unit responses in a congruent trial and an incongruent trial respectively. As can be seen at each layer unit responses for a correct decision are delayed in the incongruent trial compared to the congruent trial.

Discussion

In this study we present a recurrent cortical model to explain the behavioral effects of expectation on early visual processes that we found in Urgen and Boyaci (2021). Recurrent models have been suggested to be superior for visual inference compared to the models with only feedforward architecture (van Bergen & Kriegeskorte, 2020). Model fitting results reveal that the cortical model can successfully predict behavioral effects of expectation. Specifically, when expectations are not met, the cortical model needs to compute more iterations, which results in longer processing, to converge on a decision. Notably, this result is inline with our previous findings with Bayesian modeling of the same data (Urgen & Boyaci, 2021), and further bolster that additional steps of computation is responsible for the higher perceptual thresholds in unmet expectations.

There are several mechanistic cortical models which have computational constructs that are analogous to the ones in Bayesian framework (e.g., Friston, 2005; Heeger, 2017; Mumford, 1992; Rao & Ballard, 1999). Despite the architectural differences between these models, referred as predictive processing models here, they all provide a compelling frame-work to understand the involvement of prior knowledge in cortical information processing. In line with this, several neuroimaging findings provide strong neural evidence that top-down information coming from higher regions have a modulatory effect on the activity of early visual areas as well as higher visual processing areas (e.g. FFA) (Bar, 2004; Egner, Monti, & Summerfield, 2010; Gilbert & Sigman, 2007; Kok, Bains, van Mourik, Norris, & de Lange, 2016; Kok, Brouwer, van Gerven, & de Lange, 2013; Kok, Jehee, & De Lange, 2012; Muckli et al., 2015; Muckli & Petro, 2013; Summerfield et al., 2006; Summerfield & Koechlin, 2008). Specifically, recent neuroimaging findings showed that prior knowledge and expectations influence several information processing stages, including the early and late stages of visual processing (e.g., Alink, Schwiedrzik, Kohler, Singer, and Muckli (2010); Egner et al. (2010); Kok et al. (2013, 2012); Richter, Ekman, and de Lange (2018); Summerfield et al. (2006); Summerfield and Koechlin (2008)). Accordingly, recent neural evidence has been interpreted to be consistent with the predictive processing account of brain function. However, the models are not directly tested against empirical data, which hinders the real use and explanatory power of these models. Our primary effort in this study was to directly test a predictive processing model against empirical data and provide a mechanistic understanding of the effect of expectation on visual perception.

The recurrent cortical model adapted in this study is a very simple and parsimonious one that does not include neither subpopulations of special neural units, e.g. error or prediction computation, nor comparison of (low-level) sensory input and (high-level) predictions (Heeger, 2017). The model assumes that information processing in the brain can be accomplished simply by feedforward and feedback connections. Our findings show that even such a simple and parsimonious model can successfully elucidate the behavioral effects of expectation on perceptual processes. Specifically, we suggest that when we are exposed to an unexpected stimulus, there might be a change in feedforward-feedback interactions, e.g. additional neural units may become active and get involved in the process. This may in turn elicit additional processing, and consequently result in longer computations. This idea can account for why unexpected stimulus leads to higher perceptual thresholds and delay in sensory processes as found in Urgen and Boyaci (2021).

Conclusion

We contend that the cortical model we propose here offers a parsimonious explanation for the effects of expectation on sensory processes. Delays in human responses to unexpected stimuli can simply be explained with further, and consequently longer, computations required by the system. The proposed model simulations agree well with a Bayesian model, as well. From a broader perspective, the model offers a biologically plausible mechanism underpinning Bayesian perceptual inference in the brain and offers a rigorous link between behavioral and neuronal responses.

Author Contributions

HB and BMU conceived the original study. BMU implemented the cortical model and performed the modeling with support from HB.

Competing Interests

The authors declare no competing interests.

Acknowledgements

This work was funded by a grant of the Turkish National Scientific and Technological Council (TÜ BİTAK 217K163) awarded to HB. We thank Katja Doerschner for her valuable comments on an earlier version of the manuscript.

References

↵
Alink, A., Schwiedrzik, C. M., Kohler, A., Singer, W., & Muckli, L. (2010). Stimulus predictability reduces responses in primary visual cortex. Journal of Neuroscience, 30 (8), 2960–2966.
OpenUrl Abstract/FREE Full Text
↵
Bar, M. (2004). Visual objects in context. Nature Reviews Neuroscience, 5 (8), 617.
OpenUrl CrossRef PubMed Web of Science
↵
Chalk, M., Seitz, A. R., & Serìes, P. (2010). Rapidly learned stimulus expectations alter perception of motion. Journal of Vision, 10 (8), 2–2.
OpenUrl Abstract/FREE Full Text
↵
de Lange, F. P., Heilbron, M., & Kok, P. (2018). How do expectations shape perception? Trends in cognitive sciences.
↵
Egner, T., Monti, J. M., & Summerfield, C. (2010). Expectation and surprise determine neural population responses in the ventral visual stream. Journal of Neuroscience, 30 (49), 16601–16608.
OpenUrl Abstract/FREE Full Text
↵
Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415 (6870), 429.
OpenUrl CrossRef PubMed Web of Science
↵
Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 360 (1456), 815–836.
OpenUrl CrossRef PubMed
↵
Gilbert, C. D., & Sigman, M. (2007). Brain states: top-down influences in sensory processing. Neuron, 54 (5), 677–696.
OpenUrl CrossRef PubMed Web of Science
↵
Heeger, D. J. (2017). Theory of cortical function. Proceedings of the National Academy of Sciences, 114 (8), 1773–1782.
OpenUrl Abstract/FREE Full Text
↵
Heeger, D. J., & Mackey, W. E. (2019). Oscillatory recurrent gated neural integrator circuits (ORGaNICs), a unifying theoretical framework for neural dynamics. Proceedings of the National Academy of Sciences of the United States of America, 116 (45), 22783–22794. doi: 10.1073/pnas.1911633116
OpenUrl Abstract/FREE Full Text
↵
Heekeren, H. R., Marrett, S., Bandettini, P. A., & Ungerleider, L. G. (2004). A general mechanism for perceptual decision-making in the human brain. Nature, 431 (7010), 859.
OpenUrl CrossRef PubMed Web of Science
↵
Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as bayesian inference. Annu. Rev. Psychol., 55, 271–304.
OpenUrl CrossRef PubMed Web of Science
↵
Knill, D. C., & Pouget, A. (2004). The bayesian brain: the role of uncertainty in neural coding and computation. Trends in Neurosciences, 27 (12), 712–719.
OpenUrl CrossRef PubMed Web of Science
↵
Kok, P., Bains, L. J., van Mourik, T., Norris, D. G., & de Lange, F. P. (2016). Selective activation of the deep layers of the human primary visual cortex by top-down feedback. Current Biology, 26 (3), 371–376.
OpenUrl CrossRef PubMed
↵
Kok, P., Brouwer, G. J., van Gerven, M. A., & de Lange, F. P. (2013). Prior expectations bias sensory representations in visual cortex. Journal of Neuroscience, 33 (41), 16275–16284.
OpenUrl Abstract/FREE Full Text
↵
Kok, P., Jehee, J. F., & De Lange, F. P. (2012). Less is more: expectation sharpens representations in the primary visual cortex. Neuron, 75 (2), 265–270.
OpenUrl CrossRef PubMed Web of Science
↵
Ma, W. J., Beck, J. M., Latham, P. E., & Pouget, A. (2006, nov). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9 (11), 1432–1438. Retrieved from http://www.nature.com/doifinder/10.1038/nn1790http://www.nature.com/articles/nn1790 doi: 10.1038/nn1790
OpenUrl CrossRef PubMed Web of Science
↵
Maloney, L. T., & Mamassian, P. (2009). Bayesian decision theory as a model of human visual perception: Testing bayesian transfer. Visual neuroscience, 26 (1), 147–155.
OpenUrl CrossRef PubMed
↵
Mamassian, P., Landy, M., & Maloney, L. T. (2002). Bayesian modelling of visual perception. Probabilistic models of the brain, 13–36.
↵
Muckli, L., De Martino, F., Vizioli, L., Petro, L. S., Smith, F. W., Ugurbil, K., … Yacoub, E. (2015). Contextual feedback to superficial layers of v1. Current Biology, 25 (20), 2690–2695.
OpenUrl CrossRef PubMed
↵
Muckli, L., & Petro, L. S. (2013). Network interactions: Non-geniculate input to v1. Current opinion in neurobiology, 23 (2), 195–201.
OpenUrl CrossRef PubMed
↵
Mumford, D. (1992). On the computational architecture of the neocortex. Biological cybernetics, 66 (3), 241–251.
OpenUrl CrossRef PubMed Web of Science
↵
Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field e?ects. Nature neuroscience, 2 (1), 79.
OpenUrl CrossRef PubMed Web of Science
↵
Richter, D., Ekman, M., & de Lange, F. P. (2018). Suppressed sensory response to predictable object stimuli throughout the ventral visual stream. Journal of Neuro-science, 38 (34), 7452–7461.
OpenUrl Abstract/FREE Full Text
↵
Shipp, S. (2016). Neural elements for predictive coding. Frontiers in psychology, 7, 1792.
OpenUrl
↵
Summerfield, C., & De Lange, F. P. (2014). Expectation in perceptual decision making: neural and computational mechanisms. Nature Reviews Neuroscience, 15 (11), 745.
OpenUrl CrossRef PubMed
↵
Summerfield, C., Egner, T., Greene, M., Koechlin, E., Mangels, J., & Hirsch, J. (2006). Predictive codes for forthcoming perception in the frontal cortex. Science, 314 (5803), 1311–1314.
OpenUrl Abstract/FREE Full Text
↵
Summerfield, C., & Koechlin, E. (2008). A neural representation of prior information during perceptual inference. Neuron, 59 (2), 336–347.
OpenUrl CrossRef PubMed Web of Science
↵
Teufel, C., Subramaniam, N., & Fletcher, P. C. (2013). The role of priors in bayesian models of perception. Frontiers in computational neuroscience, 7, 25.
OpenUrl
↵
Urgen, B. M., & Boyaci, H. (2021). Unmet expectations delay sensory processes. Vision Research, 181, 1–9. Retrieved from https://doi.org/10.1016/j.visres.2020.12.004 doi: 10.1016/j.visres.2020.12.004
OpenUrl CrossRef
↵
van Bergen, R. S., & Kriegeskorte, N. (2020). Going in circles is the way forward: the role of recurrence in visual inference. arXiv preprint arxiv:2003.12128.
↵
Weiss, Y., Simoncelli, E. P., & Adelson, E. H. (2002). Motion illusions as optimal percepts. Nature neuroscience, 5 (6), 598.
OpenUrl CrossRef PubMed Web of Science
↵
Yuille, A., & Kersten, D. (2006). Vision as bayesian inference: analysis by synthesis? Trends in cognitive sciences, 10 (7), 301–308.
OpenUrl CrossRef PubMed Web of Science