Abstract
Experiments and models in perceptual decision-making point to a key role of an integration process that accumulates sensory evidence over time. We endow a probabilistic agent comprising several such integrators with widely spread time scales and let it learn, by trial-and-error, to weight the different filtered versions of a noisy signal. The agent discovers a strategy markedly different from the literature “standard”, according to which a decision made when the accumulated evidence hits a predetermined threshold. The agent instead decides during fleeting windows corresponding to the alignment of many integrators, akin to a majority vote. This strategy presents three distinguishing signatures. 1) Signal neutrality: a marked insensitivity to the signal coherence in the interval preceding the decision, as also observed in experiments. 2) Scalar property: the mean of the response times varies glaringly for different signal coherences, yet the shape of the distribution stays largely unchanged. 3) Collapsing boundaries: the agent learns to behave as if subject to a non-monotonic urgency signal, reminiscent in shape of the theoretically optimal. These three characteristics, which emerge from the interaction of a multi-scale learning agent with a highly volatile environment, are hallmarks, we argue, of an optimal decision strategy in challenging situations. As such, the present results may shed light on general information-processing principles leveraged by the brain itself.
Author summary The rate of integration of sensory information prior to a decision-making process needs to be versatile and adaptable to different situations. While driving can require quick reactions, evaluating the authenticity of a painting can require long observations, and consequently the concept of representations over multiple timescales appears necessary from an intuitive perspective. Nevertheless, there is a lack of theoretical research that exploits multiple timescales, despite the presence of a variety of integration rates have been experimentally observed. In the following work, we developed a decision-making model based on integrators with multiple characteristic times and analysed its behaviour on a highly volatile and biologically relevant task. Through trial and error and reward maximisation, the model discovers an effective strategy that is surprisingly different and more robust in comparison to the more “classical”, single time-scale approach. More importantly, the strategy learnt exhibits remarkable agreement with experimental findings, suggesting a fundamental role of multiple timescales for decision-making. Our model, despite being abstract, achieves a good degree of biological realism and perform robustly in different environments.
Competing Interest Statement
The authors have declared no competing interest.