ABSTRACT
Behavioral experience and flexibility are crucial for survival in a constantly changing environment. What are the neuronal processes that selectively transform dynamic sensory information into an appropriate behavioral response, and how do these processes adapt to changes in the environment? Here, we use voltage imaging to measure signals in primary somatosensory cortex (S1) during sensory learning and behavioral adaptation in the mouse. We found that in response to changing sensory stimulus statistics, mice adopt a task strategy that modifies their detection behavior in a context dependent manner as to maintain reward expectation. Correspondingly, neuronal activity in S1 shifts from simply representing stimulus properties to adaptively representing stimulus context in an experience dependent manner. Our results suggest that neuronal signals in S1 are part of an adaptive and dynamic framework that facilitates flexible behavior as an individual gains experience.
INTRODUCTION
Survival in a constantly changing sensory environment requires a high degree of behavioral flexibility and experience. While much is known about how and where in the brain of human and non-human animals sensory signals are processed and where decision signals are accumulated 1–3, far less is known about how behavioral strategies are formed with practice and experience and whether primary sensory areas are involved in this dynamic process. At the core of any learning process is the detection of sensory stimuli, which requires efficient neuronal coding of sensory features in the early sensory pathway. However, studies investigating the role of primary sensory cortex in visual4–6, auditory7–10 and somatosensory behaviors11,12 are in complete disagreement about its function. This discrepancy could be a result of the complexity of the chosen behavioral paradigm or it could be because cortical signals are highly dynamic and context driven. More recent findings have shown that signals in primary somatosensory cortex can enhance stimulus selectivity with behavioral training13, fluctuate according to the behavioral state14, or even remap depending on downstream signals15. Together, these findings have opened up many questions and they motivate us to believe that neuronal signals in primary sensory areas may be highly dynamic, context or experience dependent, and part of an adaptive framework.
Here, we investigate perceptual capabilities of the mouse vibrissa system during learning and behavioral adaptation. We hypothesize that signals in primary somatosensory cortex (S1) not only represent the strength of a sensory input but also play a key role in the transformation of context dependent behaviors. To test this hypothesis, we designed a series of psychophysical experiments evaluating behavioral performance and neuronal activity at different training stages. The training stages include gradual learning of a basic detection task and an advanced stage with changing sensory contingencies16. To repeatedly measure signals of large neuronal pools across training stages, we performed chronic wide-field imaging of S1 activity with the genetically encoded voltage indicator (GEVI) ‘ArcLight’ 17,18 in behaving mice.
We found that in response to changing statistical properties of the sensory stimulus, mice adopt a strategy that modifies their behavior in a way as to maintain reward in the face of these changes. Our results further reveal that S1 activity correlates with behavioral changes in a way that depends upon experience. During learning of the basic task, S1 sensitivity is mostly stimulus driven and uncorrelated with gradual changes in behavioral performance. However, once an animal reaches expert level and is trained to adapt to a change in stimulus statistics, neuronal activity dynamically shifts between changes in S1 sensitivity and decision criterion downstream. The change in S1 sensitivity is more pronounced at a later training stage when the animal has already experienced the task modulation before. Our findings suggest a translation of these context dependent changes between different brain structures along the hierarchy, where S1 is not simply representing sensory stimulus properties, but instead reflecting an adaptive process as part of a behavioral strategy to engage changing stimulus contingencies in a changing sensory environment.
MATERIALS AND METHODS
Animals, surgery, and general procedures for behavioral testing
All experimental and surgical procedures were approved by the Georgia Institute of Technology Institutional Animal Care and Use Committee and were in agreement with guidelines established by the NIH. Subjects were seven male mice (C57BL/6, Jackson Laboratories), aged 4-6 weeks at time of implantation. The basic procedures of virus delivery, head-plate preparation and cortical imaging exactly followed the ones published in a recent paper17. In the following text, only procedures pertaining to the specific procedures established here are described in detail.
Virus delivery
At least four weeks prior to experimentation, mice were anesthetized using isoflurane, 3% to 5% in a small induction chamber, and then placed on a heated platform (FHC, Inc.) to maintain body temperature with a stereotaxic nose cone to maintain anesthesia. During the surgery, the anesthesia levels were adjusted to 1–1.5% to achieve ~1/second breathing rate in mice. For virus delivery, 3 small craniotomies (burr holes of 0.7 mm diameter) were created over the barrel field of the primary somatosensory cortex (S1) according to stereotaxic measurements taken from the bregma ([1 × 3 mm, 3 × 3 mm, 3 × 1 mm] bregma × lateral). One additional craniotomy and Injection was performed over motor cortex (M1, −1 × 1 mm bregma × lateral). The virus was loaded into a neural syringe (Hamilton Neuros Syringe 700/1700). The injection needle was initially lowered to 1000 μm below the pia surface for pre-penetration and then retracted to the target depth of 500 μm, using a 10-μm resolution stereotaxic arm (Kopf, Ltd.). Following a 1-min delay to allow for tissue relaxation, each animal was injected with 2 μl of adeno-associated virus (AAV)1-hsyn1-ArcLight-D-WPRESV40 (UPenn Viral Vector Core, AV-1-36857P) at a flow rate of 0.05-μL/min (0.5 μl each for four injections). After injection, the needle remained in place for an additional 5 min before slowly being removed from the brain. The craniotomies were left to close naturally. In all cases, the skull was sealed by suturing the skin. Throughout the experiment, sterile techniques were used to keep the injection area clean and free from infection. Additionally, opioid and non-steroidal anti-inflammatory analgesic were administered (SR-Buprenorphine 0.8 - 1 mg/kg, SC, pre-operatively and Ketoprofen 5-10 mg/kg, IP, post-operatively).
Head plate Implantation
After at least four weeks post injection, a metal head-plate was secured to the skull in order to reduce vibration and allow head-fixation during imaging and behavior experiments. Following anesthetization and analgesia, a large incision was made over the skull. The connective tissue and muscles surrounding the skull were removed using a fine scalpel blade (Henry Schein #10). The custom titanium head-plate formed an open ring (half-moon shape with an inner radius of 5 mm) and was placed on top of the two hemispheres with an extended bar above the cerebellum (~10 mm, perpendicular to the midline of the skull). The extended bar was designed to attach to a stainless steel holder, serving the purpose of stable head-fixation. The head-plate was attached to the bone using a three stage dental acrylic, Metabond (Parkell, Inc.). The Metabond was chilled using ice, slowly applied to the surface of the skull, and allowed to cure for 5 to 10 min. After securing the head-plate, the skull was cleaned and covered with a thin layer of transparent Metabond. During preparation for histological validation, the head-plate could not be separated from the attached skull and the brain was extracted by removing the lower jaw. The final head-plate and dental acrylic structure additionally created a well for mineral oil that helped maintain skull transparency for the upcoming imaging sessions. Mice were allowed to recover for at least 7 days before habituation training. After wound healing, subjects were housed together with a maximum number of three in one group cage and kept under a 12/12 h inverted light / dark cycle.
Whisker Stimulation
For whisker stimulation a galvo-motor (galvanometer optical scanner model 6210H, Cambridge Technology) as described in a previous study19 was used. The rotating arm of the galvo-motor contacted a single whisker on the right of the mouse’s face at 5 mm (±1 mm tolerance) distance from the skin, and thus, directly engaged the proximal whisker shaft, largely overriding bioelastic whisker properties. All the remaining whiskers were trimmed to prevent them from being touched by the rotating arm. Across mice, different whiskers were chosen (M01: C1, M02: D2, M03: C1, M04: E2, M05: D1, M06: C1, M07: D1). Voltage commands for the actuator were programmed in Matlab and Simulink (Ver. 2015b; The MathWorks, Natick, Massachusetts, USA). A stimulus consisted in a single event, a sinusoidal pulse (half period of a 100 Hz sine wave, starting at one minimum and ending at the next maximum). The pulse amplitudes used (A = [0, 1, 2, 4, 8, 16]°, correspond to maximal velocities: V max= [0 314 628 1256 2512 5023°/s) or mean velocities: V mean= [0 204 408 816 1631 3262°/s) and were well within the range reported for frictional slips observed in natural whisker movement20,21.
Cortical GEVI Imaging
ArcLight transfected mice were chronically imaged through the intact skull using a wide-field fluorescence imaging system to measure cortical spatial activity (MiCAM05-N256 Scimedia, Ltd.). Figure 1a shows the experimental apparatus and schematically describes the wide-field fluorescence microscope. During all imaging experiments, mice were awake and head-fixed. The head-plate was used as a well for mineral oil in order to keep the bone surface wet and maintain skull transparency. The skull was covered with a silicon elastomer (Kwik-Cast Sealant World Precision Instruments) between imaging sessions for protection. The barrel cortex was imaged using a 256 × 256-pixel CMOS camera (Scimedia Model N256CM) at 200 Hz with a pixel size of 69μm × 69μm and an active imaging area of 11.1 × 11.1 mm, given a magnification of 0.63. Note, this resolution does not consider the scattering of the light in the tissue. During experimental imaging, the illumination excitation light was left continuously on. The entire cortical area was illuminated at 465 nm with a 400-mW/cm2 LED system (Scimedia, Ltd.) to excite the ArcLight fluorophore. The excitation light was further filtered (cutoff: 472-30-nm bandpass filter, Semrock, Inc.) and projected onto the cortical surface using a dichroic mirror (cutoff: 495 nm, Semrock, Inc.). Collected light was filtered with a bandpass emission filter between wavelengths of 520-35 nm (Semrock, Inc.). The imaging system was focused at ~300 μm below the cortical surface to target cortical layer 2/3. The first imaging session was used for identifying the barrel field in the awake and naïve animal at least four weeks after ArcLight viral injection. The barrel field was mapped by imaging the rapid response to a sensory stimulus given to a single whisker (A = 4-16 °, or mean velocities respectively: V = 816 - 3262°/s). We used two criteria to localize and isolate the barrel filed: standard stereotaxic localization (~3 mm lateral, 0.5 to 1.5 mm caudal from bregma) and relative evoked spatial and temporal response (visible evoked activity 20 to 25 ms after stimulation). A single whisker was chosen if it elicited a clear response within the barrel field. All subsequent imaging experiments were centered on the same exact location and the same whisker was chosen in repeated sessions for a given animal. Figure 1b shows the characteristic spread of ArcLight expression in an example coronal brain section. The same hemisphere is shown in vivo in Figure 1c with spontaneous fluorescence activity in S1 at the beginning of a behavioral session. Spontaneous and sensory evoked activity in S1 was imaged every trial, 2 s before and 2 s after the punctate whisker. Figure 1d shows frames with typical fluorescence activity patterns recorded at a framerate of 200 Hz and shown separately for three different stimulus amplitudes (rows). The calculation of the ΔF/F0 metric, region of interest (ROI) and data processing is described under ‘Imaging analysis’).
a, Top: Schematic of the imaging system. The GEVI ‘ArcLight’ is expressed in superficial layers of S1. Bottom: Schematic of the behavior setup. b, Confocal image of a coronal mouse brain section (100 μm, right hemisphere) showing the characteristic spread of ArcLight in S1. Blue: DAPI staining, Green: Arclight fluorescence. c, Top view of the same hemisphere in vivo showing ArcLight fluorescence in S1. d, Images of the same hemisphere in a trained mouse with ArcLight fluorescence color coded. Punctate stimuli (16 degree whisker angle) or catch trials (0 degree) were presented. Frames were captured at 200 Hz and depicted from stimulus onset onward. Each frame is normalized to the frame at stimulus delivery (ΔF/F0 = F-F0/F0). Shown are fluorescence responses averaged over an example imaging session (n = 24 trials per condition). A region of interest (red box, 434 × 434 μm) is located with its center at the peak fluorescence to extract and average voltage traces for further analysis. Scale bar = 1 mm. e, Go/No-Go detection task. A punctuate stimulus (10 ms) has to be detected by the mouse with an indicator lick to receive reward. Reward is only delivered in hit trials. Impulsive licks in a 2 s period before trial onset are mildly punished by a time-out triggering a new inter-trial-interval (4 −10 s, gray arrow). f, Example traces of Go and No-Go trials (n = 284 each). Top: Readouts from the stimulator (Stim = 16 degree whisker angle or catch trial = 0 degree). Middle: Lick response histograms from a trained mouse. Bottom: Average voltage response from the region of interest of the same mouse. g, Different task versions under investigation. 1. Basic learning: Learning of the Go/No-Go task. 2. Adaptive behavior: Once animals have learned the basic task, they are challenged with multiple stimulus amplitudes and changes in the statistics of the stimulus distribution.
a, Learning curve for 3 mice trained on the basic Go/No-Go detection task with a weak stimulus (4 degree). Performance is expressed as the probability of a hit (lick after stimulus) and false alarm (lick after no stimulus) for a given session. Individual mice are represented by symbols, and the mean by bold lines. b, Fluorescent activity in S1 from an example mouse during learning of the task. Shown are frames at the peak response to the 4 degree stimulus, catch trials are shown below. Scale bar = 1 mm. c, Dprime metrics for both behavioral and neuronal data during learning of the task. The d’behav was calculated from the observed hit rate and false alarm rate for a given training session. The d’neuro was calculated by comparing single trial distributions of all evoked signal peaks (maximum ΔF/F0 (%) within 100 ms post stimulus) with the corresponding noise distributions when no stimulus was present. Shown are means across all data for a given session, errorbars represent bootstrapped estimates of 95% confidence limits. The dotted lines separate performance into naive (d’<0.8), and acquired (d’ > 1.5). The right panel shows the same data separated for individuals (symbols) before (naive) and after learning (acquired). Bars represent means across mice, errorbars represent SD. * P < 0.05; ** P < 0.01; *** P < 0.001; ‘n.s.’ not significant, two-sided Wilcoxon rank-sum test or Kruskal-Wallis test.
Behavioral paradigm and training
During successive days of behavioral testing, water intake was restricted to the experimental sessions where animals were given the opportunity to earn water to satiety. Testing was paused and water was available ad libitum during 2 days a week. Body weight was monitored daily, and was typically observed to increase or remain constant during training. In some cases, the body weight dropped slightly across successive training days due to a higher task difficulty. If the weight dropped for more than ~5g, supplementary water was delivered outside training sessions to maintain the animal’s weight. 1-2 experiments were usually conducted per day comprising 50-250 trials. During behavioral testing a constant auditory white background noise (70 dB) was produced by an arbitrary waveform generator to mask any sound emission of the galvo-motor-based whisker actuator. All seven mice were trained on a standard Go/No-Go detection task (Fig. 1e) employing a similar protocol as described before 16,22-25. In this task, the whisker is deflected at intervals of 4-10 s (flat probability distribution) with a single pulse (detection target). A trial was categorized as a ‘hit’ (H) if the animal generated the ‘Go’ indicator response, a lick at a water spout within 1000 ms of target onset. If no lick was emitted the trial counted as a ‘miss’ (M). In addition, catch trials were included, in which no deflection of the whisker occurred (A = 0°) and a trial was categorized as a ‘correct rejection’ (CR) if licking was withheld (‘No-Go’). However, a trial was categorized as a ‘false alarm’ (FA) if random licks occurred within 1000 ms of catch onset. Premature licking in a 2 s period before the stimulus was mildly punished by resetting time (‘time-out’) and starting a new inter-trial interval of 4-10 s duration, drawn at random from a flat probability distribution. Note these trial types were excluded from the main data analysis.
The first step of behavioral training was systematic habituation to head-fixation and experimental chamber lasting for about one week. During the second training phase, a single whisker deflection with fixed amplitude was presented interspersed by catch trials (Pstim=0.8, Pcatch=0.2). Immediately following stimulus offset, a droplet of water became available at the waterspout to condition the animal’s lick response thereby shaping the stimulus-reward association. Once subjects showed stable and immediate consumption behavior (usually within 1-2 sessions), water was only delivered after an indicator lick of the spout within 1000 ms, turning the task into an operant conditioning paradigm in which the response is only reinforced by reward if it is correctly emitted after the stimulus. Subsequent experiments were performed systematically and the behavioral performance was measured with simultaneous GEVI imaging. The different experiments are described in detail in the following section.
Basic learning
Learning was studied once an animal had entered the operant phase of training after the basic habituation procedure. From this point forward, experiments were conducted with equal conditions across sessions and without manual interference by the experimenter. To assess differences in learning based on stimulus strength, animals were separated into two groups: group 1 (M01-M03) receiving only one low amplitude stimulus and catch trials (A = [4 0]°) and group 2 (M04-M06) receiving only one high amplitude stimulus and catch trials (A = [16 0]°). Performance metrics are described under data analysis and statistics.
Adaptive behavior
After mice had learned the basic detection task, the psychometric curve was measured in a subgroup of animals (mouse 4-7) using the method of constant stimuli, which entails the presentation of repeated stimulus blocks containing multiple stimulus amplitudes. On a single trial, one out of multiple possible stimulus amplitudes was presented after a variable time interval (4-10s), each with equal probability (uniform distribution, p = 0.2). A stimulus block consisted of a trial sequence comprising all stimuli and a catch trial in pseudorandom order (e.g., each type once per block). A behavioral session consisted of repeated stimulus blocks until the animal disengaged from the task, i.e. when it did not generate lick responses for at least an entire stimulus block. Therefore, the chosen stimulus occurred repetitively but randomly within a session. An experiment consisted of multiple sessions comprising two conditions, where each condition is defined by a different stimulus distribution. A condition was always kept constant within and across multiple behavioral sessions before the task was change. In the ‘high range’ condition, four stimulus amplitudes plus catch trial were used (A = [0, 2, 4, 8, 16] °) and presented in multiple successive sessions. Following this, four new stimulus amplitudes were presented (A = [0, 1,2, 4, 8] °) forming the ‘low range’ condition. Both stimulus distributions shared two of the three stimulus amplitudes; however, the largest stimulus amplitude of the high range condition (16°) was not part of the low range, and vice versa, the smallest amplitude of the low range condition (1 °) was not part of the high range. In order to test the reversibility of potential behavioral and neuronal changes, another high range condition was presented after the animals had completed the low range condition. This reversal of conditions also served the purpose of testing an animal’s experience level at multiple transitions (first, second etc.). Each condition consisted of 8-10 sessions performed over approximately 5 days. Each animal performed 1-2 sessions per day with a minimal break of 3 hours in between.
Data analysis and statistics
Behavior
The learning curve was measured by calculating a dprime, ‘behav, from the observed hit rate and false alarm rate of each training session
where the function Z(p), p ∈ [0,1], is the inverse of the cumulative distribution function of the Gaussian distribution. A criterion of d’ = 2.3 (calculated with p(hit) = 0.95 and p(fa) = 0.25) was used to determine the end of the basic learning period and learning progress was separated into three equally distributed stages, naive (d’ = 0-0.8), intermediate (d’ = 0.8-1.5) and expert level (d’ = 1.5-2.3). Psychometric data were assessed as response-probabilities averaged across sessions within a given stimulus condition. This was done separately for each of the seven animals. Psychometric curves were fit using Psignifit26–28. Briefly, a constrained maximum likelihood method was used to fit a modified logistic function with 4 parameters: α (the displacement of the curve), β (related to the inverse of slope of the curve), γ (the lower asymptote or guess rate), and λ (the higher asymptote or lapse rate) as follows:
where si is the stimulus on the ith trial. Response thresholds were calculated from the average psychometric function for a given experimental condition using Psignifit. The term “response threshold” refers to the inverse of the psychometric function at some particular performance level with respect to the stimulus dimension. Throughout this study, we use a performance level of 50% (probability of detection = 0.5). Statistical differences between psychophysical curves were assessed using bootstrapped estimates of 95% confidence limits for the response thresholds provided by the Psignifit toolbox.
Reward accumulation
Let the stimulus amplitude delivered on the ith trial be denoted as si, the corresponding reward as ri, (since ri is a fixed value, it can just be termed r) and the accumulated reward for N trials as RN. Over N trials, the expected accumulated reward is
where P(si) comes from the experimentally controlled stimulus distribution, P(GO|si) is the probability of a positive response (or “Go”) for the given stimulus amplitude, and E{} denotes statistical expectation.
We considered the null hypothesis of this behavioral paradigm to be that animals do not adapt their behavior in response to an experimentally forced change in stimulus distribution and thus operate from the same psychometric curve (represented as dotted curves in Figure 3b). Note, this corresponds to the same curve but for a different range of stimuli (P(GO|si) across different si. For example, in moving from the high range to the low range stimulus condition, this would result in a decrease in the total accumulated reward for the same number of trials.
a, Manipulation of stimulus distribution range. Every stimulus and catch trial (not shown) is presented with equal probability (p = 0.2). The design involves amplitudes common to both high-range (magenta) and low-range (green) conditions. b, Psychometric curves and response thresholds for an example animal (M04) working in both conditions. Each dot corresponds to response probabilities from a single session. Solid curves are logistic fits to the average data (n=10-11 sessions). Dotted line is a hypothetical curve assuming no change in performance (H0). Dashed line is a hypothetical curve assuming a change in performance to maintain reward (H1) when switching from high range to low range stimulus distributions. Response thresholds are shown as vertical lines with 95% confidence limits. Inset. Response thresholds of all mice (grey symbols). Bars represent means across mice with SD. c, Number of rewards (correct trials) accumulated by the same animal from b. Each line corresponds to one session. Inset. Average total reward number per session for each mouse. The average number of trials is shown on top. Figure conventions are the same as in b. d, Frames of evoked cortical fluorescence activity from two example sessions (M04), one with a high range stimulus distribution (top) and the other with a low range stimulus distribution (bottom). The frames are aligned for amplitudes common to both datasets. Data with a deflection angle of 8 degrees was chosen for further analysis (outlined box). e, Temporal fluorescent signal in response to 8 degree stimulation, extracted from the region of interest and averaged across sessions and mice. PSTHs of behavioral lick responses are shown on top. f, Observer model based on signal detection theory. Behavioral adaptation can either be induced by changes in sensitivity (d’) intrinsic to S1, changes in decision criterion by a downstream observer, or both. g, Distributions of evoked trials (signal) and catch trials (noise) computed separately for the high and low range condition. Dashed lines indicate mean ΔF/F0, black numbers indicate dprime metrics. Blue indicates criterion metrics as computed from ROC curves. Statistical difference was assessed using bootstrapped estimates of 95% confidence limits for the dprime and criterion. * P < 0.05; ** P < 0.01; *** P < 0.001; ‘n.s.’ not significant, two-sided Wilcoxon rank-sum test or Kruskal-Wallis test).
As an alternative hypothesis, one possible strategy the animal could take in response to a change in the stimulus distribution would be to adjust behavior to maintain the same amount of accumulated reward during a session. For example, in moving from high range stimuli to low range stimuli, the accumulated reward would be assumed fixed, and we can determine a new set of probabilities P(GO|si) that define an adapted psychometric function. Note that there is not a unique solution, but one simple possibility is that the original psychometric function maintains the same asymptotes (γ and λ) and false alarm rate but is compressed, with a decrease in response threshold and an increase in slope to maintain the same total accumulated reward. We denote this situation as our hypothetical psychometric function, represented as dashed curves in Figure 3b.
Imaging analysis
All voltage imaging data was analyzed using custom written image-analysis software (MATLAB 2015a, Mathworks, Inc.). The specific methods of processing the ArcLight raw fluorescence signal and basic data analysis followed those of a recent study from our laboratory17. Briefly, raw images were loaded and converted from the proprietary file format of the imaging system using custom scripts. Due to the natural decay of the fluorescent signal caused by photo bleaching, each trial was first normalized to a baseline and reported as a percent change in fluorescent activity (%ΔF/F0). The ΔF/F0 measurement was calculated by subtracting and dividing each trials fluorescence F(x, y, t) by the frame preceding the stimulus delivery:
where F0(x,y) is the frame of stimulus delivery (F0= F at t=0). Note, an extended analysis was performed with different normalization methods by subtracting and dividing each trials fluorescence F by the fluorescence averaged across different time windows before stimulus onset (t=-50-0ms, −100-0ms, −200-0ms, etc.). Increasing normalization windows slightly altered the change in fluorescence magnitude and variance of the evoked response; however, varying the normalization window did not affect the adaptive cortical response reported in this study (Suppl. Fig. 3).
A single region of interest (ROI) was identified using the largest 10 × 10 pixel (434 × 434 μm) area response 25 ms following stimulus onset. The average activity within this region was extracted across all frames to compute the temporal dynamics of the fluorescent signal. Note, due to the fluorophore18, positive changes in membrane potential correspond to a decrease in ArcLight fluorescent activity. In line with our previous study17 all traces have been inverted to show a positive increase in fluorescence. Fluorescent voltage traces and behavioral lick responses were acquired within the same time window and aligned with regard to stimulus onset (Fig. 1f).
Ideal observer analysis
To quantify the fluorescence signal over the course of learning a metric d’neuro was computed. For a given day or session, single trial distributions of evoked signal peaks (maximum %ΔF/F0 within 100 ms post stimulus) are compared to the corresponding noise distributions when no stimulus was present. d’neuro is then defined as:
where μS and μN are the mean and vS and vN are the variance of the signal and noise distribution. d’neuro was then directly compared with the behavioral learning curve derived from d’behav. The same analysis was repeated for the data acquired with changing stimulus statistics, with the exception that only trials with an intermediate stimulus amplitude shared between the high and the low range condition were used. Note, the chosen stimulus of 8-degree whisker angle represents the midpoint of the high range and the upper limit of the low range condition. Distributions consisting of all evoked trials termed ‘signal’ (peak ΔF/F0 with 8°) and catch trials termed ‘noise’ peak (ΔF/F0 with 0°) were computed separately for the high and low range condition with one switch from high to low or vice versa. Within a given condition, trials were concatenated for consecutive sessions and data from all four mice were used to calculate d’neuro and d’behav. Statistical differences between distributions were assessed using bootstrapped estimates of 95% confidence limits for the d’ metrics. This means resampling the d’ from a given session with replacement 1000 times, taking the average of each resampled dataset, and then taking the interval that spans the central 95% of this distribution of averages across resampled datasets. Significance values were further estimated with a non-parametric Wilcoxon rank-sum or Kruskal-Wallis test. Throughout this manuscript, ‘*’ indicates p < 0.05; ‘**’ indicates p < 0.01; indicates p < 0.001; and ‘n.s.’ indicates ‘not significant’.
Cascade framework
This analytical framework was created to describe the correspondence between the sensory input distribution, the neuronal response function derived from the GEVI signal in S1, and the behavioral readout. The experimentally controlled stimulus amplitude on a given trial si is drawn randomly from the input distribution. The evoked GEVI signal in S1 can then be expressed as a stimulus response function G(si). To establish a link between G(si) and the behavior, a mathematical function f(.) was created that transforms G(si) to a probability of a lick response P(GO|si), i.e. the psychometric function, as illustrated in Figure 4a. In other words f(.) represents a matching of the evoked fluorescence to the lick response, given a particular stimulus. Combining both G(si) and f(.) results in a function f[G(si)], as an estimate of the psychometric curve, , that can then be directly compared to the actual psychometric curve P(GO|si). This approach can be applied across the different stimulus conditions (high versus low), and comparisons made across the corresponding G(si) and f(.) functions.
a, Model incorporating the sensory input distribution, the neuronal response function derived from the GEVI signal in S1, and the behavioral readout. The experimentally controlled stimulus amplitude on a given trial si is drawn randomly from the input distribution. The evoked GEVI signal in S1 can then be expressed as a stimulus response function G(si). To establish a link between G(si) and the behavior, a mathematical function f(.) is created that transforms G(si) to a probability of a lick response P(GO|Si), i.e. the psychometric function. Combining both G(si) and f(.) results in a function f[G(si)], as an estimate of the psychometric curve, , that can then be directly compared to the actual psychometric curve P(GO|si). b, Example functions GHigh(si) and GLow(si) of an experienced mouse challenged with a change in stimulus statistics. Data points represent fluorescence changes (ΔF/F0) in response to different stimulus amplitudes averaged across sessions. The functions GHigh(si) and GLow(si) are represented by curves fitted to the data. c, Linker function computed separately for both conditions fHigh(.) and fLow(.). d, Combination of G(si) and f(.). The predictions from the model fHigh[GHigh(si)] and fLow[GLow(si)] (dotted curves) are superimposed on the measured psychometric functions PHigh(GO|si) and PLowGO|si) (solid curves).
is only influenced by changes in G(si), and thus serves as a null test for the prediction based on changes in neural activity in S1 alone. In going from the high to the low range condition, the fraction explained by G(si), can be estimated by comparing the area under the curve of
with the area under the curve of fLow[GLow(si)].
To quantitatively estimate how much of the behavioral variation can be explained by S1 activity versus downstream, the following control was performed: G(si) is considered to change between the high and the low range condition, as observed experimentally, from GHigh(si) to GLow(si). As a null test for the transition from the high to low condition, to capture how much of the observed changes in behavior is explained solely by the changes in G(si), f(.) is held constant, operating from a function fHigh(.), that only reflects the high range condition. The combination of GLow(si) and fHigh(.) then produces an estimated psychometric function that is only influenced by changes in G(si), and thus serves as a null test for the prediction based on changes in neural activity in S1 alone. The fraction explained by G(si) and thus S1, can be estimated by comparing the area under the curve of
with the area under the curve of fLow [GLow (si)]. The remaining is that explained downstream.
All curves were fit using the psignifit toolbox26 and the goodness of fit was assessed by calculating metrics of deviance (D) as well as the corresponding cumulative probability distribution (CPE). To rule out the possibility of poor fitting in the cascade framework, we inspected the goodness-of-fit metric of deviance (D) as well as estimates of where the goodness-of-fit lay in bootstrapped cumulative probability distributions of this error metric (CPE) using the psignifit toolbox. Due to the steep increase in ΔF/F0 at lower stimulus amplitudes we find that a Weibull function provides the best fit for G(si) (example G(si) in Fig. 4b, Dhigh = 0.12, CPEhigh = 0.06; Dlow = 1.03, CPElow = 0.46). Both, the linker function f(.) and the psychometric function are best fit by a logistic function due to the sigmoid configuration of the data (example f(.) in Fig. 4c, Dhigh = 0.6, CPEhigh = 0.54; Dlow = 1.57, CPElow = 0.67).
RESULTS
The current study investigates learning and experience dependent adaptation in the mouse vibrissa system. Our main goal here is to establish a relationship between controlled whisker inputs, S1 activity, and behavior output, firstly, during basic detection learning and secondly, during flexible adaptation to changing sensory contingencies. To repeatedly measure signals of large neuronal pools at different training stages, we performed chronic wide-field imaging of S1 activity with the genetically encoded voltage indicator (GEVI) ‘ArcLight’ in behaving mice.
Figure 1 outlines the experimental design and summarizes the basic neuronal and behavioral metrics of this study. Seven mice were transfected with the GEVI ArcLight before being imaged through the skull using a wide-field fluorescence microscope (Fig. 1a). We have previously described this imaging technique in detail17. Figure 1b shows the characteristic spread of ArcLight expression in an example coronal brain section after a subject had undergone all tests and its brain was extracted for histological validation. The same hemisphere is shown in vivo in Figure 1C with spontaneous fluorescence activity in S1 at the beginning of a behavioral session. Figure 1D shows sequential frames of typical fluorescence activity patterns of a trained mouse recorded at a framerate of 200 Hz and separated into catch trials (top row) and stimulation trials (bottom row) in response to a single whisker deflection (see methods for details about GEVI imaging).
All transfected mice were first trained on a tactile Go/No-Go detection task as described previously16,22,23,29,30. Briefly, this task requires animals to detect pulse-shaped deflections of a single whisker and report the decision by either generating a lick on a waterspout (‘Go’) or by withholding licking (‘No-Go’) if no stimulus is present (Fig. 1e). The trained mouse shows a stereotypical lick response pattern and a clear cortical fluorescence response to stimulation trials (Fig. 1f). Note, the temporal resolution of ArcLight allows the disentangling of the sensory response from potential motor related signals caused by licking (at 200-300 ms post stimulus).
To investigate behavior and neuronal signals at different training stages, we split the experiments into two main parts (Fig. 1g). The first part was designed to evaluate learning from one session to another, covering both behavior and neuronal S1 data in mice acquiring the basic principles of the Go/No-Go task, which we refer to here as “basic learning”. The second part describes S1 activity when mice are challenged with a change in sensory statistics. We refer to this part as “adaptive behavior”. In contrast to the first part, these experiments employed multiple whisker deflection amplitudes presented randomly and we systematically manipulated the statistical distribution of the stimulus.
S1 responses during basic learning
Naive subjects received single whisker stimulation at intervals of 4-10 s with a single pulse or catch trial. Learning was measured by calculating the hit p(hit) and false alarm rate p(fa) of successive training sessions with one session performed each day. A criterion of p(hit) = 0.95 and p(fa) = 0.25 was used to determine successful acquisition of the task. Figure 2a shows hit and false alarm rates for three mice trained on a weak stimulus (A = 4°). The choice of a weak stimulus is anticipated to make the task difficult requiring more training and facilitating sufficient data collection especially during early learning. Indeed, with this training protocol, subjects required extended exposure to the task with more than 30 sessions to achieve successful performance. Figure 2b shows frames of cortical fluorescence activity from an example mouse over the course of learning. Each square represents a top view of the same cortical hemisphere on selected training days with the average fluorescent activity (ΔF/F0 in %) at the peak of the sensory evoked signal (A = 4°, top row) or during catch trials (A = 0°, bottom row). Qualitative inspection of the fluorescence signal regarding its magnitude and spread in cortical space reveals a high level of variability from one day to another, but a systematic change with learning progress (e.g. signal in-or decrease) cannot be identified. We further quantitatively evaluated the fluorescence signal over the course of learning by computing a neuromeric sensitivity measure d’neuro. Single trial distributions of evoked signal peaks are compared to the corresponding noise distributions when no stimulus is present. The d’neuro for a given day or session is then calculated by subtracting the means of the distributions and dividing it by the variance. To compare this metric to the behavior, a d’behav was calculated from the observed behavioral hit and false alarm rate of each training session (see methods). Data from all three mice were used for this analysis. Figure 2c shows the d’neuro (orange) along with the behavioral learning curve d’behav (black), across all sessions and mice. The dashed lines separate performance into ‘naive’ (d’<0.8), and ‘acquired’ (d’>1.5). In contrast to the behavior, the average neuromeric sensitivity remains at a relatively constant level, representing a stable signal to noise relationship independent of the continuing learning progress. By separating the data into naive and acquired (Fig. 2c, Inset) across the group of mice, this finding can further be confirmed statistically (d’neuro naive vs. acquired, p>0.1, Wilcoxon rank sum test). Note, a second group of animals was trained on a much stronger stimulus for comparison (A = 16°). Those mice achieved much higher hit rates at the beginning of training and reached successful task acquisition in half the time (Suppl. Fig. 1a); however, the neuromeric sensitivity was also orthogonal to learning progress (Suppl. Fig. 1b-c). We conclude that neuronal sensitivity in S1 measured from a large population of neurons does not change during basic learning.
Adaptive behavior and changes in S1 in the highly trained animal
To investigate behavioral and neuronal dynamics with regard to changing context, we performed experiments in which we systematically manipulated stimulus statistics. First, we present an analysis of data from highly trained animals only, which we refer to here as “experienced”. Those animals had successfully acquired the basic training and first time adaptation to changing stimulus statistics. The psychophysical techniques were based on a behavioral paradigm we recently developed in the rat16. Figure 3a shows the manipulation of the stimulus distribution range; that is, the upper and lower limits of the statistical distribution of whisker deflection amplitudes presented in a psychophysical test. The first distribution consists of four different stimulus amplitudes and a catch trial (A = [0, 2, 4, 8, 16] °, magenta) which we refer to as the ‘high range’ condition. The second distribution consists of four new stimulus amplitudes and a catch trial (A = [0, 1,2, 4, 8] °, green), which we refer to as the ‘low range’ condition. Each stimulus or catch trial was presented with equal probability (uniform distribution). Importantly, the experimental design involves amplitudes common to both high-range and low-range conditions. Hence, three of the four stimulus amplitudes are shared between conditions. However, the largest amplitude of the high range (16°) is not part of the low range, and vice versa, the smallest amplitude of the low range (1°) is not part of the high range. Figure 3b depicts typical psychometric curves from an example mouse, performing the task first under the high range (magenta) and under the low range condition (green). There is an obvious and consistent shift of the psychometric curve in response to the changing stimulus statistics, which we refer to here as “adaptive behavior”.
In response to a shift in stimulus distribution, we consider two extreme hypotheses as established previously by a simple reward expectation model 16. The null hypothesis (H0) asserts that the animal does not adjust its behavior, and thus operates from the same psychometric function (black dotted curve on top of magenta curve). In moving from the high range to the low range condition, this would result in a decreased reward rate for the same number of trials. The alternative hypothesis (H1) predicts that the animal adapts its behavior to maintain accumulated reward in the face of a changing stimulus distribution. The black dashed curve denotes the hypothetical psychometric function with the same lapse and guess rate (see methods) as the original curve from the mouse, but allowing it to shift to the left such that the expected reward per trial remains constant. The experimentally measured psychometric function in the low range condition (green) comes quite close to the hypothetical performance level based on the assumption of maintained reward expectation. The observed shift results in a significant decrease in response threshold for all mice (Thigh = 4.83 ± 0.33, Tlow = 2.93 ± 0.34, Mean and SD, p<0.05, Kruskal-Wallis test). Figure 3c depicts the actual trial-by-trial reward accumulation by the example mouse. Overlaid are results for n=11 sessions with the high range distribution (magenta) and n=10 sessions with the low range distribution (green). The slope of reward accumulation in the low range condition nearly matches that of the high range condition, and the slope for the low case (green) is close to the prediction from the maintenance of accumulated reward hypothesis, H1, while being clearly separable from the slope representing the null hypothesis (dotted line). The total number of rewards acquired on average per session and across all mice further confirms this (total # high range = 44.5 ± 8.6, total # low range = 42.0 ± 10.6, Mean and SD, Fig. 3c inset), whereas there was no evidence for an alternative strategy to maintain the total number of rewards by working substantially more trials.
This finding shows that detection behavior is highly flexible in the face of a changing stimulus distribution and it further supports the concept of statistical integration and probabilistic computations in the mouse brain. The stability of the stimulus representation during basic learning might suggest that this occurs downstream of S1. Figure 3d shows frames of evoked S1 activity of the same mouse from Figure 3b and c engaged in two example sessions, one presenting the high range (top) and the other presenting the low range distribution (bottom). The frames are aligned for stimuli common to both datasets. Qualitatively, both examples show a higher change and spatial spread in fluorescence with increasing stimulus amplitude. Note, there is a fixed linear relationship between the percent change in fluorescence magnitude (% DF/F0) and the activated cortical area, as both scale with stimulus strength in a highly correlated fashion (Suppl. Fig. 2).
For simplicity, we use percent change in fluorescence magnitude as a metric for cortical activation.
Interestingly, the evoked activity for the same stimulus is clearly higher in the low range compared to the high range condition. To further investigate this modulated activity, we focus our analysis first on trials with an intermediate stimulus shared between both datasets (outlined box). The chosen deflection angle of 8 degrees represents the midpoint of the high range distribution and the upper limit of the low range distribution. According to the psychophysical results, this particular stimulus is considered a ‘supra-threshold’ stimulus and therefore within a detectable range. Figure 3e depicts the temporal fluorescent signal averaged across sessions from all mice at the experienced level. Histograms of behavioral lick responses are shown on top. As we suspected from qualitative assessment of the cortical response, the evoked response to the same 8-degree deflection is higher on average if presented within the low range compared to the high range distribution.
To compare the modulated S1 activity with the adapted behavior, we used classical signal detection theory31,32. Figure 3f outlines the approach schematically. Multiple scenarios are considered to trigger a change within sensory and higher order processing stages; behavioral adaptation can either be induced by changes in sensitivity (d’) intrinsic to S1, changes in decision criterion (also called bias or decision boundary) by a downstream observer, or both. Sensitivity in S1 improves through reduction in the overlap between sensory signal and noise distributions (left panel). In addition, the downstream observer may value hits and false alarms differently by altering the decision criterion (right panel). To test these predictions, distributions consisting of many evoked trials termed ‘signal’ (ΔF/F0 with 8°) and catch trials termed ‘noise’ (ΔF/F0 with 0°) were computed separately for the high and low range condition (Fig. 3g). Trials were concatenated for consecutive sessions with one switch from high to low or vice versa and data from all experienced mice were used for this analysis. The noise distribution (grey) is comparable for the high and the low range condition with its means being identical (ΔFhigh=0%, ΔFlow=0%). However, the signal distribution for the low range (green) is shifted towards higher fluorescence changes as compared to the signal distribution of the high range (magenta) with a clear difference in its mean (ΔFhigh=0.35%, ΔFlow=0.52%). A significant difference in neuronal sensitivity was confirmed by calculating the neuromeric d’ as introduced earlier in this study (d’high=0.69, d’low=1.04, p<0.01, Kruskal-Wallis test). Note that while the neuronal d’ was altered slightly by different normalization methods of the ΔF/F0 metric, an extended analysis revealed that the difference in d’ across conditions persisted (Suppl. Fig. 3a-c). In addition, we created receiver operating characteristic (ROC) curves by varying a criterion threshold across the ΔF/F0 signal and noise distributions and plotting the hit rate (signal detected) against the false alarm rate (incorrect guess) (Suppl. Fig. 3d). The ROC curve for the low range condition was higher than that for the high range condition, quantified by a larger area under the low range ROC curve (AUClow = 0.77) than for the ROC curve for the high range condition (AUChigh = 0.69), thereby confirming a change in S1 sensitivity. This approach further allowed us to infer changes in criterion by comparing the hit rate in ROC space (neurometric) with the average hit rate measured from the behavior (psychometric). The criterion (expressed as % ΔF/F0,) shows a slight, yet non-significant decrease when operating from the low range compared to the high range condition (Chigh = 0.17, Clow = 0.08 p>0.05, Kruskal-Wallis test, Fig. 3g blue). Note, the S1 sensitivity and downstream criterion change in opposite direction, e.g. an increase in hit rate can be caused by an increase in S1 sensitivity and/or a decrease in criterion. We conclude that, in experienced animals, within the signal detection framework, there is evidence of adaptive sensitivity in S1 yet comparatively smaller adaptive changes in criterion by a downstream observer.
Changes across training stages
The data described thus far was derived from highly trained (“experienced”) subjects in response to a stimulus that is close to the perceptual threshold. For this case, the signal detection approach hints at the fact that while changes in S1 dominate, both S1 sensitivity and downstream criterion may be altered in a complementary or constructive way. However, it is unclear whether such a co-modulation is dependent on the individual’s level of experience. To answer this question, we created an analytic, cascade framework that enables us to evaluate the relative contributions from activity in S1 and downstream of S1 in predicting the observed behavior (see methods for details). Briefly, the framework establishes a link between the sensory input distribution si, the S1 response function G(si), and the psychometric curve P(GO|si), as shown in Figure 4a. The function f(.) represents a downstream process, matching the evoked fluorescence to the lick response, given a particular stimulus, and is referred to as the “linker” function, as it links S1 activity to behavior.
This framework can then be used to describe the changes underlying the adaptive behavior. Figure 4b shows an example function G(si) of an experienced mouse challenged with a change in stimulus statistics. The function G(si) is represented by a curve fitted to the data, and separately shown for each condition, GHigh(si) (magenta) and GLow(si) (green). As we expected based on the qualitative observations of the GEVI response and the ROC analysis, a difference between GHigh(si) and GLow(si) is clearly visible. This is particularly obvious for larger stimulus strengths, but it is also true for smaller stimulus strengths that are less obvious on the logarithmic plot. Figure 4c shows the linker function f(.) computed separately for the high and low range condition, resulting in fHigh(.) (magenta) and fLow(.) (green). Another critical observation is that fHigh(.) and fLow (.) are different, confirming our previous finding that a change in S1 sensitivity cannot fully explain the observed differences in behavior. Figure 4d shows the combined functions fHigh [GHigh(si)] and fLow [GLow (si)] (dotted curves, magenta and green, respectively) which serve as predictors of the psychometric curve, along with the original psychometric curves PHigh (GO|si) and PLow (GO|si) (solid curves, magenta and green, respectively). The predictions and actual data match well, by construction, with minor differences due to the fitting accuracy when generating both, G(si) and f(.).
This framework enables us now to assess the relative roles of the two stages of the model in the context of the high and low range conditions imposed during the behavior. Specifically, as a next step we sought to quantitatively estimate how much of the behavioral adaptation can be explained by S1 activity versus other processes downstream. To capture how much of the observed changes in behavior is explained solely by the changes in G(si) in the transition from the high to the low condition, f(.) is held constant, operating from a function fHigh(.), that only reflects the high range condition. The combination of GLow (si) and fHigh (.) then produces an estimated psychometric function that is only influenced by changes in G(si), and thus serves as a null test for the prediction based on changes in neural activity in S1 alone, as shown in Figure 4d (black dashed curve). The fraction explained by G(si), can be estimated by comparing the area under the curve of
with the area under the curve of fLow [GLow (si)]. In this example of an experienced mouse, changes of the S1 neuronal response function G(si) explain 67% of the behavioral adaptation, whereas the remaining 33% would have to be explained downstream, by changes in f(.).
To investigate the effect of experience, we applied the cascade model separately for each mouse and at different training stages. So far, we presented data from highly trained (“experienced”) animals, but now we are including data from the same animals at the first exposure to changing stimulus statistics, which we refer to here as “novel”. Figure 5a and b shows in the face of changing stimulus statistics, all animals adapted their behavior equally well across training stages as revealed by their similar performance metrics (threshold, Figure 5a) and their adaptive change in psychometric thresholds for the high range and the low range condition (Figure 5b, Novel [Thigh = 4.78 ± 1, Tlow = 2.93 ± 0.34], Experienced [Thigh = 4.83 ± 0.33, Tlow = 2.93 ± 0.34], Mean and SD, p<0.05, Kruskal-Wallis test). However, by assessing the S1 contribution at different training stages, an interesting pattern emerged. Figure 5c shows the relative explanatory power of S1 versus downstream, quantified for all mice undergoing repeated task changes. At the first transition only two out of four individuals show a small change in G(si) (M04 = 21%, M05 = 0%, M06= 16%, M07 = 0%), even though all animals show behavioral flexibility. This indicates that the associated neuronal processes that are predictive of the behavior could primarily be performed outside S1. However, at the experienced level (bottom), all four mice show a substantial change in S1 activity or G(si), each explaining a significant fraction of the behavior (M04 = 67%, M05 = 67%, M06= 45%, M07 = 82%, p < 0.05, Wilcoxon rank-sum test, Fig. 5d). To rule out that this effect is due to the direction of task manipulation, we performed a control experiment with multiple changes occurring with alternating conditions (high-low-high-low), showing that S1 sensitivity closely followed the direction of changes in a way that increasingly explained the behavioral adaptation (Suppl. Fig. 4). Again, the psychometric threshold itself is consistently changing once contextual changes are introduced. In other words, animals performed equally well when challenged with a change whether they had experienced it before or not. This suggests that the first time behavioral adaptation is required, areas outside of S1 could be primarily involved in driving this kind of behavior. However, as animals gain experience, adaptive responses seem to emerge in S1. We conclude that neuronal computations associated with behavioral flexibility can be found in S1 at an advanced training stage and propose that such computations might shift across the cortical hierarchy in an experience dependent manner.
a, Psychometric thresholds for different experience levels. ‘Novel’ refers to animals that experience the first switch in stimulus statistics (left panel). ‘Experienced’ refers to the second switch or later (right panel). b, Behavioral adaptation is calculated as the change in psychometric threshold (Δ threshold) across mice and experience level. Bars represent means across mice, errorbars represent SD. c, Explanatory power of S1 versus downstream quantified for the behavior of all mice undergoing repeated task changes. Pie plots depict the fraction explained by S1 activity G(si) versus downstream f(.). Columns depict different mice, rows correspond to training stages. d, Percent explained by S1 versus downstream on average, errorbars represent SD across mice. * P < 0.05; ** P < 0.01; *** P < 0.001; ‘n.s.’ not significant, two-sided Wilcoxon rank-sum test.
DISCUSSION
In this study, we have investigated learning and experience dependent behavior in the mouse somatosensory system. Our findings provide evidence that activity in primary somatosensory cortex can be highly dynamic in support of flexible sensory processing and experience dependent behavioral adaptation. We present the following novel aspects. First, S1 population activity is stable and mostly stimulus driven during early learning despite changes in behavioral performance. Second, detection behavior can be modified in a way as to maintain reward in the face of changing statistical properties of the stimulus. Third, S1 activity is highly dynamic in the face of a changing sensory environment predicting behavioral adaptation as individuals gain experience.
Learning occurs when an individual forms an association based on a new stimulus or context. This process provides obvious benefits such as flexible hunting, optimal foraging, and social communication, especially in environments that tend to change frequently and unpredictably. There is no doubt that associative learning can occur in animals without cortex, including all classes of vertebrates33 and a large number of invertebrate species34. Several studies elucidating the role of S1 with classical or operant conditioning have found that lesions of S1 do not affect learning12,35. Hong and colleagues showed that mice can learn to detect objects even after chronic inactivation of primary somatosensory cortex. This finding seems in direct contrast to the large body of work that argues the contrary36–38. However, this disagreement can presumably be resolved by considering the possibility that neuronal signals in these areas are actually highly dynamic, context dependent, and part of a flexible or adaptive framework. When probing S1 wide field activity during learning of the basic detection task we found that neuronal sensitivity is mostly stimulus driven and does not change during basic learning. This finding suggests that S1 provides a stable sensory representation over time as an individual learns, but it does not correlate with the increase in performance or the formation of a basic association.
However, we extended our training program and challenged subjects with a change in stimulus statistics once they had learned the basics of the task. Our results show that mice can develop a flexible behavioral strategy that is closely modulated by the change in sensory statistics allowing them to maintain a constant payoff. When probing S1 wide field activity during this behavioral adaptation, we find that neuronal sensitivity changes in an experience dependent manner; the change in S1 sensitivity emerges at a later training stage when subjects had already experienced the task modulation before.
The general behavioral question posed here relates to how the animal responds to changes in the sensory environment. The behavioral paradigm was designed as a highly simplified but carefully controlled manipulation of the statistical distribution of the magnitude of whisker deflections experienced by the animal. Aside from matching amplitudes and velocities of the whisker movement that have been described in a range of studies20,21,39-41, the current study does not attempt to place this in the context of the natural sensory environment for the animal, and the passive stimulus paradigm does not speak to active sensing. Importantly, due to practical considerations, the primary behavioral experiments in this study are admittedly limited to a fairly narrow view of the adaptive behavior of the animal, with blocks of behavioral sessions with a single controlled switch in stimulus statistics, as opposed to presumably a more complex evolution of stimulus statistics in the natural environment. However, we performed a control with multiple changes occurring with alternating conditions, showing that S1 sensitivity closely followed the direction of changes in a way that increasingly explained the behavioral adaptation (Suppl. Fig. 4), suggesting that the findings here likely generalize to more complex scenarios. Furthermore, our previous study employing a similar psychophysical approach in the rat shows that behavioral adaptation takes place independent of the direction of task manipulation, and that the adaptive behavior generalizes to changes in other aspects of the statistics of the sensory stimulus16. How the behavior, and corresponding sensory representations change in more complex, naturalistic settings, and at a finer time resolution, are important, and need further investigation in future studies.
Nevertheless, our finding of context dependent adaptive responses in S1 is surprising as it suggests that reward based choice signals might shift across the cortical network and can appear and influence sensory representation in S1 once an individual has successfully adapted its behavioral strategy. In this context it is important to note that the behavioral adaptation itself is similar at different experience levels, showing consistent changes in performance already before the adaptive response even appears in S1 (novel performer). This is important, as the change in S1 we observe through GEVI imaging averaged across trials is therefore not simply reflecting a difference in performance (e.g. ratio of hits to misses) across the levels of experience, but instead a true change in the S1 response with experience in the context of this task.
So, what are the downstream targets that drive early behavior and ultimately change the stimulus representation of primary sensory areas? A recent study probing flexible decision making in the somatosensory pathway of mice found that orbitofrontal cortex dynamically interacts with sensory cortex triggering plasticity based on value signals15. However, other important reward based choice signals have been reported to influence neuronal signaling throughout cortex42. Furthermore, studies of visual attention in primates have distinguished changes in neuronal sensitivity from an observer’s response criterion in extra striate cortex, superior colliculus, as well as lateral prefrontal cortex43–45. Hence, there is a large number of downstream candidates potentially modulating the sensory signal stream. Admittedly, in the current study we did not record in any other area outside S1, but the effects described by those other studies collectively point to the interesting idea that the neuronal signal transfer identified by our study could be a common principle across neocortex with “earlier” stages emerging with experience. Moreover, we propose that this shift in cognitive signaling could even affect subcortical structures such as primary thalamus as we have shown recently in highly trained animals25.
Arguably, our current study was not specifically designed to identify attention as a driver for adaptive behavior, even though the measured changes in performance are indicative of a somewhat conscious change in the subjects’ behavioral strategy. Other effects related to the subjects’ thirst, satiation, or arousal level within this behavioral framework can further be excluded, as we have shown previously16. However, although speculative, the experience dependent change in S1 sensitivity and the corresponding change in downstream criterion may be signatures of a higher order cognitive process that could be described as an “attentional spotlight” or gain control moving down the cortical hierarchy. Such a brain-wide dynamic process would facilitate selective and efficient cognitive processing as an individual is adapting to changes in the environment. This transfer could explain some of the disparities in the literature and therefore needs to be further investigated in future studies.
Supplemental Figures
a, Learning curve for 3 mice trained on the basic Go/No-Go detection task with a strong stimulus (16 degrees). b, Fluorescent activity in S1 from an example mouse during learning. Shown are frames at the peak response to a 16 degree stimulus, catch trials are shown below. Scale bar = 1 mm. c, Dprime metrics for both behavioral and neuronal data during learning of the task. Note, mice achieved higher hit rates at the beginning of training (interim) and reached successful task acquisition in half the time if compared to mice detecting weak stimuli (Fig 2). The dotted lines separate performance into interim (intermediate performance, d’ = 0.8-1.5), and acquired (d’ > 1.5). The right panel shows the same data separated for individuals (symbols) before (interim) and after learning (acquired). Other figure conventions are the same as in Figure 2.
a, Fluorescence activity in response to different stimuli. Each frame is normalized to the frame at stimulus delivery (ΔF/F0 = F-F0/F0). Shown are images at the response peak. The dotted line represents a slice through the images aligned with the maximum fluorescence. Scale-bar = 1 mm. b, Magnitude (response amplitude ΔF/F0) versus cortical area extracted from the slice in a. Activity patterns are shown in different shades of grey for different stimulus amplitudes. c, Relationship between fluorescence magnitude and width of activated area, both scale with stimulus strength in a highly correlated fashion. The width is derived from a threshold (ΔF/F0 = 0.25%, dashed line in b). Data points correspond to different stimulus amplitudes fitted with a linear regression. Plots show average data across mice (n = 4) and sessions (n = 40).
a, Different normalization methods. F0 was varied by using the mean fluorescence across different time windows before stimulus onset. The ΔF/F0 measurement is then calculated by F-F0/F0. Mean fluorescence traces (n=4 mice, 831-870 trials) in response to the high (magenta) and the low range (green) stimulus condition. The grey box depicts the window for calculating F0. From left to right: F0=0 ms (frame at stimulus delivery), F0 from [-100,0] ms, and F0 from [-200,0] ms. b, Single-trial signal-peak and noise distributions from the data analysis as in A. Red and green numbers are mean fluorescent values in % ΔF/F0 for the high and low range condition. Black numbers represent dprime metrics. c, Dprime and variance for different F0 calculations. d, Receiver operating characteristic (ROC) curves created by shifting the criterion across the ΔF/F0 signal and noise distributions of the high and low range condition. The downstream criterion can be inferred by comparing the hit rate in ROC space with the average behavioral hit rate (dashed lines).
Pie plots depict the fraction explained by S1 [G(Si)] versus downstream [f(.)] quantified for the behavior of a control animal undergoing four changes (high-low-high-low).
Acknowledgements
C. W. was supported by a fellowship from the Deutsche Forschungsgemeinschaft (GZ: WA 3862/1-1) and NIH grant U01NS094302. P.Y.B. was supported by an NIH NRSA pre-doctoral fellowship F31NS09869. G. B. S. was supported by NIH Brain Grants R01NS104928 and U01NS094302. We thank Cornelius Schwarz for helpful feedback on the manuscript. We also acknowledge April Reedy with the Emory University Integrated Cellular Imaging Core.