Abstract
Visual perceptual decision-making involves multiple components including visual encoding, attention, accumulation of evidence, and motor execution. Recent research suggests that EEG oscillations can identify the time of encoding and the onset of evidence accumulation during perceptual decision-making. Although scientists show that spatial attention improves participant performance in decision making, little is know about how spatial attention influences the individual cognitive components that gives rise to that improvement in performance. We found evidence in this work that both visual encoding time (VET) before evidence accumulation and other non-decision time process after or during evidence accumulation are influenced by spatial top-down attention, but not evidence accumulation itself. Specifically we used an open-source data set in which participants were informed about the location of a target stimulus in the visual field on some trials during a face-car perceptual decision-making task. Fitting neural drift-diffusion models to response time, accuracy, and single-trial N200 latencies (~ 125 to 225 ms post-stimulus) of EEG allowed us to separate the processes of visual encoding and the decision process from other non-decision time processes such as motor execution. These models were fit in a single step in a hierarchical Bayesian framework. Model selection criteria and comparison to model simulations show that spatial attention manipulates both VET and other non-decision time process. We discuss why spatial attention may affect other non-evidence accumulation processes, such as motor execution time (MET), and why this may seem unexpected given the literature. We make recommendations for future work on this topic.
1 Introduction
In daily life, we frequently experience environments where we are obliged to make fast decisions via ambiguous and low-coherent sensory information [1, 52, 53]. Perceptual tasks are useful for studying decision-making due to the precise control of the quantity and quality of sensory information as well as being able to manipulate latent underlying effects on response time and accuracy, such as visual attention [20]. One of the most important factors that influences this decision process is topdown spatial attention. In everyday life, top-down prioritized factors can control visual attention, for example: knowledge, expectation, and current goals [10, 51]. In spatial prioritization tasks to explore the effects of top-down visual attention, a cue (e.g. arrow) on some experimental trials is often used to inform participants to attend covertly, without saccadic eye movements or head movements, to a location in the periphery of the visual field. These experiments translate directly to everyday experiences. For instance, when driving, expectations about the location of traffic lights helps drivers more quickly execute the choice to stop or accelerate.
Visual perceptual decision making immediately after a visual stimulus is suspected to involve two classes of processes, decision process(es) containing an accumulation of evidence toward decision choices and/or urgency signals, and non-decision time processes, though to contain at least visual encoding time (VET) and motor execution time (MET) [50, 53, 51, 7, 12, 44], although non-decision time could contain any non-evidence accumulation and non-urgency process. Top-down spatial cues are known to improve behavioral performance and manipulate neural mechanisms ([76, 47, 55]). But while spatial prioritization is well studied, traditional proposed models could not separately identify different effects of spatial prioritization on VET, evidence accumulation, and other nondecision time components, such as MET. In this study we find evidence that spatial prioritization affects VET and other non-decision time components, and not evidence accumulation.
Two-alternative forced-choice tasks are described well by sequential sampling models. These models assume individuals accumulate adequate information until evidence is reached for one of two choices, typically conceptualized as hitting upper or lower boundaries [53]. Sequential sampling models often contain parameters with cognitive interpretations. These parameters can then be compared across experimental conditions to understand cognitive effects as well as be compared directly to neural measures. For instance, researchers revealed that manipulating task difficulty of the stimuli will specifically increase decision time by decreased the drift rate, a parameter that tracks the average rate of evidence accumulation within a trial [48, 21, 53].
Event Related Potentials (ERPs) are averages of EEG across experimental trials, time-locked to specific events such as the onset of a stimulus or the execution of a response, such as a button press. Even though early ERP responses from the brainstem after the onset of auditory stimuli may be presented in a few milliseconds, ERP responses from the primary visual cortex take approximately 40-60 milliseconds ([38]). Visual evidence received by the retina must pass through the LGN to reach the primary visual cortex, then that information is preprocessed, decoded, and prepared before further cognitive use [24]. Part of this process is target selection ([37]) while another component of this process is figure-ground segregation. The time course of figure-ground segregation is thought to depend upon visual elements such as distractors within visual stimuli and low coherence of stimuli [34, 44]. We define the time between the onset of stimuli to the beginning of the accumulation process as Visual Encoding Time (VET). VET is thought to occur before the evidence can be accumulated during decision making, although some researchers show that evidence accumulation and motor planning could occur in-parallel and not sequentially [59, 11]. VET is expected to finish between 150 ms and 225 ms after stimulus onset [64, 68, 44, 27]. For instance, when monkeys were trained to report the coherent direction of motion in a random dot motion task by a saccadic eye movement, groups of neurons in the lateral intraparietal cortex (LIP) were found to represent evidence accumulation for a saccadic choice [54]. The onset of this neural evidence accumulation typically starts at similar periods around 200 ms after the presence of random dot motion stimuli ([54, 30, 60]). Furthermore Thorpe et. al. ([64]) found that participants across decision making in a go/no-go task have visual processing during around 150 ms after stimulus onset. Finally, Loughnane et al. ([37]) identified two pairs of N200 ERPs, negative deflections occuring around 200 ms after changes bi-hemispheric visual stimuli, in temporo-occipital electrodes which effect the onset of accumulated sensory evidence during a random dot motion task.
Over the last few years, researchers have begun to examine the underlying cognition and neural correlates of the decision making process simultaneously with neuro-cognitive modeling. These models allow consideration of underlying connections between cognitive model parameters and brain dynamics. Neuro-cognitive joint modeling is thought to be the most powerful technique for linking the electrophysiological dynamics of the brain across experimental trials to cognition and associated cognitive model parameters [49]. This research has resulted in models that can integrate singletrial electroencephalography (EEG) measures and individuals’ behavioral performance to make inferences about underlying states of the brain and behavior. Using single-trial EEG analysis, it has been shown P200 measures after visual noise and N200 measures after visual stimuli (i.e. positive and negative deflections around 200 ms) could delineate single-trial visual attention effects on evidence accumulation and non-decision times [45]. Nunez et al. ([44, 45, 46]) proposed new neurocognitive hierarchical models of decision making to separate measures of non-decision process by fitting models to both brain electrophysiology (EEG) and human response times and choices. These model parameters can then be related to visual attention as enforced by experimental paradigms.
In our previous work, we show that non-decision time is effected by spatial top-down attention in face-car perceptual decision-making task [19]. The main purpose of this current study is differentiate components of the non-decision process that are effected using new measured derived from the same experimental data. Specifically we hypothesised that single-trial N200 peak-latencies would reveal the effects of spatial attention on the non-decision process across experimental conditions and participants. We then sought to differentiate whether spatial prioritization influences VET or other non-VET non-decision times, or both during perceptual decision making. Using a public dataset of face-car perceptual decision-making task, we used singular value decomposition (SVD) to extract single-trial N200 latencies for all conditions (two levels of spatial prioritization and two levels of visual coherence) and then applied neurocognitive modeling to find associations between DDM parameters and the N200 latency on each experimental trial. We constructed hierarchical models to identify which components of NDT are the most influential to spatial top-down cues and then conducted model comparison informed by a simulation study. We found evidence that spatial prioritization can effect other non-decision time processes in addition to VET while not effecting decision-making itself.
2 Methods
2.1 Data collection
We used a public dataset from an experiment to understand the interaction of perceptual decision making and spatial top-down attention [18]. In this work, seventeen participants (8 females, mean age was 25.9 years, range 20-33 years, 2 left-handed) from the University of Birmingham were recruited to perform a face-car perceptual decision-making task (see Figure 1). Both behavioral data (reaction time and accuracy) and data from electroencephalograms (EEG) were recorded while participants performed the task. The data collection was separated into two experimental sessions of approximately 10 minutes each. At the beginning of each trial, a one-way arrow cue (left or right) or two-way arrow cue was presented for one second, followed by a visual stimulus (face or car) that was shown for 200 milliseconds. Participants are instructed to press a button based on whether they perceived a face or car stimulus. Participants used their index finger and middle finger of the right hand to respond. To avoid anticipatory responses by participants, an inter-stimulus interval (ISI) between 0 msec and 300 ms was used. During the task, all participants were instructed to maintain their gaze at the fixation point. This approach ensures the detection of spatial covert attention to stimuli rather than overt eye movements. The two different arrow types made up the spatial prioritization experimental condition, with the “prioritized” level given by those trials with a one-way arrow and the “non-prioritized” level given by those trials with a two-way arrow. There was an additional experimental condition of spatial coherence, such that there was a manipulation of coherence of the visual stimuli (face or car stimulus), split into “low” and “high” coherence levels. Two-level coherence manipulation, prioritized and non-prioritized cues were independent variables to manipulate the ambiguity of sensory information and spatial attention to a location of visual fields respectively. One of the participants was excluded from the data set because of both behavioral and EEG data corruption. All trials were randomly selected for each of four manipulations. For more information about experimental task design and procedure, enthusiasm readers can refer to the original work written on the data by Georgie et al. [18].
Two levels were given to participants for coherence (high and low) and two levels for spatial attention (prioritized vs non-prioritized), randomly mixed across trials. For each trial, one-way arrows (informative cueing) or two-way arrows (uninformative cueing) were shown in the center of the screen for 1000 milliseconds. Then, a picture of a car or face was shown on the left or right side of the screen for 200 milliseconds. Participants then pressed a button to respond whether they perceived a car or a face. The study design was a 2×2 factorial design. Two levels were given to participants for coherence (high and low) and two levels for spatial attention (prioritized vs non-prioritized), randomly mixed across trials. For each trial, one-way arrows (informative cueing) or two-way arrows (uninformative cueing) were shown in the center of the screen for 1000 milliseconds. Then, a picture of a car or face was shown on the left or right side of the screen for 200 milliseconds. Participants then pressed a button to respond whether they perceived a car or a face
Each participant performed the task for a total of 288 trials. This resulted in 72 trials for each unique combination of experimental manipulations: high coherence and prioritization, high coherence and non-prioritization, low coherence and prioritization, and low coherence and nonprioritization. For each manipulation, 36 trials for each face and car were randomly presented. For EEG data acquisition, 64-channels with 10-20 systems and two extra sensors relating to electrocardiogram (ECG) signals and correcting eye-blinking artifacts (EOG) were used. The ECG sensor was attached approximately 2 cm under the left collarbone, and the other sensor was placed under the left eye. Finally, if a response was faster than 150 ms, the trial was discarded.
2.2 EEG preprocessing
EEG data is an amalgam of muscle and other biological artifacts, electrical noise, and true brain oscillations, therefore it needs to be cleaned to extract signal and related task fluctuations [38]. We applied some oft-used preprocessing steps to the EEG data. However, some preprocessing stages related directly to N200 latency and SVD decomposition such as down-sampling and band-pass filtering. These preprocessing steps were conducted using the MNE Python module [22].
The summary of preprocessing steps is as follows: (1) down-sampling the raw data to 1024 samples per second from the original 5000 samples per second, (2) applying a Butterworth IIR band-pass filter from 1 to 10 Hz to match the filter parameters used by Nunez et al. [44] to calculate single-trial N200s (note that a 1 Hz highpass could decrease the amplitude of slower and later potentials such as the P300), (3) re-referencing to the common average reference, splitting the EEG data into epochs −100 ms to 400 ms time-locked to the face/car onset and subtracting the average baseline potential, (4) running Independent Component Analysis (ICA) with the fastica algorithm [26] in order to inspect and remove artifactual components based on visual inspection (using the power spectrum and variance of the signal of the component), such as muscle, eye blinks, or eye movements, without removing the affected data portions, and (5) automatically removing Independent Components whose time courses matched EOG sensors or ECG sensors (such that these ICs reflected eye blinks and heart rhythm artifacts respectively), and finally (6) converting the data back into sensor space from Independent Component space. The EEG preprocessing code was implemented by MNE package in Python [22].
2.3 Estimation of single-trial N200 waveforms
Calculating ERPs is easy to implement and results in large signal-to-noise ratios [39]. This causes a lot of research to study individual differences relating to variances of evoked potentials, e.g. see [57, 28]. Traditional trial-averaged event-related potential (ERPs) have two main disadvantages: (1) missing potentially relevant information on each individual trial (2) requiring a large number of participants to test the scientific hypothesis [8, 2]. One alternative is to use information from EEG records on single trials [4].
However, although single-trial EEG analyses provide appropriate information across trials, these analyses are less robust to artifact and noise that results in variability across trials and make it difficult to draw inferences from these analyses. Also, these analyses using single electrodes often result in information that is not revealed above higher amplitude noise. To mitigate these problems with single-trial ERP analysis, we applied Singular Value Decomposition (SVD) to boost the signal-to-noise ratios for single-trial estimates [45]. This method uses a component of the trial-average ERP in all electrodes to find the optimal weight across electrodes as a spatial filter to extract the N200s on single-trials. Thus this method produces a topology map on the scalp for each individual relating to their own waveform. Thus, one of the most important advantages of the SVD technique is to not need to set predefined electrodes. Note that the SVD algorithm we used is non-stochastic and deterministic.
After applying SVD to the trial-averaged data, we found N200 waveforms and associated weight maps in either the first or second principal components (e.g. the two most influential components to the overall variance in the trial-averaged data). The input to the SVD algorithm was the matrix of ERPs consisting of 125 ms to 250 ms samples post-stimulus (see Figure 2). Thus for each participant, the following formulas were constructed:
where
is a matrix of ERPs for each electrode, T is the number of time points between 125 to 250 ms after stimulus, which for our data is 72 points, C is the number of electrodes (64 electrodes).
contains the left-singular vectors,
is the matrix of singular values, and
contains the rightsingular vectors.
SVD to build channel weights and single-trial N200 latency for all electrodes.
The first or second right singular vector v (C × 1), corresponding to the first and second components respectively, was multiplied to the continuous data matrix E (from 125 to 225 ms after stimulus) for every single trial, then an N × 1 vector E × v = e was obtained. This approach is different from the traditional ERP estimation method that computes potentials from single electrodes or pre-specified groups of electrodes, and not all electrodes. Our single-trial technique computes single-trial estimates using a weighted map of all electrodes. In order to compute minimum latencies of N200 waveforms, a few time measurement windows were considered by observing the distributions of N200 latencies and the N200 waveform (see Figure 5) before using these values in further analysis. The following windows were explored by not ultimately used: 100-300 ms post-stimulus, 125-275 ms post-stimulus, 125-225 ms post-stimulus, 150-250 ms post-stimulus, and 175-274 ms post-stimulus. A 125-225 ms window post-stimulus of the ERP data was chosen to be submitted to SVD to maximally find N200 SVD components. Then a slightly larger window of 100-250 ms was used to compute the N200 minimum latency. Minimum values found at the boundaries of 100 or 250 ms of single-trial N200 peak-latencies were removed before neuro-cognitive model fitting since they were characteristic of a ramping potential on a trial rather than a clear N200 fluctuation. This resulted in removing a mean of 15% of trials (a maximum of 25% and a minimum 2%) across participants (see Figure 5).
2.4 Integrated neuro-cognitive model fitting
Nunez et al. [44] proposed a new neuro-cognitive hierarchical Bayesian model to separate VET, other non-decision time processes, and evidence accumulation using EEG measures and behavioral performance. This approach helps researchers to test hypotheses of trial-to-trial and individual differences with the help of neural activity and human behavior during forced-choice fast perceptual decision-making (see Figure 2). We used a new form of the neuro-cognitive drift-diffusion model by adding intrinsic trial-to-trial variability (η) for drift-rate that was not related to trial-to-trial variability in N200 in any model, and four main parameters consist of boundary α indicating the speed-accuracy trade-off and the level of conservatism, drift rate δ indicating the mean slope of evidence accumulation and level of task difficulty, starting point β = .5 indicating bias of the accumulation process, non-accumulation time τ indicating both of encoding time and execution time. To test our hypothesis, we fit four different hierarchical neuro-cognitive models consisting of embedded linear connections between single-trial N200 latencies from EEG and non-decision time parameter. We also fit a comparison model with linear connections between single-trial N200 latencies and all parameters. We found in simulation (see later text) that these models can separate underlying latent components of non-decision time and also answer the question of whether VET and/or other non-decision time components are shifted significantly by spatial prioritization.
Cognitive scientists typically focus on group-level differences due to manipulations and disorders. However, individual differences in spatial attention may unveil some latent aspects of the prioritization related to sub-components of non-decision time [35]. Also, in a individual-level Bayesian framework, we would have no common hierarchical parameters through all participants. Hierarchical models often result in more accurate estimates of parameters due to “shrinkage” towards mean parameters [16].
We therefore fit hierarchical neuro-cognitive models in order to incorporate intrinsic differences in individuals cognitive strategies and abilities when assessing whether spatial prioritization affects visual encoding time and other non-decision times. These hierarchical neuro-cognitive models also allowed us to identify individual differences in the relationship of single-trial N200s to non-decision times.
To fit each model to all participants’ data simultaneously, we used the probabilistic programming language and Markov Chain Monte Carlo (MCMC) sampler Stan [5] within Python using the pystan connector. For each model and non-parametric bootstrap iteration (see Appendix ”Testing hypotheses on behavioral data” section), three MCMC chains were run to generate 10,000 samples from the joint posterior distribution of parameters. The initial 2000 samples were discarded as a burn-in phase to minimize the effect of initial values on posterior inference, and then a thinning parameter of 2 was used to result in 4000 posterior samples in each chain. The convergence of the Markov chains was assessed through visual inspection as well as by calculating the Gelman-Rubin statistic, R-hat, to ensure that the models had properly converged [16]. Specifically, R-hat is a statistic that compares between-chain and within-chain variance. The collection of posterior samples from each chain was used to form one posterior sample of 12,000 samples for each parameter. All four models converged based on R-hat statistics that were less than 1.01 for all parameters in each model.
Model1 assumes that the linear relationship between single-trial N200 latencies and single-trial non-decision times does not depend on spatial prioritization. This model is designed to test a hypothesis that spatial prioritization itself would only shift the single-trial N200 latencies and not any relationship to non-decision times nor any residual non-decision time component unrelated to N200 latencies. Model2 assumes that spatial prioritization would only shift the relationship of single-trial N200 latencies to non-decision time and not the residual non-decision time. Model3 assumes that spatial prioritization only affects the non-decision time unrelated to N200 latencies. Finally Model4 assumes that spatial prioritization may affect both non-decision time related to N200 latencies and non-decision time unrelated to N200 latencies. For all models, uninformative prior distributions were used, such that our model-fitting procedure was a complete data-driven analysis, see the check (tick) marks in Table 1.
Based on the four criteria model fitting, Model4 is a better model to describe the manipulation of spatial prioritization. Lower WAICs, -LPPD and -ELPD indicate better fits to data after accounting for model complexity.
For Model1 (see Figure 3a): indices of i refer to experimental trials. Also, indices of j as participants, k as conditions, k1 as coherence, and k2 as spatial prioritization were set. None of the residual (r) nor linear coefficient (λ) parameters were free to vary with spatial prioritization.
Schematic diagrams of the hierarchical Bayesian models based on the convention of Lee and Wagenmakers [36]. Nodes show random variables in the model and arrows specify what variables affect other variables. The reaction time of yijk is an observed variable for the trial i level, the participant j level and the condition k level. Indices of k1 and k2 display two-level coherence and two-level spatial prioritization.
Distributions of parameters for each participant i and condition k were as follows:
Normal prior and hyperprior distributions with average and variance parameters were used for non-variance variables. Gamma (Γ) prior and hyperprior distributions with shape and scale parameters were used for variance parameters. Wiener is the Wiener first passage time distribution for the Drift-Diffusion Model (DDM) [70].
For Model2 only residual (r) parameters were free to vary with spatial prioritization (k2). Prior and likelihood distributions were the same as Model1 besides the following lines that now varied with spatial prioritization (k2):
For Model3 only linear coefficient (λ) parameters were free to vary with spatial prioritization (k2). Prior and likelihood distributions were the same as Model1 besides the following lines that now varied with spatial prioritization (k2):
For Model4 (see Figure 3b) both the residual (r) and linear coefficient (λ) parameters were free to vary with spatial prioritization (k2). Again, prior and likelihood distributions were the same as Model1 besides the following lines that now varied with spatial prioritization (k2):
Finally we fit a comparison model, Model5 with all variables free to vary by trial-to-trial N200 latencies. This model can be found in the Appendix ”The full neuro-cognitive model, Model5” section and Figure 9.
Each hierarchical neuro-cognitive model contained linear connections between single-trial N200 latencies and model parameters indirectly as neuro-cognitive models. Thus the relationship between DDM parameters and visual processing of the brain was evaluated in every trial. In this way, singletrial estimates of N200 latencies were assumed to be non-linearly related to choice-response times across all data points.
2.5 Non-parametric bootstrapping
Bootstrapping is a flexible statistical procedure that resamples observed data to generate new simulated samples, and bootstrapping’s simple implementation makes it attractive for use in psychological applications [71, 75]. Bootstrapping can have a variety of uses. For instance, researchers often apply bootstrapping to approximate confident intervals of parameters which are predicted by point estimation methods, such maximum likelihood method [42]. There are two different approaches to bootstrap. Non-parametric bootstrapping uses only the observed data to re-sample and generate more data while parametric bootstrapping generates new samples from a parametric model which has first been fitted to the observed data. Both bootstrapping strategies have been previously used in maximum likelihood estimation (MLE) to construct difference distributions of goodness-of-fit for two different models and to differentiate between these models [71].
In this study, we used a non-parametric bootstrapping procedure to generate a new sample of participants with replacement 30 times from the observed data in hierarchical Bayesian models. We performed this analysis in order to improve model comparison inference and mitigate the influence of particular participants on model comparison. Note the power of our study to find the effects of spatial attention was based on the number of single trials data. For each iteration of model fitting, 30 participants were taken randomly with replacements from the 15 participants to estimate the posterior distribution of parameters.
In order to replicate and re-analyze the current behavioural dataset with a different approach from our previous work [19], using non-parametric bootstrapping, we built four models presented in Appendix part A and reported their results in Figures 7 and 8. The results of model selection criteria such as WAIC, PWAIC and ELPD show that non-decision time is a foremost parameter to explain spatial prioritization during perceptual decision making. We also performed non-parametric bootstrapping for comparisons between Models 1 to 4 in the current dataset. The results of these comparisons, in addition to model comparison for Model 5 without using bootstrapping, are located in Table 3. For each model and bootstrap iteration, three Markov Chain Monte Carlo simulations ([14]) were used to generate 10,000 samples from the joint posterior distribution of parameters, from which the initial 2000 samples were discarded as a burn-in phase to minimize the effect of initial values on the posterior inference and a thinning parameter of 2 resulting in 4000 posterior samples in each chain. The collection of posterior samples from each chain was used to form one posterior sample of 12,000 samples for each parameter. All four models converged based on R-hat were less than 1.01 for all parameters in each model. The original 16 participants were re-sampled with non-parametric bootstrap to 30 participants.
2.6 Comparing models
Some model selection criteria were used to select the best model such as the widely applicable information criterion (WAIC) which is lower are better, the log pointwise predictive density (lppd) is the log predictive accuracy of the fitted model to data which is higher is better, expected log pointwise predicitive density for a new dataset (ELPD) which is higher is better, and the effective number of parameters (PWAIC) which is penalizing by the complexity of a model and, can be used for a comparison across these models [63, 69, 13]. According to the parsimony principle, if two models have the same explanation about the observed data, we should choose the simple model over the complex model resulting lower PWAIC value [67, 71]. The following equations have been used for calculating model selection criteria where S is the number of sample draws from the posterior distribution, P(yi|θs) is the likelihood of individual sample s from 1 to S and each data point yi from observed data y1, …, yN, and is variance [66, 17, 69]:
2.7 Individual differences analysis
The assumption of hierarchical levels is that posteriors of participants’ parameters come from a parent distribution such as normal distribution. This often leads to better parameter estimates due to shrinkage, even though the hierarchical distribution assumptions are not necessarily true [16]. However our neuro-cognitive hierarchical models had the disadvantage of containing potentially noisy EEG measures that differed across participants. In our models it was difficult to incorporate these differences in EEG noise with hierarchical parameters. Therefore to ensure we picked the best model out of Model1, Model2, Model3, Model4, and Model5, we fit individualized (non-hierarchical) models to each individual and performed model comparison for each participant.
2.8 Simulation of three hypotheses and parameter recovery
In order to answer the question about whether spatial attention effects only visual encoding times or other non-decision times, or both, we considered three theories. These three theories were simulated directly and then the data was fit to each of the hierarchical neuro-cognitive models previously discussed. For each simulation we simulated single-trial Visual Encoding Time (VET) and Motor Execution Time (MET). The specific single-trial VETs (τ(e)) and METs (τ(m)) were both assumed to come from a random draw from a normal distribution with a certain mean that could change based on the spatial prioritization condition and with a standard deviation of 100 ms (.1 sec). We assumed that single-trial N200 latencies were drawn from a normal distribution with mean of .5τ(e)i and variance of 50 ms. Note that N200 latencies are a purely visual ERP and have been shown to track the onset of evidence accumulation [38, 37, 44]. For this reason, we assumed that true VET only influenced single-trial N200 latencies positively, and that true MET (a reflection of all other possible non-decision times unrelated to visual processing) were unaffected by single-trial N200 latencies. In each of the three simulations, we varied which of VET or MET would change with the two spatial prioritization (k2) levels, while the drift rate always changed by the two spatial coherence levels (k1).
The specific formulation of simulations were as follows:
In Simulation 1, we simulated N200 latencies and choice-response times with VET μτe = .3 (300 ms) in the prioritized level of the spatial prioritization condition and VET μτe = .5 (500 ms) in the non-prioritized level of the same condition. Also in Simulation 1, we simulated MET μτm = .4 (400 ms) in each level and thus MET (μτm) did not change by spatial prioritization. In Simulation 2, we simulated data with VET μτe = .3 in each spatial prioritization level but with MET changing across spatial prioritization levels. Such that in Simulation 2, MET μτm = .4 (400 ms) in the prioritized level and MET μτe = .6 (600 ms) in the non-prioritized level. In Simulation 3 both VET and MET changed, with VET μτe = .3 (300 ms) and MET μτm = .4 (400 ms) in the prioritized level and VET μτe = .5 (500 ms) and MET μτm = .6 (600 ms) in the non-prioritized level.
For each of the three simulations we simulated the data with δ = 2.5 (evidence units per second) in the high coherence level of the spatial coherence condition and δ = 1.5 in the low coherence condition. In each of the three simulations the other parameters were fixed across all levels of the two conditions: α = 1.5 evidence units, β = .5 evidence units, η = 0 evidence units / sec.
We then fit each of the three simulations to the first four hierarchical models. For each model fit three MCMC chains ([14]) were used to generate 10,000 samples from the joint posterior distribution of parameters, from which the initial 4000 samples were discarded as a burn-in phase to minimize the effect of initial values on the posterior inference and a thinning parameter of 3 resulting in 6000 posterior samples. All models converged based on R-hat were less than 1.01 for all parameters in each model.
3 Results
3.1 Behavioral modeling results
To understand the true effects of the both independent variables, two-way repeated measure ANOVAs were used to reveal the true effects and interaction of the variables for both response time and accuracy. Fitting an ANOVA for mean response times across participants revealed significant main effects of phase coherence (F(2,14) = 31.72, p < 0.001) and spatial cueing (F(2,14) = 14.29, p = 0.002). These results imply that both the spatial cueing and phase coherence manipulations shifted response times for the face-car perceptual decision making task. In particular, the high phase coherence condition had faster response times in comparison to low-phase coherence. Also, informative one-way arrows giving prioritized cues resulted in faster response times than when two-way arrows gave non-prioritized cues. Note that, for response time, the Cohen’s d effect size of spatial manipulation and phase coherence are 1.5 and 1.01 respectively, and also for accuracy, the Cohen’s d effect size of them are 2 and .47 respectively [9]. The interaction effect between phase-coherence and spatial cues was not significant (F(2,15) = 0.81, p = 0.38). Note that because of these results we fit neuro-cognitive models with certain parameters that could change based on each experimental manipulation, reflecting the 2 × 2 task design, instead of allowing all parameters to change based on a four condition design.
A two-way repeated measures ANOVA was also fit to mean accuracy across participants. The main effect of phase-coherence was significant (F(2,14) = 55.82, p < 0.001), but the main effect of spatial cueing was not significant (F(2,13) = 3.0, p = 0.1). Similarly the interactions of spatial cueing and phase coherence was not significant (F(2,13) = 0.05, p = 0.81).
The results of cognitive modeling of behaviour when fitting hierarchical DDMs across participants confirmed that non-decision time is the most influenced parameter by spatial prioritization (see Figure 7 in Appendix part A). We used a non-parametric bootstrap procedure with 30 draws. This was a direct replication of our previous work [19]. Specifically to assess four cognitive models (see Appendix), we computed WAIC across non-parametric bootstrap samples. For each iteration of bootstrap sample, non-decision time was selected as a the most likely parameters manipulated by spatial cueing. For each of the cognitive models, the means of WAIC, PWAIC and ELPD across bootstrap are displayed in Appendix part A, Table 5. This result corresponds to the behavioral result because spatial cueing significantly changed mean response time and not mean accuracy. This is exactly what we would expect if since this effect only influenced non-decision time and not evidence accumulation.
3.2 Parameter recovery in the simulated data
In order to understand how the model fits translate to the three simulated hypotheses, we checked both model comparison criteria and posteriors of parameter recovery for each of the three simulations. For Simulation 1, the fit of the data to Model 1 fit the data the best between Models 1 through 4 in terms of both WAIC and ELPD, see Table 2 and see Figures 10 and 11. For Simulation 2, the fit of the data to Model 2 fit the data the best between Models 1 through 4, see Table 2 and Figure 12. For Simulation 3, the fit of the data to Model 4 fit the data the best between Models 1 though 4, see Table 2 and Figure 4.
Model comparison criteria for three simulated participants. * indicate the best model fit statistic between Models 1 to 4. Bold text indicates the best model fit statistic.
Parameter posteriors of Model 4 fit to Simulation 3. The shaded area represent the 95% Bayesian credible interval (BCI) of posteriors.
Although the minimum WAIC and maximum ELPD are signs of the best model, Model 5 did not differentiate between the three hypotheses, with both Simulation 1 and Simulation 3 being best fit by Model 5. Therefore true data results with Model 5 being the best fit model would not differentiate data between two hypotheses, namely that only VET or both VET and MET were affected by spatial prioritization. We did not find evidence that Model 5 fit the data the best, and thus differentiated the hypotheses between the model fits of Models 1 to 4.
3.3 Single-trial N200s were recovered for each participant with various levels of noise
We found single-trial N200s by using either the first or second component of the SVD method described above. For each participant we found component channel weights and component waveforms that best represented N200 latencies, see Figure 5. However the results indicate that the amount of noise across trials in the single-trial estimation varied by participant. Specifically the amount of trials estimated at the boundaries of realistic N200 values varied over participants, suggesting that some single-trials in those participants did not contain quality N200 latencies. In the neuro-cognitive models, we excluded these boundary effects by imputing N200 latencies for these trials while modeling choice-response times on these trials. Specifically we used a Bayesian model as implemented in the Stan language, in which missing data were represented new parameters.
Weighted map representations of the first or second SVD component of each participant. These scalp maps indicate the negative activation of occipito-parietal electrodes during the single-trial N200s peaks. It also displays the N200 waveform and distribution on all trials across all electrodes for each participant.
We also analyzed the data of each specific participant in separate neuro-cognitive model fits without hierarchies across participants. In particular Participant 5 had the least boundary effects of single-trial N200 estimation and had clear posterior channel weights and a clear N200 waveform, therefore we considered the results of a neuro-cognitive model fits of only this participant, in addition to all participants (see Results below).
3.4 Spatial prioritization shifts both VET and MET
We wished to know which non-decision time sub-component was the most pivotal for manipulating spatial prioritization during face-car perceptual decision making. We used the results of four neurocognitive models to discover which of VET or other non-decision time components were manipulated by top-down spatial cue. Based on model comparison criteria across non-parametric bootstrapping, Model4 was found to have the smallest WAIC and largest ELPD than all other models. For each of the non-parametric bootstrap samples, WAIC and -ELPD of Model4 are lower than the other models (see Table 1). For each the neuro-cognitive models, the means of WAIC, PWAIC and ELPD across bootstrap are given in Table 3.
Based on four criteria for model fitting, Model4 is the best model to describe manipulation of spatial prioritization. Lower WAICs indicate better fits to data after accounting for model complexity. For other model selection criteria larger values indicate better model fits to data. Note that PWAIC is the number of effective parameters and thus is not a great measure of model comparison. Bold text indicates the best model fit statistic.
Because Model4 was selected as a better model fitting based on four model selection criteria, we report the results of the model’s parameters and sub-component of its non-decision time, see Figure 6. Of particular note is a difference in the posterior distributions of residuals r between the prioritization and non-prioritization levels. Fourteen of 15 participants had significant differences in single-trial N200 latencies between the prioritization and non-prioritization conditions at an alpha level of 0.05, see Figure 6.
Posterior distributions of Hierarchical Bayesian model 4. HCP: high coherence and prioritization stimulus, HCNP: high coherence and non-prioritization stimulus, LCP: low coherence and prioritization stimulus, LCNP: low coherence and non-prioritization stimulus.
We also fit the neuro-cognitive models to each participants data individually. We found that Model 3 fit 9 participants’ data the best. Model 4 fit 2 participants’ data the best, and Model 5 fit 1 participant’s data the best. However because the single-trial N200 results suggested that some participants had lower quality N200 estimation across trials (see Results in the section above), we also used the cleanest participant’s single-trial N200 data to differentiate the three main hypotheses. Fitting the neuro-cognitive models to Participant 5 also resulted in Model 4 being the best model fit, see Table 4.
Model comparison for individual-level Bayesian analysis to compare five nested models. * indicates the participant with the most trustworthy single-trial N200 results. Bold text indicates the best model fit statistic.
Based on four criteria model fitting, Modelt is better moded to desctibe manipulation of spatial prioritization. Lower WAICs indicate better fits to data after accounting for model complexity. Other moel selection criteria should higher to be a better fits to data.
4 Discussion
Accumulation models can split the cognitive time course of the perceptual decision making process into a decision time and the non-decision time. However relying on fitting stochastic processes or formulations to behavioural data alone is not sufficient to decompose sub-components of nondecision time. Thus answering cognitive questions about the effects of experiential manipulations on non-decision processes during perceptual decision making requires more information than just participants’ behavioural data. Scientists have previously used electro-physiology activity to assist the quantitative models to be able to track latent psychological mechanisms better, including encoding and motor execution sub-components [59, 44, 72]. Neuro-cognitive models of drift-diffusion and single-trial EEG measures are an candidate to separate independent components of the decision process and to test hypotheses or differentiate theories [46]. In this study, we used used assumed single-trial N200-latencies from occipito-parietal electrodes that dynamically track visual encoding time to understand the effects of spatial attention on visual encoding and other non-decision time components.
4.1 Brain networks of spatial attention
We found evidence that Visual Encoding Time (VET), as encoded by relationships of single-trial N200 latencies to response time distributions, are affected by spatial attention. Specifically these single-trial N200 latencies were found in posterior electrodes in EEG, time-locked to the onset of visual stimuli. These results correspond to a large body of previous work to find spatial attention networks. A dorsal fronto-parietal network has been reported to play an effective role in topdown spatial attention [6]. This network usually extends to the occipital cortex (specifically, V1 and extrastriate areas), which likely reflects a top-down cue to modulate sensory representations [43, 3]. Bilateral BOLD signals in the primary visual cortex have been shown to be modulated with spatial cueing such that responses were greater when the subject attended to the stimuli in the contralateral hemifield [15, 29]. Findings show that successive location of visual targets lead to sustained activation of the intraparietal sulcus (IPS) and momentary activation of occipital cortex [10, 6]. Posterior parietal cortex consisting superior parietal lobule (SPL) and IPS is also a key component of a endogenous orienting network [41]. Finally, pre-stimulus alpha (8–14 Hz) oscillatory EEG activity in the parieto-occipital site is modulated by the endogenous spatial cueing paradigm [74, 65, 56].
However we also found evidence that spatial attention affects additional non-decision times apart from Visual Encoding Time (VET). One possible reason is that Motor Execution Time (MET) is affected by spatial attention in this data. Before the study we felt that an effect of spatial attention on MET was improbable. However our results can be placed in the context of previous work. For instance, top-down signals for attending such as lateral intraparietal area (LIP) has some effectors to response or action selection [10]. Attention-related modulations seem to happen primarily in response-related processes, and collection of evidence shows that the primate premotor cortex including dorsal premotor cortex (PMd) and supplementary motor area (SMA) plays a role in spatial attention [62, 31, 40]. Finally, premotor theory of attention derives from neurophysiological studies indicating that showing that spatial attention programs motor planning [61, 25]
4.2 Top-down spatial cue prioritization does not affect evidence accumulation
Researchers have previously found evidence that fixation point (gaze) leads to an amplifying effect on the attended option [32]. This phenomenon can be modeled with a sequential sampling theory of perceptual decision making, the attentional Drift Diffusion Model (aDDM) [33]. The assumption of the attentional Drift Diffusion Model (aDDM) is that attention can influence the rate of the evidence being gathered from the options during decision making, thus attention may shift driftrate (delta) parameter [33, 23]. Also, the aDDM assumes that selective gaze will result in relevant information being accumulated while irrelevant information is ignored. This simple assumption is not necessarily correct, because subjects may gaze a low-value option while are attending to the high-value option, resulting in gathering more evidence in favor of high-value option.
However, covert top-down spatial cue prioritization in the current experiment has a different mechanism to amplify performance. In this spatial attention manipulation, participants are only informed about the location of stimuli presence, and therefore top-down attention should not be related to dynamically shifting attention or eye-movement between options during options in decision task. There is only one stimulus to which participant should attend, the face or the car stimulus. Thus after seeing prioritization cue, participants have to make decision between a face or car in a certain level of spatial coherence. It is therefore understandable that the most important latent parameter that is affected by spatial cueing can not be drift rate in the current task (see top left of Figure 9 in the Appendix).
We proposed five nested models (scenarios) to test which of parameters and subcomponents are influenced by spatial attention. Finally, after comparing hierarchical models we revealed that the model 4 best fit the data, which in simulation was the best model when both of VET and other nondecision time components are shifted by spatial attention. Hierarchical effects of single-trial N200 latencies were positive indicating relationships to VET, see Figure 9g. The remaining non-decision time is assumed to be an approximate measure of motor execution time (MET) or mixture of other non-decision time processes. We found evidence that this non-decision time was also manipulated by spatial cueing, such that the prioritization level resulted in faster non-VET non-decision times than the non-prioritization level.
4.3 Individual differences in the findings of this study
While over all participants it was true that model results suggest both VET and other non-decision time components are effected by spatial cueing, the exact results varied by participant. We suspect that these could be due to actual strategic or cognitive differences across individuals. Or the results could be due to difference levels of noise and contaminants in the EEG N200 measures and behavioural data.
Specifically, we found that most participants data was best described by Model 3 while other participants data was best describe by Model 4 (see Table 4). This implies that for most participants, only the relationship of N200 to VET is affected by spatial prioritization. However when evaluating a particular participant’s data with less noisy and seemingly more precise single-trial N200 estimates, it was found that Model 4 best describes the data. Further study is necessary to determine if there are true cognitive differences in spatial attention between individuals or if these results were driven by EEG artifact differences between the two conditions in some participants.
4.4 Limitations of this study and future work
We decided to focus on occipito-parietal electrodes based on our literature review and previous research. We used first or second component of SVD to extract single-trial latencies which is assumed than they can track single trial visual encoding. The SVD can provide weighted map for all electrodes and waveforms at the same time such that weighted map should be positive and waveform should have negative peak around occipito-parietal sensors. However, the use of advanced signal processing techniques for extracting single-trial measures of the EEG oscillation could be developed to improve inference with less across trial noise or artifact. Furthermore, neurocognitive models that better account for sources of artifact in single-trials could be fit to EEG and behavioural data jointly.
Weindel et al. have proposed a novel approach based on the onsets of electromyographical (EMG) activity to separate response time into a pre-motor (PMT) and a motor time (MT) in perceptual decision tasks [72, 73]. Furthermore, researchers have shown that such data can be described well by Dual-Threshold Diffusion Models (DTDMs) [58]. In the future researchers should seek to combine both EEG and EMG activity within neuro-cognitive models to better decompose non-decision components from decision components in perceptual decision making tasks. These models will provide even more inference into the role of spatial attention on non-decision time components. Specifically these models can then whether evidence accumulation models that include non-decision time stages between evidence accumulation stages better explain spatial prioritization affects on non-decision times. This is because VET, evidence accumulation, and MT could be separately identified within models. These models can also directly test whether spatial prioritization effects motor time (MT) with appropriate experimental paradigms. This was not an question that we could easily be answer with existing data sets.
5 Conclusion
Top-down spatial prioritization amplifies individuals’ performance to make quicker decisions, since they do not require to take some time to search the visual field to find the location of stimulus appearance. Neuro-cognitive modeling incorporated human behavior and electrophysiological activity at the same time to constrain each in order to decompose the non-decision time and understand differential affects of spatial attention on visual encoding time (VET) and other non-decision time processes.
We revealed that both visual stimulus encoding and other non-VET non-decision processes are manipulated by spatial cueing. The group-level effect of single-trial N200 latencies on non-decision times was positive and group-level residual time for prioritization was lower than non-prioritization. Moreover, we show that occipital-parietal deflections extracted by SVD plays a key role in the study of spatial attention, derived directly from N200 latencies thought to track visual encoding time and the onset of evidence accumulation [37, 44].
5.1 Data and code sharing
Five models for neuro-cognitive models and four models for cognitive models were constructed. All models were run on 15 Cores PC, 60 gig RAM, Debian operation. All implementations for EEG preprocessing, singular value decomposition (SVD), hierarchical and individual-level Bayesian DDM models in Python and PyStan are available on https://github.com/AGhaderi/spatial_attenNCM, and dataset used in this research is publicly available in the Open Science Framework (https://osf.io/q4t8k/).
6 Appendix
6.1 Testing hypotheses on behavioral data
We re-analysed the this dataset with non-parametric bootstrapping to enlarge the dataset to further verify our results in [19], these results are reported here. We proposed four hierarchical models containing, modelt assuming spatial attention may shift non-decision time, modelv assuming spatial attention may shift drift rate parameter, modela assuming spatial attention may manipulate boundary, and modelp assuming spatial attention manipulates no parameters significantly. Based on model selection criteria for Hierarchical Bayesian model, modelt is better model for all nonparametric samples (see Figures 8 and 7)
The Four applicable information criteria of M = 30 non-parametric bootstrap samples and four models. Each sample is the n = 24 double size with replacement from the 12 training samples. To choose the best model, the WAIC measure should be lowest and ELPD should be highest.
Schematic diagrams of the four hierarchical Bayesian models based on the convention of Lee and Wagenmakers [36]. Nodes show random variables in the model and arrows specify what variables effect other variables. The reaction time of yijk is observed variable for the trial i level, the participant j level and the condition k level. Indices of k1 and k2 display two-level coherence and two-level spatial prioritization. In Modelt amounts of participant-level parameters such as evidence accumulation rate δjk1, non-decision time τjk2, boundary separation αj and trial-to-trial drift rate variability ηj, conditional hierarchical-level parameters μ(δ)k2 and μ(τ)k1, and non-conditional hierarchical-level parameters such as σ(δ), σ(τ), μ(α), σ(α), μ(η) and σ(η) are estimated by MCMC sampling.
For Modelt (Figure 8a): indices of i refer to trials that do not vary with random variables. Also, indices of j participants, k conditions, k1 coherence and k2 spatial prioritization were set.
Distributions of parameters for each participant i and condition k are as follows:
For Modelv (Figure 8b): prior distributions of drift-rate and non-decision time parameters were changed as follows:
For Modela (Figure 8c): prior distributions of drift-rate free from coherence and non-decision time parameters free from spatial prioritization as follows:
For Modelp (Figure 8d): only prior distributions of drift-rate free from coherence as follows:
Finally, R-hat was less than 1.1 for all parameters in each model and non-parametric bootstrap. Also, The mean of effective sample size for all parameter across non-parametric bootstraps and models was 8200.
6.2 The full neuro-cognitive model, Model5
Model5 with all variables free to vary by trial-to-trial N200 latencies:
6.3 Three simulations
Posterior distributions of Hierarchical neuro-cognitive Bayesian model 5.
Parameter posteriors of model 1 and simulated participant data 1. The shaded area represent the 95% Bayesian credible interval (BCI) of posteriors.
Parameter posteriors of model 5 and simulated participant data 1. The shaded area represent the 95% Bayesian credible interval (BCI) of posteriors.
Parameter posteriors of model 2 and simulated participant data 2. The shaded area represent the 95% Bayesian credible interval (BCI) of posteriors.
6.4 The proportion of N200 latency
The proportion of samples above zero from non-prioritized N200 latency minus prioritized N200 latency for each participant have been reported as follows:
[0.63, 0.69, 0.50, 0.77, 0.85, 0.58158333, 0.46, 0.66, 0.32, 0.82, 0.27, 0.39, 0.82, 0.60, 0.38].
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵