Stimulus-choice (mis)alignment in primate MT cortex

For stimuli near perceptual threshold, the trial-by-trial activity of single neurons in many sensory areas is correlated with the animal’s perceptual report. This phenomenon has often been attributed to feedforward readout of the neural activity by the downstream decision-making circuits. The interpretation of choice-correlated activity is quite ambiguous, but its meaning can be better understood in the light of population-wide correlations among sensory neurons. Using a statistical nonlinear dimensionality reduction technique on single-trial ensemble recordings from the middle temporal area during perceptual-decision-making, we extracted low-dimensional neural tra jectories that captured the population-wide fluctuations. We dissected the particular contributions of sensory-driven versus choice-correlated activity in the low-dimensional population code. We found that the neural trajectories strongly encoded the direction of the stimulus in single dimension with a temporal signature similar to that of single MT neurons. If the downstream circuit were optimally utilizing this information, choice-correlated signals should be aligned with this stimulus encoding dimension. Surprisingly, we found that a large component of the choice information resides in the subspace orthogonal to the stimulus representation inconsistent with the optimal readout view. This misaligned choice information allows the feedforward sensory information to coexist with the decision-making process. The time course of these signals suggest that this misaligned contribution likely is feedback from the downstream areas. We hypothesize that this non-corrupting choice-correlated feedback might be related to learning or reinforcing sensory-motor relations in the sensory population. Author summary In sensorimotor decision-making, internal representation of sensory stimuli is utilized for the generation of appropriate behavior for the context. Therefore, the correlation between variability in sensory neurons and perceptual decisions is sometimes explained by a causal, feedforward role of sensory noise in behavior. However, this correlation could also originate via feedback from decision-making mechanisms downstream of the sensory representation. This cannot be resolved by analyzing single unit responses, but requires a population level analysis. Area MT contains both sensory and choice information and is known to be the key sensory area for visual motion perception. Thus the decision-making process may be corrupting the sensory representation. However, we find that the sensory stimuli and choice variables are separate at the population level,contradicting the previous interpretations based on single unit recordings. This new insight postulates how neural systems can maintain a mixed representation while allows learning and adaptation.

Sensory cortical neurons exhibit substantial variability to repeated presentations of the 2 same stimulus [1,2]. This variability depends on the specifics of the sensory stimulus 3 and task being performed [3][4][5][6][7], and is often correlated with the trial-by-trial perceptual 4 report of the animal [8][9][10][11]. This trial-by-trial correlation between neural responses and 5 perceptual reports, often quantified as choice probability (CP), has long been of interest 6 for its potential to reveal the mechanisms by which downstream areas read out the 7 response of relevant population of sensory neurons [12][13][14]. However, this interpretation 8 is complicated by the presence of interneuronal correlations [15], top-down 9 feedback [9,16] and also depends on assumptions about the readout mechanisms of 10 downstream brain areas [12,14,16,17]. 11 Several models of perceptual decision-making have been proposed to explain the 12 empirical relationships between stimuli, neural responses, and behavioral 13 choices [12,14,16]. Existing proposals come in two basic flavors: those that posit an 14 optimal readout that is limited by shared neural variability [14,18,19] and those that 15 assert that choice-related feedback modifies the signals in sensory areas [16,20]. Several 16 recent experimental results support the feedback hypothesis [7,9,20]. Although feedback 17 can be interpreted in terms of probabilistic inference [16], the resulting pattern of 18 variability in sensory areas will reduce the information about the stimulus [16,19,21] 19 and impair performance on the task [20]. Why would the brain bother to feedback a 20 choice or decision that corrupts the sensory information and make it do worse on the 21 task? Here, we propose an alternative hypothesis: that the feedback can be 22 non-corrupting, effectively multiplexing choice signals in a sensory population without 23 diminishing information about the stimulus. 24 To visualize the space of hypotheses and how they can be distinguished, it is helpful 25 to summarize the joint activity of a population of neurons with respect to the stimulus 26 driven activity. Figure 1 demonstrates this alignment conceptually and the effect of 27 each type of choice model in this space. Specifically, for a population of only two 28 neurons, the joint activity of the population can be represented as points in a 2D space 29 where each axis represents an individual neuron's activity ( Figure 1A). For a 30 one-dimensional stimulus (as is typically used in discrimination paradigms), different 31 values of the stimulus (red and black) will drive activity that falls along a 32 one-dimensional "stimulus axis". Increased variability along the stimulus axis will 33 decrease the amount of information about changes in the stimulus, while, importantly, 34 variability orthogonal to the stimulus axis will not [19,22]. We call this variability the 35 "non-stimulus subspace" ( Figure 1A). In larger populations, it is possible that the 36 "stimulus axis" could be higher than one-dimension, however, there will still be a 37 subspace that is orthogonal to the stimulus axes and, therefore, will not affect 38 information about the stimulus (i.e., in the null-space of the stimulus axes).

39
By realigning the population activity to the "stimulus axis", the effect of noise 40 correlations and feedback can be visualized clearly. Noise correlation is any elongation 41 of the joint activity point cloud for repeats of the same stimulus ( Figure 1B). In this 42 space, the optimal readout of such a population is to draw a criterion line (decision 43 boundary) orthogonal to the stimulus axis and report which side the population activity 44 on that trial fell on ( Figure 1C) reduces the performance on the task for weak stimuli ( Figure 1E). In contrast, feedback 52 could be non-corrupting by pushing choice information only in the non-stimulus 53 subspace ( Figure 1F). This increases CP in the non-stimulus subspace without adding 54 CP in the stimulus axis and does not diminish stimulus decoding performance. In each 55 of these examples ( Figure 1C,E,F) the readout is optimal (orthogonal to the stimulus 56 axis). For completeness, one additional possibility is that the readout is suboptimal and 57 the downstream areas are mistakenly including variability that is in the non-stimulus 58 subspace giving rise to CP in the non-stimulus subspace ( Figure 1D).

59
To test these different hypotheses requires an analysis of the joint statistics of stimulus, choice, and trial-to-trial variability present in the population activity are 66 decomposed into shared low-dimensional neural trajectories and noise that is private to 67 each neuron ( Figure 2). As expected, low-dimensional shared signals capture a majority 68 of the variability in these data as seen previously in other areas [6,23,25,26]. By 69 aligning the latent signals to the stimulus and task variables, we were able to investigate 70 how stimulus and choice are encoded by neurons collectively. 71 We found that the task variable (visual motion) was primarily captured by a single 72 latent dimension, indicating that the high-dimensional visual stimulus was represented 73 in a low-dimensional, task-relevant manner across the MT population. Additionally, we 74 found that the choice-correlated variability in the population was mainly captured by non-stationary environments [27,28]. To understand how stimulus and perceptual choice are encoded across the population, 101 we employed the variational latent Gaussian process (vLGP) method to extract 102 single-trial low-dimensional neural trajectories from population recordings in area MT. 103 We used the recording of the period from 100 ms before stimulus onset to 350 ms after 104 offset, and binned the spike counts at 1 ms resolutions. Let x k denote the k-th 105 dimension of the latent process. We assumed that the spatial dimensions of latent 106 process are independent and imposed a Gaussian Process (GP) prior to the temporal 107 correlation of each dimension, 108 x k ∼ N (0, K).
To obtain smoothness, we used the squared exponential covariance function and respective covariance matrix K in the case of discrete time. Let y tn denote the occurrence of a spike of the nth neuron at time t, y tn = 1 if there was a spike at time t December 18, 2019 4/15 and y tn = 0 otherwise at this time resolution. Then y t is the vector of length N , total number of neurons in a session, that concatenates all neurons at time t. The spikes y t are assumed to be a point-process generated by the latent state x t at that time via a linear-nonlinear model, To infer the latent process (x t for each trial) and the model parameters (A and b), we 109 used variational inference technique, as the pair of prior and likelihood do not have an 110 tractable posterior. We assumed parametric variational posterior distribution of the 111 latent process, We analyze the mean {µ k } as the latent trajectory in this study. The detail of inference 113 is described in [23]. To accelerate the inference, we initialized algorithm at the result of 114 Gaussian Process Factor Analysis (GPFA). The dimensionality of the latent process was 115 determined to be 4 by leave-one-neuron-out cross-validation on the session with the 116 largest population (2). All the sessions with more than 10 simultaneously recorded units 117 were included in the this study.
where W is the weight matrix to estimate and E is the Gaussian noise matrix and the 137 regularization hyperparameter γ was chosen by the generalized cross-validation 138 (GCV) [30]. The PTA was calculated with the design matrices of unit-strength pulse 139 and the estimated weights β. We smoothed the PTA with a temporal Gaussian kernel 140 (40 ms kernel width). Since there were some recording sessions with less than ideal number of frozen trials for 148 the calculation of choice probability, we instead analyzed the "weak" trials of which the 149 monkeys' correct rate was below a threshold (65%). We started at the trials of zero 150 pulse coherence and gradually increased the magnitude of coherence (absolute value) 151 until the correct rate reached the threshold. One of the sessions containing less than 100 152 weak trials was excluded in this analysis. 153 We removed the stimulus directions that are encoded in the latent process and raw 154 population activity of weak trials by regressing out the pulses and analyzed the residuals. 155 The latent process and population activity were re-binned at 100 ms resolution where 156 the value of each bin is the sum of latent state x t or spike counts y t over the bin for 157 t = 1, 2, . . . , T . For each t, we assumed a linear model to predict its value where s i denote the strength of the i-th pulse, w ti is the weight vector corresponding to 159 the bin and pulse, and e is the homogeneous Gaussian noise across all bins. We Again, the hyperparameter of regularization was chosen by GCV. For the raw 163 population activity, we did the same regression, replacing x t with the spike count y t . 164 We then analyzed the contribution of behavioral choice on the residuals For the whole trial we used the sum residual of the windows r = t r t . The range of t 166 depends on the period of interest. 167 We trained logistic models, to which we refer to as choice decoders, to predict the 168 subject's choice on each trial using either latent trajectories or population responses.

169
The weights β and bias β 0 were estimated by maximum likelihood with The conventional choice probability only applies to univariate variables. However, either 175 the latent process or population activity is multivariate. We transformed the 176 multivariate variables mentioned above onto a one-dimensional subspace that has the 177 same direction as the choice through the choice decoders, We refer to the transform as the choice mapping. The quantity c is a normalized value 179 within [0, 1] that maps the residual onto the choice direction [31], and enables

186
To investigate the effect of different dimensions on the choice, we did sequential likelihood ratio tests through adding the choice-mapped value of stimulus-dimension, non-stimulus-dimensions and the population one by one to a logistic model that predicts the choice, Three monkeys performed a motion-pulse direction discrimination task with an eye 195 movement to one of two targets [29]. The visual stimulus was presented as a sequence of 196 7 temporally coherent motion pulses of varying strength. An ensemble of MT neurons 197 were simultaneously recorded using multi-electrode arrays. Given the recording, we 198 statistically infer a low-dimensional latent process that explains the shared component 199 of the high-dimensional variations in the observed spiking activity. Conventional 200 analysis methods such as factor analysis or principal component analysis assume either 201 observation models inappropriate for spikes (e.g. Gaussian) or linear dynamics that lack 202 expressive power to describe any non-trivial computation. To overcome these 203 disadvantages, we imposed a general (nonlinear) Gaussian process prior on the latent 204 trajectories and assumed a point-process observation model to account for spikes. The 205 generative model was fit using the variation latent Gaussian process (vLGP) method to 206 recover nonlinear smooth latent trajectories from population recordings [23]. Figure 2A   To validate the model, we evaluate the pairwise noise correlations between neurons 218 on randomly interleaved frozen trials where the stimulus was held constant ( Figure 2B). 219 With the inferred latent process and loading matrix, we can generate spike trains from 220 the model. We calculated the noise correlation matrices from data and reconstructed   The latent process is subject to arbitrary rotation [23] which results in models with 232 equivalent explanatory power. Hence, we rotated the latent processes for each session so 233 that the effects of motion pulses are concentrated in decreasing order across dimensions 234 ( Figure 3A). For both subjects, the pulses are faithfully represented as transiently 235 modulated latent process, and most of the motion information is encoded in the mean 236 value of the first factor-we refer to this factor as the stimulus axis. 237 We pooled the stimulus-explaining latent factor alignment across all sessions. The 238 first dimension explains most (> 90%) of the PTA in the latent process for all but one 239 session ( Figure 3B). This concentration of stimulus information in 1-dimension is manifold that spans multiple dimensions in the neural space [32]. should rely only on the stimulus and ignore the off-axis "noise" [17]. Hence, for a purely 251 feed-forward system, only the noise in the stimulus dimension should influence the 252 choice, resulting in choice-correlation reflecting the optimal strategy ( Figure 1C).

253
Otherwise sub-optimal "readout" can show choice-correlation through 254 stimulus-irrelevant variability ( Figure 1D). On the other hand, feedback paths can mix 255 the downstream choice process signals back into the MT representation: if the feedback 256 is aligned with the stimulus-axis, it will corrupt the encoding of the sensory signal 257 ( Figure 1E), while misaligned feedback that stays orthogonal to the continuous stream 258 of stimulus modulated population activity subspace ( Figure 1F). non-stimulus-axes, and the population, one by one ( Figure 5). The choice is significantly 276 correlated with the latent nonstimulus-axes (p < 2.2 × 10 −16 ), which indicates that the 277 choice axis is not perfectly aligned with the stimulus axis as the optimal readout or 278 corrupting feedback models suggest. Therefore, our analysis supports representation of 279 choice information in the non-stimulus latent subspace. This misalignment of stimulus 280 axis and choice axis can occur through either non-optimal readout ( Figure 1D) or 281 non-corrupting feedback ( Figure 1F). The misalignment between choice and stimulus in 282 MT provides evidence for a feedback source of choice information in sensory neurons.

283
The presence of CP orthogonal to the stimulus axis suggests that choice information is 284 not just a result of noise on the sensory response, but rather arises from another process 285 altogether.  [16,20,[33][34][35]. To disambiguate the two, 292 we investigate the temporal profile of choice probability. Behavioral analysis showed 293 that the sensory information immediately after its presentation has a strong influence in 294 the choice [24]. In turn, one would expect to see choice information early in the 295 population activity. If the choice information is only present late in the trial, then we 296 can conclude that the feedback from the downstream decision-making process is 297 contributing to the misaligned choice information we observed in the previous section. 298 To investigate the temporal profile of choice correlation in the non-stimulus axes, we 299 calculated time course of CP. We fit 3 linear choice decoders to the latent non-stimulus 300 axes during the early (200-500 ms), middle (600-900 ms) and late (1000-1300 ms) 301 periods, and then used them to decode the whole period with a 100 ms moving window. 302 Figure 6 shows that the middle and late decoders start climbing late during the visual 303 motion presentation and reach a peak at around the motion stimulus was terminated.

304
This temporal profile is consistent with a choice variable that accumulates sensory 305 evidence [12], and supports the non-corrupting feedback from the decision-making 306 process. On the other hand, the early decoder shows a constant choice probability 307 throughout the motion presentation period (Fig. 6) which could represent a per-trial 308 choice bias. These observations suggest that the choice information resides in more than 309 1-dimension within the non-stimulus subspace.  Time course of choice probability in the non-stimulus subspace suggests feedback from the decision-making process. Decoders were fit to early (yellow), middle (red), and late (purple) periods (300 ms, marked by the colored bars) of non-stimulus latent dimensions to predict choice. We used the resulting weights of the decoders to perform choice-mapping on the whole time interval divided into 100 ms non-overlapping moving windows (aligned at the center). The colored curves correspond to the choice probability time course using the respective decoder. to MT is picked up as population variability along with other noise correlations denoted x 1 (t), x 2 (t), x 3 (t). To optimally perform the task, the choice should rely on only the stimulus dimension, and hence noise in x 1 shows up as CP in relevant units reflecting their 'readout' strategy (case 1). Non-optimal readout can provide CP through stimulus-irrelevant variability (case 3). Alternatively, feedback from the decision-making process to MT can provide choice-correlation in the stimulus-irrelevant subspace (case 4) without corrupting the optimal representation or the stimulus driven shared dimension (case 2) causing non-optimal behavior.
choice, and trial-to-trial variability presented in the population activity are decomposed 315 to reveal the underlying signals: individual neuron's private activity, and 316 low-dimensional shared signals. As expected, latent low-dimensional shared signals 317 capture the majority of the variability present in the population recordings. By aligning 318 the latent signals to the stimulus and behavioral choice, we were able to investigate how 319 stimulus and choice are shared across neurons. We found that the sensory task variable 320 was primarily captured by a single latent dimension, indicating that high-dimensional