ABSTRACT
We provide evidence that the brain may use time division multiplexing, or interleaving of different signals across time, to represent multiple items in a single neural channel. We evaluated single unit activity in an auditory coding "bottleneck", the inferior colliculus, while monkeys reported the location(s) of one or two simultaneous sounds. Using novel statistical methods to evaluate spiking activity on a variety of time scales, we found that on dual-sound trials, neurons sometimes alternated between firing rates similar to those observed for each single sound. These fluctuations could occur either across or within trials and appeared coordinated across pairs of simultaneously recorded neurons. Fluctuations could be predicted by the state of local field potentials prior to sound onset, and, in one monkey, predicted which sound the monkey would ultimately saccade to first. Alternation between activity patterns corresponding to each of multiple items may be a general strategy employed by the brain to enhance its processing capacity, suggesting a potential connection between such disparate phenomena as variable neural firing, neural oscillations, and limits in attentional or memory capacity.
ONE SENTENCE SUMMARY The brain may use time division multiplexing, or alternation between signals corresponding to different items, to enhance its processing capacity.
Introduction
In the natural world many stimuli or events occur at the same time, evoking activity in an overlapping population of neurons. When neurons are exposed to more than one stimulus to which they can respond, how might they preserve information about each stimulus? In this study we investigated whether spike trains contain interleaved signals corresponding to each stimulus, akin to time-division multiplexing used in telecommunications (Figure 1), and postulated to occur in some form in the brain (1–8).
Multiplexing is most likely to occur when there is an information-processing bottleneck. The coding of sound locations involves such a bottleneck. Sound waves stemming from two sources sum in the world and are sampled at only two locations, i.e. at each ear. In barn owls, multiple locations appear to be de-multiplexed from these signals and encoded as distinct peaks in auditory space maps (9-12). But in primates (including humans) and several other mammalian species, the neural representations themselves involve a bottleneck (13,20). The inferior colliculus (IC) and other auditory structures encode sound location not in a map but in a “meter”: a firing rate code in which neural activity is roughly proportional to the horizontal angle of the sound, reaching an apex (or nadir) at 90 degrees contralateral (or ipsilateral) along the axis of the ears, where the binaural timing and level differences reach their maximal (or minimal) values (Figure 2D, F) (13-20).
A strict meter/firing rate code would seem unable to represent more than one sound location except via multiplexing. The auditory pathway’s maps for sound frequency can only partially ameliorate this situation. Such maps serve to separate the coding of sounds of different frequencies to somewhat different neural subpopulations. However, most natural sounds are spectrally rich and will activate overlapping “hills” of neural activity; even a single pure tone of a particular frequency can evoke activity in 40-80% of IC neurons (21). This raises the question of how a population consisting of such broadly-tuned neurons can preserve information about combinations of sounds, even when they differ in sound frequency. Alternating the coding of different sounds across time would potentially solve this problem.
Results
Monkeys can report the locations of both sounds, indicating that both are coded in brain
We first tested whether monkeys can perceptually preserve information about multiple sounds presented simultaneously. Monkeys performed a localization task in which they made eye movements to each of the sounds they heard: one saccade on single-sound trials and two saccades in sequence on dual-sound trials (Figure 2A). The sounds were separated horizontally by 30 degrees and consisted of band-limited noise with different center frequencies. They were thus physically distinguishable in principle, and humans can do so (22–24). The monkeys learned the task successfully (example session shown in Figure 2B), and, like humans, typically performed better when the frequency separation between the two sounds was larger (Figure 2C, ~72 vs. ~77% correct for frequency differences of 3.4 vs. 6.8 semitones).
If the monkeys can report the locations of two sounds presented simultaneously, it follows that their brains must preserve information about both sound items. To evaluate the neural basis of this, we focused on the IC because it lies comparatively early along the auditory pathway (a few synapses in from the periphery, and about two synapses prior to signals reaching auditory cortex) (25, 26) and because it is a nearly obligatory station along this pathway (27). Thus, preservation of information about both sound locations in the IC would appear to be required for performance of this task.
Time-and-trial pooled neural activity in the IC is consistent with an “average”, but an average is inconsistent with behavior
Conventional analysis of spike data typically involves two simplifications: spikes are counted within a fairly long window of time, such as a few hundred milliseconds, and activity is pooled across trials for statistical analysis. If IC neurons multiplex signals related to each of the two sounds (arbitrarily dubbed “A” or “B” for the single-sound trials), then they might appear to show “averaging” responses on dual (or “AB”) trials when activity is pooled across time and across trials. But they should not appear to show “summation” responses, i.e. in which the responses on dual-sound trials resemble the sum of the responses exhibited on single-sound trials involving the component sounds. Such summation has been observed in some neural populations in areas such as primary visual cortex (28, 29), the hippocampus (30), or the superior colliculus (31) when multiple stimuli are presented.
To investigate whether responses to two sounds are more similar to the sum or the average of the two single-sound responses, we considered matched combinations of a particular pair of stimuli A and B presented alone or in combination. The set of stimulus A alone, stimulus B alone, and stimuli A and B in combination is referred to as a “triplet”, a term we will use throughout. Using an analysis similar to that of (31), dual-sound responses were converted to Z-scores relative to either the sum or the average of the corresponding single-sound responses (see Methods). Figure 2D-G shows that such trial-and-time-pooled responses more closely resemble averaging than summation: 93% of Z scores (N=761) were consistent with averaging (gray zone indicating +/-1.96 units of standard deviation) whereas far fewer, 55%, were consistent with summation. This was true even when both sound A and sound B evoked excitatory responses (dark bars). Findings were similar regardless of whether the signals delivered to the audio speakers were identical on dual and single-sound trials vs. when the signals were adjusted to equate loudness across single– vs. dual-sound trials (See Methods and Supplementary Figure 1). Consequently, in subsequent analyses we pooled across sound level.
However, such apparent averaging response patterns are inconsistent with the behavioral results: if the neurons truly responded at an average firing rate, then presumably the monkeys should respond to dual sounds as if there were only a single sound at the midpoint of the two sources (Figure 2F). Since monkeys can indicate the locations of both sounds (Figure 2B, C), multiplexing might provide a better explanation for so-called averaging response patterns.
Within and between trial activity fluctuations consistent with multiplexing: visualization and statistical analyses at multiple time scales
Visualization
To determine whether neural activity fluctuates within and/or between trials, creating an overall averaging response but retaining information about each sound at distinct moments, we first sought to visualize the activity on individual trials. Figure 3 shows the activity of two example neurons on dual-sound trials compared to single-sound trials. The colored backgrounds illustrate the median and 25-75% quantiles of the activity on single-sound trials, in 50 ms time bins. Superimposed on these backgrounds is the activity on individual trials. Individual single-sound (A alone, B alone) trials align well with their corresponding 25-75% quantiles, by definition (Figure 3A–B;E–F). But on dual-sound (AB) trials, for any given trial or time bin, some individual traces correspond well to one of the component sound’s 25-75% quantiles, and on other trials or time bins they correspond well to the 25-75% quantiles of the other component sound. For the neuron in Figure 3CD, there are whole trials in which the activity appears to match that evoked by sound “A” alone and others in which it better corresponds to that evoked by sound “B” alone. For the neuron in Figure 3G, the firing pattern on dual-sound trials appears to switch back and forth between the levels observed for sounds A and B as the trial unfolds. In short, for these two examples, the activity on dual-sound AB trials does not appear to occur at a consistent value intermediate between those evoked on single-sound A and B trials, but can fluctuate between those levels at a range of time scales.
We developed a series of statistical analyses to test for the presence of these various forms of alternation in firing rates. Several unknowns must be taken into consideration when testing for activity fluctuations. Specifically, the time scale, repeatability, and potential correlations across the neural population are uncertain. Accordingly, we sought to make minimal assumptions about the time scale at which neurons might alternate between encoding each stimulus, and we assumed that any such switching might vary from trial to trial and/or across time within a trial.
Statistical analysis of whole trial spike counts
If neurons alternate firing rates at the time scale of trials, as appears to be the case for the neuron in Figure 3A,B,C–D, then the spike counts from dual-sound responses should resemble a mixed bag of spike counts from each of the component single-sound responses. We tested this hypothesis against other reasonable competing possibilities in a Bayesian model comparison. For this analysis, we evaluated the subset of triplets whose spike counts on single sound A and B trials could be well modeled by Poisson distributions with statistically different mean rates λA and λB (N=363, see methods for details).
The competing scenarios to describe the corresponding dual sound trials were:
(a) Mixture: The spike counts observed on individual trials are best described as having come from a weighted mixture of Poi(λA)and Poi(λB) (Figure 4A, purple dashed line). This possibility is consistent with multiplexing across trials.
(b) Intermediate: A single Poisson distribution best describes the spike counts, and this Poisson has a rate λAB that is between λA and λB (Figure 4A, pink dashed line). This possibility is consistent with either multiplexing at faster, sub-trial time scales or with true averaging/normalization.
(c) Outside: Again, a single Poisson, but the rate λAB is outside the range of λA and λB (i.e is greater than both or less than both; Figure 4A, green dashed line). Summation-type responses would be captured under this heading, as would inhibitory interactions.
(d) Single: A single Poisson describes the dual-sound trial spike counts, but the rate λAB is is equal to one of the single-sound rates λA or λB (Figure 4A, red/blue dashed lines). A winner‐ (or loser‐)-take-all pattern would fit this category.
In summary, these four models capture the spectrum of possibilities at the whole-trial time scale. A Bayesian model comparison with default priors and intrinsic Bayes factor calculation was used to compute the posterior probabilities of the four models given the neural data.
For a sizeable portion of the triplets, the spike counts on dual-sound trials were better fit by a mixture of the single-sound Poisson distributions than by any single Poisson distribution (Figure 4B, bar labeled “mixture”). These conditions are potentially consistent with time division multiplexing at the level of individual trials; the neuron illustrated in Figure 3A,B,C–D met these criteria. Of the 72 triplets in which one model had a winning probability >0.95, 50 or 69% were categorized this way.
For the next largest category, the best fitting model involved a unique λAB between λA and λB (Figure 4B, bar labeled “intermediate”). These triplets are ambiguous: they could exhibit a true intermediate firing rate on the dual-sound trials, or they could simply show alternation at a time scale more rapid than individual trials (the neuron illustrated in Figure 3E–G was classified as “intermediate”). Of the 72 triplets in which one model had a winning probability >0.95, 18 or ~25% were categorized this way.
The remaining triplets were categorized as “single”, or λAB = λA or λB (a narrowly defined category that consequently did not produce any winning model probabilities >0.95) or “outside”, λAB greater or less than both λA and λB. “Single” can be thought of as a winner-take-all response pattern. “Outside” may be consistent with a modest degree of summation in the neural population, particularly as λAB was generally greater than both λA and λB in this subgroup.
Statistical analysis of within-trial spike counts
We next turned to the question of whether firing patterns fluctuated or remained stable across time within a trial. In particular, might triplets categorized as “intermediate” in the whole trial analysis show evidence of fluctuating activity on a faster time scale?
This is a more challenging statistical question, and required development of a novel statistical approach. We focused on the same triplets selected above, and analyzed single trial spike countsin 50 ms time bins (see Methods). For each triplet, individual single sound trials were assumed to be independent realizations from nonhomogeneous Poisson process with unknown time-dependent firing rates (λA(t) (for sound A and λB(t) for sound B. To assess how individual time-varying dual sound responses related to single sound responses, each trial from the dual sound condition was assumed to be a realization of a Poisson process but with its own firing rate function λ(t), modeled as an unknown weighted average of the two single sound firing rate functions λAB(t)=α(t)λA(t)+(1-α(t))λB(t). The weight function α(t), unique to each dual sound trial, quantified the potentially time varying relative contribution of sound A on that trial at time t, while 1-α(t) while quantified the complementary contribution of sound B (Figure 5A). A value of α(t)=1 would indicate that the corresponding dual trial’s response at time t closely matched the response distribution at time t of single sound A trials, and a value of 0 would indicate it matched that of single sound B trials. An α function realizing values strictly between 0 and 1 would indicate some contribution from either sound at all times. An α centering around a value close to 0.5 would indicate comparable aggregate contributions from both sounds, whereas one centering close to 0 or 1 would indicate dominance of one sound over the other (Figure 5A, trial-wise mean alphas). A wavy shape of the function would indicate the relative contributions of the two sounds changed across time at a sub-trial timescale (Figure 5A, max swing sizes).
That we allowed each dual sound trial to have its own temporal pattern is a real novelty of our modeling approach. For each α(t) function we assumed its dynamic pattern was given by a transformed Gaussian process governed by three parameters that directly controlled the function’s long-term centering, and the frequency and amplitude with which the function fluctuated around its long-term centering. These sets of three parameters, one set for each trial, were assumed to arise from a shared but unknown probability distribution – a dynamic pattern generator that was a property of the triplet and could be used to describe its properties. All α(t) functions were then estimated together, jointly with the dynamic pattern generator, within a Bayesian inference framework.
For each triplet, we summarized its dynamic pattern generator by quantifying three features: (1) waviness, (2) centrality, and (3) symmetry (Figure 5A). Quantification was done by repeatedly simulating α(t) functions for hypothetical new trials and summarizing the sampled functions along the three dimensions (Figure 5AB–C). The waviness metric was computed as the odds of obtaining an α(t) function exhibiting a swing of at least 50% between its peak and trough:
Where P denotes the sampling proportion of the simulated α draws. Centrality was computed as the odds of obtaining an α(t) function with its long-term average being closer to the mid-way mark of 50% than the extremes:
Skewness was computed as the maximum of A-skew and B-skew, where A-skew was computed as the odds of obtaining an α(t) function with long-term average closer to 1 than 0, and B-skew being its reverse:
The three quantified features were then thresholded to generate a 3-way classification of all triplets. Along waviness, a triplet was categorized as “wavy”, “flat” or “ambiguous” according to whether rw>1.3, rw<0.77, or, 0.77≤rw≤1.3, respectively Along centrality, the categories were “central”, “extreme”, or, “ambiguous” according to whether rc>3.24, rc>1.68, or, 1.68≤rc≤3.24, respectively. Along skewness, the categories were “skewed”, “symmetric” or “ambiguous” according to whether rs>4, rs>2 or 2≤rs≤4, respectively. Supplementary Table 1 and Supplementary Figures 2 and 3 give the results of this 3-way classification, cross tabulated with the classification done under the whole trial spike count analysis.
The DAPP tags confirmed and extended the results of the whole-trial analysis. Triplets categorized as “intermediate” in the whole trial analysis showed a different distribution of tags as compared to those categorized as “mixtures”. “Mixture” triplets tended to be classified as showing “flat” single sound contributions, centering around “extreme” rather than “central” values of long-term average contribution (Figure 5C), and the distribution of the long-term averages were either symmetric or unlabeled with regard to symmetry (Supplementary Table 1). In contrast, “intermediate” triplets showed a combination of two types of labelling patterns relevant to our hypothesis. Some showed flat firing at a central (and symmetric) intermediate value, indicating stable firing at roughly the average of the responses evoked by each sound separately. Such a firing pattern is consistent with some form of normalization occurring in this subpopulation. However, there were also triplets that showed wavy, i.e. fluctuating response patterns symmetric around a central value. This type of response pattern suggests that under some circumstances, neurons can “switch” relatively rapidly between a response pattern consistent with one stimulus vs the other on dual stimulus trials.
Consistent with this statistical evidence for activity fluctuations at the subtrial timescale in the “intermediate” category, we also found that the local field potential (LFP) at such sites showed greater oscillatory activity. Figure 5D shows the average LFP power spectrum for dual trials of triplets categorized as “mixtures” vs. those categorized as “intermediates” and their statistical comparison (lower panel, two-tailed t-test between the LFP power spectrum of dual trials classified as Intermediate and that of dual trials classified as mixtures, for each time point and frequency combination). The LFP for intermediate sites showed higher energy across a range of frequencies, including frequencies well above the 20 Hz (50 ms) frequency range that we were able to evaluate at the spike-count single unit level
Coordination of fluctuations across the neural population: within and between trials and relation to behavior
We next considered the question of whether and how activity fluctuations are coordinated across the neural population, in two ways: (1) by evaluating activity correlations across time within trials between pairs of simultaneously recorded neurons, and (2) by evaluating whether the state of the local field potential prior to sound onset predicts between-trial fluctuations in activity (e.g. 32, 33).
Neural pairs and within trial correlations
To evaluate correlations in within-trial switching patterns, we evaluated the neuron-to-neuron correlation between how “A-like” vs. how “B-like” the responses were on a time bin by time bin basis on individual trials, in a total of 91 pairs of triplet conditions from 34 pairs of neurons recorded simultaneously (from among the 363 tripletsused for the previous analyses). For each 50 ms bin of a dual-sound trial in a given triplet, we assigned a probability score between 0 and 1 that the spike count in the bin was drawn from the Poisson distribution with rate equaling the bin’s sound A rate, and the complementary probability to the same being drawn from the Poisson distribution with rate equaling the bin’s sound B rate (Figure 6A; see Methods: A vs. B assignment scores). We normalized these probabilities by converting them to Z-scores within a given time bin but across trials, to minimize the contribution of shared correlations due to stimulus responsiveness or changes in motivational state across time (34). We then calculated the neuron-to-neuron correlation coefficients between the normalized assignment scores across the set of time bins within each trial, i.e. one correlation coefficient value estimated per trial. This analysis is conceptually similar to conventional cross-correlation analysis of spike trains in neural pairs, but does not focus on precise timing of spikes or the relative latency between them (35, 36).
Generally, the observed correlations were positive, indicating that the activity was coordinated within the neural population. Figure 6 illustrates analysis of the dual-sound trials for a particular triplet in an example pair of neurons (A), and the distribution of the mean neuron-to-neuron correlations in the population for all the triplets’ dual-sound conditions (B). The distribution of mean correlation coefficients was skewed positive (t-test, p = 6.8 × 10−6). Similar results were obtained when the raw spike counts were analyzed rather than the assignment scores (Supplementary Figure 4). This was the case even though we included triplets that were not categorized as showing “wavy” behavior in the DAPP analysis. It may be that coordinated activity fluctuations occur in more neurons than those that met our statistical criteria.
Local Field Potentials and between-trial fluctuations
To determine whether the state of the local field potential prior to sound onset predicts between-trial fluctuations in activity, we analyzed the LFP data recorded simultaneously with single unit spiking data. We combined data across triplets, creating two “bags” of trials based on whether the whole-trial spike count on a given dual-sound trial more closely resembled the responses evoked by sound A alone (where A is the contralateral sound) or sound B alone (see Methods: A vs. B assignment scores). Figure 6C shows the average LFP for the two groups of dual-sound trials. We quantified differences between these two groups with a t-test in the 600ms windows before and after sound onset (each trial contributed one mean LFP value in each time window). As expected, the LFP signals statistically differed after sound onset in these two trial groupings (red vs. blue traces, time period 0-600 ms, p-val = 1.0474 × 10−05). But the LFP signals also differed prior to sound onset (p-val = 0.0064), suggesting that the state of activity in the local network surrounding an individual neuron at the time of sound onset is predictive of whether the neuron “encodes” the contra-lateral or the ipsi-lateral sound on that particular trial.
Relationship to behavior
If fluctuations in neural activity are coordinated across the population, and if one particular stimulus dominates the representation at any given instant, it follows that there should be a relationship between trial-by-trial variability in neural activity and behavior. Accordingly, we investigated whether the activity on individual trials predicted whether the monkey would look first to sound “A” or sound “B” on that trial. As noted in the Methods, we trained the monkeys on sequential sounds first and this training strategy tended to promote performing the task in a stereotyped sequence. Partway through neural data collection, we provided monkey Y with additional training on the non-sequential task, after which that monkey began displaying less stereotypical behavior and sometimes saccaded first to A and sometimes first to B for a given AB dual sound combination (see Figure 7A for example). We then analyzed recording sessions after this training (N=73 triplets) and we found that at both the whole trial and sub-trial time scales, the activity of individual neurons was predictive of what saccade sequence the monkey would choose on that particular trial. Specifically, the average dual sound AB assignment score for a given triplet was computed separately for trials in which the first saccade was toward A vs. toward B. The average scores statistically differed between the two groups of dual-sound trials (t-test, pval = 5× 10^−9,Figure 7B) and in the expected direction, with more A-like scores occurring on trials in which the monkey looked at A first. This relationship was also present when looking at finer, 50 ms bin time scales (Figure 7C).
DISCUSSION
Our results show that the activity patterns of IC neurons fluctuate, and that these fluctuations may be consistent with encoding of multiple items in the same processing channels (i.e. the set of neural spike trains occurring in the IC). The time scale of these fluctuations ranges from the level of individual trials down to at least 50 ms bins within a trial. The fluctuations are positively correlated across pairs of neurons (at least, those recorded within the IC on a given side of the brain), are reflective of the state of local field potentials at the time of sound onset, and are predictive of the behavioral response to follow.
There are several limitations to the present statistical approach. First, the analyses could only be conducted on a subset of the data, requiring a good fit of a Poisson distribution to the single-sound trials and adequate separation of the responses on those trials. For the moment, it is unknown whether any of the excluded data exhibit meaningful response fluctuations. In principle, the modeling approach can be extended to other types of response distributions which should reduce the amount of data that is excluded. Second, the range of time scales at which fluctuations occur is unknown. Fluctuations that occur faster than the 50 ms bin time scale used for the DAPP model would likely have been (erroneously) categorized as flat-central. Third, our statistical approach based on the DAPP model involves a categorization step that summarizes the dominant features of a triplet. If a neuron sometimes behaves as a “flat-extreme” type and sometimes as an “wavy-central” type for a given triplet of conditions, it would likely be categorized as ambiguous. In other words, even though the DAPP model can pick up composite response patterns, the results we present ignore the existence of any such patterns.
The observed fluctuations have broad implications because they provide a novel account linking a number of other well-known aspects of brain function under a common explanation. First, it is widely recognized that neural firing patterns are highly variable. This variability isoften thought to reflect some fundamental inability of neurons to code information accurately. Here, we suggest that some of this variability may actually reflect interleaved periods of potentially quite accurate coding of different items. What else individual neurons may commonly be coding for in experiments involving presentation of only one stimulus at a time is not known, but possibilities include stimuli not deliberately presented by the experimenter, memories of previous stimuli, or mental imagery as suggested by the theory of embodied cognition (37). In the present study, we were able to demonstrate signal in these fluctuations by virtue of statistical tests comparing each of the trial types in A-B-AB triplets, but it may be the case that fluctuations were occurring in the single stimulus trials as well. We could not test this because our analysis required having as benchmarks the response distributions corresponding to the potentially encoded items.
Second, as a concept, multiplexing provides insight into why limitations in certain types of cognition exist. Working memory capacity is limited; attention filters stimuli to allow in depth processing of a selected set of items. These limitations may stem from using the same population of neurons for each attended or remembered item. If this is the case, then the puzzle becomes why these limits are often greater than one. Multiplexing suggests that cycling between different items across time allows evading what might otherwise be a one-item limit (2). Here, we investigated only two time scales, 50 ms and whole trials. Future work will be needed to more fully explore the time scales on which this occurs and to tie the resulting information on duty cycle to perceptual capacity.
Third, brain oscillations are ubiquitous, have been linked specifically to attentional and memory processes (33, e.g. 38, see also 39), and have been suggested as indicating multiplexing (2-8). Oscillations indicate that neural activity fluctuates, although they capture only the portion of such fluctuation that is coordinated across the underlying neural population and is regular in time. It remains to be determined to what degree oscillations in field potentials reflect the activity of neural circuits that control such temporal coordination in other neural populations vs. the activity of the neural circuits subject to the effects of such coordination. In a highly interconnected system such as the brain, both are likely to occur.
In the case of our particular experimental paradigm, several additional questions arise. How do signals related to different items come to be multiplexed? Are they later de-multiplexed? If so, how?
To some degree, sounds are multiplexed in the world. That is, the sound waves from multiple sources sum in the world and are never purely distinct from one another. The air pressure waves arriving at each ear reflect the combined contribution of all sound sources. However, if the IC’s neural fluctuations were driven by the sound signals arriving at the ears, then individual neurons should always respond the same way on every trial, and they do not. Instead, it seems likely that the externally-multiplexed sound waves interact with neural circuit states at the time that the incoming signal arrives to govern how individual neurons respond on a moment by moment basis.
Where and how signals may be de-multiplexed critically depends on the nature of the representation to which a de-multiplexed output could be written. In barn owls, which have maps of auditory space, the coding bottleneck intrinsic to meter/rate coding does not occur, and two sounds produce two separate active populations (9-12). Such distinct peaks suggest that the multiplexed-in-the-air signals have been de-multiplexed and segregated into two hills of activity.
In primates and several other mammals, neural representations of space employ meters (rate codes) rather than maps throughout the pathway from sound input to eye movement output, as far as we currently know (13-20, 40). This is the case even at the level of the superior colliculus (41), which has a well-deserved reputation for mapping when activity is evoked by non-auditory stimuli (42, 43).
Given that different types of codes exist in different species, and given that coding format is not known in all the circumstances in which multiplexing might apply (e.g. attention, working memory), we developed two different models to illustrate a range of different de-multiplexing possibilities (Figure 8) based on the nature of the recipient representation. In the first (Figure 8A), a multiplexed signal in a meter is converted into two hills of activity in a map, using a basic architecture involving graded thresholds and inhibitory interneurons suggested previously (44). Adding an integration mechanism such as local positive feedback loops would then serve to latch activity “on” at the appropriate locations in the map, producing a more sustained firing pattern. No clock signal is necessary for this model.
In the second model (Figure 8B), there are multiple output channels, each capable of encoding one item. An oscillating circuit that knows about the timing of the input gates signals to each output channel at the appropriate moments. As in the first model, a local positive feedback mechanism acts to sustain the activity during the gaps in the input. This model thus retains the efficient coding format of a meter but requires a controlling signal with knowledge of when to latch input flow through to each output channel. In our data, it is possible that within-trial fluctuating units lie at the input stage of such a circuit, and that between-trial fluctuating units actually lie at the output stage. A given unit might be allocated to either the “A” or the “B” pools based on state of the network (as detected by the LFP measurements) on different trials.
Data Analysis
All analyses concerned correctly performed trials.
Analysis of activity pooled across time and/or trials: Summation and Averaging
To evaluate IC activity using conventional analysis methods that pool across time and/or across trials, we counted action potentials during two standard time periods. The baseline period (Base) was the 600ms period before target onset, and the sensory-related target period (Resp) was the 600ms period after target onset (i.e. ending before, or at the time of, the offset of the fixation light. Figure 2A).
Summation/Averaging Indices: We quantified the activity on dual-sound trials in comparison to the sum and the average of the activity on single-sound trials, expressed in units of standard deviation (Z-scores), similar to a method used by (31). Specifically, we calculated, and where RespA and Resp B were the number of spikes of a given neuron for a given set of single-sound conditions A and B (location, frequency, and intensity) that matched the component sounds of the dual-sound trials being evaluated. As the “response” may actually include a contribution from spontaneous baseline activity, we subtracted the mean of the baseline activity for the single sounds (BaseA,B). Without this subtraction, the predicted sum would be artificially high because two “copies” of baseline activity are included under the guise of the response activity.
The Z scores for the dual-sound trials were computed by subtracting these predicted values from the mean of the dual-sound trials (mean(RespAB)) and dividing by the mean of the standard deviations of the responses on single-sound trials:
And
If the dual response was within +/- 1.96 of the predicted sum or predicted average, we could say the actual dual response was within the 95% confidence intervals for addition or averaging of two single responses, respectively.
Analyses of fluctuations in neural firing across and within-trials, and inclusion criteria
Our statistical tests for fluctuations in neural firing were conducted on triplets, or related sets of single and dual-sound trials (A, B, AB trials). To evaluate whether neural activity fluctuates across trials in a fashion consistent with switching between firing patterns representing the component sounds, we evaluated the Poisson characteristics of the spike trains on matching dual and single-sound trials (triplets: AB, A and B). Spike train data from each trial was summarized by the total spike count between 0-600ms or 0-1000 ms from sound onset (i.e. whatever the minimum duration of the overlap between fixation and sound presentation was for that recorded neuron, see section Events of Task). We modeled the distribution of spike counts in response to single sounds A and B as Poisson distributions with unknown rates λA, denoted Poi(λA), and λB, denoted (λB). Four hypotheses were considered for the distribution of sound AB spike counts:
a mixture distribution α. Poi(λA)+(1-α).Poi(λB) with an unknown mixing weight α (“mixture”)
a single Poi(λAB) with some λAB in between λA and λB (“intermediate”)
a single Poi(λAB) where λAB is either larger or smaller than both λA and λB (“outside”)
a single Poi(λAB) where λAB exactly equals one of λA and λB (“single”)
Relative plausibility of these competing hypotheses was assessed by computing their posterior probabilities with equal prior weights (1/4) assigned to the models, and with default Jeffreys’ prior (48) on model specific Poisson rate parameters, and a uniform prior on the mixing weight parameter α. Posterior model probabilities were calculated by computation of relevant intrinsic Bayes factors (49).
Triplets were excluded if either of the following applied: 1) the Poisson assumption on A and B trial counts was not supported by data; or 2) λA and λB were not well separated. To test the Poisson assumption on single-sound trials A and B of a given triplet, we used an approximate chi-square goodness of fit test with Monte Carlo p-value calculation. For each sound type, we estimated the Poisson rate by averaging counts across trials. Equal probability bins were constructed from the quantiles of this estimated Poisson distribution, with number of bins determined by expected count of 5 trials in each bin or at least 3 bins––whichever resulted in more bins. A lack-of-fit statistic was calculated by summing across all bins the ratio of the square of the difference between observed and expected bin counts to the expected bin count. Ten thousand Monte Carlo samples of Poisson counts, with sample size given by the observed number of trials, were generated from the estimated Poisson distribution and the lack-of-fit statistic was calculated from each one of these samples. P-value was calculated as the proportion of these Monte Carlo samples with lack-of-fit statistic larger than the statistic value from the observed data. Poisson assumption was considered invalid if the resulting Monte Carlo p-value < 0.1
For triplets with valid Poisson assumption on sound A and B spike counts, we tested for substantial separation between λA and λB, by calculating the intrinsic Bayes factor of the model λA ≠ λB against λA = λB with the non-informative Jeffreys’ prior on the λ parameters: λA,λB, or their common value. The triplet was considered well separated in its single sounds if the logarithm of the intrinsic Bayes factor equaled 3 or more, which is the same as saying the posterior probability of λB≠λA exceeded 95% when a-priori the two models were given 50-50 chance.
Dynamic Admixture Point Process Model
To evaluate whether neural activity fluctuates within trials, we developed a novel analysis method we call a Dynamic Admixture Point Process model (DAPP) which characterized the dynamics of spike trains on dual sound trials as an admixture of those occurring on single sound trials. The analysis is carried out by binning time into moderately small time intervals. Given a predetermined bin-width w = T/C for some integer C, we divide the response period into contiguous time intervals I1= [0;w); I2= [w; 2w)… IC= [(C-1)w, Cw) and reduce each trial to a C-dimensional vector of bin counts(Xej1,…,XejC) for e∈{A;B;AB} and j = 1,…, ne. Mathematically, XejC = Nej(Ic). We typically use w = 25 or 50 (with time measured in ms and T = 600 or 1000).
Our model for the bin counts is the following. Below we denote by the mid-point (c−1/2)w of sub-interval Ic.
. we assume both λA(t) and λB(t) are smooth functions over t∈[0,T].
, where λj(t)=λj(t)=αj(t)λA(t)+{1-αj(t)}λB(t) with αj:[0,T]→(0,1) being unknown smooth functions.
We model αj(t) = S(ηj(t)), where S(t)=1/(1+e-t) is the sigmoid function, and, each η(t) is a (smooth) Gaussian process with E{ηj(t)}≡ϕj,Var{ηj(t)}≡ψj, and, The three parameters (ϕj,ψj,ℓj) respectively encode the long-term average value, the total swing magnitude and the waviness of the αj (t) curve. While the temporal imprint carried by each αj is allowed to be distinct, we enforce the dual trials to share dynamic patterns by assuming (ϕj,ψj,ℓj,j=1,…,nAB), are drawn from a common, unknown probability distribution P, which we call a dynamic pattern generator and view as a characteristic of the triplet to be estimated from the data.
To facilitate estimation of P, we assume it decomposes as P=Pϕψ×Pℓ, where Pϕψ is an unknown distribution on (-∞,∞)×(0,∞) generating (ϕj,ψj), and, Pℓ is an unknown distribution on (0,∞) generating ℓj. To simplify computation, we restrict ℓj to take only a finitely many positive values, representative of the waviness range we are interested in (in our analyses, we took these representative values to be {75, 125, 200, 300, 500}, all in ms). This restricts Pℓ to be a finite dimensional probability vector.
We perform an approximate Bayesian estimation of model parameters. Note that only λA(t) and λB(t) are informed by the single sound trial data. All other model parameters are informed only by the dual sound trial data conditionally on the knowledge of λA(t) and λB(t) To take advantage of this, we first smooth each set of single sound trial data to construct a conditional gamma prior for the corresponding , where the gamma distribution’s mean and standard deviation are matched with the estimate and standard error of . A formal Bayesian estimation is then carried out on all model parameters jointly by (a) using only the dual sound trial data, (b) utilizing the conditional gamma priors on λA(t) and λB(t), and, (c) assuming a Dirichlet process prior (50) on PΦΨ and an ordinary Dirichlet prior on Pℓ. This final step involves a Markov chain Monte Carlo computation whose details will be reported in a separate paper.
A vs. B assignment scores: individual neurons, pairs of neurons, local field potential, and behavioral prediction
A vs. B assignment scores were computed for several analyses (the example shown in Figure 3A-D; pairs of recorded neurons; the relationship between spiking activity and local field potential; and the relationship between saccade sequences and spiking activity). For each triplet, every dual-sound trial received an “A-like” score and a “B-like” score, either for the entire response window (600-1000 ms after sound onset) or for 50 ms time bins. The scores werecomputed as the posterior probability that the spike count in each dual-sound trial was drawn from the Poisson distribution of single-sound spike counts,
For the pairs analysis, the A vs. B assignment scores were computed within each 50 ms time bin independently for each pair of neurons recorded simultaneously. The scores were normalized across trials by subtracting the mean score and dividing by the standard deviation of scores for that bin (a Z-score in units of standard deviation). Only conditions for which both recorded neurons exhibited reasonably different responses to the “A” vs. the “B” sound and for which there were at least 5 correct trials for A, B, and AB trials were included (t-test, p < 0.05). A total of 206 conditions were included in this analysis.
Local field potential analysis
We analyzed the local field potential from 87 sites in both monkeys (30 sites from monkey P’s left IC, 31 sites from monkey Y’s right IC and 26 sites from monkey Y’s left IC). The LFP acquisition was either recorded in discrete temporal epochs encompassing behavioral trials (roughly 1.2 to 2 seconds long) and at a sampling rate of 20 kHz (Dataset I, part of Dataset II), or as a continuous LFP signal during each session, at a sampling rate of 20 kHz or 1kHz (rest of Dataset II). We standardized the LFP signals by trimming the continuous LFP into single trial intervals and down-sampling all signals to 1 kHz. The MAP system filters LFP signals between 0.7 and 300 Hz; no additional filtering was applied. For each site we subtracted the overall mean LFP value calculated over the entire session, to remove any DC shifts, and we excluded trials that exceeded 500mV. For each triplet, we assigned individual dual-sound trials to two groups based on the total spike count in a 600 ms response window (see Methods: A vs. B assignment scores). The average LFP was then compared across the two groups in two 600 ms windows before and after sound onset (baseline and response periods). The results reported here refer to these mean-normalized LFP signals. We obtained similar results when the amplitude of each trial’s LFP was scaled as a proportion of the maximum response within the session.
ACKNOWLEDGMENTS
We are grateful for expert technical assistance from Jessi Cruger, Karen Waterstradt, Christie Holmes, Stephanie Schlebusch, Tom Heil, and Eddie Ryklin. We have benefitted from thoughtful discussions with Michael Lindon, Winrich Freiwald, Liz Romanski, Stephen Lisberger, Marty Woldorff, David Bulkin, Kurtis Gruters, Bryce Gessell, Luke Farrell, David Murphy, and Akinori Ebihara. We thank Bao Tran-Phu, Will Hyung, Stephen Spear, Francesca Tomasi, and Ashley Wilson for assistance with animal training and/or recordings. Financial support for the research was provided by the National Science Foundation (0924750) to JMG and the National Institutes of Health (5R01DC013906-02) to ST and JMG.