Abstract
An analytical approach for quantifying the information in auditory-nerve (AN) fiber responses for the task of level discrimination is described. A simple analytical model for AN responses is extended to include temporal response properties, including the nonlinear-phase effects of the cochlear amplifier. Use of simple analytical models for AN discharge patterns allows quantification of the contributions of level-dependent aspects of the patterns to level discrimination. Specifically, the individual and combined contributions of the information contained in discharge rate, synchrony, and relative phase cues are explicitly examined for level discrimination of tonal stimuli. It is shown that the rate information provided by individual AN fibers is more constrained by increases in variance with increases in rate than by saturation. As noted in previous studies, there is sufficient average-rate information within a narrow-CF region to account for robust behavioral performance over a wide dynamic range; however, there is no model based on a simple limitation or use of AN information consistent with parametric variations in performance. This issue is explored in the current study through analysis of performance based on different aspects of AN patterns. For example, we show that performance predicted from use of all rate information degrades significantly as level increases above low–medium levels, inconsistent with Weber’s Law. At low frequencies, synchrony information extends the range over which behavioral performance can be explained by 10–15 dB, but only at low levels. In contrast to rate and synchrony, nonlinear-phase cues are shown to provide robust information at medium and high levels in near-CF fibers for low-frequency stimuli. The level dependence of the discharge rate and phase properties of AN fibers are influenced by the compressive nonlinearity of the inner ear. Evaluating the role of the compressive nonlinearity in level coding is important for understanding neural encoding mechanisms and because of its association with the cochlear amplifier, which is a fragile aspect of the ear believed to be affected in common forms of hearing impairment.
Similar content being viewed by others
INTRODUCTION
The focus of this article is the classical problem of level encoding and its relation to the physiological response properties of auditory-nerve (AN) fibers. The pioneering work in this area is a series of papers from Siebert (1965, 1968). Siebert took a mathematical modeling approach to derive expressions for the sensitivity index for performance in intensity discrimination. Siebert assumed that the action potentials on a single AN fiber could be represented mathematically as a stochastic point process, specifically a Poisson process. He further assumed that the variability of the firing times on each neuron was statistically independent from fiber to fiber, consistent with the results of Johnson and Kiang (1976). With these assumptions, the only additional information needed to specify the model completely was the rate of firing for each fiber and, most important, the dependence of this firing rate on stimulus level. Siebert assumed a convenient form for this dependence that allowed an analytic solution for the performance of an ideal observer (basically the best performance that could be achieved given the statistical nature of the firing patterns) based on the complete set of neural firings. The current study extends Siebert’s work with analytic expressions that allow explicit description of the level dependence of the temporal response. A simple description of the rate function for each nerve fiber is specified as a function of time and level, and a nonstationary Poisson process is assumed. Analytical performance measures are derived that allow comparisons among the different information sources regarding the level of the stimulus, including the average rate of responses, and the temporal synchronization and relative phases of responses at low frequencies.
Although many people have extended Siebert’s work with computations of performance based on more detailed assumptions about peripheral coding (e.g., Goldstein 1980; Delgutte 1987; Viemeister 1988; Winslow and Sachs 1988; Winter and Palmer 1991; Huettel and Collins 1999; Heinz et al. 2001a,b; reviewed by Delgutte 1996), most of these studies were essentially computational in nature. The computational approach does not take advantage of mathematical expressions that give insight into the relationship among the various sources of information and parameters of dependence. In addition, these studies have shown that the robust level-discrimination performance demonstrated by human listeners is not accounted for by the optimal use of average-rate information in the AN. Thus, it is of interest to examine whether the optimal use of temporal information in AN responses provides a better account of robust performance.
In the following section, general results are derived that are used throughout the article. Then, analytical results based on average rate and on temporal information are presented in separate sections, followed by general discussion.
THEORETICAL CALCULATIONS
General methods for characterizing performance
A convenient parameter for summarizing empirical performance and theoretical predictions is the sensitivity per decibel δ′(Heinz et al. 2001a; based on the sensitivity-per-Bel measure of Durlach and Braida 1969; Braida and Durlach 1972). This parameter is used here because it has been shown to be generally appropriate for intensity discrimination experiments and allows convenient combination of information from independent sources. Specifically, the sensitivity per decibel δ′ is defined in terms of the usual sensitivity coefficient d′ between two levels L and L + ΔL (Rabinowitz et al., 1976; Buus and Florentine 1991):
where L and L + ΔL are measured in decibels (SPL). If the just-noticeable difference (JND) in level is defined by the value of ΔL giving unity d′, then the JND is equal to 1/δ′. It follows directly that Weber’s Law, which refers to a constant JND as a function of level, corresponds to a δ′ that is independent of the reference level L. Similarly, the “near miss” to Weber’s Law, which refers to the slight improvement in level discrimination of tones that has been experimentally observed as level increases (McGill and Goldberg 1968b; Florentine et al. 1987), corresponds to a δ′(L) that increases with L.
The theoretical significance of the parameter δ′ can be appreciated from the combination of Eq. (1) with the definition of d′(L, L + ΔL) from signal detection theory. Specifically, if it is postulated that decisions are made by comparing the value of an underlying random variable X (the decision variable) with a threshold that is chosen for each experiment in a manner that accounts for bias and judgment factors, then achievable performance can be characterized by a single parameter. This parameter is d′(L, L + ΔL) and is defined by the relation
where E[X;L] and var[X;L] are the expected value and variance of X given the level L. This form of the equation for d′ assumes that the variance of X for L and L + ΔL are approximately the same, which is true for the level increments of interest, i.e., near threshold. The δ′ measure is convenient because (δ′)2 [and (d′)2] is an additive parameter for an optimum linear combination of uncorrelated decision variables. That is, if X is given by the relation
where the c m are weighting parameters and the Y m are uncorrelated random variables, then the best performance (maximum δ′) that can be obtained (allowing any choice of the parameters c m ) is given by
where δ′ m is the sensitivity per decibel for the variable Y m and is defined by relations parallel to Eqs. (1) and (2). This additivity theorem for (δ′ m )2 is especially significant for constant-variance Gaussian (normal) or Poisson random variables where the differences in the distributions of the decision variables are determined by changes in the means. In these cases, the optimum linear combination of the random variables results in performance as good as or better than any nonlinear combination.
The general calculation of the best achievable performance as limited by the statistical properties of the data can be done with the likelihood ratio test. In this test, a decision is based on the relative probabilities (or probability densities) of the observations under the two hypotheses under consideration. In other words, to discriminate between the levels L and L + ΔL, one calculates the ratio of the conditional probabilities of the available observations (conditional on each of these levels) and then compares the ratio to a threshold. This threshold is determined by the chosen performance criterion and is based on a priori probabilities and the relative costs and benefits of the possible outcomes. Since comparison to a threshold is not affected by monotonic transformations of both sides of the inequality, the log-likelihood ratio (formed by taking the logarithm of the likelihood ratio and the threshold) is commonly used for specific computations. In the case of statistically independent observations, the log operation results in a summation of the log-likelihood ratios for the individual observations.
Performance based on Poisson observations
Performance can be characterized for the specific case of a nonstationary (time-varying) Poisson process, which is specified by r(t,L), the instantaneous rate as a function of time t and level L (e.g., Siebert 1970; Rieke et al. 1997; Heinz et al. 2001a). The likelihood ratio test can be shown to be equivalent to the following test:
where the set of t i are the times at which discharges occur for a given stimulus presentation, N is the total number of discharges (the count) during the presentation of the stimulus, and C is the threshold. This inequality [lneq. (5)] describes a processor for AN responses that could perform the discrimination task when the rate function r(t,L) is known to the central processor. The processor calculates the ratio of the two rate functions at each observed discharge time t i , sums the log of the ratio across all discharges, and compares the value of the sum to a threshold. This processor is evaluated below with different assumptions about the rate function r(t,L).
In the case that r(t,L) is independent of time t (during the response to the stimulus), the summand in Ineq. (5) is independent of t, and the optimum test reduces to
where the count N contains all relevant information from the Poisson process and C is again the threshold. Since performance depends only on N, the general results for a decision variable X given above can be applied here. It follows directly from the statistics of the (Poisson) variable N that the parameter δ′ is given by
where r(L) is the value of r(t,L) during the stimulus of duration T. For the final approximation in Eq. (7), it is assumed that r(L) varies continuously with L over the increment ΔL so that the derivative exists, and r(L + ΔL) ≅ r(L) + (ΔL)[dr(L)/dL]. Since L and ΔL are measured in decibels, this approximation is only meaningful if the derivative of the function r(L) is taken with respect to L in decibel units. The expression in Eq. (7), which is used extensively in subsequent sections, is equivalent to the result provided by Siebert (1965, 1968).
In order to include the information provided by the specific times of the neural firings, (i.e., temporal information), an expression is required for δ′ when the decision variable is equal to the left side of Ineq. (5) and the rate r(t,L) depends on t. Using approximations similar to those used for Eq. (7), the resulting expression for δ′ is given by
A detailed derivation of this expression can be found in Heinz et al. (2001a).
Performance based on a simple description of the rate function for AN responses
Attention is restricted to a simple analytical expression for r(t,L) that is a good description of the discharge patterns in response to tones for many AN fibers (see Colburn 1973). Specifically, it is assumed that r(t,L) for a tonal stimulus at frequency f is given by
This expression for r(t,L) can be understood by noting that the shape of the time dependence is described by an exponentially rectified cosine: exp{g(L) cos(2πft + Θ(L))}. The instantaneous discharge rate is monotonically related to the sinusoid in a way compatible with most AN data. When the value of the sinusoidal function is large and positive, the instantaneous discharge rate is large and positive; when the value is large and negative, the rate approaches a value near zero. The size of the level-dependent parameter g determines the degree of synchrony. This synchrony parameter g is related to the familiar synchrony index or vector strength VS by the relation VS = 2I 1[g]/I 0[g].
To be compatible with AN data, g(L) must increase as a function of level to a maximum value that depends on the stimulus frequency (Johnson 1980). Note that the modified Bessel function I 0[g] Eq. (9) is equal to the time average of the exponentially rectified cosine term and thus, the average discharge rate is given by r(L). The average rate r(L) must be specified independent of the synchrony g(L) to separate the contributions of rate and synchrony cues in the computations below. Finally, note that the phase parameter Θ(L) depends on level, and the phase of the response could provide information about stimulus level, as discussed below.
The r(t,L) given in Eq. (9) can now be used to compute an explicit expression for the sensitivity index as represented in Eq. (8). Ignoring edge effects due to integrating over fractions of periods, one can show that the sensitivity index is given by the sum of three terms:
These terms arise in the following manner: The integrand in Eq. (8) can be rewritten in terms of the rate times the square of the partial derivative of the log rate. When the log of r(t,L),
is inserted into the equation for (δ′)2 [Eq. (8)], the partial derivative with respect to L results in a sum of derivatives of the terms on the right side of Eq. (11). The square of the derivative thus produces several cross terms within the integral, which can be solved and simplified to arrive at Eq. (10). The terms in the solution can be factored into those depending upon the derivative of rate with respect to level, dr(L)/dL; those depending upon the derivative of synchrony with respect to level, dg(L)/dL; and those depending on the derivative of the phase with respect to level, dΘ(L)/dL. Thus, the three terms in Eq. (10) represent the separate contributions of changes in count (or mean rate), synchrony, and phase to optimum level-discrimination performance. Note that the first term is consistent with the results from the count-based Poisson model described above Eq. (7).
The analysis resulting in Eq. (10) implicitly assumes that the processor makes full use of knowledge of the statistics of the process, which includes complete knowledge of the function r(t,L). Accordingly, knowledge of the time origin of the stimulus, and thus a phase reference, is assumed in order to use all of the synchrony and nonlinear-phase information. The third term in Eq. (10) expresses the maximum information contained within the phase dependence on level. When no absolute phase reference is available (as is generally believed), non-linear phase information is available from differences in timing of discharges across fibers tuned to different frequencies. This issue is addressed below.
Performance of multiple-channel models: several combination rules
The peripheral auditory system is clearly a multiple-channel structure, and any serious attempt to relate level discrimination to peripheral physiology must allow cross-channel combination of information. A fundamental consideration for understanding multiple-channel models is how activities on the individual channels are combined. There are, of course, an unlimited number of possible combination rules, and predicted behavior depends on this choice. The most important rule for this analysis is probably the optimum rule: If the activities on the individual channels are specified probabilistically (including the interchannel statistical dependencies), then signal detection theory allows calculation of the best performance achievable by any processing scheme. Models exist, possibly ad hoc and complex, that can achieve any level of performance between this “best” or optimum performance and chance performance. As in most black-box modeling tasks, the merit of a given model is usually based on its simplicity and economy of assumptions relative to the amount and complexity of the data it is able to predict or describe. In this section, several combination rules are outlined that have been suggested for level discrimination.
Three specific rules illustrate some of the important issues. The first rule is the optimum combination rule, suggested and analyzed by Siebert (1965, 1968) for AN fibers and used by Florentine and Buus (1981) for combining channels in an excitation-pattern model. As noted above, this rule allows computation of the limitations imposed on performance by peripheral encoding when individual channels correspond to individual nerve fibers. The second rule, suggested by Zwicker (1956) and Maiwald (1967b,c), is based on the use of a single channel at a time. This channel is the one that results in the best performance in a given situation. The third rule, analyzed by Goldstein (1974) for loudness judgments and by Teich and Lachs (1979) for discrimination, postulates that the sum of the counts from all fibers, i.e., the total count, is the decision variable.
For each of these rules, the relation between the total sensitivity per decibel (δ′op, δ′sc, or δ′tc, for optimum, single-channel, or total-count rules, respectively) and the statistics of the individual channels can be found. The relation is particularly straightforward for the optimum combination rule when it is assumed that the activities on the channels are Gaussian or Poisson and uncorrelated. In this case the optimal combination is a weighted sum of the individual-channel decision variables (when the distributions of the decision variable differ only in the means), and the squares of the individual δ′ m ; that is,
where m indexes the individual channels. This relation also holds more generally (i.e., for distributions other than Gaussian or Poisson) whenever the final decision variable is specified to be an optimally weighted linear sum of the statistically independent decision variables for the individual channels. For the single-channel rule, δ′sc is simply the maximum value of the δ′ m (maximum over all m); that is,
The third rule, the total-count-decision-variable case, results in the relation
where ΔE m is the change in the mean of the decision variable on the mth channel and V m is the corresponding variance. The total variance is the sum of the individual variances due to the assumption of statistically uncorrelated channels.
These relations can be compared and understood by considering a few special cases. If there is a single channel, performance is the same for all cases. If there are many channels but only one channel has a significant change in the mean for the two levels being discriminated, the optimum decision rule looks only at this channel (the same is true for the single-channel rule); however, the total count rule adds all channels. The variance therefore increases dramatically but the change in the mean remains equal to the single-channel case, thereby degrading performance relative to the other two rules. Another useful example is the case of N statistically identical channels. In this case, the optimum rule and the summation rule show an improvement in δ′ by a factor of √N relative to the single-channel rule, i.e., δ′op = δ′tc = √Nδ′sc.
As a last general step in preparation for the specific models addressed below, consider a set of channels with identical rate-level functions except for their thresholds, which are distributed according to a density function n(L). In these conditions, the sum over m above becomes a convolution of the threshold distribution, n(L), with [δ′(L)]2, the squared sensitivity per decibel of a fiber with L thr = 0. The optimum and summed-channel combination rules then result in the following equations:
where * represents the convolution operation and all quantities except ΔL are functions of the level L.
RESULTS: LEVEL DISCRIMINATION BASED ON AVERAGE-RATE INFORMATION
The ability of single-fiber counting models to explain Weber’s Law
The degree to which the average discharge rate of individual AN fibers can account for robust level discrimination over their limited dynamic range depends on the shape of their rate-level function and on the level dependence of the variance in their response. First, several simple rate-level functions are considered for the Poisson case to evaluate the relationship between the requirements for single AN fibers to produce Weber’s Law and known physiological response properties. The sensitivity per decibel δ′ (and thus the JND) can be calculated if the rate-level function r(L) is specified. Second, the effects of non-Poisson variance on the ability to explain Weber’s Law are explored by evaluating an existing model that includes dead-time refractoriness.
The effect of rate-level shape
The first rate-level function considered is given by the following equations in which all levels are in decibels re: threshold:
where SR is equal to the spontaneous rate of discharge, and L sat is the level above which the rate saturates. This function is plotted in Figure 1 for a value of L sat equal to 40 dB and for several values of SR. It can be verified that, for this choice of r(L), (δ′)2 in Eq. (7) is given by
In Figure 2, (δ′)2 vs. L is plotted for T = 0.1 s and three values of SR, corresponding to the three classes of fibers suggested by Liberman (1978): those with low spontaneous rates (Fig. 2A. with SR = 0.5 sp/s), medium spontaneous rates (Fig. 2B with SR = 10 sp/s), and those with high spontaneous rates (Fig. 2C with SR = 50 sp/s). Saturation at L sat makes the function zero above L sat. The dashed curves show the JND in decibels as a function of L for a single channel with the corresponding (δ′)2. Note that all functions are plotted relative to a threshold (which would vary among AN fibers).
Several observations are relevant here. First, note that a single low-to-medium spontaneous-rate fiber provides sufficient rate information for a JND of 3 or 4 dB at levels just above threshold. When a longer duration, say T = 0.3 s, and a higher slope, say 10 sp/s/dB (achieving a discharge rate of 200 sp/s at 20 dB above threshold), are used in the calculations, a single fiber provides sufficient rate information for a JND of approximately 1 dB.
Second, note that high-SR fibers provide significantly less information in terms of average discharge rate than do low-SR fibers. In Figure 2, (δ′)2 for a low-SR unit is approximately three times larger than that for a high-SR unit. This effect comes from the larger variance associated with higher means in Poisson random variables. The details of the rate-level functions vary among the different spontaneous rate groups (Sachs and Abbas 1974; Winter et al. 1990; Schoonhoven et al. 1997); however, a more precise description of the rate-level functions would be expected to have only a small quantitative, but not qualitative, effect on the calculations and conclusions of this study.
Third, the shape of the dependence of (δ′)2 for a single Poisson channel with the basic rate-level function of Eq. (17) differs grossly from psychoacoustic observations. It is suggested by the results plotted in Figure 2 [and is easily verified analytically, see Eq. (7)] that whenever r(L) is increasing linearly (on a dB scale), (δ′)2 is decreasing with level (due to increased variance with increases in rate). There is no physiological evidence of fibers with rate-level functions that increase faster than linearly over a range greater than 10 or 20 dB. Thus, it can be concluded that a single Poisson channel with a rate-level function compatible with available physiology cannot provide sufficient information even for Weber’s Law, let alone improvement of performance with level, or “the near miss to Weber’s Law.”
The second rate-level function considered is given by
where L is in dB relative to a threshold reference value. It is easily verified that Weber’s Law predictions are obtained from a Poisson channel with this rate-level function. In this case δ′ is equal to a constant value [10√(4cT)] that is independent of L for levels above threshold. It follows that a near-miss prediction on a single Poisson channel requires a rate-level function that grows faster than quadratically on a decibel level scale.
The last rate-level function considered has an exponential shape (on a dB level scale). This type of function was used in the counting models of McGill and Goldberg (1968a,b) and Luce and Green (1974). In both models, count is a Poisson (or nearly Poisson) random variable with a rate-level function that can be written as
where a and b are constants and L is the level in dB re: some reference level L ref. In this case, the resulting expression for (δ′)2 is Tab 2ebL. The result is a JND that decreases with increasing level, and thus these models can predict the observed near-miss behavior. However, rate-level functions with this shape have not been observed in AN fibers and at best could represent combinations of many fibers.
Since single-channel Poisson counting models of level discrimination require rate-level functions that do not represent physiological data directly, we next consider whether deviations from Poisson variability can account for Weber’s Law in single AN fibers.
The effect of deviations from Poisson variability
The importance of the assumptions for the statistical properties of the model discharge patterns is illustrated by single-channel predictions using the formulae of Teich and Lachs (1979). They give expressions for the mean and variance of the count for a dead-time-modified Poisson process, assuming that the rate of the original Poisson process grows proportionally to stimulus level in decibels. The mean and variance of the count in the modified process are given by
and
where E is the stimulus energy, E ref is a threshold constant, T is the duration, and τ is the dead time. If (δ′)2 is computed for a single channel with these statistics, one obtains
Thus, (δ′)2 for a single channel would saturate and become independent of level.
This example shows the importance of variance assumptions, since the mean rate-level function [mean count in Eq. (21) divided by T] has a shape very similar to Eq. (17) with L sat = 20 dB (if thresholds are adjusted), and yet the predicted (δ′)2 in Eq. (23) is dramatically different than that given in Eq. (7) and shown in Figure 2. Also, a saturating rate-level function can provide information sufficient for Weber’s Law and even at a δ′ level consistent with a JND of 0.3 dB (δ′ = 3.2) when τ/T = 0.005.
However, a question for this article is how well non-Poisson models of this type describe AN behavior. The mean function in the model of Teich and Lachs (1979) is similar to observed rate-level functions; the variance, however, is clearly inconsistent with available data near saturation. For example, with a saturation rate of 100 sp/s, the variance of the count over 1 s at a rate of 90 sp/s is less than unity, and the coefficient of variation (the ratio of the standard deviation to the mean count) is less than 0.01. Furthermore, this relative variability continues to decrease inversely proportionally to the stimulus energy because the model fibers are stimulated to discharge almost immediately upon the conclusion of the fixed dead time after each firing. AN data are closer to the Poisson assumption. For example, the count data from Young and Barta (1986, their Fig. 6a) show that a count of 20 discharges per 200 ms (100 sp/s) has a standard deviation of approximately 3 discharges per 200 ms. (For a count of 100 discharges over a full second, this would correspond to a standard deviation of 3√5 = 6.7). Thus, the coefficient of variation at this mean count would be 0.067, which is roughly a factor of 1.5 less than expected for a Poisson process (0.1), but much greater than the model used by Teich and Lachs (1979). It follows that this non-Poisson model does not appropriately describe AN patterns and thus overestimates the amount of AN information at high levels.
This example illustrates the extent to which variance must be reduced from Poisson statistics to produce Weber’s Law and that this reduction is much greater than has been reported for AN fibers (e.g., Young and Barta 1986; Delgutte 1987; Winter and Palmer 1991). Thus, the deviation from Poisson discharge-count variance observed in AN fibers cannot account for the inability of Poisson counting models to predict robust level encoding.
The ability of multiple-CF counting models to explain the “near miss” to Weber’s Law
Since single-fiber models cannot simultaneously be consistent with physiological observations and psychophysical observations, multiple-channel models are considered. When models for level discrimination of narrowband stimuli are considered, the spread of excitation to fibers with CFs that differ from stimulus frequency becomes a central issue. This section begins with a description of a simple AN model (Siebert 1965, 1968) to demonstrate how a population of AN fibers with limited dynamic range can produce Weber’s Law. Several modifications to Siebert’s model are then discussed in terms of their ability to produce the “near miss” to Weber’s Law.
Siebert’s model of Weber’s Law based on spread of excitation
Unlike many other modeling studies that also explicitly included a spread of excitation over CF (e.g., Zwicker 1956; Maiwald 1967a,b,c; Florentine and Buus 1981), Siebert (1965, 1968) included the AN discharge patterns explicitly in his multiple-CF model. Siebert (1965, 1968) assumed optimum processing of a population of Poisson counts, which were based on a saturating rate-level function that was the same for all fibers except that tone threshold varied with CF based on AN frequency tuning. With these assumptions, level discrimination using fibers within a narrow CF band is poor except for a narrow range of levels near threshold (as discussed above), so that the robustness of performance across level is almost completely determined by the spread of excitation over CF bands. Siebert (1965, 1968) showed that Weber’s Law is predicted for tonal stimuli by this model if one assumes a uniform-in-log-frequency distribution of CFs, and two-piece linear tuning curves with constant slopes (in decibels versus log-frequency axes). Although AN fibers with CFs near the tone frequency saturate, the edges of the activity pattern provide a constant amount of information as level increases.
Possible modifications of Siebert’s model to explain the “near miss”
If the distribution of CFs is changed from uniform-in-log-frequency to uniform-in-linear-frequency, a “near miss” deviation from Weber’s Law is predicted with the amount of deviation dependent upon assumptions about the slopes of the tuning curves. This deviation is a direct consequence of having more fibers in the nonsaturated region of CFs as level increases. Specifically, the increase in the number of fibers in the nonsaturated region with CFs above the stimulus frequency is much greater than the decrease in the number with CFs below the stimulus frequency. However, the original uniform-in-log-frequency assumption is much more descriptive of available physiological data than the uniform-in-linear-frequency alternative, thus rejecting this possibility for the purposes of the present study. [Note that the uniform-in-linear-frequency assumption with this model results in the incorrect prediction that the masking of high-CF fibers results in a decrease in performance as level increases when the masking forces the system to use information on low-frequency fibers, since the number of fibers in the useful range (nonsaturated) decreases as level increases.]
If the shape of the tuning curves changes as a function of CF such that higher-CF fibers have lower slopes (decreasing Q), then the spread of excitation would proceed more quickly and place more high-CF fibers in the useful range at higher levels. A model with this assumption would also result in a “near miss” prediction. Although the narrowly tuned “tip” portion of tuning curves shows an increasing Q with increasing CF, the tails of the tuning curves at high CFs (Kiang and Moxon 1974) provide a clear physiological basis for this assumption. Other examples of the dependence of the tails on CF can also be seen in Kiang (1980) and Evans (1972). Instead of describing available tuning curves and the distribution of CFs and calculating the spread of excitation, one can measure the spread directly by measuring the distribution of thresholds for a fixed stimulus waveform for all AN fibers. A sample of measured thresholds for a 1-kHz tone from three cats can be seen in Figure 4 in Kiang and Moxon (1974). The slope of the mean threshold as a function of CF decreases with increasing CF when plotted on the log-frequency axis, consistent with the increasing number of useful fibers as the level increases (if the distribution of CFs is approximately uniform on a log-frequency scale and if the distribution of thresholds at a fixed CF is independent of CF). There are not sufficient data to characterize this factor with quantitative precision; it is clear, however, that this effect would contribute to a deviation from Weber’s Law in the observed direction, i.e., an improvement in performance with increasing level.
The third factor is the shape of the rate-level functions for fibers with CFs above and below the stimulus frequency (Sachs and Abbas 1974; Cooper and Yates 1994). The slope of the rate-level function for a given fiber decreases as the stimulus frequency increases above CF. As frequency decreases below CF, the slope either increases or remains roughly constant. This result indicates that many high-CF fibers will have steeper rate-level functions than fibers with CFs near the stimulus frequency. This would also predict an improvement in discrimination performance at higher levels (other things being equal) relative to Siebert’s prediction of Weber’s Law. If the slope increases by a factor of 3, the predicted δ′ for a single fiber increases by a factor between √3 and 3, depending on the spontaneous rate. Note that such a slope change is consistent with the nonlinear growth of the output of the high-frequency channels in Zwicker’s (1956) model that leads to a predicted improvement in performance at high levels. Furthermore, the fibers with CF below the stimulus frequency are less useful than the fibers with higher CFs. If it were possible to eliminate the higher-CF fibers, performance (i.e., sensitivity per decibel) would be expected to decrease as level increased as a consequence of this effect.
To summarize the conclusions from Siebert’s model (optimum processing of stationary Poisson patterns), deviations from Weber’s Law that are comparable to psychophysical data (a near miss) could be predicted for tones by modifying the model to incorporate the tails of tuning curves for high-CF fibers and/or changes in slope of the rate-level function with tone frequency relative to CF. It is important to consider how well the data being predicted constrain the models being investigated. For example, as discussed above, many modifications of Siebert’s basic model can produce a “near miss” to Weber’s Law based on spread of excitation (also see Lachs et al. 1984; Delgutte 1996; Heinz et al. 2001a,b). Thus, the ability to predict the “near miss” rather than Weber’s Law for tones in quiet is not a critical issue for evaluating level encoding in the AN. A much stronger constraint is the ability to explain the observation that level-discrimination performance is still robust in the presence of off-frequency masking noise (e.g., Moore and Raab 1974, 1975; Viemeister 1983). The simplest (and most common) interpretation of this result is that spread of excitation is not necessary for robust level encoding. This interpretation is based on the assumption that the only influence of the off-frequency masker is prevention of any spread of excitation to CFs away from the tone frequency. If this is true, it becomes critical to account for Weber’s Law only on the basis of information in AN fibers with CFs near the tone frequency. In fact, models that assume Weber’s Law within single-CF channels produce a near miss to Weber’s Law for tones in quiet based on spread of excitation (e.g., Florentine and Buus 1981). The influence of off-frequency maskers may be more complicated than typically assumed because of nonlinear interactions between the signal and masker (e.g., Rhode et al. 1978); however, a quantitative evaluation of these effects requires a more complex AN model than is considered in the present study (see Heinz 2000; Heinz et al. 2002). Nonetheless, it is informative to evaluate level encoding in single-CF channels, and thus the next section continues with the analytical approach to examine the ability of rate information to account for Weber’s Law based on pooling across AN fibers with similar CFs.
The ability of single-CF counting models to explain Weber’s Law
In this section, level discrimination performance (as characterized by δ′ vs. L) is obtained from a population of AN fibers with a common CF (equal to the stimulus frequency). The results depend upon the postulated combination rule as well as the set of assumptions about the discharge patterns. This section focuses on encoding in terms of discharge rate for illustrative purposes, while contributions of temporal information are evaluated below.
Optimum processing
First consider optimum processing of time-invariant Poisson processes (i.e., optimally weighted Poisson counting variables) with rate-level functions given by Eq. (17) as plotted in Figure 1. Since δ′ for L thr = 0 has been calculated for this case (as plotted in Fig. 2), overall performance can be calculated by combining across individual AN fibers according to Eq. (15). The distribution of threshold values, n(L), must be specified along with the values for spontaneous discharge rate. To specify the thresholds, the observation that the rate thresholds of fibers at their CFs are (negatively) correlated with the spontaneous rates of discharge (SRs) is incorporated (Liberman 1978). Three distributions of thresholds are chosen, one for each of the SR categories (low SR = 0.5 sp/s, medium SR = 10 sp/s, and high SR = 50 sp/s). The threshold distributions shown in Figure 3 are based on the data of Liberman (1978). With these assumptions, the optimum sensitivity per decibel is given by
where δ′L, δ′M, and δ′H are the sensitivities per decibel for L thr = 0 described by the functions in Figure 2 for the low, medium, and high SR cases, respectively; n L(L), n M(L), and n H(L) represent the threshold distributions shown in Figure 3; and * represents convolution. The result of this calculation for (δ′op)2, is shown in Figure 4 for a band that is assumed to contain 2200 fibers, corresponding roughly to the number of fibers in a single 1/3-octave band of CFs when frequencies are uniformly distributed on a logarithmic scale (1350 high-SR, 500 medium-SR, and 350 low-SR fibers).
It is apparent in Figure 4 that optimum use of the counts on all fibers in a common CF band does not predict a level dependence corresponding to Weber’s Law or the near miss to Weber’s Law. Rather it predicts a significant decrease in performance as level increases above about 15 dB. However, predictions for reference levels near 15 dB using 2200 fibers are better than observed performance (e.g., δ′op ≅ 4.5, whereas δ′observed ≅ 1 since the JND ≅ 1 dB). The inability of single-CF Poisson rate information to account for Weber’s Law is consistent with similar studies that have used more accurate rate-level shapes (i.e., that vary with spontaneous rate and threshold) and discharge-count variance based on AN data from cat (e.g., Delgutte 1987; Viemeister 1988; Winslow and Sachs 1988). In contrast, Winter and Palmer (1991) predicted robust level-discrimination performance over at least 110 dB based on single-CF AN rate-level responses in guinea pig. Robust level encoding at high levels in their model resulted from the contribution of high-threshold, low-SR fibers with nonsaturating (“straight”) rate-level functions. However, “straight” rate-level functions were not observed in the guinea pig data for CFs below 1.5 kHz (Winter and Palmer 1991) and have not been observed in data from cat at any CF (e.g., Sachs and Abbas 1974; Delgutte 1987; Winslow and Sachs 1988). Thus, optimal processing of rate information within a single-CF-band does not generally predict Weber’s law. This conclusion implies that this type of rate-based single-CF model alone cannot describe the action of a single (critical-band) channel in models of the type suggested by Zwicker (1956) and Maiwald (1967a,b,c) since performance [e.g., δ′ (L)] is postulated to be independent of L for a single channel stimulated at its CF. However, the wide dynamic range over which enough single-CF rate information is available to account for human performance suggests that combination rules other than the optimal rule should be examined.
Other (nonoptimal) combination rules
Throughout the level range for which predicted performance is superior to observed performance (from less than 0 dB to greater than 70 dB in Fig. 4), there is generally sufficient information available in this single band of fibers to allow performance equal to observed performance if appropriate nonoptimum processing is assumed. This means in essence that many nonoptimum models could describe the observed results in this range. Most of these nonoptimum models may be contrived and ad hoc, but some may be simple and appealing.
In the discussion of combination rules above, total-count and single-fibers-at-a-time rules were considered in addition to the optimum rule. The total count statistic can give performance only equal to or poorer than optimum. Since saturated fibers contribute maximum variance and a negligible change in the mean to the total count, total-count performance will be significantly worse than optimum at high levels. Since this degradation will be relatively less important at lower levels, the total-count statistic will give a description of level discrimination that is even worse (more rapid decrease in performance with level) than the optimum use of Poisson counts. Further, as seen in Figure 2, a single-fiber-at-a-time rule does not provide adequate sensitivity; however, a similar rule applied to groups of fibers (i.e., using a different set of fibers at each level, e.g., Winslow et al. 1987) could be constructed to give Weber’s Law performance over a range of at least 80 dB. Similarly, Delgutte (1987) has shown that a combination rule in which low-SR, high-threshold fibers were processed more efficiently than high-SR, low-threshold fibers could extend the dynamic range over which Weber’s Law was predicted; however, performance still degraded significantly above 80 dB SPL.
The considerations for cases in which only fibers within a single CF band are available can be summarized as follows: Performance based on rate information would ultimately degrade at high levels, and therefore the full range of CFs must be included to understand level discrimination of tones at the highest levels. When all fibers within a given CF band are included, and when all uncertainties are considered, it is not possible to exclude the possibility of performance consistent with Weber’s Law over a wide range of levels using only the rate information in a single-CF band. However, a parsimonious and general model for predicting robust level encoding based on the processing of average-rate information does not exist at this time. Thus, it is of interest to extend the analytical approach used in the present study to the quantification of other sources of level information contained in single-CF AN responses, specifically temporal information.
RESULTS: LEVEL DISCRIMINATION BASED ON TEMPORAL INFORMATION
The ability of synchrony information to explain Weber’s Law
The time-varying Poisson single-channel case [Eq. (10)] is considered here, assuming that the time-varying rate-level function is given by Eq. (9) with Θ independent of level (i.e., level-dependent synchrony is included, but not level-dependent phase). Since the characteristics of the first term in Eq. (10) (i.e., rate information) have been described above, attention is focused on the second, synchrony term.
To evaluate the effect of the second term in Eq. (10), specific assumptions about the function g(L) are made. The maximum value of g(L) depends on frequency; in cat, the largest values are about 5 and occur for low frequencies (as do the largest slopes of g vs. L) (Johnson 1980). The maximum value of g(L) decreases steadily above about 1–3 kHz (Johnson 1980; Weiss and Rose 1988; Koppl 1997). As a convenient approximation to available data (Evans 1980; Johnson 1980), it is assumed in the following that g(L) increases linearly over a range of 20 dB as shown in Figure 5A for a low-frequency fiber. Also, since the discharge patterns on AN fibers often show phase-locking to the stimulus at levels below the level at which the average rate of discharges starts to increase (Johnson 1980), a hypothetical fiber is considered for which g(L) increases to its maximum value before the rate increases above the spontaneous rate. [In actuality, the dynamic range for synchronization partly overlaps that of average rate, but the conclusions drawn here are not affected by this simplification.] For easy comparison to the average-rate-alone results in Figure 2, the duration is again taken to be T = 0.1 s. For dg(L)/dL = 1/4, (δ′)2 reduces to (5/8) SR d 2[ln I 0 (g)]/dg 2, where SR is the spontaneous rate. This function is plotted in Figure 5B for two values of SR (SR = 50 sp/s and SR = 10 sp/s). Note that in contrast to rate information, which decreases as SR increases, synchrony information increases with SR because of the increased number of discharges that encode temporal information.
This example shows that synchrony can provide much information for level discrimination at low frequencies. Since the synchrony threshold is clearly below the rate threshold, this source of information could extend the range of levels over which a single fiber could provide robust performance. If synchrony information is included, the (δ′)2 for synchrony in Figure 5B is essentially added to each fiber’s (δ′m)2 from the rate-alone analysis in accordance with Eq. (10) above. For the single-CF population model considered above (see Fig. 4), this information could add 10–15 dB to the range of levels over which (δ′op)2 above observed performance but does not change the fact that predicted performance deteriorates rapidly at high levels.
The ability of nonlinear-phase information to explain Weber’s Law
An additional source of information in the phase-locked discharges of low-frequency AN fibers is the nonlinear phase (Anderson et al. 1971), which introduces the third term on the right side of Eq. (10). As mentioned above, the usefulness of this cue is dependent upon either the availability of an absolute phase reference, which is unlikely, or the use of relative times of the discharges of fibers with different CFs (Carney 1994). The Poisson model with nonlinear phase cues can be studied using the expression for the time-varying rate given in Eq. (9), which includes level-dependent rate and synchrony in addition to level-dependent phase. The average-rate-level function used in this section is described in Eq. (17) (Fig. 1) and the level-dependent synchrony was described in the last section (Fig. 5A).
The level-dependent phase is described by a simple function that captures the key features described by Anderson et al. (1971) for AN responses, Ruggero et al. (1997) for basilar membrane responses, and Cheatham and Dallos (1998) for inner hair cells. The phase of a fiber’s response to tones has increasing lag as a function of level in response to stimulus frequencies below CF, has no change with level at CF, and has decreasing lag with level in response to frequencies above CF. Figure 6 shows the dependence of phase on frequency for a single model fiber’s responses at several levels; the plotted phases are referenced to phase at 90 dB SPL [using Anderson et al.’s (1971) convention]. The model phase varies linearly between 30 and 90 dB SPL. This is a conservative range of levels over which the nonlinear-phase cue might convey information for level discrimination; Ruggero et al. (1997) showed that in the most sensitive experimental preparations, the compressive nonlinearity has a threshold of about 20 dB SPL and extends to levels of 100 dB or higher. The maximum difference in phase between the nonlinear-phase threshold (30 dB SPL) and 90 dB SPL is specified as π/2, and that maximum is reached at frequencies 1/2-octave above and below CF.
This AN model has a highly simplified representation of the nonlinear phase, which facilitates the calculations here. A more accurate representation would vary the amount and frequency range of the level-dependent phase as a function of CF to incorporate the change in the strength of the active process as a function of CF (see Heinz 2000). Nevertheless, the form chosen here yields phase-level curves that are comparable to those of Anderson et al. (1971) for low CFs. As in the treatments of level-dependent rate and synchrony, the details of the level-dependent phase are not important to the goal of illustrating a method for quantifying the information in this neural cue.
When quantifying the information for level discrimination that is available in responses that contain all three level-dependent response properties, the three terms in Eq. (10) can be plotted separately to illustrate the relative contributions of each cue. The upper panels of Figure 7 show rate r(L), synchrony g(L), and phase Θ(L) versus level for a high-SR, 1200-Hz CF model fiber in response to a 1000-Hz tone. Siebert’s (1968) tuning curve function,
was used to compute the threshold for this off-CF tone. For illustration, the frequency of the tone was chosen to be approximately a quarter-octave below CF, resulting in a half-maximal phase cue (see Fig. 6), Recall that the nonlinear-phase cue for tones exists only for fibers responding to frequencies above or below CF. The lower panels of Figure 7 show (δ′)2 for each of the three terms in Eq. (10). The rate-level and sync-level functions are shifted approximately 15 dB to the right compared with Figures 1 and 5 because the fiber is responding to a tone at a frequency away from CF. As before, the changes in rate with SPL contribute information over a limited level range between rate threshold and L sat. The synchrony contributes a relatively large amount of information, but only at very low levels. In contrast, the nonlinear phase contributes values of (δ′)2 comparable to those of the rate term, which are maintained at mid-to-high SPLs. The nonlinear-phase cue increases from 30 to 55 dB SPL because the rate-level function has still not saturated at these levels. Above 55 dB SPL, where rate is saturated, the phase cue remains constant until 90 dB SPL, where the phase becomes level-independent in the model and no information about level change is provided.
Relative amounts and CF distributions of rate, synchrony, and nonlinear-phase information
The definition of the time-varying rate function in Eq. (9) resulted in the ability to “parse” the level information into the three terms in Eq. (10). The overall information for level discrimination contributed by the three cues can be examined by simply summing the three terms of (δ′)2 (Fig. 8), which illustrates the differing importance of the rate and temporal forms of information over different ranges of sound levels for a single fiber. Of course, the distribution of information provided by some of these cues also varies with CF. The CFs that convey information in the form of rate and synchrony vary with level because of spread of excitation, saturation, and the change in amount of compression as a function of CF.
Figure 9 illustrates (δ′)2 vs. CF for the three terms in Eq. (10) and their sum at three sound levels of a 1000-Hz tone. The level that excites each model fiber is determined by the simple triangular tuning-curve filter described in Eq. (25). The effects of saturation for fibers with CF near the stimulus frequency and the spread of excitation with increasing level are clear in the “rate” and “synchrony” terms. The “phase” term illustrates that, at moderate-to-high levels, the fibers tuned near the tone frequency have information for level discrimination. The sum of the three terms illustrates that the CF range near the tone frequency provides information at all three SPLs, due to synchrony and rate at low sound levels and to phase at moderate-to-high sound levels. Thus, at low CFs, where the average-rate dynamic ranges of both low- and high-SR fibers are limited, the nonlinear-phase cues may be especially important for conveying information related to changes in level.
GENERAL DISCUSSION
This study explored several issues related to the encoding of level in AN discharge patterns. Analytical models of AN tone responses and signal detection theory were used to quantify optimal performance limits based on the stochastic responses of the AN. Simple analytical AN models provided insight into the relative importance of different sources (rate and temporal) of neural information for level encoding. Specifically, simple equations were derived for the relative contributions of average-rate, synchrony, and phase cues. The inclusion of temporal information in analytical AN models extends previous modeling studies of level encoding, which have been primarily limited to average-rate information (e.g., Siebert 1965, 1968; Delgutte 1987; Winslow and Sachs 1988; Viemeister 1988; Winter and Palmer 1991).
The ability of individual AN fibers to robustly encode level changes based on average rate depends on the shape of the rate-level function and on the nature of the discharge randomness. It was shown that the rate information provided by individual AN fibers is maximal at stimulus levels within 5–10 dB above fiber threshold and that information begins to degrade at levels well below those for which rate saturation limits performance. This degradation is primarily due to the variance of AN discharge counts increasing significantly with increases in rate, while AN rate-level curves do not increase faster than linearly (versus decibels) over wide level ranges. Thus, individual AN fibers are even more limited in their ability to robustly encode changes in stimulus level based on rate than saturation would suggest.
Since there is considerably more than enough information in the AN population response to allow observed performance in level discrimination in quiet over a wide range of levels, the interesting question becomes how to understand the parametric dependencies and the effects of off-frequency maskers. It is typically assumed that good performance in the presence of off-frequency maskers implies that Weber’s Law must be produced by AN fibers within a narrow CF band. However, consistent with previous studies, it was shown here that optimal processing of average-rate information does not account for Weber’s Law based on fibers with a limited CF range because performance degrades significantly above about 40 dB SPL.
While the predicted trends in optimal performance were inconsistent with behavioral performance, it is not possible to rule out rate-based models because there is enough total rate information to account for robust level-discrimination performance over a wide range of levels (for general discussions of the use of optimal performance limits to evaluate neural encoding, see Siebert 1968, 1970; Colburn 1973; Delgutte 1996; Heinz et al. 2001a). Optimal performance limits superior to behavioral performance suggest the need for a suboptimal combination scheme (as discussed below). However, the strong degradation in rate information as level increases above medium levels suggests that parsimonious suboptimal combination schemes based on rate information may not exist and that other sources of neural information may be needed to account for robust level encoding in the AN.
The analytical AN model used in the present study allowed for the quantitative comparison of the relative contributions of rate and of temporal information. The level dependence of synchrony provides information that extends the dynamic range for robust level encoding at low frequencies, but only at low levels. Thus, synchrony information per se does not help account for robust level encoding at high levels based on fibers within a narrow range of CFs. In contrast, it was shown that nonlinear-phase cues provide robust level information within a narrow CF range over a wide range of levels, including high levels.
The third term of Eq. (10) illustrates the dependence of nonlinear-phase information on basic AN response properties. It was shown that phase information depends not only on the rate of change in phase with level, but also on average discharge rate and strength of synchrony. This makes sense intuitively, as changes in phase are easier to decode when many spikes are observed and when these spikes are strongly phase locked to the stimulus. This dependence implies that nonlinear-phase information at low frequencies is robust up to high levels in all fibers because average rate and synchrony are essentially constant at levels more than ~30 dB above fiber threshold, and the rate of change of the phase is essentially constant with level (Anderson et al. 1971; Ruggero et al. 1997).
The relation between nonlinear-phase responses and nonlinear tuning implies that nonlinear-phase cues exist over the entire range of levels for which the cochlear amplifier produces compressive BM responses (i.e., at least up to ~90 dB SPL; Ruggero et al. 1997). The predicted optimal performance limits do not depend on (or suggest) a specific mechanism for decoding the nonlinear-phase cues. However, these phase cues can be decoded by any mechanism that compares the relative phase response across fibers with different CFs (discussed further below) because the level dependence of phase differs across frequency relative to CF (Anderson et al. 1971; also see Fig. 6). Thus, nonlinear-phase responses appear to provide a realistic source of robust level information near CF and may provide an alternative explanation at low frequencies to the level-dependent processing schemes that are necessary to account for Weber’s Law based on average rate.
The present study provides constraints for two possible explanations, one based on average rate and one based on nonlinear phase, for robust level encoding at high stimulus levels based on AN fibers within a narrow range of CFs. As discussed in the following paragraphs, a specific neural mechanism has been proposed for each explanation of how AN information could be decoded in the cochlear nucleus to produce robust level encoding. Further support for or against each explanation can be garnered by considering whether there are cell types in the cochlear nucleus that could perform the proposed neural processing.
Winslow et al. (1987) have proposed a “selective listening” mechanism in which average-rate information from high-SR, low-threshold fibers is used at low levels, while that from low-SR, high-threshold fibers is used at high levels. Lai et al. (1994) have demonstrated that such a selective-listening strategy can be performed by a simple model of a cochlear nucleus stellate neuron based on shunting inhibition. However, the required anatomical innervation patterns of the different SR fibers to stellate neurons and quantitative psychophysical predictions have not been demonstrated for this mechanism. Furthermore, it is not clear that a model that relies solely on low-SR fibers at high levels would produce Weber’s Law because the information provided by low-SR fibers also begins to degrade within 10 dB above their threshold (see Fig. 2).
Carney (1994) suggested that a monaural, across-frequency coincidence detection mechanism could be used to decode the level information provided by nonlinear-phase cues. There is physiological evidence that some cell types in the cochlear nucleus (e.g., globular bushy cells) have responses that are consistent with a coincidence-detection mechanism (e.g., Carney 1990; Joris et al. 1994a,b). Heinz et al. (2001b) have quantitatively evaluated the ability of a simple across-frequency coincidence-counting mechanism to account for robust level encoding based on the information in AN responses. They showed that a near-CF population of coincidence counters could reliably decode the robust nonlinear-phase cues provided at low frequencies. In addition, the coincidence-detector population also produced Weber’s Law at high frequencies based on the more robust average-rate cues associated with stronger compression at high frequencies. Carney et al. (2002) have also demonstrated the ability of a monaural, across-frequency, coincidence-detection mechanism to account for detection of tones in noise. Future physiological studies are needed to test specific single-unit-response predictions for the coincidence-detection mechanism, as well as the selective-listening mechanism, in order to provide further support for the types of AN information that are important for robust level encoding.
As noted above, available data indicate that AN phase locking to the cycles of a tone decreases at frequencies higher than approximately 1–3 kHz (Johnson 1980; Weiss and Rose 1988; Koppl 1997); however, this rolloff in synchrony was not included in the simple analytical model. If it is assumed that synchrony information in the human AN is similarly reduced at high frequencies, then the information conveyed by synchrony and level-dependent phase cues for encoding the level of a tone is significantly reduced at high frequencies. At high frequencies, the contributions of rate cues from high-threshold, low-SR fibers with wider dynamic ranges are potentially more important (Winter and Palmer 1991; Heinz 2000; Heinz et al. 2001b). The low-SR fibers depend upon large amounts of compression for their wide dynamic ranges and the amount of compression increases as a function of CF (e.g., Cooper and Yates 1994). These facts are consistent with the observation that fibers with non-saturating (“straight”) rate-level functions are not observed at CFs below about 1500 Hz (Winter and Palmer 1991) in guinea pig.
If robust level encoding were dependent on phase cues at low frequencies and on rate cues at high frequencies, then a variation in level-discrimination performance across frequency could be expected. However, Heinz et al. (2001b) have shown that linear spread of excitation plays a strong role for level discrimination in quiet, which suggests that this frequency effect would be subtle. In fact, a subtle frequency dependency has been observed in level-discrimination performance (Jesteadt et al. 1977; Florentine et al. 1987). While the near miss to Weber’s Law occurs for low frequencies, a small but significant nonmonotonicity in performance as a function of level occurs at high frequencies. This “midlevel bump,” which begins to appear between 1 and 4 kHz, can be accounted for by the strong BM compression at high frequencies that starts around 30 dB SPL (Heinz et al. 2001b). Finally, it should also be mentioned that the present analysis does not address the time variation in the rate that occurs after the onset of a stimulus (Smith and Brachman 1979); the level dependence of this adaptation (i.e., a wider dynamic range at onset) could also provide level information (cf. Evans 1980) and is not limited to low frequencies.
The significance of potential variations across species is another issue that requires future work. For example, “straight” rate-level curves have been observed in guinea pig AN responses for high CFs (Winter et al. 1990) but not for low CFs (Winter and Palmer 1991), whereas “straight” rate-level functions have not been observed for any CFs in cat (e.g., Sachs and Abbas 1974; Delgutte 1987; Winslow and Sachs 1988). This result suggests that the strength and frequency dependence of compression may differ for cats and guinea pigs. Heinz et al. (2001b) have demonstrated that the strength of compression has a large effect on the ability of near-CF rate information to account for Weber’s Law. Thus, an important remaining issue is the strength of compression in humans relative to species for which physiological BM and AN data are available. Psychophysical methods have recently been developed that estimate BM compression based on forward-masking studies (e.g., Oxenham and Plack 1997; Nelson et al. (2001). These methods have been shown to produce estimates of human compression that are consistent with the amount of BM compression that has been measured at high frequencies. However, these methods rely on assumptions for which the physiological evidence at low frequencies is not definitive, e.g., that below-CF responses are linear. These methods show promise for estimating the strength of cochlear nonlinearity in humans, but their ability to accurately estimate compression strength as a function of frequency remains to be shown.
In summary, it is likely that level discrimination is mediated by a multiplicity of attributes of the physiological data and that the relative usefulness of these attributes is dependent upon the stimulus circumstances, such as masked or unmasked, wideband or narrowband, short or long duration, and fast or slow stimulus onsets and offsets. The present study provides a quantitative framework to analyze and compare different types of information available in AN responses for encoding level. Future studies with more complex AN models can extend the results in the present study by using this quantitative approach.
References
DJ Anderson JE Rose JE Hind JF Brugge (1971) ArticleTitleTemporal position of discharges in single auditory nerve fibers within the cycle of a sinewave stimulus: Frequency and intensity effects. J. Acoust. Soc. Am. 49 1131–1139 Occurrence Handle4994692
LD Braida NI Durlach (1972) ArticleTitleIntensity perception. II. Resolution in one-interval paradigms. J. Acoust. Soc. Am. 51 483–502
S Buus M Florentine (1991) ArticleTitlePsychometric functions for level discrimination. J. Acoust. Soc. Am. 90 1371–1380 Occurrence Handle1:STN:280:By2D2czhtVA%3D Occurrence Handle1939901
LH Carney (1990) ArticleTitleSensitivities of cells in the anteroventral cochlear nucleus of cat to spatio temporal discharge patterns across primary afferents. J. Neurophysiol. 64 437–456 Occurrence Handle1:STN:280:By6D3M7lsFA%3D Occurrence Handle2213126
LH Carney (1994) ArticleTitleSpatiotemporal encoding of sound level: Models for normal encoding and recruitment of loudness. Hear. Res. 76 31–44 Occurrence Handle10.1016/0378-5955(94)90084-1 Occurrence Handle1:STN:280:ByqD3MznsVc%3D Occurrence Handle7928712
LH Carney MG Heinz ME Evilsizer RH Gilkey HS Colburn (2002) ArticleTitleAuditory phase opponency: A temporal model for masked detection at low frequencies. Acustica 88 334–347
MA Cheatham P Dallos (1998) ArticleTitleThe level dependence of response phase: Observations from cochlear hair cells. J. Acoust. Soc. Am. 104 356–369 Occurrence Handle10.1121/1.423245 Occurrence Handle1:STN:280:DyaK1czjsFWrtw%3D%3D Occurrence Handle9670529
HS Colburn (1973) ArticleTitle“Theory of binaural interaction based on auditory-nerve data. I. General strategy and preliminary results on interaural discrimination. J. Acoust. Soc. Am. 54 1458–1470 Occurrence Handle1:STN:280:CSuC3cflvFI%3D Occurrence Handle4780800
Colburn HS (1981) Intensity perception: relation of intensity discrimination to auditory-nerve firing patterns. Internal Memorandum, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA
NP Cooper GK Yates (1994) ArticleTitleNonlinear input–output functions derived from the responses of guinea-pig cochlear nerve fibres: Variations with characteristic frequency. Hear. Res. 78 221–234 Occurrence Handle10.1016/0378-5955(94)90028-0 Occurrence Handle1:STN:280:ByqD2szovVw%3D Occurrence Handle7982815
B Delgutte (1987) Peripheral auditory processing of speech information: implications from a physiological study of intensity discrimination. MEH Schouten (Eds) The Psychophysics of Speech Perception. Dordrecht Nijhoff 333–353
B Delgutte (1996) Physiological models for basic auditory percepts. HL Hawkins TA McMullen AN Popper RR Fay (Eds) Auditory Computation. Springer-Verlag New York 157–220
NI Durlach LD Braida (1969) ArticleTitleIntensity perception. I. Preliminary theory of intensity resolution. J. Acoust. Soc. Am. 46 372–383 Occurrence Handle1:STN:280:CCaB28nksVU%3D Occurrence Handle5804107
EF Evans (1972) ArticleTitle“The frequency response and other properties of single fibers in the guinea pig cochlear nerve. J. Physiol. 266 263–287
EF Evans (1980) “Phase-locking” of cochlear fibers and the problems of dynamic range. G van den Brink FA Bilsen (Eds) Psychophysical, Physiological, and Behavioural Studies in Hearing. Delft University Press Delft, The Netherlands
M Florentine S Buus (1981) ArticleTitleAn excitation-pattern model for intensity discrimination. J. Acoust. Soc. Am. 70 1646–1654 Occurrence Handle1:CAS:528:DyaL3MXhtlKgu7c%3D
M Florentine S Buus CR Mason (1987) ArticleTitleLevel discrimination as a function of level for tones from 0.25 to 16 kHz. J. Acoust. Soc. Am. 81 1528–1541 Occurrence Handle1:STN:280:BiiB38zltVM%3D Occurrence Handle3584690
JL Goldstein (1974) Is the power law simply related to the driven spike response rate from the whole auditory nerve? HR Moskowitz (Eds) Sensation and Measurement. Reidel Publishing Co. Dordrecht-Holland
JL Goldstein (1980) On the signal processing potential of high-threshold auditory-nerve fibers. G van den Brink FA Bilsen (Eds) Psychophysical, Physiological and Behavioral Studies in Hearing. Delft University Delft, The Netherlands 293–299
Heinz MG. Quantifying the effects of the cochlear amplifier on temporal and average-rate information in the auditory nerve. PhD Thesis, Massachusetts Institute of Technology, 2000.
MG Heinz HS Colburn LH Carney et al. (2001a) ArticleTitleEvaluating auditory performance limits: I. One-parameter discrimination using a computational model for the auditory nerve. Neural Comput. 13 2273–2316 Occurrence Handle1:STN:280:DC%2BD3Mritl2ntA%3D%3D
MG Heinz HS Colburn LH Carney (2001b) ArticleTitleRate and timing cues associated with the cochlear amplifier: Level discrimination based on monaural cross-frequency coincidence detection. J. Acoust. Soc. Am. 110 2065–2084 Occurrence Handle1:STN:280:DC%2BD3MrnsVWgtw%3D%3D
MG Heinz HS Colburn LH Carney (2002) ArticleTitleQuantifying the implications of nonlinear cochlear tuning for auditory-filter estimates. J. Acoust. Soc. Am. 111 996–1011 Occurrence Handle10.1121/1.1436071 Occurrence Handle11863202
LG Huettel LM Collins (1999) ArticleTitleUsing computational auditory models to predict simultaneous masking data: model comparison. IEEE Trans. Biomed. Eng. 46 1432–1440 Occurrence Handle10.1109/10.804571 Occurrence Handle1:STN:280:DC%2BD3c%2FnvFOhtw%3D%3D Occurrence Handle10612901
W Jesteadt CC Wier DM Green (1977) ArticleTitleIntensity discrimination as a function of frequency and sensation level. J. Acoust. Soc. Am. 61 169–177 Occurrence Handle1:STN:280:CSiD1MvjslA%3D Occurrence Handle833368
DH Johnson (1980) ArticleTitleThe relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. J. Acoust. Soc. Am. 68 1115–1122 Occurrence Handle1:STN:280:Bi6D38nptl0%3D Occurrence Handle7419827
DH Johnson NYS Kiang (1976) ArticleTitleAnalysis of discharges recorded simultaneously from pairs of auditory nerve fibers. Biophys. J. 16 719–733 Occurrence Handle1:STN:280:CSmB3s3ltFE%3D Occurrence Handle938715
PX Joris LH Carney PH Smith TCT Yin (1994a) ArticleTitleEnhancement of neural synchronization in the anteroventral cochlear nucleus. I. Responses to tones at the characteristic frequency. J. Neurophysiol. 71 1022–1036 Occurrence Handle1:STN:280:ByuB28%2FivVw%3D
PX Joris PH Smith TCT Yin (1994b) ArticleTitleEnhancement of neural synchronization in the anteroventral cochlear nucleus. II. Responses in the tuning curve tail. J. Neurophysiol 71 1037–1051 Occurrence Handle1:STN:280:ByuB28%2FivV0%3D
NYS Kiang (1980) ArticleTitleProcessing of speech by the auditory nervous system. J. Acoust. Soc. Am. 68 830–835 Occurrence Handle1:STN:280:Bi6D38npt1w%3D Occurrence Handle7419818
NYS Kiang EC Moxon (1974) ArticleTitleTrails of tuning curves of auditory-nerve fibers. J. Acoust. Soc. Am. 55 620–630 Occurrence Handle1:STN:280:CSuC3svhtFw%3D Occurrence Handle4819862
C Koppl (1997) ArticleTitlePhase locking to high frequencies in the auditory nerve and cochlear nucleus magnocellularis of the barn owl, Tyto alba. J. Neurosci. 17 3312–3321 Occurrence Handle1:STN:280:ByiB2M3mslY%3D Occurrence Handle9096164
G Lachs R Al-Shaikh Q Bi RA Saia MC Teich (1984) ArticleTitleA neural counting model based on physiological characteristics of the peripheral auditory system. V. Application to loudness estimation and intensity discrimination. IEEE Trans. Systems Man Cybern. SMC-14 819–836
YC Lai RL Winslow MB Sachs (1994) ArticleTitleA model of selective processing of auditory-nerve inputs by stellate cells of the antero-ventral cochlear nucleus. J. Comp. Neurosci. 1 167–194 Occurrence Handle1:STN:280:BymA1cvjsVY%3D
MC Liberman (1978) ArticleTitleAuditory-nerve response from cats raised in a low-noise chamber. J. Acoust. Soc. Am. 63 442–455 Occurrence Handle1:STN:280:CSeB3MfosFw%3D Occurrence Handle670542
RD Luce DM Green (1974) ArticleTitleNeural coding and psychophysical discrimination data, J. Acoust. Soc. Am. 56 1554–1563 Occurrence Handle1:STN:280:CSqD2c7mslw%3D
D Maiwald (1967a) ArticleTitleBeziehungen zwischen Schallspektrum, Mithorschwelle un der Erregung des Gehors. Acustica 18 69
D Maiwald (1967b) ArticleTitleEin Funktionsschema des Gehors zur Becchreibung der Erkennbarkeit kleiner Frequenz— un Amplitudenanderungen. Acustica 18 81
D Maiwald (1967c) ArticleTitleDie Berechnung von Modulationsschwellen mit Hilfe eines Funktionsschemas. Acustica 18 194
WJ McGill JP Goldberg (1968a) ArticleTitlePure-tone intensity discrimination and energy detection. J. Acoust. Soc. Am. 44 576–581 Occurrence Handle1:STN:280:CCeA2MjhsFY%3D
WJ McGill JP Goldberg (1968b) ArticleTitleA study of the near-miss involving Weber’s Law and pure-tone intensity discrimination. Percept. Psychophys. 4 105–109
BCJ Moore DH Raab (1974) ArticleTitlePure-tone intensity discrimination: Some experiments relating to the “near miss” to Weber’s law. J. Acoust. Soc. Am. 55 1049–1054 Occurrence Handle1:STN:280:CSuC2sjmvFM%3D Occurrence Handle4833697
BCJ Moore DH Raab (1975) ArticleTitleIntensity discrimination for noise bursts in the presence of a continuous, bandstop background: Effects of level, width of the bandstop, and duration. J. Acoust. Soc. Am. 57 400–405 Occurrence Handle1:STN:280:CSqC3MrjsFw%3D Occurrence Handle1117092
DA Nelson AC Schroder M Wojtczak (2001) ArticleTitleA new procedure for measuring peripheral compression in normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 110 2045–2064 Occurrence Handle10.1121/1.1404439 Occurrence Handle1:STN:280:DC%2BD3MrnsVWgtg%3D%3D Occurrence Handle11681384
AJ Oxenham CJ Plack (1997) ArticleTitleA behavioral measure of basilar membrane nonlinearity in listeners with normal and impaired hearing. J. Acoust. Soc. Am. 101 3666–3675 Occurrence Handle10.1121/1.418327 Occurrence Handle1:STN:280:ByiA3snislw%3D Occurrence Handle9193054
WM Rabinowitz JS Lim LD Braida NI Durlach (1976) ArticleTitleIntensity perception. VI. Summary of recent data on deviations from Weber’s law for 1000-Hz tone pulses. J. Acoust. Soc. Am. 59 1506–1509 Occurrence Handle1:STN:280:CSmB3sfgvV0%3D Occurrence Handle939883
WS Rhode CD Geisler DT Kennedy (1978) ArticleTitleAuditory nerve fiber responses to wide-band noise and tone combinations. J. Neurophys. 41 692–704 Occurrence Handle1:STN:280:CSeC1Mnhtl0%3D
F Rieke D Warland R de Ruyter van Steveninck W Bialek (1997) Spikes: Exploring the Neural Code. MIT Press Cambridge, MA
MA Ruggero NC Rich A Recio S Narayan L Robles (1997) ArticleTitleBasilar-membrane responses to tones at the base of the chinchilla cochlea. J. Acoust. Soc. Am. 101 2151–2163 Occurrence Handle10.1121/1.418265 Occurrence Handle1:STN:280:ByiB2MjlsVA%3D Occurrence Handle9104018
M Sachs P Abbas (1974) ArticleTitleRate versus level functions for auditory-nerve fibers in cats: tone-burst stimuli. J. Acoust. Soc. Am. 56 1835–1847 Occurrence Handle1:STN:280:CSqD2sjotVI%3D Occurrence Handle4443483
R Schoonhoven VF Prijs JHM Frijns (1997) ArticleTitleTransmitter release in inner hair cell synapses: a model analysis of spontaneous and driven rate properties of cochlear nerve fibers. Hear. Res. 113 247–260 Occurrence Handle10.1016/S0378-5955(97)00149-4 Occurrence Handle1:STN:280:DyaK1c%2FkslWqtA%3D%3D Occurrence Handle9388003
WM Siebert (1965) ArticleTitleSome implications of the stochastic behavior of primary auditory neurons. Kybernetik 2 206–215 Occurrence Handle1:STN:280:CCmD3svgsVY%3D Occurrence Handle5839007
WM Siebert (1968) Stimulus transformations in the peripheral auditory system. PA Kolers M Eden (Eds) Recognizing Patterns. MIT Press Cambridge, MA 104–133
WM Siebert (1970) ArticleTitleFrequency discrimination in the auditory system: place or periodicity mechanisms? Proc. IEEE 58 723–730
RL Smith ML Brachman (1979) Dynamic response of single auditory-nerve fibers: Some effects of intensity and time. G van den Brink FA Bilsen (Eds) Psychophysical, Physiological and Behavioral Studies in Hearing. Delft University Press Delft, The Netherlands 312–319
MC Teich G Lachs (1979) ArticleTitleA neural-counting model incorporating refractoriness and spread of excitation. I. Application of intensity discrimination. J. Acoust. Soc. Am. 66 1738–1749 Occurrence Handle1:STN:280:Bi%2BD1Mrgt1U%3D Occurrence Handle521559
NF Viemeister (1983) ArticleTitleAuditory intensity discrimination at high frequencies in the presence of noise. Science 221 1206–1208 Occurrence Handle6612337
NF Viemeister (1988) ArticleTitleIntensity coding and the dynamic range problem. Hear. Res. 34 267–274 Occurrence Handle10.1016/0378-5955(88)90007-X Occurrence Handle3170367
TF Weiss C Rose (1988) ArticleTitleA comparison of synchronization filters in different auditory receptor organs. Hear. Res. 33 175–179 Occurrence Handle10.1016/0378-5955(88)90030-5 Occurrence Handle1:STN:280:BieB1critVI%3D Occurrence Handle3397327
RL Winslow MB Sachs (1987) Rate coding in the auditory nerve. WA Yost CS Watson (Eds) Auditory Processing of Complex Sounds. Erlbaum Associates, Mahwah, NJ 212–224
RL Winslow MB Sachs (1988) ArticleTitleSingle-tone intensity discrimination based on auditory-nerve rate responses in backgrounds of quiet, noise, and with stimulation of the crossed olivocochlear bundle. Hear. Res. 35 165–189 Occurrence Handle10.1016/0378-5955(88)90116-5 Occurrence Handle1:STN:280:BiaD28nksFI%3D Occurrence Handle3198509
IM Winter XR Palmer (1991) ArticleTitleIntensity coding in low-frequency auditory-nerve fibers of the guinea pig. J. Acoust. Soc. Am. 90 1958–1967 Occurrence Handle1:STN:280:By2D2s7ksVY%3D Occurrence Handle1960289
IM Winter D Robertson GK Yates (1990) ArticleTitleDiversity of characteristic frequency rate-intensity functions in guinea pig auditory nerve fibres. Hear. Res. 45 191–202 Occurrence Handle10.1016/0378-5955(90)90120-E Occurrence Handle1:STN:280:By%2BB1M3ptlA%3D Occurrence Handle2358413
ED Young PE Barta (1986) ArticleTitleRate responses of auditory nerve fibers to tones in noise near masked threshold. J. Acoust. Soc. Am. 79 426–442 Occurrence Handle1:STN:280:BimC2c7itFI%3D Occurrence Handle3950195
E Zwicker (1956) ArticleTitleDie elementaren Grundlagen zur Bestimmung der Informationskapazitat des Gehors. Acustica 6 365–381
Acknowledgements
This article is an expanded and updated version of Colburn (1981), an unpublished but oft-referenced technical report. That report addressed coding of level in terms of average discharge rate and synchrony only, and this article extends the approach to include nonlinear-phase cues. We acknowledge the editorial assistance of Susan Early. This work was supported by NIH Grants R01DC00100, R01DC01641, and T32DC00038 and by National Science Foundation Grant 9983567.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Steven Colburn, H., Carney, L.H. & Heinz, M.G. Quantifying the Information in Auditory-Nerve Responses for Level Discrimination . JARO 4, 294–311 (2003). https://doi.org/10.1007/s10162-002-1090-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10162-002-1090-6