Abstract
In psychophysics, participants are often asked to discriminate between a constant standard and a variable comparison. Previous studies have shown that discrimination performance is better when the comparison follows, rather than precedes, the standard. Prominent difference models of psychophysics and decision making cannot easily explain this order effect. However, a simple extension of this model class involving dynamical updating of an internal reference accounts for this order effect. In addition, this Internal Reference Model (IRM) predicts sequential response effects. We examined the predictions of IRM in two duration discrimination experiments. The obtained results are in agreement with the predictions of IRM, suggesting that participants update their internal reference on every trial. Additional simulations show that IRM also accounts for the negative sequential effects observed in single-stimulus paradigms.
Similar content being viewed by others
A fundamental issue in human performance concerns the mechanisms that allow people to discriminate between two stimuli (e.g., Falmagne, 1985; Gescheider, 1997; Green & Swets, 1966; Macmillan & Creelman, 2005; Wickens, 2002). It is usually assumed that when participants are asked to compare two stimuli differing in magnitude (e.g., pitch, duration, luminance), they base their judgments on the difference of the internal representations of these stimuli. Models that incorporate this idea are often based on Thurstone’s (1927a, 1927b) original difference model, which also underlies Signal Detection Theory and other accounts in psychophysics (Falmagne, 1985; García-Pérez & Alcalá-Quintana, 2010; Luce & Galanter, 1963; Yeshurun, Carrasco, & Maloney, 2008). Difference models state that the internal representation X 1 of one stimulus is compared with the internal representation X 2 of the other stimulus. More specifically, these models assume that an internal comparison mechanism operates on the difference D = X 1 − X 2. When this difference is greater than a fixed constant—which is zero if the participant has no bias to prefer one of the response alternatives—the participant assumes that the stimulus associated with X 1 is the larger one. Likewise, if the difference is smaller than the fixed constant, the participant assumes that the stimulus associated with X 2 is the larger one.
Although difference models provide a simple discrimination mechanism, they cannot account for certain order effects on discrimination performance (Lapid, Ulrich, & Rammsayer, 2008; Rammsayer & Ulrich, 2012; Ulrich, 2010; Ulrich & Vorberg, 2009). More specifically, if a constant standard s and a variable comparison c are presented successively, difference models predict that discrimination performance as indexed by the difference limen (DL)Footnote 1 should not depend on the order of s and c, even if the observer has a bias favoring one stimulus position. In particular, these models predict equal slopes of the psychometric functions and, thus, equal DLs for stimulus orders 〈sc〉 (i.e., c follows s) and 〈cs〉 (i.e., c precedes s; see Appendix 1 for a proof). Figure 1 provides an illustration. As can be seen, the predicted psychometric functions for the two stimulus orders can be shifted—as one would expect if the participant exhibits a response bias—but their slopes and, thus, the DLs are identical.
In several studies including discrimination of temporal intervals (e.g., Lapid et al., 2008; Stott, 1935; Ulrich, 2010; Woodrow, 1935), weight discrimination (e.g., Ross & Gregory, 1964), and contrast discrimination in visual perception (Nachmias, 2006), DL estimates have been found to be larger for the 〈cs〉 than for the 〈sc〉 stimulus order. Table 1 gives a brief overview of studies in which discrimination performance on 〈sc〉 trials and 〈cs〉 trials was analyzed separately. As one can see, in all of these studies, discrimination performance is superior for 〈sc〉 trials, as compared with 〈cs〉 trials (but see Hellström, 2003, Table 7; Hellström & Rammsayer, 2004). Thus, in sharp contrast to the predictions of difference models, presentation order of stimuli does indeed affect discrimination performance.
For example, Yeshurun et al. (2008) pointed out that performance on detection tasks often differs depending on whether the to-be-detected signal is presented in the first or the second of two intervals. These authors proposed that this might be due to decisional biases or to differences in sensitivity for the comparison of both intervals. In a similar vein, Ulrich and Vorberg (2009) introduced the term Type B effect to distinguish the effect of stimulus order on discrimination performance as indexed by DL from the well-studied effect of stimulus order on the point of subjective equality (PSE).Footnote 2 The latter effect, commonly known as time-order error (TOE),Footnote 3 has been known since the advent of psychophysics and has been subjected to extensive psychophysical research since then (Fechner, 1860; Hellström, 1979, 1985, 2003; Hellström & Rammsayer, 2004; Helson, 1964; Michels & Helson, 1954; Woodworth & Schlosberg, 1954; Guilford, 1954, pp. 305–311 provides a detailed overview of the classical literature on the TOE). In contrast to the TOE, the Type B effect reflects a genuine difference in sensitivity between the two presentation orders, and it has been studied less often (see, however, Hellström, 2003, Table 7; Hellström & Rammsayer, 2004; Rammsayer & Wittkowski, 1990).Footnote 4 It is important to understand why the Type B effect occurs. A major purpose of the present work is to contribute to a better understanding of this effect’s origin.
Trial-by-trial updating of an internal reference
One possibility for the origin of the Type B effect is that participants build up a virtual standard or internal reference across the experiment (e.g., Durlach & Braida, 1969; Helson, 1947, 1964; Michels & Helson, 1954; Morgan, Watamaniuk, & McKee, 2000; Nachmias, 2006). Elaborating on this idea, Lapid et al. (2008) proposed a quantitative account of how such an internal reference I is established and updated on every trial. According to this Internal Reference Model (IRM), the internal reference I n on the current trial n is a weighted sum of the internal reference I n−1 from the previous trial and the internal representation of the first stimulus X 1,n on the current trial. Specifically, the updated internal reference I n on trial n is assumed to follow a geometrically moving average (Roberts, 1959),
with the constant weight g, 0 ≤ g < 1. Participants compare the internal representation of the second stimulus X 2,n on the current trial with this internal reference. If \( {{\mathbf{D}}_n} = {{\mathbf{I}}_n} - {{\mathbf{X}}_{{2,n}}} > 0 \), they judge the first stimulus to be the larger one, and, otherwise, they judge the second stimulus to be the larger one. The IRM is an extension of the standard difference model because, if weight g is set to zero, only the internal representation X 1,n of the first stimulus is compared with X 2,n on trial n (see Appendix 2 for further details).
IRM can be regarded as a special case of Hellström’s (1979, 1985, 2003) prominent Sensation Weighting model, which is related to Michels and Helson’s (1954; Helson, 1964) quantitative TOE theory (Hellström 1985) and Helson’s (1947, 1964) adaptation-level theory. The complete form of Hellström’s Sensation Weighting model is given by the following equation (see, e.g., Hellström, 1985, p. 45; Hellström & Rammsayer, 2004, p. 3; Patching, Englund, & Hellström, 2012):
where d is the subjective difference between two compared successive stimuli, Ψ1 and Ψ2 are the sensation magnitudes of the stimuli, s 1 and s 2 are weighting coefficients, and the term b accounts for a possible judgment bias. Furthermore, Ψ r1 and Ψ r2 are reference levels that reflect the current average subjective level of stimulation. According to Hellström and Rammsayer (2004), the ratio s 1/s 2 determines the size and the direction of the time-order effect. For s 2 = 1 and b = 0, the discrimination process in the Sensation Weighting model is analogous to the discrimination process proposed by IRM.
Taken together, IRM can be regarded as intermediate between classical difference models and the Sensation Weighting model. However and most important, IRM explicitly tackles how the internal reference builds up during an experiment. Specifically, a core feature of IRM is that it incorporates a dynamic trial-by-trial updating process for the internal reference. This assumption is in line with the idea that psychometric functions may be based on a nonstationary process (Fründ, Haenel, & Wichmann, 2011). IRM’s dynamic updating process allows us to derive a set of specific predictions with regard to stimulus order and trial sequence, which will be outlined subsequently.
Predicted effects of stimulus order on DL
IRM predicts worse discrimination performance for 〈cs〉 than for 〈sc〉 trials when these stimulus orders are presented in separate experimental blocks (see Appendix 2). A rough approximation that does not take into account the variance of the distribution the comparisons are drawn from shows that for blocked stimulus order, \( D{L_{{ < sc > }}} \approx \left( {1 - g} \right) \cdot D{L_{{ < cs > }}},\;0 < 1 - g \leqslant 1 \) (for an exact analytical derivation, see Appendix 2). The worse performance on such blocked 〈cs〉 trials than on blocked 〈sc〉 trials is a consequence of integrating the variable comparison c—instead of the constant standard s—into the internal reference. However, it was unclear whether the same prediction applies when stimulus order is random, as in the two-alternative forced choice task. We have therefore conducted Monte Carlo simulations to examine the predictions of the model under blocked and random stimulus order.Footnote 5 The predicted DLs of these simulations are depicted in Fig. 2 for blocked and random stimulus order and for various values of g. As one can see, the model predicts a clear order effect on DL, and this effect increases with increasing g. Surprisingly, the model predicts about the same size of order effect for blocked and random stimulus order.
Psychophysical data pertinent to these predictions are scarce and mixed. First, Nachmias (2006, Experiment 2) instructed five observers to discriminate between visual stimuli (e.g., contrast discrimination of Gabor patches). Nachmias’ data revealed an effect of stimulus order on discrimination performance. Specifically, the standard deviation of the fitted psychometric functions was 41 % larger for 〈cs〉 than for 〈sc〉 trials when these stimulus orders were blocked. When they were randomly intermixed, the effect of stimulus order was somewhat smaller (35 %; Table 1, p. 2461). Although the main effect of stimulus order was significant, neither the main effect of block type (blocked vs. random) nor the interaction was statistically reliable. Thus, these results are consistent with the predictions of IRM. The lack of a statistical effect might, however, reflect low statistical power due to the rather small sample size. Second, Nahum, Daikhin, Lubin, Cohen, and Ahissar (2010) examined auditory two-tone frequency discrimination performance for blocked orders of s and c (i.e., 〈sc〉 and 〈cs〉), as well as for an alternating order of the two stimuli. For alternating stimulus order, they observed worse discrimination performance on 〈cs〉 trials than on 〈sc〉 trials, that is, a Type B effect. For blocked stimulus order, this effect was also present numerically, but it was much reduced and statistically not reliable. Thus, in contrast to Nachmias (2006) and the predictions of the IRM, these data suggest that mixing stimulus order across trials has an impact on discrimination performance. This finding is in accordance with a variety of studies that commonly indicate that blocking versus mixing of experimental conditions can have a severe effect on performance, which is often attributed to adopting different strategies or decision criteria (e.g., Grice & Hunter, 1964; Mattes & Ulrich, 1997; Niemi & Näätänen, 1981; Rogers & Monsell, 1995; Sanders, 1998).
In addition to the Type B effect, IRM accounts for other findings in the literature. For example, it is a well-established finding that discrimination performance increases when the standard is presented repeatedly before the comparison process takes place (e.g., Drake & Botte, 1993; Ivry & Hazeltine, 1995; Schulze, 1989). According to IRM, this is because the signal-to-noise ratio of the internal reference tends to increase with each successive presentation of the standard (see Appendix 2 for a demonstration of IRM’s noise reduction property).
Predicted effects of the previous trial on PSE
IRM implies a less stable internal reference on blocked 〈cs〉 than on blocked 〈sc〉 trials because, in the former case, the variable comparison c is integrated into the internal reference (see Equation 1). This unstable reference not only worsens discrimination performance (i.e., larger DL), but also sequentially modulates participants’ judgments.
Specifically, for stimulus order 〈cs〉, the size of c on the previous trial must affect the size of the internal reference because, for this order, c is integrated into the internal reference. First, if c and, thus, X 1 were small on the preceding trial (say, c = 400 ms) relative to the standard (say, s = 500 ms), the internal reference I n on the present trial will also tend to be small. Hence, the standard in the second position of the current trial appears relatively large when compared with I n . Therefore, the magnitude of c on the present trial will be underestimated. As a consequence of such an underestimation, a larger value of c is necessary in order to yield a sensation equivalent to the one associated with s. Accordingly, PSE will be larger than s for these trials. Second, if c was large on the preceding trial (say, c = 600 ms), the internal reference I n on the present trial also tends to be large, leading to an overestimation of c on the present trial relative to s. Consequently, a smaller value of c suffices in order to yield a sensation equivalent to the one associated with s. This, in turn, leads to a PSE smaller than s for these trials.
In short, conditioning on the size of c on the preceding trial, the estimated PSE for stimulus order 〈cs〉 is greater than s if c on the preceding trial was small. Likewise, the estimated PSE for stimulus order 〈cs〉 is smaller than s if c on the preceding trial was large. For stimulus order 〈sc〉, c is not integrated into the internal reference, and thus, the magnitude of c on the preceding trial cannot affect the value of I n . Therefore, PSE should not depend on the magnitude of c on the preceding trial. To sum up, PSE should depend on the magnitude of c on the previous trial for stimulus order 〈cs〉, but not for stimulus order 〈sc〉.
In order to illustrate this prediction, we analyzed the data for the blocked conditions from our Monte Carlo simulations as a function of stimulus order and the magnitude of c on the preceding trial. As can be seen in Fig. 3, IRM predicts a clear sequence effect on PSE for 〈cs〉 trials. PSE is greater if the comparison on the previous trial was small (i.e., c n − 1 < s) than if it was large (i.e., c n − 1 > s). This effect increases with increasing g. However, if the order of stimuli is reversed and the first stimulus is the constant standard s, as in the blocked 〈sc〉 condition, the internal reference and, hence, PSE do not depend on the magnitude of the comparison c on the previous trial, which enables an especially strong prediction.
Aim of the present study
The present experiments were designed to test the two major predictions of IRM outlined above. First, IRM predicts smaller DLs, that is, better discrimination for stimulus order 〈sc〉 than for 〈cs〉. This effect should be about the same for blocked and random stimulus order. Second, IRM predicts sequential effects of the previous comparison magnitude on PSE for blocked stimulus order 〈cs〉, but not for 〈sc〉. To evaluate these predictions, participants performed an auditory (Experiment 1) or visual (Experiment 2) two-interval duration discrimination task (Grondin, 2010) under three conditions: (1) s always preceded c within a single experimental session (i.e., 〈sc〉 blocked), (2) s always followed c (i.e., 〈cs〉 blocked), and (3) stimulus order was random (i.e., 〈sc〉 and 〈cs〉 intermixed). The three conditions differed only in stimulus order; the stimuli themselves were physically identical across all conditions. The present experiments therefore enable a systematic and comprehensive analysis of the Type B effect and of sequential effects on PSE.
Experiment 1
Method
Participants
Seventeen female and 7 male volunteers with normal hearing (mean age: 24.5 ± 6.8 years) participated in three experimental sessions on separate days. Each session lasted approximately 60 min. All participants were naïve about the purpose of the study and were reimbursed for participating in the experiment. One participant was replaced because of too many trials without a response.
Apparatus and stimuli
The experiment was run in a sound-attenuated booth. A Mac Pro 3.1 (Apple, Inc.) controlled both stimulus presentation and response recording. Instructions and feedback appeared on a computer screen (Samsung SyncMaster 1100 MB, 1,024 × 768 pixels, 150 Hz), placed approximately 60 cm from the participant. The “y" (i.e., “z" on a QWERTY keyboard) and “m” key of a standard Apple QWERTZ USB-keyboard served as the left and right response keys, respectively.Footnote 6 The experiment was programmed in MATLAB (The MathWorks, Inc., Version R 2009a) using the Psychophysics Toolbox 3.0.8 (Brainard, 1997; Pelli, 1997). The auditory stimuli were filled intervals of white noise, sine ramped and damped with rise and fall times of 10 ms, and were presented binaurally through headphones (Sennheiser HD 212Pro) at a peak level of 65 dB SPL. A new interval of white noise was generated for each stimulus on each trial.
Procedure
Participants performed a duration discrimination task. On each trial, two auditory intervals were presented in succession. One of these intervals had a constant duration of 500 ms (standard s), and the other interval had a variable duration ranging from 400 to 600 ms in constant steps of 20 ms (comparison c). A trial started with the presentation of the first stimulus. After an interstimulus interval of 1,000 ms, the second stimulus was presented. Following stimulus presentation, participants pressed the left response key with their left index finger to indicate that they judged the first stimulus as being longer than the second one and the right response key with their right index finger to indicate that they judged the second stimulus as being longer than the first one. Immediately after the response, participants received feedback, displayed for 400 ms at the center of the screen, about which stimulus was physically longer. A “1” or “2” indicated that the first or second stimulus was longer, respectively, and an “=” sign indicated that the two stimuli were physically identical in duration. If participants did not respond within 5,000 ms after the offset of the second stimulus, the trial was terminated, and “zu langsam” (too slow) was displayed for 800 ms on the screen. After an intertrial interval of 1,600 ms, the next trial began.
There were three conditions tested in separate sessions. The order of sessions was counterbalanced across participants. In the 〈sc〉 blocked condition, the first stimulus was s, and the second stimulus was c, so the temporal order of stimuli was 〈sc〉 on each trial. In the 〈cs〉 blocked condition, the temporal order of stimuli was reversed; that is, the first stimulus was c, and the second stimulus was s on each trial. In the 〈sc〉 and 〈cs〉 random condition, stimulus order was random on each trial; that is, the two possible orderings 〈sc〉 and 〈cs〉 occurred randomly intermixed within a single session. Note that the stimuli were physically identical in all conditions; only the order of stimuli and block type (i.e., blocked vs. random) differed. In all conditions, participants received the same written instruction displayed on the computer screen—namely, to indicate by keypress which of the two stimuli (the first or the second) was longer. Participants were not informed about the procedural details of the experiment. For example, they were not told that there was a constant and a variable stimulus on each trial. In addition, they were not told whether stimulus order was blocked or random. A postexperimental interview revealed that participants were not aware of these conditions.
In the blocked conditions, each of the 11 levels of c was presented 60 times, resulting in a total of 660 trials. To equalize the total number of trials per condition in the random condition, each of the 11 levels of c was presented 30 times for the stimulus order 〈sc〉 and 30 times for 〈cs〉. Participants could take a short rest after every 110 trials. At the beginning of each session, there was a practice block of 22 trials, and participants were informed when this block was finished (each of the 11 levels of c was presented twice in the blocked conditions and once for each stimulus order in the random condition). Practice trials did not enter into the data analysis.
Design and dependent variables
The data from the random condition were analyzed separately for the two stimulus orders. Thus, there was a stimulus order (〈sc〉 vs. 〈cs〉) × block type (blocked vs. random) within-subjects design. The dependent variables were the DL and the PSE. In order to assess whether participants traded speed against accuracy, we also measured response time (RT) from the offset of the second interval to the onset of response.
Psychometric functions and estimation of DL and PSE
For the blocked conditions, a separate logistic psychometric function for each stimulus order (i.e., 〈sc〉 and 〈cs〉) was fitted to individual data sets using a maximum likelihood procedure (for an implementation, see Bausenhart, Dyjas, Vorberg, & Ulrich, 2012). In this logistic psychometric function
c denotes the duration of the comparison, a is the PSE, and b > 0 reflects the slope and, thus, assesses discrimination performance, that is, DL = ln(3)·b (e.g., Bush, 1967, p. 448). For the random conditions, DL and PSE were estimated under a constraint that forces the psychometric functions averaged over stimulus orders to pass through the point (s, 0.5) (Ulrich, 2010; Ulrich & Vorberg, 2009).
Results and discussion
Figure 4 shows the data for each participant and the fitted psychometric functions for stimulus orders 〈sc〉 and 〈cs〉 in the blocked conditions. For almost all participants, the estimated psychometric function for stimulus order 〈cs〉 is shallower than the one for stimulus order 〈sc〉, indicating worse discrimination performance for 〈cs〉 trials, that is, a pronounced Type B effect. As one may expect, the pattern of results varies across participants. First, the discrimination performance differs greatly between participants. Second, for some participants, a strong Type B effect can be observed (e.g., participants 6, 13, 17, and 24), whereas for others, this effect is only weak or even absent (e.g., participants 1, 3, 9, and 20). As was noted by a reviewer, it might be possible that participants fall into two groups, such that one small group does not show the Type B effect, whereas the other group does. Within the framework of IRM, this would mean that one group of participants relies only on the stimulus information present on the current trial (i.e., g = 0), whereas the other group relies on an internal reference in order to additionally use more remote stimulus information (i.e., g > 0). It remains to be tested whether such interpersonal differences exist and whether they are stable or can be changed strategically. For some participants, a marked shift of the psychometric function is also visible (e.g., participants 4, 11, and 16).
Figure 5 depicts the data for the random condition. As for the blocked conditions, the psychometric functions for stimulus order 〈cs〉 are shallower than those for 〈sc〉, again indicating inferior discriminability on 〈cs〉, as compared with 〈sc〉, trials. As before, the pattern of results varies across participants. It is remarkable that the data patterns between blocked and random stimulus order are quite consistent across participants. This also indicates that the effects for each participant are stable across sessions and conditions.
Analysis for effects of stimulus order and block type
From the psychometric functions depicted in Figs. 4 and 5, DLs and PSEs were estimated for each participant (see Bausenhart et al., 2012). Separate analyses of variance (ANOVAs) were performed for DL, PSE, and RT. Figure 6a shows mean DL as a function of stimulus order and block type. As was evident from the data for individual participants and in line with IRM’s predictions and Nachmias’ (2006, Table 1) results, discrimination performance was better for the stimulus order 〈sc〉 (DL = 53 ms) than for 〈cs〉 (DL = 104 ms), F(1,23) = 13.57, p = .001, η 2p = .37. Neither the main effect of block type nor the interaction of the two factors was significant (both Fs < 1). The Weber fraction, DL/s, was 0.11 and 0.21 for stimulus order 〈sc〉 and 〈cs〉, respectively. This amounts to a meaningful increase of almost 100 %.
In contrast to Nachmias’ (2006, Table 2) results,Footnote 7 PSE did not differ significantly for the two stimulus orders, F(1,23) = 2.15, p = .16 (Fig. 6b). Again, neither the effect of block type nor the interaction of stimulus order and block type was significant (both Fs < 1). The average PSE across participants and conditions was equal to 502 ms. Analysis of RT suggests that the DL difference between 〈sc〉 and 〈cs〉 is not due to a speed–accuracy trade-off (Fig. 6c). Neither stimulus order, F(1,23) = 2.29, p = .14, nor block type, F(1,23) = 1.30, p = .27, nor the interaction of these factors, F < 1, significantly influenced RT.
Analysis for trial sequence effects
As was discussed in the introduction, IRM predicts a specific pattern of sequence effects, that is, effects of the comparison magnitude on the previous trial on discrimination on the current trial (cf. Fig. 3). We therefore analyzed data from trials for which the magnitude of the comparison on the previous trial c n − 1 was small (i.e., c n − 1 < s) and from trials for which it was large (i.e., c n − 1 > s) separately. Only data from the blocked conditions could be used for this analysis, because in the random condition, not only the magnitude of c n − 1, but also the order of s and c varies. Trials for which the magnitude of c n − 1 was physically identical to the standard were excluded. Thus, separate logistic psychometric functions were estimated for the two previous comparison magnitudes (i.e., small vs. large), and this was done for each blocked condition (〈sc〉 and 〈cs〉). From these psychometric functions, DL and PSE were computed. In addition, RT was calculated. Each dependent measure was submitted to a separate ANOVA with the two factors previous comparison magnitude (small vs. large) and stimulus order (〈sc〉 vs. 〈cs〉).
Figure 7b depicts PSE as a function of stimulus order and previous comparison magnitude. The ANOVA for PSE revealed a significant main effect of previous comparison magnitude, F(1,23) = 6.96, p = .015, η 2p = .23, and PSE was greater (513 ms) if the previous comparison was small than if it was large (491 ms). This main effect, however, should not be interpreted meaningfully, because it resulted from an interaction of stimulus order and previous comparison magnitude, F(1,23) = 13.72, p = .001, η 2p = .37. Consistent with the prediction of IRM, PSE for 〈cs〉 trials was greater if the previous comparison was small than if it was large. For 〈sc〉 trials, the magnitude of the previous comparison exerted virtually no influence on PSE. The main effect of stimulus order did not reach significance, F(1,23) = 1.29, p = .268.
Consistent with the previous overall analysis of DL, DL was smaller for 〈sc〉 trials (45 ms) than for 〈cs〉 trials (103 ms), F(1,23) = 11.61, p = .002, η 2p = .34, again indicating a Type B effect (cf. Fig. 7a). There was a statistical trend for better discrimination performance if the previous comparison was large (DL = 69 ms) than if it was small (DL = 79 ms), F(1,23) = 4.03, p = .057, η 2p = .15. The interaction of stimulus order and previous comparison magnitude was not significant, F < 1.
Analysis of RT suggested that the results above are not due to a speed–accuracy trade-off (see Fig. 7c). Neither stimulus order nor previous comparison magnitude (both Fs < 1), nor the interaction, F(1,23) = 1.60, p = .219, was significant.
Association between Type B effect and trial sequence effect
It might be informative to assess whether the Type B effect (i.e., an effect of stimulus order on DL) and the trial sequence effect (i.e., an effect of the comparison on the previous trial on PSE) are associated. If the Type B effect and the trial sequence effect are due to a common underlying mechanism, there might be an association between the magnitude of both effects. We conducted a correlational analysis to investigate whether such an association exists. Since the sequential effects can be examined only for blocked stimulus order, we restrict the following analysis to the blocked conditions. In order to quantify the Type B effect for each participant, we subtracted DL for 〈sc〉 trials from DL for 〈cs〉 trials, such that Type B effect = DL 〈cs〉 − DL 〈sc〉. In a similar vein, in order to quantify the trial sequence effect, we subtracted PSE for trials with large c on the preceding trial from PSE for trials with small c on the preceding trial, such that \( sequence\;effect = PS{E_{{{c_{{n - 1}}} < s}}} - PS{E_{{{c_{{n - 1}}} > s}}} \). The correlation between these two measures was r = .89, t(22) = 8.09, p < .001, indicating that the Type B effect and the trial sequence effect are indeed associated, supporting the notion of a common underlying mechanism.
In summary, the results of Experiment 1 revealed a strong Type B effect that was of the same size for blocked and random stimulus order. This result is consistent with the predictions of IRM and suggests that g does not differ between blocked and random stimulus presentations. Furthermore and also in line with IRM, sequential effects on PSE were observed for stimulus order 〈cs〉, but not for stimulus order 〈sc〉.
Experiment 2
In order to assess whether the obtained effects in Experiment 1 are specific to the auditory modality or generalize across modalities, Experiment 2 employed a visual duration discrimination task.
Method
Participants
A new sample of 24 female participants (mean age: 20.1 ± 2.4 years) participated in three sessions on separate days.
Apparatus, stimuli, and procedure
The apparatus was identical to the one in Experiment 1, except that no headphones were used. The visual stimuli were discs (diameter 50 pixels) presented on the computer screen in light gray (49.2 cd/m2) on a dark gray (5.4 cd/m2) background. Procedure and time course were identical to those in Experiment 1, except that the range of c was increased to take into account the inferior discrimination performance in the visual modality (e.g., Grondin, 2001). Specifically, the duration of c ranged from 300 to 700 ms in steps of 40 ms, but the duration of s was again 500 ms.
Results and discussion
Figures 8 and 9 depict the individual psychometric functions in the blocked and random conditions, respectively.
Analysis for effects of stimulus order and block type
As in Experiment 1, DL was about twice as large for stimulus order 〈cs〉 (207 ms) as for 〈sc〉 (98 ms), demonstrating again a strong Type B effect, F(1,23) = 10.28, p = .004, η 2p = .31 (cf. Fig. 10a). In contrast to Experiment 1, DL was reliably larger in the random (178 ms) than in the blocked (127 ms) condition, F(1,23) = 5.62, p = .027, η 2p = .20. Within IRM, this increase could reflect an increase of g for random stimulus order, indicating that participants rely more on the internal reference in this condition. There was no significant interaction, F(1,23) = 2.61, p = .120. The analogous ANOVA on PSE revealed no significant effects, Fs < 1 (Fig. 10b). Finally, RT was longer on 〈sc〉 than on 〈cs〉 trials (497 vs. 458 ms, respectively; Fig. 10c), F(1,23) = 5.61, p = .027, η 2p = .20. Furthermore, longer RTs were observed in the random than in the blocked condition (499 vs. 456 ms, respectively), although this effect was only marginally significant, F(1,23) = 3.16, p = .089, η 2p = .12. Both factors did not interact significantly, F < 1. The RT results suggest that discrimination in the random condition is more demanding than in the blocked condition. Surprisingly, RTs were larger for 〈sc〉 trials than for 〈cs〉 trials. A speculative yet plausible explanation for this finding is that for stimulus order 〈cs〉, participants might sometimes decide which stimulus is the larger one already after the first stimulus—that is, the variable c—has been presented. Accordingly, they might elicit their responses faster for this stimulus order than for order 〈sc〉, in which the first stimulus never conveys sufficient information about the required response.
Analysis for trial sequence effects
As for Experiment 1, we separately analyzed the data of the blocked conditions from trials for which c on the previous trial was small and from trials for which c on the previous trial was large. The ANOVA for PSE revealed that consistent with the predictions of IRM and in line with the results of Experiment 1, stimulus order and previous comparison magnitude interacted significantly, F(1,23) = 10.33, p = .004, η 2p = .31 (Fig. 11b). Separate t tests were performed for the two stimulus orders. For 〈sc〉 trials, the PSE for small versus large c on the preceding trial did not differ significantly, t(23) = −1.70, p = .103. However, for 〈cs〉 trials, the PSE was significantly greater for small c on the preceding trial than for large c on the preceding trial, t(23) = 2.13, p = .044. The main effects of the factors stimulus order and previous comparison magnitude were not significant, Fs < 1. Consistent with the analysis above, discrimination performance as indexed by DL was better on 〈sc〉 than on 〈cs〉 trials (90 and 172 ms, respectively; Fig. 11a), F(1,23) = 9.14, p = .006, η 2p = .28. There were no other significant effects on DL, Fs < 1. The ANOVA on RT revealed neither a significant effect of stimulus order, F(1,23) = 1.61, p = .217, nor one of previous comparison magnitude, F(1,23) = 2.63, p = .119, nor an interaction of these factors, F < 1 (Fig. 11c).
Association between Type B effect and trial sequence effect
The correlation between the Type B effect and the trial sequence effect was smaller than in Experiment 1, r = .30, and was only marginally significant, t(22) = 1.36, p = .093, indicating that the association between these effects was weaker than in Experiment 1. This might be due to more noisy data in Experiment 2, which employed a visual duration discrimination task, as compared with the auditory duration discrimination task employed in Experiment 1.
In sum, the results of Experiment 2 again revealed a strong Type B effect and a sequential dependency of PSE on previous comparison magnitude. These results are thus consistent with the results of Experiment 1 and the predictions of IRM. In contrast to Experiment 1, larger DLs were observed in the random than in the blocked condition.
General discussion
The idea that an internal reference or internal standard builds up across trials in a discrimination task has been put forward by several authors (e.g., Durlach & Braida, 1969; Helson, 1947, 1964; Michels & Helson, 1954; Morgan et al., 2000; Nachmias, 2006). The IRM proposed here provides an especially simple and plausible mechanism of how such an internal reference emerges. The mechanism underlying this model involves a geometrically moving average with a single free parameter, namely weight g. The present experiments were designed to evaluate two major inherent predictions of IRM that are independent of the specific value of g.
Evaluation of IRM’s major predictions
According to IRM’s first major prediction and in contrast to standard difference models (e.g., Luce & Galanter, 1963), discrimination performance as indexed by DL is better when the constant standard s precedes, rather than follows, the variable comparison c. Consistent with this prediction of a Type B effect, DL was considerably increased for stimulus order 〈cs〉, as compared with stimulus order 〈sc〉. This effect was observed for an auditory duration discrimination task (Experiment 1), as well as for a visual duration discrimination task (Experiment 2). In addition, this Type B effect had about the same magnitude for random as for blocked stimulus orders, a result that is also consistent with IRM.
Although a large body of literature on order effects exists, previous studies have focused primarily on the effects of stimulus order on PSE, that is, on the classical time-order error (for a review, see Eisler, Eisler, & Hellsström, 2008). Nonetheless, there is also some evidence for the existence of the Type B effect not only in temporal discrimination, but also across a variety of tasks and modalities (cf. Table 1). The majority of studies reviewed in the introduction reported better discrimination for stimulus order 〈sc〉 than for 〈cs〉. To our knowledge, the only exceptions—that is, reversed Type B effects—were reported for very brief stimuli and interstimulus intervals (Hellström, 2003, Table 7; Hellström & Rammsayer, 2004). These findings are inconsistent with the predictions of the current version of IRM. It might be that the mechanism underlying discrimination of brief stimuli presented with short interstimulus intervals is special; for example, memory processes and interference between stimuli might play a crucial role. Future extensions of IRM may incorporate such processes (e.g., by allowing for negative weighting of prior experience, that is, − 1 < g < 1) in order to explain an even broader range of phenomena.Footnote 8
Regardless of whether discrimination performance is better for 〈sc〉 or for 〈cs〉 trials, the theoretical significance of these Type B effects has often been neglected in previous research, presumably because standard difference models cannot account for any influence of stimulus order on discrimination performance. Nevertheless, these difference models form the basis of various very prominent psychophysical theories, including Signal Detection Theory. Due to its dynamic updating mechanism, IRM directly implies effects of stimulus order on discrimination performance. Therefore, IRM is a promising extension of standard difference models.
A second major prediction of IRM concerns sequential dependencies of PSE in the blocked conditions. As was outlined in the introduction (see also Fig. 3), IRM predicts that the magnitude of c on the preceding trial modulates the judged stimulus magnitude on the current trial. Specifically for stimulus order 〈cs〉, according to IRM, a large c on the previous trial increases the internal reference. Consequently, the magnitude of c on the present trial is overestimated, and thus PSE becomes smaller than s. Likewise, a small c on the previous trial decreases the internal reference, and thus c on the present trial is underestimated, leading to a PSE larger than s. In contrast, on 〈sc〉 trials, c does not enter into the internal reference, and therefore the magnitude of c on the preceding trial should not modulate PSE on the present trial. This second major prediction of IRM was established empirically in Experiments 1 and 2 as well.
It should be noted, however, that a more complex version of IRM can predict a similar pattern of results. This more complex version assumes that not only the first, but also the second stimulus of each trial is integrated into the internal reference. For example, the internal reference I n on trial n could alternatively be established as
with weight g, 0 ≤ g < 1. In this version, X 1,n denotes the internal representation of the first stimulus on the present trial, and X 2,n − 1 denotes the internal representation of the second stimulus on the preceding trial. Thus, the second stimulus on the present trial is compared with a conglomerate of all previously presented stimuli. Monte Carlo simulations employing Equation 3 revealed virtually identical results to those of the simple IRM embodied by Equation 1.Footnote 9 In accordance with Occam’s principle, we prefer the simple version. After all, the task itself may suggest the strategy of integrating only the first stimulus, given that the participant has to memorize the first stimulus until the second is presented and then judge which one was longer.
IRM and the single-stimulus paradigm
In the present study, we investigated the effect of the preceding comparison on the discrimination process for blocked stimulus order, and it was demonstrated that the size of the preceding comparison can strongly influence the judgment on the current trial. Similar sequential effects have been reported with the method of single stimuli (Lages & Treisman, 1998; Treisman & Williams, 1984). Here, the standard s is presented only at the beginning of the experiment. On subsequent trials, participants receive only the variable comparison c and judge whether c is smaller or greater than s.
In Lages and Treisman’s (1998) first experiment, participants indicated whether a comparison sine-wave grating was higher or lower in spatial frequency than the standard. In the analysis for sequential stimulus dependencies, the authors found that PSE was greater when the spatial frequency of the preceding stimulus was high than when it was low. Thus, in this single-stimulus experiment, the effect of the preceding c on PSE was in the opposite direction than in our experiment. Lages and Treisman explained this negative sequential effect within the framework of the Criterion Setting Theory, which is an extension of Signal Detection Theory (Green & Swets, 1966). Like Signal Detection Theory, Criterion Setting Theory assumes that discrimination is based on a decision process that compares sensory input represented on a decision axis with a response criterion. In contrast to Signal Detection Theory, Criterion Setting Theory assumes that the response criterion changes from trial to trial in order to optimize performance (see also Lages & Treisman, 2010; Treisman & Lages, 2010, for criterion setting in different tasks and contexts). Specifically, the criterion is increased when the internal representation of the preceding c was above the criterion and is lowered if this representation was below the criterion.
Although this theory accounts for the negative sequential effect, it should be noted that IRM can also account for this finding in the single-stimulus method. Accordingly, the participant compares the internal representation X n of the comparison on the present trial with the internal reference I n . If X n is larger than I n , the participant responds with “c > s”; otherwise, with “c < s.” Importantly, the internal reference is updated on every trial according to the process described above and is, in this case,
with weight g, 0 < g < 1.
In order to assess the behavior of IRM for the single-stimulus paradigm, we ran a Monte Carlo simulation that was based on this updating process. For this simulation, we used the same set of stimuli and the same parameter settings (cf. Fig. 2) as for the other simulations. Here s was presented only once at the beginning, and on all subsequent trials, only c was presented. Trials were grouped according to the magnitude of the preceding c, that is, whether c n − 1 < s or c n − 1 > s. Figure 12 depicts the two psychometric functions depending on the magnitude of the preceding c for weights g = 0.3, 0.5, and 0.7. As can be seen, the psychometric function is shifted to the left for c n − 1 < s and to the right for c n − 1 > s, demonstrating a negative sequential effect, just as Lages and Treisman (1998) observed empirically. This sequential effect decreases with increasing g.
In this simulation, the internal reference is completely driven by the history of the stimulus sequence and, thus, established by bottom-up processing. Nevertheless, the size of this internal reference might also be influenced by top-down processes to adjust for payoffs. Consequently, the internal reference might serve as a criterion that participants can shift according to task requirements.
Time-order error and Type B effect
A theoretically important issue is whether the classical TOE and the Type B effect are independent phenomena or are just two sides of the same coin. In the present experiments, the analyses revealed no effects of stimulus order or block type on PSE; that is, there was no classical TOE. Thus, our results are consistent with the notion that the TOE (i.e., effect on PSE) and the Type B effect (i.e., effect on DL) are dissociable. This, however, does not necessarily exclude the possibility that the two effects emerge from the same underlying mechanism. For instance, relations between the TOE and the Type B effect might rather be observable in experiments employing roving standards or no fixed standard at all (cf. Hellström, 2000, 2003). Since the present version of IRM involves no specific assumptions that allow one to predict classical TOEs, this version might suggest that the Type B effect and the classical TOE stem from different mechanisms. Future extensions of the model should address how TOEs can emerge within this framework. For example, merging IRM and Hellström’s Sensation Weighting model might be fruitful in order to provide a more general account for these phenomena. Such an extended model may also account for the reversed Type B effect that was observed for brief stimuli and short interstimulus intervals (Hellström, 2003, Table 7; Hellström & Rammsayer, 2004).
Conclusion
A strong Type B effect was observed in an auditory and a visual two-stimuli discrimination task. This effect was independent of whether the two stimulus orders 〈sc〉 and 〈cs〉 were presented in separate blocks or randomly within a single block. Moreover, PSE was modulated by the magnitude of the comparison on the preceding trial. These results are consistent with the predictions of IRM, which can be seen as a hybrid of the Sensation Weighting model (e.g., Hellström, 1985) and classical difference models (Thurstone, 1927a, 1927b). Given that a strong Type B effect was previously observed in various studies across a wide range of tasks and in different modalities (cf. Table 1), it seems rather unlikely that the obtained results are specific to the temporal discrimination tasks employed in the present study. Classical difference models of discrimination performance such as Signal Detection Theory cannot explain the Type B effect. The IRM is a straightforward extension of these standard models; it suggests a simple and plausible mechanism that accounts for the Type B effect, as well as for sequential dependencies across trials. A general question that may arise is how IRM relates to sensory processing, decision making, and memory.
Notes
The DL or just noticeable difference (jnd) is commonly defined as half the interquartile range of a psychometric function, that is, \( DL = \left( {{c_{{.75}}} - {c_{{.25}}}} \right)/2 \), where c denotes the variable comparison stimulus, and c .75 and c .25 are the 75th and 25th percentiles of the psychometric function, respectively (e.g., Luce & Galanter, 1963, p. 199). Thus, DL reflects the steepness of the psychometric function; the smaller the DL, the better discrimination performance is.
The PSE is commonly defined as the median of the psychometric function, that is, PSE = c .5 (e.g., Luce & Galanter, 1963, pp. 198–199). Thus, the PSE denotes the comparison magnitude that has probability .5 of being judged as greater than the constant standard.
These authors computed the distance between the 75 % percentile and s for s > c and for s < c in separate experimental blocks. They then defined the just noticeable difference as the mean of these two distances. This was done for each order of s and c separately. As has been pointed out by Ulrich and Vorberg (2009), however, the difference between s and the 75 % percentile does not necessarily reflect discrimination performance but may also be contaminated by shifts of psychometric functions. Therefore, it is not clear whether or not these results reflect a genuine difference in sensitivity.
In these Monte Carlo simulations, the standard s was set to 500 ms, and the comparison c could take values from 400 to 600 ms in constant steps of 20 ms. It was assumed that X (i.e., the internal representation of the standard or the comparison) is normally distributed with mean E(X|s) = s and mean E(X|c) = c. The standard deviation of X was SD(X) = 50 in either case. For blocked stimulus order, each of the comparison levels was administered 2,000 times. For random stimulus order, each comparison level was administered 1,000 times in order 〈sc〉 and 1,000 times in order 〈cs〉, also resulting in a total of 2,000 trials for each comparison level. The simulations mimicked the responses of a participant, and the obtained data were used to calculate psychometric functions, from which DLs were obtained.
Note that the position of the “m” key is the same on the QWERTZ and on the QWERTY keyboard.
Note that Nachmias (2006) uses the term “trial sequence” to refer to “stimulus order.” The effects of previous comparison magnitude, termed “trial sequence effects” in this article, were not investigated in Nachmias’ experiments.
We thank Åke Hellström for pointing this out.
Note that Lapid et al. (2008, Appendix B) focused on the psychometric functions denoting the probability that a participant judges the second stimulus as larger than the first one for the two stimulus orders, \( P\left( {\left. {\prime {\prime} {S_2} > {S_1}\prime {\prime} } \right|\left\langle {sc} \right\rangle } \right) \) and \( P\left( {\left. {\prime {\prime} {S_2} > {S_1}\prime {\prime} } \right|\left\langle {cs} \right\rangle } \right) \). In contrast, we focus on the psychometric functions denoting the probability that a participant judges c as longer than s for the two stimulus orders, that is, \( P\left( {\left. {\prime {\prime} c > s\prime {\prime} } \right|\left\langle {sc} \right\rangle } \right) \) and \( P\left( {\left. {\prime {\prime} c > s\prime {\prime} } \right|\left\langle {cs} \right\rangle } \right) \). Thus, for stimulus order 〈sc〉 the psychometric function in Lapid et al. and the psychometric function in this article are equivalent, but for stimulus order 〈cs〉, the psychometric function in Lapid et al. (2008) and the one in this article differ and \( P\left( {\left. {\prime {\prime} c > s\prime {\prime} } \right|\left\langle {cs} \right\rangle } \right) = P\left( {\left. {\prime {\prime} {S_1} > {S_2}\prime {\prime} } \right|\left\langle {cs} \right\rangle } \right) = 1 - P\left( {\prime {\prime} {S_2} > {S_1}\prime {\prime} |\left\langle {cs} \right\rangle } \right) \) Lapid et al. holds.
See the end of this appendix.
References
Bausenhart, K. M., Dyjas, O., Vorberg, D., & Ulrich, R. (2012). Estimating discrimination performance in two-alternative forced-choice tasks: Routines for MATLAB and R. Behavior Research Methods. doi:10.3758/s13428-012-0207-z.
Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436.
Bruno, A., Ayhan, I., & Johnston, A. (2012). Effects of temporal features and order on the apparent duration of a visual stimulus. Frontiers in Psychology, 3(90), 1–7. www.frontiersin.org/psychology
Bush, R. R. (1967). Estimation and evaluation. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (2nd ed., Vol. 1, pp. 429–469). New York: John Wiley & Sons.
Cousineau, D. (2007). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1, 42–45.
Drake, C., & Botte, M.-C. (1993). Tempo sensitivity in auditory sequences: Evidence for a multiple-look model. Perception & Psychophysics, 54, 277–286.
Durlach, N. I., & Braida, L. D. (1969). Intensity perception. I. Preliminary theory of intensity resolution. The Journal of the Acoustical Society of America, 46, 372–383.
Eisler, H., Eisler, A. D., & Hellström, Ǻ. (2008). Psychophysical issues in the study of time perception. In S. Grondin (Ed.), Psychology of time (pp. 75–109). Bingley: Emerald.
Falmagne, J.-C. (1985). Elements of psychophysical theory. Oxford: Clarendon Press.
Fechner, G. T. (1860). Elemente der Psychophysik [Elements of psychophysics]. Leipzig, Germany: Breitkopf und Härtel.
Fründ, I., Haenel, N. V., & Wichmann, F. A. (2011). Inference for psychometric functions in the presence of nonstationary behavior. Journal of Vision, 11, 1–19.
García-Pérez, M. A., & Alcalá-Quintana, R. (2010). Reminder and 2AFC tasks provide similar estimates of the difference limen: A re-analysis of the data from Lapid, Ulrich, & Rammsayer (2008) and a discussion of Ulrich & Vorberg (2009). Attention, Perception, & Psychophysics, 72, 1155–1178.
Gescheider, G. A. (1997). Psychophysics: The fundamentals (3rd ed.). Mahwah, New Jersey: Lawrence Erlbaum Associates.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics (Rev. ed.). Los Altos, CA: Peninsula Publishing. Reprinted Edition 1988.
Grice, G. R., & Hunter, J. J. (1964). Stimulus intensity effects depend upon the type of experimental design. Psychological Review, 71, 247–256.
Grondin, S. (2001). From physical time to the first and second moments of psychological time. Psychological Bulletin, 127, 22–44.
Grondin, S. (2010). Timing and time perception: A review of recent behavioral and neuroscience findings and theoretical directions. Attention, Perception, & Psychophysics, 72, 561–582.
Grondin, S., & McAuley, J. D. (2009). Duration discrimination in crossmodal sequences. Perception, 38, 1542–1559.
Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw-Hill.
Hellström, Ǻ. (1979). Time errors and differential sensation weighting. Journal of Experimental Psychology. Human Perception and Performance, 5, 460–477.
Hellström, Ǻ. (1985). The time-order error and its relatives: Mirrors of cognitive processes in comparing. Psychological Bulletin, 97, 35–61.
Hellström, Ǻ. (2000). Sensation weighting in comparison and discrimination of heaviness. Journal of Experimental Psychology. Human Perception and Performance, 26, 6–17.
Hellström, Ǻ. (2003). Comparison is not just subtraction: Effects of time- and space-order on subjective stimulus difference. Perception & Psychophysics, 65, 1161–1177.
Hellström, Ǻ., & Rammsayer, T. H. (2004). Effects of time-order, interstimulus interval, and feedback in duration discrimination of noise bursts in the 50- and 1000-ms ranges. Acta Psychologica, 116, 1–20.
Helson, H. (1947). Adaptation-level as frame of reference for prediction of psychophysical data. The American Journal of Psychology, 60, 1–29.
Helson, H. (1964). Adaptation-level theory. New York: Harper & Row.
Ivry, R. B., & Hazeltine, R. E. (1995). Perception and production of temporal intervals across a range of durations: Evidence for a common timing mechanism. Journal of Experimental Psychology. Human Perception and Performance, 21, 3–18.
Lages, M., & Treisman, M. (1998). Spatial frequency discrimination: Visual long-term memory or criterion setting? Vision Research, 38, 557–572.
Lages, M., & Treisman, M. (2010). A criterion setting theory of discrimination learning that accounts for anisotrophies and context effects. Seeing and Perceiving, 23, 401–434.
Lapid, E., Ulrich, R., & Rammsayer, T. (2008). On estimating the difference limen in duration discrimination tasks: A comparison of the 2AFC and the reminder task. Perception & Psychophysics, 70, 291–305.
Luce, R. D., & Galanter, E. (1963). Discrimination. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. I, pp. 191–243). New York: John Wiley & Sons.
Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A user’s guide (2nd ed.). Mahwah, New Jersey: Lawrence Erlbaum Associates.
Mattes, S., & Ulrich, R. (1997). Response force is sensitive to the temporal uncertainty of response stimuli. Perception & Psychophysics, 59, 1089–1097.
Michels, W. C., & Helson, H. (1954). A quantitative theory of time-order effects. The American Journal of Psychology, 67, 327–334.
Morgan, M. J., Watamaniuk, S. N. J., & McKee, S. P. (2000). The use of an implicit standard for measuring discrimination thresholds. Vision Research, 40, 2341–2349.
Nachmias, J. (2006). The role of virtual standards in visual discrimination. Vision Research, 46, 2456–2464.
Nahum, M., Daikhin, L., Lubin, Y., Cohen, Y., & Ahissar, M. (2010). From comparison to classification: A cortical tool for boosting perception. Journal of Neuroscience, 30, 1128–1136.
Niemi, P., & Näätänen, R. (1981). Foreperiod and simple reaction time. Psychological Bulletin, 89, 133–162.
Patching, G. R., Englund, M. P., & Hellström, Ǻ. (2012). Time- and space-order effects in timed discrimination of brightness and size of paired visual stimuli. Journal of Experimental Psychology. Human Perception and Performance, 38, 915–940.
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442.
Rammsayer, T., & Ulrich, R. (2012). The greater temporal acuity in the reminder task than in the 2AFC task is independent of standard duration and sensory modality. Canadian Journal of Experimental Psychology, 66, 26–31.
Rammsayer, T., & Wittkowski, K. M. (1990). Zeitfehler und Positionseffekt des Standardreizes bei der Diskrimination kurzer Zeitdauern [Time-order error and position effect of the standard stimulus in the discrimination of short durations]. Archiv für Psychologie, 142, 81–89.
Roberts, S. W. (1959). Control chart tests based on geometric moving averages. Technometrics, 1, 239–250.
Rogers, R. D., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology. General, 124, 207–231.
Ross, H. E., & Gregory, R. L. (1964). Is the Weber fraction a function of physical or perceived input? The Quarterly Journal of Experimental Psychology, 16, 116–122.
Sanders, A. F. (1998). Elements of human performance: Reaction processes and attention in human skill. Mahwah, New Jersey: Lawrence Erlbaum Associates.
Schulze, H. H. (1989). The perception of temporal deviations in isochronic patterns. Perception & Psychophysics, 45, 291–296.
Stott, L. H. (1935). Time-order errors in the discrimination of short tonal durations. Journal of Experimental Psychology, 18, 741–766.
Thurstone, L. L. (1927a). A law of comparative judgment. Psychological Review, 34, 273–286.
Thurstone, L. L. (1927b). Psychophysical analysis. The American Journal of Psychology, 38, 368–389.
Treisman, M., & Lages, M. (2010). Sensory integration across modalities: How kinaesthesia integrates with vision in visual orientation discrimination. Seeing and Perceiving, 23, 435–462.
Treisman, M., & Williams, T. C. (1984). A theory of criterion setting with an application to sequential dependencies. Psychological Review, 91, 68–111.
Ulrich, R. (2010). DLs in reminder and 2AFC tasks: Data and models. Attention, Perception, & Psychophysics, 72, 1179–1198.
Ulrich, R., Nitschke, J., & Rammsayer, T. (2006). Crossmodal temporal discrimination: Assessing the predictions of a general pacemaker-counter model. Perception & Psychophysics, 68, 1140–1152.
Ulrich, R., & Vorberg, D. (2009). Estimating the difference limen in 2AFC tasks: Pitfalls and improved estimators. Attention, Perception, & Psychophysics, 71, 1219–1227.
Wickens, T. D. (2002). Elementary signal detection theory. Oxford: Oxford University Press.
Woodrow, H. (1935). The effect of practice upon time-order errors in the comparison of temporal intervals. Psychological Review, 42, 127–152.
Woodworth, R. S., & Schlosberg, H. (1954). Experimental psychology (3rd ed.). London: Methuen.
Yeshurun, Y., Carrasco, M., & Maloney, L. T. (2008). Bias and sensitivity in two-interval forced choice procedures: Tests of the difference model. Vision Research, 48, 1837–1851.
Author Note
This research was supported by DFG UL 116/13-1.We thank Åke Hellström, Jeff Miller, and two anonymous reviewers for helpful comments and Julia Dietrich for assistance in data collection. Correspondence concerning this article should be addressed to Oliver Dyjas, Cognition and Perception, Department of Psychology, University of Tübingen, Schleichstr. 4, 72076 Tübingen, Germany. E-Mail may be sent to oliver.dyjas@uni-tuebingen.de.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1 Difference models do not predict a Type B effect
This appendix shows that difference models (e.g., Thurstone, 1927a, 1927b) predict identical slopes of the psychometric functions and, thus, identical DLs for stimulus orders 〈sc〉 and 〈cs〉. According to the standard difference model, s and c are internally represented as X s and X c , respectively. In accordance with Signal Detection Theory, X s = s + E s and X c = c + E c are noisy, that is, are random variables with errors E s and E c . The means of these random variables are E(X s ) = s and E(X c ) = c, with variances \( Var({{\mathbf{X}}_s}) = \sigma_s^2 \) and \( Var({{\mathbf{X}}_c}) = \sigma_c^2 \). These internal representations form the basis for the participant’s comparison judgment, that is, for the responses R 1 (larger stimulus in the first interval) or R 2 (larger stimulus in the second interval). Furthermore, X 1 and X 2 denote the perceived stimulus magnitudes in the first and second positions, respectively. For the order 〈sc〉, therefore, X 1 = X s and X 2 = X c ; for the order 〈cs〉, therefore, X 1 = X c and X 2 = X s . The difference model with bias (γ) postulates the following decision rule (e.g., Luce & Galanter, 1963, p. 225; Wickens, 2002, p. 100):
From this model, the psychometric function for the judgment “c > s” (c is judged as larger than s) for the stimulus order 〈sc〉 is calculated as follows:
Equation 6 shows that the psychometric function (Equation 5) can be recast as distribution function of the random variable Δ = E s − E c . As one would expect, this psychometric function increases monotonically with c from 0 to 1. Accordingly, for stimulus order 〈cs〉, the following psychometric function holds:
As can be seen from these equations, the predicted psychometric functions in Equations 6 and 8 differ only in location (i.e., are shifted by 2 · |γ| against each other) but are identical in shape. Hence, according to this model, the predicted slope and, thus, DL do not depend on the order of stimuli s and c. When there is no bias (i.e., γ = 0), the two predicted psychometric functions are identical.
Appendix 2 Internal Reference Model: Mean, variance, and psychometric functions
This appendix consists of two parts. In the first part, we derive the theoretical mean and the variance of the internal reference I n for the two stimulus orders 〈sc〉 and 〈cs〉. In addition, the first part contains the assumptions underlying these derivations and demonstrates that the internal reference stabilizes quickly after only a few trials. In the second part, we derive the psychometric functions for each stimulus order predicted by IRM. Specifically, this part contains an elaboration of the proof provided by Lapid et al. (2008, Appendix B). The derivations provided here are for blocked stimulus orders; for random stimulus order, similar derivations apply. The results obtained in Part 1 are in line with Roberts’ (1959) results on geometrically moving averages. However, in order to keep this appendix self-contained, we have tailored the derivations to the present framework.
Part 1: Theoretical mean and variance of the internal reference
Some notation is needed before we can proceed. Let the index i denote the ith trial, and let n be the current trial. In particular, let i = 0 be the initial trial, and thus, i = 0, 1…n. Furthermore, X 1,i and X 2,i are the internal representations of the first and second stimuli, respectively, on trial i. Therefore, the sequence
symbolizes the history of internal representations from the initial trial up to the current trial.
Consistent with Signal Detection Theory, we assume that all internal representations are noisy and independent, that is, uncorrelated random variables. Moreover, let ϕ be the stimulus magnitude. For the sake of simplicity, we assume that X is normally distributed with E(X|ϕ) = ϕ and Var(X|ϕ) = σ 2 and, therefore,
In order to derive the mean and the variance of I n , the following lemma is useful (see also Roberts, 1959).
-
Lemma 1
The recurrence relation \( {{\mathbf{I}}_n} = g \cdot {{\mathbf{I}}_{{n - 1}}} + \left( {1 - g} \right) \cdot {{\mathbf{X}}_{{1,n}}} \) incorporated in IRM implies the geometric moving average I n at trial n
$$ {{\mathbf{I}}_n} = {g^n} \cdot {{\mathbf{X}}_{{1,0}}} + \left( {1 - g} \right) \cdot \sum\limits_{{i = 1}}^n {{g^{{n - i}}}} \cdot {{\mathbf{X}}_{{1,i}}}. $$(10)
Proof.
and, in general, we see that Equation 10 follows.
The moving average expressed in Equation 10 attaches higher weights to the most recent internal representations than to the temporally remote ones. Specifically, the weights increase from X 1,0 to X 1,n . For example, with g = 0.3 and n = 4, one obtains
so that the weight attached to X 1,4 is about 86 times larger than the weight attached to X 1,0.
-
Theorem 1
(Expectation of the internal reference) Under the assumptions outlined above, the expected mean of I n for the two stimulus orders 〈sc〉 and 〈cs〉 is \( E\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {sc} \right\rangle } \right) = s \) and \( E\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {cs} \right\rangle } \right) = s \), respectively.
Proof. Because of Lemma 1, one can write
Since it is assumed that the mean of the first stimulus’ internal representation does not depend on i, that is, \( E\left( {{{\mathbf{X}}_{{1,i}}}} \right) = \mu, i = 0 \ldots n \), and because \( \sum\nolimits_{{i = 1}}^n {{g^{{n - i}}} = \frac{{1 - {g^n}}}{{1 - g}}} \), it follows that
For stimulus order 〈sc〉, μ = s holds, and hence, \( E\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {sc} \right\rangle } \right) = s \). If the values of the comparison are symmetrically distributed around s—which is the case in most experiments—then μ = s also applies to stimulus order 〈cs〉; hence, \( E\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {cs} \right\rangle } \right) = s \).
-
Theorem 2
(Variance of the internal reference) The variance of the internal reference on trial n depends on stimulus order and is given by
$$ Var\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {sc} \right\rangle } \right) = {\sigma^2} \cdot \left[ {{g^{{2n}}} + \frac{{\left( {1 - g} \right)}}{{1 + g}} \cdot 1 - {g^{{2n}}}} \right] $$(12)$$ Var\left( {\left. {{{\mathbf{I}}_n}} \right|\left\langle {cs} \right\rangle } \right) = \left( {{\sigma^2} + \sigma_c^2} \right) \cdot \left[ {{g^{{2n}}} + \frac{{\left( {1 - g} \right)}}{{\left( {1 + g} \right)}} \cdot \left( {1 - {g^{{2n}}}} \right)} \right], $$(13)where σ 2 denotes the variance of the first stimulus’ internal representation and \( \sigma_c^2 \) denotes the variance of the comparisons c 1…c k employed in the experiment.
Proof. Because of Lemma 1 and \( Var\left[ {\sum\nolimits_{{i = 1}}^n {{a_i} \cdot {{\mathbf{Y}}_i}} } \right] = \sum\nolimits_{{i = 1}}^n {a_i^2 \cdot Var\left( {{{\mathbf{Y}}_i}} \right)} \), one can write
Since the sequence X 1,0, X 1,1…X 1,n is stationary with \( Var\left( {{{\mathbf{X}}_{{1,i}}}} \right) = \sigma_x^2,i = 0 \ldots n \), and because \( \sum\nolimits_{{i = 1}}^n {{g^{{2\left( {n - i} \right)}}} = \frac{{1 - {g^{{2n}}}}}{{1 - {g^2}}}} \), it follows that
Note that \( \sigma_x^2 \) depends on stimulus order. For stimulus order 〈sc〉, \( \sigma_x^2 \) is equal to σ 2. In contrast, for stimulus order 〈cs〉 one needs also to take into account that the magnitude of the first stimulus varies from trial to trial. Although the distribution of X 1,i is stationary, it is a mixture distribution. Let C represent a random variable that denotes the randomly drawn magnitude of the comparison from the set of all available comparisons c j , j = 1…k, employed in the experiment. Therefore, \( \sigma_x^2 \) is generally given by
According to the assumptions embodied in Equations 9, \( Var\left( {\left. {{{\mathbf{X}}_{{1,i}}}} \right|{\mathbf{C}} = {c_j}} \right) = {\sigma^2} \), and \( E\left( {\left. {{{\mathbf{X}}_{{1,i}}}} \right|{\mathbf{C}} = {c_j}} \right) = {c_j} \), which directly implies \( E\left( {E\left( {\left. {{{\mathbf{X}}_{{1,i}}}} \right|{\mathbf{C}}} \right)} \right) = E\left( {\mathbf{C}} \right) \). Therefore, Equation 14 can be recast as
In short, the variance \( \sigma_x^2 \) of the first stimulus’ internal representation is σ 2 for stimulus order 〈sc〉, whereas for stimulus order 〈cs〉 the variance of the comparisons must be taken into account as an additional component.
It can be easily verified that the variance of the internal reference Var(I n ) decreases rapidly to an asymptotic level with increasing n. Table 2 provides an example to demonstrate this convergence. Note that the asymptotic variance of I n is obtained from Theorem 2 for n → ∞ and summarized in the following corollary.
-
Corollary 1
(Asymptotic variance of the internal reference) The asymptotic variance of I n for n → ∞ is given by
$$ Var\left( {\left. {{{\mathbf{I}}_{\infty }}} \right|\left\langle {sc} \right\rangle } \right) = {\sigma^2} \cdot \frac{{1 - g}}{{1 + g}} $$(15)$$ Var\left( {\left. {{{\mathbf{I}}_{\infty }}} \right|\left\langle {cs} \right\rangle } \right) = \left( {{\sigma^2} + \sigma_c^2} \right) \cdot \frac{{1 - g}}{{1 + g}}. $$(16)
Note that \( 0 < \frac{{1 - g}}{{1 + g}} < 1 \), which implies that the variance of the internal reference is always smaller than the variance of the first stimulus’ internal representation, at least when the standard is presented in the first temporal position.
Part 2: Predicted difference limen as a function of stimulus order
In this part, we derive explicit expressions for the DL for the two stimulus orders 〈sc〉 and 〈cs〉. This derivation is an elaboration of the proof provided by Lapid et al. (2008, Appendix B).Footnote 10
-
Theorem 3
(Predicted difference limen) The DL for the two stimulus orders 〈sc〉 and 〈cs〉 under the assumptions above and for large n is
$$ D{L_{{\left\langle {sc} \right\rangle }}} = 0.67 \cdot \sqrt {{{\sigma^2} \cdot \frac{2}{{1 + g}}}} $$(17)$$ D{L_{{\left\langle {cs} \right\rangle }}} = 0.67 \cdot \frac{{\sqrt {{{g^2} \cdot \frac{{1 - g}}{{1 + g}} \cdot \sigma_c^2 + {\sigma^2} \cdot \frac{2}{{1 + g}}}} }}{{1 - g}}. $$(18)
Proof. We first derive DL for stimulus order 〈sc〉,
and since X 2,n − I n is normally distributed,
where Φ denotes the cumulative distribution function of a standard normal variable. Because of \( \varPhi \left( { - z} \right) = 1 - \varPhi (z) \), it follows that
The numerator of Equation 19 can be computed as
Furthermore, the denominator of Equation 19 is
The term \( Var\left( {\left. {{{\mathbf{I}}_{{n - 1}}}} \right|\left\langle {sc} \right\rangle } \right) \) is \( \frac{{1 - g}}{{1 + g}} \cdot {\sigma^2} \) for large n (see Corollary 1); thus,
Inserting Equations 20 and 21 into Equation 19 yields the predicted psychometric function
with the predicted DL
A similar proof applies to stimulus order 〈cs〉,
X 2,n − I n is assumed to be approximately normally distributed,Footnote 11 thus,
The numerator of Equation 23 can be computed as
and the denominator of Equation 23 can be computed as
For large n we can write
with
Thus,
Finally, inserting Equations 24 and 25 into Equation 23 yields the predicted psychometric function
and, therefore,
If the comparison c can take k values and the distance d from one level of c to the next level is constant, such that
and P(C = c j ), j = 1…k, is uniformly distributed, then the variance of the comparison, \( \sigma_c^2 \) , can be calculated as
In the introduction, the predictions of IRM for DL obtained by Monte Carlo simulations were presented for blocked and random stimulus order (see Fig. 2). The predicted DLs for blocked stimulus orders 〈sc〉 and 〈cs〉 based on these simulations are also given in Table 3, together with the predictions obtained analytically using Equations 17 and 18. As is evident, the predictions obtained analytically match those obtained from the simulations closely, which verifies the analytical and simulation-based predictions.
To further illustrate, the psychometric functions as predicted by Equations 22 and 26 for weight g = 0.5 are shown in Fig. 13 for blocked stimulus order 〈sc〉 (left panel) and for blocked stimulus order 〈cs〉 (right panel). The predicted probabilities obtained from the Monte Carlo simulations are also shown. Again, the analytically obtained psychometric functions and the simulation-based predicted probabilities match closely. In this appendix, we derived DL 〈sc〉 and DL 〈cs〉 for blocked conditions only. A similar analysis applies when the order of s and c varies randomly from trial to trial.
Theorem 3 can also be used to estimate g and σ from the observed DLs. For example, in the blocked conditions of Experiment 1 the DLs are 48.6 and 104.8 ms for stimulus orders 〈sc〉 and 〈cs〉, respectively. Inserting these values into Equations 17 and 18 and noting that \( {\sigma_c} = 40\sqrt {{30}} \), one obtains \( \widehat{g} = 0.61 \) and \( \widehat{\sigma } = 65.0 \) ms. For Experiment 2, we have DL 〈sc〉 = 89.0 ms, DL 〈cs〉 = 165.3 ms, and \( {\sigma_c} = 80\sqrt {{30}} \) ms. With these values, one obtains \( \widehat{g} = 0.64 \) and \( \widehat{\sigma } = 120.3 \) ms.
X 2,n − I n is approximately normally distributed
For stimulus order 〈sc〉, I n represents the weighted sum of independent normal random variables X 1,1, X 1,2, …, X 1,n . Thus, for 〈sc〉, I n is normally distributed. For stimulus order 〈cs〉, however, X 1,i can be regarded as a sum consisting of a normal variable Y and a discrete uniformly distributed random variable C, that is, X 1,i = Y + C. Therefore, the internal reference I n consists of a sum of symmetrically distributed random variables that can be expected (according to the central limit theorem) to approach a normal distribution. Finally, because X 2,n is normally distributed, the difference X 2,n − I n should be approximately normally distributed, as long as the variability of C does not strongly dominate the variance of X 1,i . For stimulus order 〈cs〉, we conducted Monte Carlo simulations to verify this conclusion. For these simulations, the standard was set to 500 (arbitrary unit), and the comparison could take values from 400 to 600 in constant steps of 20. The standard deviation SD(X) of the internal representations X 1 and X 2 was set to 50. The internal reference I n was updated according to Equation 1, and after 30 trials, the difference X 2,n − I n was computed. This was repeated 30,000 times. Figure 14 shows the results for weights g = 0.3 (left panel), g = 0.5 (middle panel), and g = 0.7 (right panel) as a histogram, together with the best-fitting normal distribution (thick solid line). As can be seen, the difference approaches the normal distribution quite well.
Rights and permissions
About this article
Cite this article
Dyjas, O., Bausenhart, K.M. & Ulrich, R. Trial-by-trial updating of an internal reference in discrimination tasks: Evidence from effects of stimulus order and trial sequence. Atten Percept Psychophys 74, 1819–1841 (2012). https://doi.org/10.3758/s13414-012-0362-4
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13414-012-0362-4