Abstract
Predicting the timing of incoming information allows the brain to optimize information processing in dynamic environments. Behaviorally, temporal expectations have been shown to facilitate processing of events at expected time points, such as sounds that coincide with the beat in musical rhythm. Yet, temporal expectations can develop based on different forms of structure in the environment, not just the regularity afforded by a musical beat. Little is still known about how different types of temporal expectations are neurally implemented and affect performance. Here, we orthogonally manipulated the periodicity and predictability of rhythmic sequences to examine the mechanisms underlying beat-based and memory-based temporal expectations, respectively. Behaviorally and using EEG, we looked at the effects of beat-based and memory-based expectations on auditory processing with and without attention. At expected time points, both beat-based and memory-based expectations facilitated target detection and led to attenuation of P1 and N1 responses, even when expectations were task-irrelevant (unattended). At unexpected time points, we found reduced target detection and enhanced N1 responses when beat-based expectations could be formed, regardless of the presence of memory-based expectations or task relevance. This latter finding supports the notion that periodicity selectively induces rhythmic fluctuations in neural excitability and furthermore indicates that while beat-based and memory-based expectations may similarly affect auditory processing of expected events, their underlying neural mechanisms may be different.
To optimize sensory processing and perception in a changing environment, the human brain continuously tries to predict incoming information (Clark, 2013; Friston, 2005). Being able to not only predict the content of sensory input (“what”), but also its timing (“when”) allows the system to prepare for and focus on time points when useful information is likely to occur (Large & Jones, 1999; Nobre & van Ede, 2018). Indeed, temporal expectations have been shown to improve processing of events at expected time points (Haegens & Zion Golumbic, 2018; Henry & Obleser, 2012; Nobre & van Ede, 2018; Rohenkohl, Cravo, Wyart, & Nobre, 2012; ten Oever, Schroeder, Poeppel, van Atteveldt, & Zion-Golumbic, 2014). Additionally, temporal expectations allow us to align our actions to sensory input, enabling complex behaviors such as dancing and synchronizing to musical rhythm (Honing & Bouwer, 2019; McGarry, Sternin, & Grahn, 2019; Merchant, Grahn, Trainor, Rohrmeier, & Fitch, 2015). Entrainment models, such as Dynamic Attending Theory (DAT), propose that temporal expectations result from synchronization between internal (neural) oscillations and external rhythmic stimulation, so called entrainment (Haegens & Zion Golumbic, 2018; Henry & Herrmann, 2014; Jones & Boltz, 1989; Large & Jones, 1999). DAT explains the behavioral benefits of temporal expectations by assuming that the internal oscillations represent fluctuations in attentional energy over time, peaking at expected time points (Jones & Boltz, 1989; Large & Jones, 1999). On a neural level, the internal oscillations can be thought of as fluctuations in low-frequency oscillatory activity, or cortical excitability, such that the high-excitability phase of low frequency neural oscillations coincides with the timing of expected events, facilitating their processing (Lakatos, Karmos, Mehta, Ulbert, & Schroeder, 2008; Schroeder & Lakatos, 2009).
Temporal expectations are often studied in the context of some form of periodic input, such as a regular beat in music (beat-based expectations). However, temporal expectations can be formed based on different types of structure in the environment, which need not necessarily be periodic. For example, temporal expectations can result from learning the relationship between a cue and a particular temporal interval, or from learning (nonperiodic) sequences of temporal intervals (Nobre & van Ede, 2018). In the latter two cases, expectations rely on memory of absolute durations. We will refer to these as memory-based expectations. Note that elsewhere, the terms duration-based timing, absolute timing, and interval-based timing have also been used (Breska & Ivry, 2018; Merchant & Honing, 2014; Teki, Grube, Kumar, & Griffiths, 2011).
Beat-based and memory-based timing have been differentiated in terms of their occurrence in different species (Honing, Bouwer, Prado, & Merchant, 2018; Honing & Merchant, 2014), the neural networks involved (Teki et al., 2011), and in how they are affected by neuropsychological disorders (Breska & Ivry, 2018), suggesting that beat-based and memory-based expectations are subserved by separate mechanisms. However, some have argued for one integrated system for timing (Schwartze & Kotz, 2013; Teki, Grube, & Griffiths, 2012), based on neuropsychological evidence (Cope, Grube, Singh, Burn, & Griffiths, 2014), and for reasons of parsimony (Rimmele, Morillon, Poeppel, & Arnal, 2018). Also, it is unclear whether entrainment, thought to underlie beat-based expectations, can account for memory-based expectations, which do not rely on periodic input (Breska & Deouell, 2017b; Morillon, Schroeder, Wyart, & Arnal, 2016; Rimmele et al., 2018). Thus, whether beat-based and memory-based temporal expectations are based on shared or separate underlying mechanisms is presently a matter of active debate. In the current EEG study, we addressed this outstanding question by directly comparing the effects of beat-based and memory-based expectations on auditory processing and behavior.
To date, many studies examining temporal expectations used isochronous sequences of events to elicit expectations (for examples, see Arnal, Doelling, & Poeppel, 2014; Henry & Obleser, 2012; Lakatos et al., 2013; Lawrance, Harper, Cooke, & Schnupp, 2014; Rohenkohl, Cravo, Wyart, & Nobre, 2012; Schwartze, Farrugia, & Kotz, 2013). Isochronous sequences are fully predictable, both in terms of their absolute intervals, and in terms of their ongoing periodicity, and therefore do not allow for differentiation between beat-based and memory-based expectations.
Only a handful of studies directly compared beat-based and memory-based expectations. In a behavioral experiment, responses to sequences that were isochronous (affording both beat-based and memory-based expectations), predictably speeding up or slowing down (affording only memory-based expectations), or with random timing (no expectations) were compared (Morillon et al., 2016). Beat-based expectations improved both perceptual sensitivity and response speed, while memory-based expectations only affected perceptual sensitivity, which was suggested to result from a special relationship between beat-based expectations and the motor system (Morillon et al., 2016). However, in another behavioral study, both beat-based and memory-based expectations improved response speed (Breska & Ivry, 2018), and phase coherence of delta oscillations, which is often used as a proxy for neural entrainment, was shown to be similarly enhanced by memory-based and beat-based expectations (Breska & Deouell, 2017b), suggesting that entrainment is a general, rather than a context-specific, mechanism of temporal expectations (Rimmele et al., 2018). However, in the latter two studies (Breska & Deouell, 2017b; Breska & Ivry, 2018), visual stimuli were used, which are arguably less optimal for creating beat-based temporal expectations than auditory stimuli (Merchant et al., 2015).
Here, we orthogonally manipulated beat-based and memory-based expectations using auditory stimuli, and examined their effects on both behavioral and auditory ERP responses. Interestingly, temporal expectations have been reported to both enhance and attenuate the auditory P1 and N1 responses. Enhancement of sensory responses at expected time points (Bouwer & Honing, 2015; Escoffier et al., 2015; Fitzroy & Sanders, 2008; Hsu, Hämäläinen, & Waszak, 2013; Rimmele, Jolsvai, & Sussman, 2011; Tierney & Kraus, 2013) is in line with entrainment models of temporal expectations, which assume heightened attention at expected time points (Large & Jones, 1999). By contrast, attenuation of sensory responses at expected time points (Lange, 2009; Paris, Kim, & David, 2016; Sanabria & Correa, 2013; Schwartze et al., 2013; Sherwell, Garrido, & Cunnington, 2017; van Atteveldt et al., 2015) is in line with predictive models of brain function that assert more efficient processing of incoming information when predicted information is suppressed (Friston, 2005; Marzecová, Widmann, Sanmiguel, Kotz, & Schröger, 2017; Schröger, Kotz, & SanMiguel, 2015; Schröger, Marzecová, & Sanmiguel, 2015). The idea that attention and prediction have separable effects on sensory responses is well established (Alilović, Timmermans, Reteig, van Gaal, & Slagter, 2019; Kok, Rahnev, Jehee, Lau, & de Lange, 2012; Lange, 2013; Schröger, Kotz, et al., 2015; Todorovic, Schoffelen, Ede, Maris, & de Lange, 2015). In the context of temporal expectations, whether a listener orients attention to a relevant time point, or predicts the probability of an event occurring at that time point may depend on the type of temporal structure that affords an expectation. Periodic input, affording beat-based expectations, may lead to entrainment and heightened attention at expected time points (Haegens & Zion Golumbic, 2018). By contrast, learned probabilistic information about timing, affording memory-based expectations, may induce temporal predictions. Thus, beat-based and memory-based expectations may have opposing effects on sensory responses, enhancement and attenuation of responses respectively.
In addition to the effects of beat-based and memory-based expectations on behavioral and auditory responses, we examined the effects of attention (or task relevance) on both types of expectations. Entrainment and beat-based processing have been shown to be somewhat independent of task relevance (Bouwer, Van Zuijen, & Honing, 2014; Bouwer, Werner, Knetemann, & Honing, 2016; Breska & Deouell, 2014a; Rohenkohl, Coull, & Nobre, 2011). Predictive processing, however, has been shown to depend on and interact with attention in both the visual and the auditory domain (Hsu, Hämäläinen, & Waszak, 2018; Kok et al., 2012; Paris et al., 2016; Todorovic et al., 2015). Thus, if beat-based expectations rely on entrainment while memory-based expectations rely on learning probabilistic information, they may be affected by task relevance differently.
In the current study, in two experiments, we compared responses to auditory sequences that were either periodic, affording beat-based expectations, or aperiodic, thus not affording beat-based expectations. Also, sequences could either consist of fully predictable temporal intervals, affording memory-based expectations, or unpredictable, randomly concatenated intervals. To examine effects of temporal expectations on behavioral performance, we introduced targets in the form of rare softer tones at different positions in these sequences. We expected faster and more accurate detection of targets with expected than unexpected timing, both for beat-based and memory-based expectations. Additionally, beat-based and memory-based expectations may interact in two distinct ways. First, the presence of beat-based expectations may lead to heightened attention to sounds on the beat, which in turn should increase the precision of memory-based predictions (Feldman & Friston, 2010). Thus, such an interaction would lead to enhanced effects of memory-based expectations in the presence of beat-based expectations. Alternatively, if beat-based and memory-based expectations rely on shared mechanisms, the simultaneous presence of both types of expectations may lead to interference, with smaller effects of either type when both need to be engaged. Finally, we not only examined responses for events on the beat (e.g., in phase with the periodicity, at expected times), but also off the beat (e.g., out of phase with the periodicity, at less expected times), as entrainment theories predict not only heightened attention at expected moments, but also reduced attention in between (Breska & Deouell, 2014b). In Experiment 2, we recorded ERP responses to all sound events, to examine the effects of temporal expectations on the P1 and N1 responses, both when sequences were task-relevant and task-irrelevant.
Experiment 1
Participants
Thirty-four participants (26 women), aged between 19 and 45 years old (M = 24.6, SD = 5.7) with no history of neurological or hearing disorders took part in the experiment. Data from two participants were removed due to technical problems, leaving 32 participants for the analysis. All participants provided written consent prior to the study, and participants were reimbursed with either a monetary fee or course credit. The study was approved by the Ethics Review Board of the Faculty of Social and Behavioral Sciences of the University of Amsterdam. The statistical analysis of Experiment 1 was preregistered (aspredicted.org/blind.php?x=mt5z2h).
Stimuli
We created sound patterns of five or six consecutive temporal intervals (Figure 1A), marked by woodblock sounds generated in GarageBand (Apple Inc.). Patterns of five or six intervals are short enough to allow for learning of the temporal intervals (Schultz, Stevens, Keller, & Tillmann, 2013), and to not make too large demands on working memory (Grahn & Schuit, 2012). At the same time, with a total length of 1800 ms, patterns were long enough to avoid the perception of a regular, periodic beat when patterns were concatenated into sequences, as people do not readily perceive a beat with a period of 1800 ms (London, 2012). Patterns were concatenated into sequences of 128 patterns, with a final tone added to each sequence. Each sequence thus lasted for 3 minutes and 51 seconds.
Beat-based expectations
For the periodic, beat-based patterns, temporal intervals were related by the integer ratios of 1:2:2:3:4 (five intervals) and 1:1:1:2:3:4 (six intervals). The shortest interval was set at 150 ms, leading to inter-onset intervals for the other intervals of 300, 450, and 600 ms. In the periodic patterns, temporal intervals were organized to form groups of four units length (600 ms) and grouped such that a perceptually accented tone was present at the start of each group (Grahn & Brett, 2007; Povel & Essens, 1985). In these patterns a beat could be perceived with an inter-beat interval of 600 ms (100 BPM or 1.7 Hz), the optimal rate for human beat perception (London, 2012). These patterns could be regarded as strictly metric, with the periodicity of the pattern always being marked by a sound (Grahn & Brett, 2007). Each pattern consisted of 12 units length, or three beats of four units length.
To create aperiodic equivalents of the beat-based patterns, we changed the ratios by which the temporal intervals were related. For the aperiodic patterns, intervals were related by non-integer ratios of 1:1.4:1.4:3:5.2 (five intervals) and 1:1:1:1.4:3:4.6 (six intervals). The aperiodic patterns were equal to their periodic counterparts in terms of length, grouping, and number of tones. However, the aperiodic patterns did not contain a periodic beat at unit length four. A pilot confirmed that aperiodic patterns were rated as less beat-inducing than periodic patterns.
Note that it is impossible to create sequences of sounds that are not to some extent (quasi-) periodic (Breska & Deouell, 2017a; Obleser, Henry, & Lakatos, 2017). However, first, in the aperiodic patterns, contrary to the periodic patterns, events did not align with a possible beat at a rate close to the ideal tempo for beat perception. Second, the periodicity that was present in the aperiodic sequences at the level of concatenated patterns (with a period of 1800 ms, or 33 BPM or 0.6 Hz) was too slow for humans to readily perceive a beat in (London, 2012). Finally, we performed an informal pilot experiment in which we asked 17 participants to rate on a scale from 1 to 10 how strongly they heard a beat in the aperiodic and periodic patterns. On average, each periodic pattern was rated as containing more beat (on average 9.3) than each aperiodic pattern (on average 6.7), confirming that our manipulation of periodicity indeed affected perception at the level of hearing a beat. Thus, while we are aware that the aperiodic patterns could be classified as quasi or weakly periodic, for clarity, we will refer to them as aperiodic.
Memory-based expectations
Fully predictable sequences were created by concatenating 128 identical patterns into a sequence (Figure 1B). The surface structure of temporal intervals in these sequences could easily be predicted based on probabilistic information alone. Unpredictable sequences were created by concatenating 128 semi-randomly chosen patterns. Patterns were chosen both from the original patterns, which were also used for the predictable sequences (patterns starting at beat 1, see Figure 1), and from cyclic permutations of these (patterns starting at beats 2 or 3, see Figure 1). The cyclic permutations were identical to the original patterns when looped (as in the predictable sequences), but not when concatenated in random order (as in the unpredictable sequences). Within an unpredictable sequence, only patterns with either five or six intervals were concatenated. To retain control over event density and entropy, we did not combine the sets. Each pattern could occur maximally twice consecutively, to retain the unpredictability of the sequence.
Position
With manipulations of periodicity and predictability, we were able to compare responses to events that were expected based on a beat or on learned interval structure with responses that could not be predicted based on their timing. However, we also wanted to examine how beat-based expectations affected responses to events with unexpected timing (e.g., not unpredicted, but rather mispredicted, see also Hsu et al., 2018). Therefore, we not only probed events that were in phase with the periodicity (e.g., on the beat), but also events that were out of phase with the periodicity (e.g., off the beat, see Figure 1). Events in the aperiodic patterns were similarly classified as on the beat or off the beat, depending on their grouping in the periodic counterpart. We assumed that people would not perceive a beat in the aperiodic patterns. However, by including a distinction between events on and off the beat in aperiodic patterns as well, we could not only test that assumption, but also take possible effects of grouping into account. Thus, when we refer to an event on the beat in an aperiodic pattern, it is an event that falls on the beat in the periodic equivalent.
Targets
In the behavioral task, temporal expectations were probed implicitly, by introducing infrequent intensity decrements as targets. Based on previous experiments, we expected that temporal expectations would improve the detection of these targets (Bouwer & Honing, 2015; Bouwer et al., 2014, 2016; Potter, Fenwick, Abecasis, & Brochard, 2009). Intensity decrements of 6 dB were used (Bouwer & Honing, 2015). In each sequence of 128 patterns, 32 patterns (25 percent) contained a target. Half of the targets appeared on the beat, and half of the targets off the beat. In each sequence, 26 targets were in positions after temporal intervals with unit lengths 1 and 3, present in both periodic and aperiodic patterns. Only these targets were used for the analysis, to equate their acoustic context. At least two standard patterns separated a pattern containing a target.
Procedure
A total of 16 sequences were presented to each participant, four of each type. Sequences of different types were semi-randomized, with each type appearing once every four sequences, and therefore a maximum of two sequences of the same type in a row. Upon arrival, participants completed a consent form and were allowed to practice the task. They were instructed to avoid movement, listen to the rhythm carefully, and press a button as fast as possible when they heard a target. Participants were allowed breaks between sequences. An entire experimental session lasted for about 2 hours. Participants were tested individually in a dedicated lab at the University of Amsterdam. Sounds were presented at 70 dB SPL with one Logitech speaker positioned in front of the participants, using Presentation software (version 19.0, www.neurobs.com).
Data analysis
All responses made within 2000 ms of a target were recorded. Responses faster than 150 ms were discarded, as were responses that were more than 2.5 standard deviations from the mean of a participant’s reaction time in any specific condition. Removal of outliers led to the exclusion of 2.9 percent of the responses in Experiment 1 and 3.1 percent of the responses in Experiment 2. Participants with a hit rate of less than 50 percent in all conditions were excluded from the analysis. In Experiment 1, on this ground, two participants were excluded, leaving 30 participants for the analysis of hit rates. One additional participant was excluded for the analysis of reaction times, as this participant had less than five valid reaction times in one condition. In Experiment 2, no participants were excluded from the analysis. Hit rates for each condition and participant and average reaction times for each condition and participant were entered into three-way repeated measures ANOVAs, with Periodicity (periodic, aperiodic), Predictability (predictable, unpredictable), and Position (on the beat, off the beat) as within-subject factors. For significant interactions (p<0.05), post hoc tests of simple effects were performed. Effect sizes are reported as partial eta squared. All statistical analyses were performed in SPSS 24.
Experiment 2
Participants
Thirty-two participants (22 women), aged between 19 and 44 years old (M = 23.4, SD = 4.9) with no history of neurological or hearing disorders took part in Experiment 2. Data of one participant were removed due to excess noise in the EEG signal, leaving thirty-one participants for the analysis. All participants provided written consent prior to the study, and participants were reimbursed with either a monetary fee or course credit. The study was approved by the Ethics Review Board of the Faculty of Social and Behavioral Sciences of the University of Amsterdam.
Stimuli and Procedure
The materials and procedure for Experiment 2 were identical to those for Experiment 1. In Experiment 2, participants additionally completed an unattended version of the experiment. In the unattended condition, participants were asked to ignore the rhythmic sequences and focus on a self-selected muted movie. All participants first completed the unattended EEG experiment, and subsequently the attended EEG experiment. For Experiment 2, one experimental session was about 4 hours, including breaks, practice, and setting up equipment.
EEG recording
EEG was recorded using a 64-channel Biosemi Active-Two acquisition system (Biosemi, Amsterdam, The Netherlands), with a standard 10/20 configuration and additional electrodes for EOG channels, on the nose, on both mastoids, and on both earlobes. The EEG signal was recorded at 1 kHz.
EEG analysis
Preprocessing was performed in MATLAB and EEGLAB (Delorme & Makeig, 2004). Data were offline re-referenced to linked mastoids, bad channels were removed, and independent component analysis was used to remove eye-blinks. Subsequently, bad channels were replaced by values interpolated from the surrounding channels. Visual inspection of the ERPs revealed a postauricular muscle response (PAM) in several subjects. The auditory evoked potential can be easily contaminated by the PAM response (Bell, Smith, Allen, & Lutman, 2004; Picton, Hillyard, Krausz, & Galambos, 1974). To avoid contamination, we re-referenced the data to earlobes for all further analyses. For completeness, we also report results from the mastoid-referenced data.
Data were offline down-sampled to 512 Hz, and filtered using 0.1 Hz high-pass and 40 Hz low-pass finite impulse response filters. Epochs for each condition separately were extracted, from 200 ms preceding the onset of each event till 500 ms after the onset of each event. Only epochs for events following an interval of unit length 1 or unit length 3 (150 and 450 ms respectively) were included, to equate the acoustic context of events used in the analysis. Epochs with a voltage change of more than 150 microvolt in a 200 ms sliding window were rejected from further analysis. For each condition and participant, epochs were averaged to obtain ERPs and baseline corrected using the average voltage of the 50 ms window preceding each sound. Finally, ERPs were averaged over participants to obtain grand average waveforms.
Peak latencies for the P1 and N1 responses were determined independent from the statistical analysis, from the average waveform collapsed over all conditions. P1 peaked at 58 ms after tone onset. We defined P1 amplitude as the average amplitude in a 20 ms window around the peak (48-68 ms). N1 peaked at 124 ms, and was more distributed in time. Thus, we defined N1 amplitude as the average amplitude from a 40 ms window around the peak (104-144 ms). Auditory evoked potentials are known to be maximal over fronto-central electrodes (Picton et al., 1974; Ruhnau, Herrmann, Maess, & Schröger, 2011), which was also observed in the current dataset. Therefore, ERP amplitudes were computed from the average of a cluster of 15 fronto-central electrodes: F3, F1, Fz, F2, F4, FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2, and C4. All statistics and figures reported here are based on the average amplitude from this region of interest.
Statistical analysis
Amplitudes for P1 and N1 were entered into repeated measures ANOVAs, with four within-subject factors: Periodicity (periodic, aperiodic), Predictability (predictable, unpredictable), Position (on the beat, off the beat), and Attention (attended, unattended). For significant interactions (p<0.05), subsequent tests of simple effects were performed. Effect sizes are reported as partial eta squared. Analyses were performed in SPSS 24.
Cluster-based permutation tests
Finally, we examined the effects of memory-based and beat-based expectations using cluster-based permutation tests. First, this allowed us to directly compare the effects of memory-based and beat-based expectations. Second, with this approach we could examine potential differences at all timepoints and at all electrodes, while taking into account the multiple comparisons along both the spatial and time axes. As ERP components often overlap in time, their real peaks may be obscured in grand average waveforms (Luck, 2005). The use of cluster-based permutation testing allowed us to make sure we did not miss potential differences between beat-based and memory-based expectations by selectively examining peak time windows and selected clusters of electrodes.
The effect of beat-based expectations was quantified as the difference between responses on the beat in periodic and aperiodic sequences. The effect of memory-based expectations was quantified as the difference between responses in predictable and unpredictable sequences. For the latter, we only included responses on the beat, to make sure that possible differences between beat-based and memory-based expectations could not be attributed to differences in grouping. Cluster-based permutation tests were performed using the Fieldtrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011). Clusters were formed based on adjacent time-electrode samples that survived a statistical threshold of p<0.01 when comparing the conditions of interest with dependent samples T-tests. Clusters were subsequently evaluated with permutation tests, using 2000 iterations.
Results
Behavioral experiments
Mean hit rates and reaction times per condition from both Experiment 1 and 2 are shown in Table 1 and Table 2 and Figure 2. In Experiment 2, all results from Experiment 1 were replicated with comparable effect sizes. In general, targets were detected more often in predictable than in unpredictable sequences, as reflected in a significant main effect of Predictability (Experiment 1: F(1,29) = 39.4, p < 0.001, ηp2 = 0.58; Experiment 2: F(1,30) = 46.4, p < 0.001, ηp2 = 0.61). The simple effect of Predictability was significant for all comparisons (all ps < 0.016, see Table 1). Thus, as expected, memory-based expectations led to improved detection of targets at expected time points. In addition, targets on the beat were detected more often than targets off the beat, as visible in a main effect of Position (Experiment 1: F(1,29) = 99.9, p < 0.001, η2 = 0.78; Experiment 2: F(1,30) = 100.3, p < 0.001, ηp2 = 0.77), and in Experiment 2, we found a main effect of Periodicity (F(1,30) = 6.59, p = 0.016, ηp2 = 0.18).
As expected, the effect of Periodicity in both experiments depended on the position of the target. On the beat, targets were detected more often in periodic than aperiodic sequences, while off the beat, targets were detected more often in aperiodic than periodic sequences, as reflected in a significant interaction between Periodicity and Position (Experiment 1: F(1,29) = 35.0, p < 0.001, ηp2 = 0.54; Experiment 2: F(1,30) = 39.3, p < 0.001, ηp2 = 0.57). The interaction between Periodicity and Position was more pronounced for the unpredictable than the predictable sequences, as reflected in a significant three-way interaction between Periodicity, Position, and Predictability (Experiment 1: F(1,29) = 6.69, p = 0.015, ηp2 = 0.19; Experiment 2: F(1,30) = 5.59, p = 0.025, η2 = 0.16). The simple effect of Periodicity was significant for all comparisons off the beat (all ps < 0.029). Thus, beat-based expectations led to decreased detection of targets when they were presented out of phase with the periodicity and therefore occurred at unexpected time points. On the beat, the simple effect of Periodicity only reached significance for unpredictable sequences, and only in Experiment 2 (p = 0.003), showing improved detection through beat-based expectations only in the absence of memory-based expectations.
For reaction times (Table 2), the data showed numerical trends in the same direction as for hit rates, albeit non-significant for most comparisons. Like for hit rates, there was a main effect of Predictability (Experiment 1: F(1,28) = 19.6, p < 0.001, ηp2 = 0.41; Experiment 2: F(1,30) = 25.0, p < 0.001, ηp2 = 0.46), with faster responses to predictable than unpredictable targets, showing faster target detection through memory-based expectations. The simple effect of predictability was significant for all comparisons (all ps < 0.023), except for targets off the beat in periodic sequences in Experiment 2 (p = 0.055). In both experiments, targets were detected faster on the beat than off the beat, apparent from a main effect of Position (Experiment 1: F(1,28) = 37.1, p < 0.001, ηp2 = 0.57; Experiment 2: F(1,30) = 14.4, p = 0.001, ηp2 = 0.32). In line with the results for hit rates, numerically, detection of targets off the beat, but not on the beat, was slower in periodic than aperiodic sequences. However, the interaction between Periodicity and Position did not reach significance for the reaction times (Experiment 1: F(1,28) = 3.74, p = 0.063, ηp2 = 0.12; Experiment 2: F(1,30) = 1.91, p = 0.18, ηp2 = 0.06). In Experiment 2, we did find an interaction between Periodicity and Predictability, with a stronger effect of Predictability in aperiodic than periodic sequences (F(1,30) = 6.67, p = 0.015, ηp2 = 0.18).
Thus, behaviorally, memory-based expectations improved target detection both in terms of response accuracy and response speed. Beat-based expectations similarly lead to improved target detection, but only when no memory-based expectations were present, and only in terms of response accuracy, not response speed. The improvements in target detection afforded by beat-based expectations were thus small and dependent on memory-based expectations. In contrast, off the beat, beat-based expectations lead to decrements in performance, both in the absence and presence of memory-based expectations. Interestingly, while we found an interaction between beat-based and memory-based expectations, it was in the opposite direction of what we expected: rather than enhancing each other, when both types of expectations were present, their effects were diminished.
EEG results
ANOVA results
Figure 3 shows the auditory evoked potentials for all conditions. Average amplitudes as extracted from the P1 and N1 time-windows and fronto-central region of interest are depicted in Figure 4.
As expected, P1 responses were larger in amplitude for attended than unattended events (main effect of Attention: F(1,30) = 70.8, p < 0.001, ηp2 = 0.70), and P1 responses were smaller for predictable than unpredictable events (main effect of Predictability: F(1,30) = 10.1, p = 0.003, ηp2 = 0.25), showing attenuation through memory-based expectations. The simple effect of Predictability did not depend on task relevance (no interaction between Predictability and Attention: F(1,30) = 1.62, p = 0.21, ηp2 < 0.05). Unexpectedly, for P1, the effect of Periodicity did not depend on the position of an event (no interaction between Periodicity and Position: F(1,30) = 0.009, p = 0.92, η2 < 0.001). Instead, we found a main effect of Periodicity (F(1,30) = 12.0, p = 0.002, ηp2 = 0.29), with smaller P1 responses to events in periodic than aperiodic sequences. On the beat, smaller P1 responses to events in periodic (i.e., beat-based expected, in phase with the periodicity) than aperiodic (no beat-based expectations present) sequences could indicate attenuation of expected events through beat-based expectations, in line with predictive processing. However, off the beat, smaller responses to events in periodic (i.e., beat-based unexpected, out of phase with the periodicity) than aperiodic (no beat-based expectations present) sequences would be in line with attenuation of unexpected events through beat-based expectations, in line with resource withdrawal and smaller responses off the beat, as proposed by DAT. Thus, the lack of an interaction between Periodicity and Position for the P1 responses is hard to interpret and warrants further research.
As expected, N1 responses to predictable events were smaller than to unpredictable events (main effect of Predictability: F(1,30) = 4.32, p = 0.046, ηp2 = 0.13), showing attenuation through memory-based expectations. In addition, we found a main effect of Position (F(1,30) = 11.1, p = 0.002, η2 = 0.27). However, the effect of position depended on Attention (interaction between Position and Attention: F(1,30) = 17.5, p < 0.001, ηp2 = 0.37). In the attended condition, N1 responses to events off the beat were larger than to events on the beat (p < 0.001), while in the unattended condition, there was no difference between responses on and off the beat (p = 0.69). Importantly, we found a significant interaction between Periodicity and Position (F(1,30) = 12.0, p = 0.002, ηp2 = 0.29). On the beat, N1 responses were smaller to events in periodic than aperiodic sequences (though only marginally so: p = 0.060), in line with attenuation through beat-based expectations. Contrarily, off the beat, responses were larger to events in periodic than aperiodic sequences (p < 0.001).
Figure 5 shows a summary of the main effects of memory-based expectations on performance and auditory-evoked potentials. Memory-based expectations lead to improved target detection (Figure 5A), in line with behavioral advantages afforded by memory-based expectations, both in terms of response accuracy and response speed. Moreover, memory-based expectations were associated with an attenuation of both the P1 and N1 responses (Figure 5B and 5C). This attenuation of auditory responses suggests processes related to prediction of events, which leads to suppression of predicted information. The effects of memory-based expectations were independent of task relevance, though numerically, larger in attended than unattended conditions.
Figure 6 shows a summary of the effects of beat-based expectations on performance and auditory-evoked potentials. Beat-based expectations only weakly facilitated target detection (e.g., only when no memory-based expectations were present, and only in Experiment 2). However, for events that occurred at unexpected time points, beat-based expectations lead to worse performance in terms of target detection (Figure 6A). Like memory-based expectations, beat-based expectations attenuated N1 responses to expected events (Figure 6B and 6C). In addition, N1 responses to events off the beat were larger for periodic than aperiodic sequences. Thus, N1 responses were smallest for events that were expected (i.e., on the beat, in phase with the periodicity), largest for events that were unexpected (i.e., off the beat, out of phase with the periodicity), with the amplitude of the N1 to events that were neither expected nor unexpected (i.e., in aperiodic sequences where no beat-based expectations were present) in between. Attenuation of responses to expected events, and enhancement of responses to unexpected events points to prediction, rather than attention, underlying the effects of beat-based expectations on perception, similar to the effects of memory-based expectations. Like for memory-based expectations, the effects of beat-based expectations were independent of task-relevance.
To check whether observed results were not confounded by our choice of earlobe reference, we repeated the analysis with mastoid reference, as is more customary in auditory ERPs analyses. Note that these results may have been confounded with the PAMR response, which was present to some degree in several subjects. With mastoid reference, results generally were the same as with earlobe reference. However, with mastoid reference we found a four-way interaction between Attention, Predictability, Periodicity and Position for the P1 (F(1,30) = 4.97, p = 0.033, ηp2 = 0.14). This four-way interaction did not reach significance for the data referenced to earlobes (F(1,30) = 2.57, p = 0.12, ηp2 = 0.079). To pursue the interaction in the mastoid-referenced data, we split the data between attended and unattended conditions. In the unattended condition, no significant effects were present. In the attended condition, we found a three-way interaction between Periodicity, Predictability, and Position (F(1,30) = 4.59, p = 0.04, η2 = 0.13). The effect of Periodicity on the P1 response was significant only for events off the beat in predictable sequences (p = 0.006), and for events on the beat in unpredictable sequences (p = 0.005). Finally, with mastoid reference, the effect of Predictability on the N1 response was only marginally significant (F(1,30) = 3.80, p = 0.061, ηp2 = 0.11). While there were thus small differences in significance between the data sets, the main findings – the effect of Predictability and interaction between Periodicity and Position – were robust to changes in reference.
Cluster-based permutation tests
Finally, and crucial to our main question, we compared the effects of memory-based and beat-based temporal expectations on auditory-evoked responses directly, by running permutation tests on the ERPs obtained from subtracting responses to events on the beat in unpredictable from predictable sequences (index of memory-based predictions), and subtracting responses to events on the beat in aperiodic from periodic sequences (index of beat-based expectations). As is visible in Figure 7, the difference waves indexing the two different types of temporal expectations examined have a very similar morphology. Indeed, cluster-based testing showed no significant differences between the effects of Periodicity (on the beat) and Predictability on the auditory evoked potential, neither in attended, nor in unattended conditions.
Discussion
Temporal expectations facilitate sensory processing and perception in dynamic environments, and play an important role in synchronizing our actions to regularities in the outside world, for example, when dancing to a beat in music (Honing & Bouwer, 2019; McGarry et al., 2019). Currently, little is still known about whether shared or separate mechanisms contribute to temporal expectations based on different types of structure in the environment. Specifically, it is unclear whether beat-based expectations (based on periodicity of the input) and memory-based expectations (based on learned predictions of absolute intervals) affect sensory processing and performance in similar ways, independently of each other or in interaction (Breska & Deouell, 2017b; Nobre & Rohenkohl, 2014). Moreover, to what extent these effects rely on attention is also still unclear.
In the current study, we show that beat-based and memory-based expectations cannot be differentiated in terms of their effects on auditory processing and/or performance. Both types of expectations lead to enhanced detection of expected events, and to attenuation of auditory responses to those events, the latter independent of attention. Also, beat-based and memory-based expectations interacted, with smaller behavioral effects when both were present. These findings are in line with the notion that beat-based and memory-based expectations are subserved by a shared mechanism for temporal predictive processing (Breska & Deouell, 2017b; Rimmele et al., 2018; Teki et al., 2012). Yet, beat-based expectations also lead to reduced target detection and enhanced auditory responses to events out of phase with the periodicity, even when these events were fully predictable based on memory. This latter finding suggests that while the effects of beat-based and memory-based expectations on sensory processing and performance are the same at expected time points, the underlying computation may in fact be separate, with beat-based expectations relying (in part) on a rhythmic processing mode characterized by withdrawal of resources off the beat (Breska & Deouell, 2017b). Below, we discuss these findings and their theoretical implications in detail.
Behaviorally, both beat-based and memory-based expectations facilitated target detection in a rhythmic sound sequence, as reflected by enhanced hit rates for targets with expected timing. Although only memory-based expectations improved response speed, the results for beat-based expectations were numerically in the same direction. Thus, unlike previous research (Morillon et al., 2016), we did not find a qualitative difference between beat-based and memory-based expectations on performance. However, in previous work, beat-based expectations were elicited by isochronous sequences, which also elicit memory-based expectations, rendering interpretation of their findings difficult.
That the effects of beat-based and memory-based expectations on auditory processing may rely (in part) on shared mechanisms may further be supported by the observed interaction between the two types of expectations. Beat-based facilitation of detection rates only reached significance in the absence of memory-based expectations. Given that detection rates on average did not exceed 82 percent, even in the fully predictable sequences, it is unlikely that the absence of a beat-based effect here was due to a ceiling effect. Moreover, we expected beat-based expectations to facilitate memory-based expectations, similar to the facilitation of content predictions (“what”) afforded by temporal expectations (Auksztulewicz et al., 2018; Hoch, Tyler, & Tillmann, 2012; Schwartze, Rothermich, Schmidt-Kassow, & Kotz, 2011; Selchenkova, Jones, & Tillmann, 2014). However, if anything, the effect of predictability on target detection was smaller in the periodic than the aperiodic sequences. Thus, when both types of expectations were present, their effects on auditory processing and performance were diminished. This may suggest that beat-based and memory-based expectations also to some extent compete for limited capacity temporal processing to form expectations, leading to interference when both need to be engaged.
In correspondence to our behavioral findings, our ERP results also do not suggest qualitative differences between the effects of beat-based and memory-based expectations. Both P1 and N1 responses to expected events were attenuated, in line with theories about predictive processing, and explained by assuming that the brain only processes input that is not predicted (Friston, 2005; Schröger, Marzecová, et al., 2015). Effects of beat-based expectations are often explained by entrainment models and DAT, which assume that attention, rather than prediction, is heightened at expected time points (Henry & Herrmann, 2014; Large & Jones, 1999), leading to enhanced, rather than attenuated sensory responses (Haegens & Zion Golumbic, 2018). Our results, however, do not show enhanced auditory processing due to beat-based expectations.
While entrainment has often been equated with fluctuations in attention (Henry & Herrmann, 2014; Lakatos et al., 2008; Large, Herrera, & Velasco, 2015), other studies have also shown that entrainment leads to attenuation rather than enhancement of sensory responses (O’Connell et al., 2015; van Atteveldt et al., 2015), like the beat-based expectations in the current study. Moreover, the effects of periodicity, which guides expectations bottom-up, can be dissociated from the effects of task relevance and general top-down attention mechanisms (Kunert & Jongman, 2017). Thus, while entrainment may lead to fluctuations in neural excitability, these may be related to predictions, rather than attention, as proposed by DAT.
Several studies have found enhancement of sensory responses when manipulating temporal expectations, which may seem contradictory to the current findings. However, first, the observed enhancement may depend on the use of different manipulations of temporal expectations. For example, when temporal expectations are cue-based (Hsu et al., 2013), they may indeed be manipulations of attention (relevance of a time point), rather than prediction (Lange, 2013). Second, when comparing responses to sounds in phase and out of phase with some periodicity, differences in evoked responses may be due to grouping differences between sounds on and off the beat (Schnupp, Rajendran, Harper, Garcia-Lazaro, & Lesica, 2017). Finally, the perception of hierarchical structure in music (meter) may lead to perceived illusory metrical accents (Bouwer, Burgoyne, Odijk, Honing, & Grahn, 2018; Repp, 2010), causing enhanced responses on the beat when compared to off the beat. Indeed, two studies that specifically manipulated beat-based expectations by asking participants to imagine accents on the beat found enhancement of sensory responses (Iversen, Repp, & Patel, 2009; Schaefer, Vlek, & Desain, 2010). To sum up, it may be that entrainment and beat-based expectations are not necessarily related to fluctuations in attention, and that previously reported enhancement of auditory responses is due to attention, or grouping and metrical accenting, which we controlled for in the current study by including the aperiodic sequences. Which specific task- and stimulus-related factors cause auditory responses to be enhanced, and whether attention plays a role in this at all, remains an interesting question for further research.
In addition to considering the effects of beat-based expectations at expected time points, we also looked at how beat-based expectations affected processing at time points that were unexpected. Off the beat, N1 responses were larger for events in periodic than aperiodic sequences. Enhancement of auditory processing of events off the beat when periodicity is present is in line with these events being more unexpected than their aperiodic counterparts. Behaviorally, we found deteriorated target detection for events off the beat in periodic sequences when compared to aperiodic sequences, in line with reduced processing of events off the beat. Importantly, detection was hampered even if event timing was fully predictable based on learning the sequence of intervals.
Our findings illustrate the importance of also examining effects of beat-based expectations off the beat in distinguishing between beat-based and memory-based expectations. They support the view that beat-based expectations may allow the brain to go into the more efficient “rhythmic mode” of processing, instead of a continuous “vigilance mode” associated with non-periodic input. Rhythmic mode may be accompanied by automatic suppression of out of phase input when entrainment occurs (Schroeder & Lakatos, 2009; Zoefel & Vanrullen, 2017). Beat-based, but not memory-based expectations have indeed been associated with withdrawal of resources from unexpected moments in time, as apparent from immediate CNV resolution after expected time points for beat-based, but not memory-based expectations (Breska & Deouell, 2017b). A rhythmic processing mode characterized by withdrawal of resources off the beat, rather than focusing of resources on the beat, could also explain why beat-based expectations only weakly affected responses on the beat, especially when memory-based expectations were also present. With the current design, we could not probe events that were unexpected in terms of memory-based expectations. Thus, we cannot be sure that memory-based expectations do not show similar withdrawal of resources from unexpected time points. However, focusing on the effects of expectations at unexpected instead of expected time points may be an interesting way to compare beat-based and memory-based expectations in the future.
We did not find any interactions between the effects of expectations and task relevance. For the effect of beat-based expectations on responses off the beat, the independence of attention can be explained by assuming that off the beat, events were mispredicted, rather than unpredicted (e.g., the aperiodic sequences did not allow for the formation of beat-based expectations, making all events unpredicted based on a beat, while in the periodic sequences, events off the beat were not in line with the beat-based expectations that could be formed). The effects of mispredicted stimuli have been shown to be independent of attention (Hsu et al., 2018), reminiscent of the independence of the MMN, an ERP response to mispredicted sounds, from attention (Näätänen et al., 2007). For the facilitating effects of both beat-based and memory-based expectations, no interaction with attention was present either. However, based on the current data, we cannot rule out that attention may facilitate effects of temporal expectations, as numerically, effects were larger in the attended than unattended condition. Indeed, with mastoid reference, the (four-way) interaction with attention did reach significance for the P1 response. We can therefore not draw strong conclusions about the interaction between attention and temporal expectations, other than noting that attention does not seem to be a prerequisite for expectations to develop.
Conclusion
To sum up, in the current study, we could not differentiate between beat-based and memory-based temporal expectations in terms of their effects on the detection of, and auditory responses to targets with predictable timing, nor in terms of how they were affected by attention. Also, beat-based and memory-based expectations to some extent interfered with each other. Similar effects on ERPs and behavior and the presence of interference may point at a shared underlying mechanism, which may be surprising given the evidence from clinical studies (Breska & Ivry, 2018) and research in nonhuman animals (Honing et al., 2018) that suggests that beat-based expectations are distinct from other types of temporal expectations. Indeed, we also found evidence for distinct processing of beat-based and memory-based expectations: When the timing of events was fully predictable based on memory, the presence of beat-based expectations still deteriorated target detection and enhanced sensory responses for events off the beat (at unexpected moments).
To reconcile these findings, we propose that beat-based and memory-based expectations overlap in their effect on auditory responses and behavior, in line with both types of expectations serving the same function, but nonetheless, have partly separate underlying mechanisms to form expectations. Future research will have to focus on distinguishing between possible separate underlying mechanisms by examining the neural dynamics and neural networks involved in different types of temporal expectations. This may also elucidate at which point in the processing stream different types of temporal expectations interact, be it at early stages, during the formation of expectations, or at later stages, where expectations exert their effect on perception and behavior. The special status of beat-based expectations, which has been linked with evolutionary advantages of music (Honing et al., 2015), thus remains an open question. We contribute to this question by being the first to look at the orthogonal effects of beat-based and memory-based expectations on responses to auditory rhythm, and by showing how focusing on the effects of expectations on processing at unexpected, rather than expected time points, may provide a fruitful way to differentiate beat-based from other types of expectations in future research.
Footnotes
FLB is supported by an ABC Talent Grant awarded by Amsterdam Brain and Cognition. HAS is supported by a European Research Council (ERC) starting grant (679399). We would like to thank Peter Saalbrink for his assistance with the data collection.