Neurocognitive studies of musicians have provided ample evidence for the existence of various experience-dependent plasticity changes in the brain (Jäncke, 2009). Long, intensive exposure to sounds during musical training and professional careers enhances auditory processing in musicians as compared to nonmusicians (Münte, Altenmüller, & Jäncke, 2002). However, little is known about the effects of musical training on rapid neural plasticity during auditory perceptual learning. Neural plasticity refers to the capacity of the neural system to change its functional properties after learning or maturation (Pascual-Leone, Amedi, Fregni, & Merabet, 2005). During perceptual learning, neural changes can occur relatively rapidly, even within minutes (Weinberger & Diamond, 1987). Perceptual learning is a type of procedural learning in which improved discrimination of stimuli at the sensory level can be evaluated by examining changes in neural processing and behavioral discrimination. Rapid and short-term plasticity in the auditory system is an essential feature of learning new languages or music (François & Schön, 2010). Auditory perceptual learning is also important for the rehabilitation of auditory functions.

Neurocognitive studies have consistently confirmed that the auditory system is capable of extracting the sound environment and its rules in a probabilistic manner without focused attention (Fiser, Berkes, Orban, & Lengyel, 2010). In other words, regularly repeated and familiar sounds are processed differently from irregular, deviating sounds. In addition to encoding stimulus features, the auditory system develops a prediction model for the sound environment that is used to process sound events in an optimized manner: Repeated, familiar events typically habituate, while unexpected, deviating sounds initially produce stronger responses (Grill-Spector, Henson, & Martin, 2006; Todorovic, van Ede, Maris, & de Lange, 2011). This probabilistic and fairly automatic process may be an essential component of auditory perceptual learning.

Two mechanisms of perceptual learning have been proposed: the feedback-guided attentional process is believed to lead to feature-dependent learning, while passive exposure to stimuli is hypothesized to lead to learning that can be generalized to untrained features (Zhang & Kourtzi, 2010). For example, learning to discriminate pitch contours in melodies could be generalized to the discrimination of linguistic pitch contours (i.e., prosody; Marques, Moreno, Castro, & Besson, 2007). These mechanisms can be studied with event-related potentials (ERPs). In the present study, we examined auditory perceptual learning by measuring rapid plasticity changes on the P3 ERP components following both passive exposure to sounds (P3a) and active discrimination tasks (P3b).

The P3a response is a positive deflection that occurs 200–400 ms following either a low-probability novel (infrequent nontarget) or salient (infrequent target) change in a stream of predictable (frequent) auditory stimulations (for a review, see Polich, 2007). Originally, the P3a was associated with novel sound (or visual: Courchesne, Hillyard, & Galambos, 1975) processing; however, it can be elicited by the infrequent but nonnovel changes in an oddball paradigm. For easily discriminated deviant sounds, P3a responses can occur even when a listener is instructed to ignore the auditory stimuli and to concentrate on other tasks (Schwent, Hillyard, & Galambos, 1976). Frontocentrally maximal P3a responses might reflect involuntary attention switching toward irregular deviant sounds that follow passive comparisons between regularly presented standard and irregularly presented deviant sounds (Polich, 2007). In contrast, slower and temporo-parietally maximal P3b responses reflect controlled attention for task-relevant stimulus characteristics (Pritchard, 1981). In general, P3a and P3b responses are suitable for studying both bottom-up and top-down influences; they are modulated by attention, subjective probability (familiarity), difficulty levels, and stimulus features such as the relative saliency when compared to frequent sounds. P3a and P3b responses show both short- and long-term plasticity changes following auditory training (Atienza, Cantero, & Stickgold, 2004; Uther, Kujala, Huotilainen, Shtyrov, & Näätänen, 2006). Within a single session, P3a and P3b amplitudes show repetition-dependent reductions for target sounds in the frontal areas and a shift from frontal to parietal cortical activation during both active and passive listening conditions (Friedman, Kazmerski, & Cycowicz, 1998). Ben-David, Campeanu, Tremblay, and Alain (2011) found that the amplitudes for late positivity (P3b or P600) were decreased in left-hemisphere electrodes during speech tasks but not during tone-learning tasks. These results were interpreted as learning rather than repetition effects because the amplitude decrease was also related to better behavioral discrimination. Reduced activation in the frontal areas may reflect a lower demand for attentional processing of target sounds when the auditory memory template for sounds develops in temporo-parietal areas in conjunction with auditory perceptual learning.

Several studies have demonstrated enhanced P3a and P3b responses to target deviant sounds in musicians relative to nonmusicians. In this study, we focused on P3a findings for target deviant sounds. For example, P3b responses were greater in musicians asked to listen for pitch deviants (for late positivity results, see Besson & Faïta, 1995; for the P3b specifically, see Nikjeh, Lister, & Frisch, 2009; Tervaniemi, Just, Koelsch, Widmann, & Schröger, 2005), rhythmic irregularities (Vuust, Ostergaard, Pallesen, Bailey, & Roepstorff, 2009), and sound location deviants (Nager, Kohlmetz, Altenmüller, Rodriguez-Fornells, & Münte, 2003). In rhythmically trained musicians, P3b latencies were shorter for irregular sound omissions in rhythmic contexts (Jongsma, Desain, & Honing, 2004). Similarly, P3a latencies for pitch deviant sounds were shorter when musically trained participants were asked to ignore sounds (Nikjeh et al., 2009). These findings indicate stronger, faster involuntary attention switching (P3a) and enhanced matching of the working memory trace (P3b) to relevant target sounds in musicians. Although these findings indicate experience-dependent and long-term plasticity changes to P3a and P3b responses in musicians, the effect of music training on the rapid plasticity of P3a or P3b during auditory perceptual learning has not been studied. In this study, we explored the interplay of long- and short-term plasticity effects by measuring changes in P3a responses for deviating target sounds during passive exposure to sounds, as well as changes in P3b responses during an active auditory discrimination task. Sounds were presented in an oddball paradigm in which infrequently presented deviant target sounds were interspersed among frequently presented standard sounds. On the basis of previous studies of P3a and P3b potentials in musicians (Nikjeh et al., 2009; Tervaniemi et al., 2005; Vuust et al., 2009), we hypothesized that auditory perceptual learning, as indicated by rapid plasticity changes of the P3a/P3b for deviating target sounds during a single experimental session, would differ between musicians and nonmusicians. In some studies, music training enhanced only P3b responses during attentive discrimination tasks, and not P3a responses to unattended sounds (Besson & Faïta, 1995; Tervaniemi et al., 2005). On the basis of these findings, we also assumed that the differences between musicians and nonmusicians would occur only during active discrimination tasks.

Method and materials

Participants

The study participants were musicians (n = 20, 15 women, age range = 21–39 years) and nonmusicians (n = 21, 11 women, age range = 19–31 years). We used the same data presented in Seppänen, Hämäläinen, Pesonen, and Tervaniemi (2011), in which we report the mismatch negativity for deviants. All musicians had received a professional musical education, had an average of 17 years of playing and training experience, and reported practicing an average of 13 h/week (range = 4–28 h). None of the nonmusicians had received professional musical training; however, most had played an instrument for a short time during their schooling. Five of the nonmusician participants reported currently practicing for 0.5–1 h/week. All participants had normal hearing and reported no history of neurological or psychiatric disorders. Before starting the experiment, all participants gave written informed consent. The experimental protocol was conducted in accordance with the Declaration of Helsinki and approved by the ethics committee of the Department of Psychology at the University of Helsinki.

Procedure

The experimental sessions were conducted on two separate days. During the first day, the session started by determining each participant’s hearing threshold by presenting a short excerpt of the experimental stimuli binaurally through headphones. Subsequently, the stimuli were presented at 50 dB above the threshold, and an electroencephalogram (EEG) was recorded. The stimuli were presented in Passive Blocks 1 and 2, Active Task 1, Passive Blocks 3 and 4, and Active Task 2 (see the illustration in Fig. 1). Passive blocks lasted 15 min each, and the active tasks lasted 5 min each. During the passive blocks, participants were asked to ignore the sounds and concentrate on a muted movie with subtitles. During the active tasks, participants were instructed to press a button whenever they noticed a deviant sound among the standard sounds (e.g., Fitzgerald & Picton, 1983; Friedman et al., 1998; Romero & Polich, 1996; Schwent et al., 1976). Half of the musician and nonmusician participants received visual feedback after each correct answer. The remaining participants were told to look at the fixation cross on the screen. The purpose of the feedback was to offer guidance, especially to nonmusicians, who had not been not trained in auditory discrimination tasks. The second testing day occurred approximately one week after the first session. During this session, participants were subjected to a follow-up of the behavioral discrimination task (Active Task 3) without any visual feedback or EEG recording. Participants were also administered a series of questionnaires (not reported here), which consisted of the Immediate and Delayed Auditory Verbal Memory scales of the Wechsler Memory Scale–Revised (WMS-R) and the Stroop color-word interference test.

Fig. 1
figure 1

Order of the passive and active blocks during EEG recording in our experiment

Stimuli

During both the passive and active conditions, oddball stimuli consisting of infrequent deviant sounds and frequent standard sounds (with 70% probability) were presented. Standard sounds consisted of harmonically rich tones of 466.16, 493.88, or 523.25 Hz that varied randomly between active tasks and between passive blocks. The fundamental frequency was 150 ms in duration, with 10-ms rise and fall times (added with two harmonic partials in proportions of 60%, 30%, and 15%). Each sound was created individually using Adobe Audition software. The fundamental frequency was varied between blocks to avoid the frequency-specific neuronal adaptation caused by repetition of the same physical stimulus (Grill-Spector et al., 2006). Among the standard sounds, pitch, duration, and location deviances of three difficulty levels (easy, medium, and difficult) were presented. In each passive block, the probability of each deviant type (10%) was equally distributed throughout the three difficulty levels, such that each of the nine deviant sounds was presented 75 times among 1,575 standard sounds. During each active task, a maximum of 75 trials for each deviant type was presented. Importantly, the number of trials was dependent on the number of correct answers that the participant gave; after five successive correct responses, the difficulty level was elevated. Although we intentionally used simple tones rather than long, melodic stimuli that would have given an advantage to the musicians, the adaptive task also allowed for an assessment of improved discrimination for demanding (difficult) deviances. The average numbers of correct trials in Active Tasks 1 and 2 were not significantly different between groups (34 and 39 in musicians and 37 and 38 in nonmusicians, respectively). The pitch deviants were 5%, 2.5%, and 1% higher than the standard tones at the easy, medium, and hard difficulties, respectively. Duration deviants were from easy to difficult, as follows: 75 ms (50% shorter than standard), 112.5 ms (25% shorter), and 131.25 ms (12.5% shorter), respectively. Location deviants were generated by creating interaural time and decibel-level differences between the left and right ears. On the stereo channels representing the left ear, the sound started 1,200 μs (easy), 700 μs (medium), or 300 μs (difficult) later, such that deviants were perceived as coming from the right ear. The sound location deviant data failed to show reliable P3 responses and were excluded from analyses. The stimulus onset asynchrony was 400 ms under all of the conditions.

EEG recording and analysis

EEGs were recorded with the BioSemi ActiveTwo measurement system (BioSemi, The Netherlands) with a 64-channel cap and nose reference. Additional electrodes were used to record an electrooculogram (EOG) and mastoid site activation. Before filtering (0.5–35 Hz), the EEG data were down-sampled to 512 Hz offline, and artifacts, including movement-related distortions, were removed by BESA version 5.2 software (MEGIS Software GmbH, Germany). The data were divided into 500-ms epochs beginning 100 ms before sound onset (prestimulus baseline) and ended 400 ms after the sound onset. Thereafter, deviant and standard ERPs were averaged separately for each participant, condition, and stimuli. Grand-average waveforms were computed for each stimulus, condition, and group.

Nose-referenced grand-average waveforms were used to determine peak latencies for each group, by visual inspection from Fz, for P3a responses (passive blocks), and Pz, for P3b responses (active tasks). Peak latencies were used to calculate mean amplitudes ±20 ms around the peak latency for each participant, deviant type, difficulty level, and block. Peak latencies for the maximum values were calculated between 200 and 400 ms for the P3a and P3b responses. It is possible to have longer onset latencies for P3b; however, due to the short stimulus onset asynchrony (400 ms), the selected time window avoided overlapping responses. The amplitude distributions of all 64 electrodes over the head are presented in scalp maps that were computed from the same time period and used to calculate the group’s mean amplitude. Due to technical difficulties, the data from 1 nonmusician participant were missing from Passive Block 4, and the medium and difficult deviants were missing from Passive Block 3. One participant had a distorted occipital electrode during Block 3 and was excluded from the corresponding scalp map. To keep the signal-to-noise ratio consistent, only participants completing a minimum of 14 trials per deviant were analyzed in the active tasks (Cohen & Polich, 1997). On average, the number of completed trials was higher (see the Stimuli section above).

Statistical analysis

The significance of the P3a and P3b responses to deviant target sounds were tested by comparing the mean amplitudes between the deviant and standard sounds. We applied a mixed-effects model of the analysis of variance (ANOVA) that allowed a flexible dependency structure for the model and did not exclude the participant when a missing value was encountered (Gueorguieva & Krystal, 2004). Separate mixed-model ANOVAs were calculated for P3a and P3b responses. For the passive conditions, the block (Passive Blocks 1, 2, 3, or 4) was used as a repeated measure using the repeated statement in the SPSS mixed-model function. Participant was added as a random effect. This procedure assumes that within-subjects effects are “repeated measures,” as with traditional repeated measures ANOVAs. We used deviant type (pitch and duration), difficulty level (easy, medium, and difficult), and frontality (F3, Fz, and F4 for the frontal region; FC3, FCz, and FC4 for the frontocentral region; C3, Cz, and C4 for the central region; P3, Pz, and P4 for the parietal region) as within-subjects effects, and music training (musician and nonmusician) as between-subjects effects. Laterality was tested with similar parameters, with the exception of frontality, for which a within-subjects effect of laterality (F3, FC3, C3, and P3 for the left hemisphere; Fz, FCz, Cz, and Pz for midline; F4, FC4, C4, and P4 for the right hemisphere) was substituted. For random effects (participants), a scaled identity covariance structure was used. A restricted maximum likelihood fitting with a first-order autoregressive (AR1) function was used as a variance–covariance structure for the model.

For the active conditions, separate mixed-model ANOVAs were calculated for pitch and duration deviants; only duration deviants had a sufficient number of trials at both medium and difficult levels, whereas pitch deviants had enough trials only at the difficult level. The small number of trials in active task was due to the fact that task difficulty was adapted on the basis of individual learning profiles. Most participants discriminated deviants well enough to quickly move to the medium difficulty level and, later, to move from the medium to the difficult deviant level. Because we had a criterion of five consecutive successful identifications to move on to the next level of difficulty, most participants completed only a few easy trials. Easy deviants during active tasks were, therefore, not analyzed, due to the small number of trials. For both the duration and pitch analyses, task (Active Tasks 1 and 2) was used as a repeated measure, with frontality or laterality as a within-subjects effect. For duration deviants only, the difficulty level was also used as a within-subjects effect. Bonferroni-adjusted pairwise comparisons were used for all post hoc analyses. Additionally, trends were examined using contrasts between Passive Blocks 1 and 2 versus 3 and 4. All statistical tests are reported with an alpha level of .05 used as the significance criterion.

Behavioral performance in Active Tasks 1, 2, and 3 (the follow-up) was evaluated with a χ 2 test (more methodological details are given in the Notes to Supplementary Table 3, in the supplementary materials). The relationships between the improvements in behavioral discrimination accuracy and the active task, age, WMS-R memory scales, Stroop score, and neural changes were analyzed with Spearman’s nonparametric correlations. Correlations are reported with Bonferroni-adjusted criterion levels that were computed by dividing the level of significance by the number of tests (N = 51). The statistical comparisons of feedback effects during active tasks were performed only for the behavioral data and not for ERPs; the number of participants in the active tasks was insufficient after movement correction. Statistical analyses were performed using SPSS version 18 (SPSS Inc., Chicago, IL).

Results

Grand averages and amplitude topographies are shown in Figs. 2, 3, and 4. Summaries of all of the ANOVA results and the mean amplitude and peak latency values for each group are shown in Figs. 5 and 6, as well as Supplementary Tables 1 and 2. To examine whether rapid plasticity of the P3a and P3b differed between musicians and nonmusicians, separate mixed-model analyses were conducted to examine the P3a and P3b amplitude and latency changes between the four passive listening blocks and between the two active discrimination tasks. The focus in these analyses was to compare block-to-block neural changes between the musicians and nonmusicians.

Fig. 2
figure 2

Grand-average waveforms for difficult levels of both pitch and duration deviants and for medium duration deviant sounds during Active Tasks 1 and 2 for musicians and nonmusicians

Fig. 3
figure 3

Grand-average waveforms for duration deviant sounds during Passive Blocks 1, 2, 3, and 4 for musicians and nonmusicians

Passive condition: P3a amplitudes

During passive exposure to sounds, musical training modulated P3a amplitude changes between blocks [Block × Music training: F(3, 8146) = 21.05, p < .001; see upper left panel of Fig. 5]. In musicians, P3a amplitude enhanced from Blocks 1 to 2 (p = .002) but reduced from Blocks 1, 2, and 3 to Block 4 (all ps ≤ .001). In nonmusicians, however, P3a amplitude enhanced from Blocks 1 and 2 to Blocks 3 (both ps < .001) and 4 (ps = .04 and .01, respectively). In addition, a trend analysis for collapsed Blocks 1 and 2 as compared with Blocks 3 and 4 showed significant changes of P3a amplitude in both groups (p < .001). In other words, musicians initially showed an enhancement of P3a but habituation after the active task, while nonmusicians showed enhancement of P3a only after the active task.

Also, the deviant type (pitch and duration) as well as difficulty level interacted with the P3a amplitude changes between blocks for the different groups [Block × Deviant Type × Difficulty Level × Music Training: F(6, 8139) = 17.12, p < .001]. Since there were no preliminary assumptions for the effects of deviant type or difficulty level, here is only a summary of the significant post hoc findings (see also Fig. 5, lower left). For musicians, P3a amplitudes for easy and difficult pitch deviants were rapidly enhanced between the first two blocks but were diminished (habituated) after the active task. For medium-difficulty pitch deviants, however, the P3a amplitude diminished rapidly in musicians but was enhanced in nonmusicians, which was a pattern that continued after the active task. P3a responses habituated for easy duration deviants in both groups but were enhanced for difficult duration deviants after the first active task in musicians. Medium-difficulty duration deviants showed habituation in nonmusicians, with temporary enhancement observed after the active task.

Although there was no main effect of musical training in the grand-average waveforms, the pitch deviant P3a was visible and significant only for musicians. For duration deviants, nonmusicians also exhibited a P3a response for the easy and medium difficulty levels (Figs. 2 and 3, and Supplementary Table 1). One of the musicians displayed highly variable amplitude values for selective deviants (medium-difficulty pitch deviants in Passive Blocks 2 and 3, and easy-difficulty pitch deviants in Passive Block 4) that probably eliminated the main effect of the passive condition.

Fig. 4
figure 4

Grand-average waveforms for difficult levels of both pitch and duration deviants and for medium duration deviant sounds during Active Tasks 1 and 2 for musicians and nonmusicians

Fig. 5
figure 5

Summary of the P3a results (with standard errors of the means) for amplitudes (left) and latencies (right) of deviants in the passive condition

Passive condition: P3a latencies

In both musicians and nonmusicians, P3a latencies were shortened during the experiment [Block × Music training: F(3, 8110) = 12.00, p < .001]. In musicians, P3a latencies shortened from Block 1 to Blocks 2, 3, and 4 (all ps < .001). In nonmusicians, P3a was shortened from Block 1 to Blocks 2 and 4, and from Block 3 to 4, but increased from Block 2 to 3 (all ps ≤ .001). In addition, a trend analysis for collapsed Blocks 1 and 2 as compared with Blocks 3 and 4 showed a significant change of P3a amplitude only in musicians (p < .001). In other words, P3a latencies shortened in both groups, but increased only in nonmusicians after the active task.

As with P3a amplitude, deviant type and difficulty level also modulated the rapid plasticity of P3a latencies [Block × Deviant Type × Difficulty Level × Music Training: F(6, 8105) = 5.36, p < .001]. To summarize the significant findings, in musicians, the P3a latency for easy pitch deviants shortened rapidly, while in nonmusicians, the P3a latency was shortened only after the active task. P3a latencies for the medium-difficulty pitch and duration deviants were shortened only in nonmusicians from Block 1 to Block 2, with an additional latency shortening for medium-difficulty duration deviants from Blocks 3 to 4. In both groups, the latencies shortened for hard-difficulty pitch deviants only after the active task. Musicians also showed increased latencies from Blocks 3 to 4. Also, in both groups, the P3a latency for difficult duration deviants shortened from Blocks 1 to 2, while the P3a latency increased after the active task in musicians only. No changes of P3a latency were found for the easy duration deviant.

Active condition, P3b for duration

In the active tasks (Fig. 4), P3bs were analyzed separately for duration and pitch deviants; there were sufficient duration deviant trials to compare the medium and hard difficulty levels; however, there were only enough pitch deviant trials to analyze the hard difficulty level. The hard-difficulty-level deviants that had not yielded significant responses during the passive condition produced significant responses in the active tasks (Supplementary Table 1). For duration deviants in the active tasks, the P3b amplitude was diminished between Active Tasks 1 and 2 for medium (p = .04) and hard (p < .001) difficulty levels only in musicians [Block × Difficulty Level × Music Training: F(1, 351) = 4.38, p = .04, Fig. 6]. In addition, P3b amplitudes for duration deviants were significantly diminished in all but the most frontal electrodes in musicians (frontocentral, p = .01; central, p = .01; parietal, p < .001). In nonmusicians, however, P3b responses were diminished significantly (p = .02) only in the most frontal (F3, Fz, and F4) electrodes [Block × Frontality × Music Training: F(3, 400) = 4.74, p = .01]. P3b latencies were shortened between Active Tasks 1 and 2 in musicians for medium duration deviants (p = .02) and for the difficult duration deviants in both groups (musicians, p = .02, nonmusicians, p < .001) [Block × Difficulty Level × Music Training: F(1, 682) = 8.85, p = .01].

Fig. 6
figure 6

Summary of the P3b results (with standard errors of the means) for amplitudes (left) and latencies (right) of deviants in the active condition

Active condition, P3b for pitch

Separate analyses for pitch (only difficult level included) showed a significant reduction in P3b amplitudes between active tasks only in musicians (p < .001) [Block × Music Training: F(1, 344) = 5.73, p = .02]. In all participants, P3b latencies were shortened between active tasks [Block: F(1, 335) = 69.84, p < .001], but during Active Task 2, the right-hemisphere electrodes showed significantly (both ps < .01) longer latencies when compared to the midline and left hemisphere electrodes [Block × Laterality: F(2, 463) = 5.79, p = .01].

Behavioral measures

Improvement in behavioral discrimination accuracy between active tasks was evaluated by comparing performance between the tasks separately in musicians and nonmusicians. Only nonmusicians showed improved behavioral accuracy for hard-difficulty deviant sounds (sum score comprising both pitch and duration deviants) between Active Tasks 1 and 2 (χ 2 = 15.59, p = .01) and between Active Tasks 1 and 3 (the follow-up) (χ 2 = 7.37, p = .03). In musicians, accuracy started at ceiling level and remained there throughout testing (see Fig. 7). We did not study the effects of feedback on the neural measures, due to the small group sizes, which resulted in problems with the signal-to-noise ratio. Analyses of the behavioral data indicated that feedback did not significantly impact the performance of the musicians. In contrast, the nonmusician group showed a feedback-related improvement in the discrimination of hard-difficulty deviants between Active Tasks 1 and 2 (χ2 = 6.88, p = .03). No significant improvement in behavioral discrimination accuracy was found between Active Tasks 2 and 3 in either group.

Fig. 7
figure 7

Behavioral performance (with standard errors of the means) in Active Tasks 1, 2, and 3 (the follow-up) for musicians and nonmusicians

Correlations between neural and behavioral measures

Correlation analyses were run to examine the relationship between the P3a and P3b changes between blocks and the behavioral discrimination accuracy. Also, the relationship between neural changes and the attentional tests was examined. Using an adjusted alpha level of p ≤ .009, we found that participants who exhibited better discrimination performance during the active tasks tended to have a higher working memory capacity, as evaluated by the WMS-R Digit Span Test (Supplementary Table 6). Improved discrimination during the active tasks was also related to decreased changes in P3a responses between passive blocks. No significant correlations were found between changes in P3a/P3b responses between blocks and either the cognitive tests (WMS-R Immediate and Delayed Auditory Verbal Memory scales and Stroop color-word interference test) or age. Moreover, cognitive test scores did not differ between musicians and nonmusicians, but musicians showed a larger variance in the Stroop test (Levene’s test, p = .05; Supplementary Table 5). All cognitive tests showed greater variances among the musician group (Supplementary Fig. 1). It is possible that with a larger sample, musical training might have been found to influence auditory attention measures in a statistically significant manner.

Discussion

The main goal of this study was to examine the effects of music training on auditory perceptual learning, as reflected by rapid plasticity changes in P3a responses during passive exposure to sounds and in P3b responses during active auditory discrimination tasks. Confirming our hypothesis, we found that music training modulated the rapid plasticity of P3b responses for infrequently presented deviant target sounds during active listening tasks. Between active tasks, musicians exhibited habituation for both medium- and hard-difficulty duration deviants and hard-difficulty pitch deviants. Nonmusicians, however, showed habituation only for pitch deviants. When asked to ignore the sounds, musicians showed differential P3a plasticity for pitch deviants as compared to nonmusicians: Musicians exhibited a general trend of reduction (habituation) in P3a amplitudes, while nonmusicians showed enhancement of P3a amplitudes. For both groups, P3a responses were habituated for easy but enhanced for hard-difficulty duration deviants after the active task. We also found that while musicians were better able to discriminate the target sounds, only nonmusicians exhibited improvement in their behavioral discrimination accuracy. For all participants, behavioral discrimination performance was positively correlated with working memory capacity.

Rapid plasticity of P3b during active discrimination

On the basis of previous studies, we assumed that musicians would show enhanced rapid plasticity of P3b between the active discrimination tasks (e.g., Besson & Faïta, 1995; Tervaniemi et al., 2005). During active discrimination of deviant sounds, both musicians and nonmusicians showed significant reductions in P3b latencies and P3b amplitudes between Active Tasks 1 and 2 for pitch deviants. In musicians, however, the P3b amplitude for pitch deviants was stronger, and the amplitudes of medium- and hard-difficulty duration deviants diminished between tasks. The P3b latency reduction for hard-difficulty pitch deviants in both groups might reflect faster evaluation times for the target sounds as processing becomes easier during focused attention. Previous findings had shown that for easier deviants, the P3b latency is faster and larger during focused attention (Fitzgerald & Picton, 1983; Mazaheri & Picton, 2005). Our findings suggest that stimulus evaluation for more difficult deviants can be enhanced without musical training but requires focused attention on the deviating sounds.

Alternatively, the reduced P3b latencies and habituation of P3b amplitudes may indicate that the prediction error for task-relevant deviating sounds was reduced (Vuust et al., 2009). The prediction coding model for sensory processing emphasizes that the active neural process creates a set of rules between sound events. When a stimulus becomes easily predictable and familiar after repetition, the neural response habituates. In this study, we found that in musicians, the P3b amplitude decreased (habituated) significantly for target sounds deviating in both pitch and duration. Nonmusicians, however, showed P3b habituation only for pitch deviants (as did the participants in Romero & Polich, 1996). Our findings suggest that musicians are able to more efficiently develop prediction models for sounds. Enhanced prediction coding may explain why musicians also exhibited smaller P3b responses for musically relevant sounds as compared to speech (nonrelevant) deviants during active listening tasks (Tervaniemi et al., 2009). Musically relevant sounds are familiar and easily predictable for professional musicians and might lead to smaller P3b responses. Of note, the optimal paradigm to evoke and analyze P3b responses during active conditions would require a longer stimulus onset asynchrony than was used here (400 ms).

Interestingly, for duration deviants, the P3b diminished between active tasks in the frontal electrodes in nonmusicians, but in posterior electrodes in musicians. Similarly, in a previous study (Friedman et al., 1998), the P3 for attended novel sounds decreased only at the electrodes placed at the frontal areas of young adults (same age group used here). The lack of plasticity (habituation) of P3b responses in the frontal electrodes and the locus of the posterior scalp amplitude topography for plasticity effects in musicians suggest more automated task performance among the musicians during active conditions. Alternatively, musicians may have enhanced auditory selective attention that is associated with larger parietal activation (Pugh et al., 1996). In nonmusicians, the frontally maximal plasticity effects might indicate a developing memory template for auditory stimuli during perceptual learning. Although the present EEG data cannot confirm which brain structures were involved, our findings suggest differences in frontal and temporo-parietal networks between musicians and nonmusicians. Previous imaging studies have shown that both prefrontal and hippocampal structures are involved in passive encoding and habituation to repeated stimuli (Friedman, Cycowicz, & Gaeta, 2001; Strange, Fletcher, Henson, Friston, & Dolan, 1999). Also, reduced activation at parietal and prefrontal brain regions is associated with elevated behavioral performance in working memory tasks, thereby indicating practice effects (Jansma, Ramsey, Slagter, & Kahn, 2001). Temporo-parietal habituation in musicians may be related to their active use of auditory working memory (with significant contributions from temporal brain regions; Baddeley, 2003). Indeed, we found that musicians had superior (ceiling-level) behavioral discrimination accuracy in active tasks, but only nonmusicians exhibited improved accuracy between the tasks, since more nonmusicians discriminated the most difficult deviants in the second than in the first active task. This finding does not necessarily indicate that those who discriminated the more difficult deviants were actually learning to discriminate better. However, together with the neural findings of amplitude and latency changes, behavioral improvement also reflects learning for sounds. The accuracy improvement between Active Tasks 1 and 2 may be explained by enhanced neural processing. However, there was no significant improvement in discrimination accuracy between Active Task 2 and the follow-up; therefore, we conclude that the essential portion of perceptual learning occurred during the first experimental (EEG) session.

Another criticism of our active condition results might be that the musicians and nonmusicians had unequal signal-to-noise ratios. Since the musicians had better behavioral discrimination, it is possible that because of more trials, they had a better signal-to-noise ratio in the active tasks, and that this influenced the findings. However, since there was no significant difference in the numbers of trials between the groups, this alternative sounds implausible.

Rapid plasticity of P3a during passive exposure to sounds

During passive exposure to sounds, P3a responses for duration deviants were processed similarly in musicians and nonmusicians. When participants were asked to ignore sounds, a small but significant P3a response was elicited in both musicians and nonmusicians for the easy duration deviants (Fig. 3). During the active condition, however, only musicians showed discernible P3b responses for the hard-difficulty duration deviants. Also, only musicians showed significant P3a responses for pitch deviants in all passive blocks at the easy level. After the first active task, P3a responses were reduced for easy deviants and enhanced for difficult duration deviants between passive blocks in both groups. In nonmusicians, the P3a response decreased at a faster rate than in musicians for the easy- and medium-difficulty duration deviants. In addition, P3a latencies were shortened in both groups for selective deviants. In keeping with the results obtained for the P3b, a shortened P3a latency typically indicates faster stimulus evaluation and plasticity changes (i.e., habituation) for repeatedly presented nontarget novel stimuli (Debener, Makeig, Delorme, & Engel, 2005; Friedman et al., 1998). The lack of group differences in P3a signal plasticity for duration might be related to the fact that the Finnish participants were, in general, able to discriminate between duration variations that are essential for semantic differentiation in the Finnish language (Marie, Kujala & Besson, in press; Tervaniemi et al., 2006). Thus, the participants’ familiarity with duration variations may have enhanced their rapid plasticity for infrequent duration deviants.

Although we did not make explicit assumptions about P3a plasticity, we found that musicians had differential P3a plasticity for pitch deviants; that is, the plasticity changes in P3a amplitudes among the musicians showed greater habituation for pitch changes than among the nonmusicians, who showed enhancement. In fact, P3a responses were nearly absent for all pitch deviants in nonmusicians (Fig. 2), although they had significant P3b responses for the difficult pitch deviants during active tasks (Fig. 4 and Supplementary Table 1). These findings suggest that music training might be required for eliciting P3a responses for unattended pitch changes. Stronger P3a habituation in musicians for unattended deviating pitch sounds might also indicate enhanced change detection and involuntary attention switching to familiar pitch sounds. This interpretation is consistent with a previous study that found that classically trained musicians process pitch in a facilitated manner (Jäncke, 2009). Our findings suggest that music training modulates the exposure type of perceptual learning (Zhang & Kourtzi, 2010) for pitch. This skill could explain why musicians can generalize their auditory skills (i.e., pitch processing) beyond musically relevant tasks, such as discriminating pitch violations in foreign language prosody (Marques et al., 2007).

Relationship between working memory and auditory perceptual learning

A previous study had found a positive relationship between P3 responses during auditory discrimination and working memory capacity (Polich, Howard, & Starr, 1983). We found that the results of standardized tests of attentional inhibition and auditory memory did not differ between musicians and nonmusicians; nor did these results relate to P3a or P3b plasticity. However, higher working memory capacity, as evaluated by digit span, was related to better behavioral discrimination of target deviant sounds in active tasks. While our sample size did not allow for further generalizations, these results suggest that both auditory working memory and musical training influence behavioral discrimination of deviant sounds. It is likely that correlations between behavioral discrimination and working memory performance were somewhat biased by the maximal level of discrimination in musicians. Although we did not find better working memory performance in musicians than in nonmusicians, a recent study has suggested that music training enhances performance in working memory tasks (George & Coch, 2011).

It is possible that neurophysiological findings in musician studies are caused by factors other than musical training, such as musically enriched home environments in the childhood, enhanced cognitive skills, or genetic predispositions to sound processing. However, in Norton et al. (2005) there were no preexisting cognitive, music, motor, or structural brain differences between the children starting instrumental training and the control groups at the pretraining phase. Furthermore, several neurocognitive studies on musicians have shown positive correlations between the length of musical training and the amount of neural processing for sounds. Although the selection effect caused by potential preexisting differences between musicians and nonmusicians cannot be totally ruled out, here we tried to control some part of the variance in cognitive capacity by using standardized attention tasks. In these tasks testing attentional skills, performance was not significantly different between the musicians and nonmusicians in our sample. Musicians, however, had greater variances in their attention task performance.

In summary, the present results suggest that auditory perceptual learning, as measured by rapid neural changes in P3a and P3b responses and behavioral discrimination accuracy, differs between musicians and nonmusicians. During passive exposure to sounds, musicians showed P3a habituation for pitch deviant sounds, while nonmusicians showed P3a enhancement. During active discrimination of deviant sounds, musicians showed greater habituation for duration deviants than did nonmusicians. Generally, habituation was stronger for easier deviants, while responses were enhanced for more difficult deviants. Taken together, these findings suggest that P3a and P3b plasticity effects may reflect auditory perceptual learning for deviant target sounds. In other words, music training modifies the exposure type of perceptual learning for pitch deviants and the attention-gated perceptual learning for duration deviant sounds. Musical training may improve attentional skills and the encoding of features and rules in the auditory environment, thereby explaining the differences in short-term plasticity between musicians and nonmusicians. While these results are among the first to show differential auditory plasticity of P3a and P3b responses within a single experimental session in musicians and nonmusicians, more research will be needed to address whether musical training also enhances rapid plasticity and learning for more complex (i.e., melodic or linguistic) auditory stimuli and over a longer time period. Additional studies are also needed to resolve learning-related changes in ERP generators and in the functional connectivity between different neural structures.