Elsevier

NeuroImage

Volume 83, December 2013, Pages 627-636
NeuroImage

From Vivaldi to Beatles and back: Predicting lateralized brain responses to music

https://doi.org/10.1016/j.neuroimage.2013.06.064Get rights and content

Highlights

  • Brain activations can be predicted in the auditory, limbic, and motor regions.

  • The right STG is the core structure for processing complex acoustic features.

  • Orbitofrontal regions are also recruited while listening continuously to music.

  • The presence of lyrics weakens prediction of activations in the left STG.

  • A novel cross-validation method is applied to improve reliability of the results.

Abstract

We aimed at predicting the temporal evolution of brain activity in naturalistic music listening conditions using a combination of neuroimaging and acoustic feature extraction. Participants were scanned using functional Magnetic Resonance Imaging (fMRI) while listening to two musical medleys, including pieces from various genres with and without lyrics. Regression models were built to predict voxel-wise brain activations which were then tested in a cross-validation setting in order to evaluate the robustness of the hence created models across stimuli. To further assess the generalizability of the models we extended the cross-validation procedure by including another dataset, which comprised continuous fMRI responses of musically trained participants to an Argentinean tango. Individual models for the two musical medleys revealed that activations in several areas in the brain belonging to the auditory, limbic, and motor regions could be predicted. Notably, activations in the medial orbitofrontal region and the anterior cingulate cortex, relevant for self-referential appraisal and aesthetic judgments, could be predicted successfully. Cross-validation across musical stimuli and participant pools helped identify a region of the right superior temporal gyrus, encompassing the planum polare and the Heschl's gyrus, as the core structure that processed complex acoustic features of musical pieces from various genres, with or without lyrics. Models based on purely instrumental music were able to predict activation in the bilateral auditory cortices, parietal, somatosensory, and left hemispheric primary and supplementary motor areas. The presence of lyrics on the other hand weakened the prediction of activations in the left superior temporal gyrus. Our results suggest spontaneous emotion-related processing during naturalistic listening to music and provide supportive evidence for the hemispheric specialization for categorical sounds with realistic stimuli. We herewith introduce a powerful means to predict brain responses to music, speech, or soundscapes across a large variety of contexts.

Introduction

The past two decades have witnessed a surge of neuroimaging studies exploring music perception. In the classical approach commonly employed in these studies, specific hypotheses are tested in controlled conditions wherein the condition of interest is interspersed with a rest baseline or with other control conditions. As a result it fails to capture the modulations caused due to continuous information flow to our sensory modalities occurring in the real world. Moreover, these studies typically employ the General Linear Model approach wherein the regressors are usually binary (1 representing a condition of interest and 0 representing baseline) or consist at most of a small number of discrete values thereby not allowing for generalizations of the models in other contexts as there is no parameterization of the presented stimulus. Furthermore, analysis of such data also typically includes averaging operations across scans, and sometimes also across regions of interest (ROIs), and therefore suffers from severe underestimation of the “amount of information collected in a single fMRI measurement” (Haynes and Rees, 2006). Despite these reservations, such an approach provides us with a macro-level description, spatially and temporally, of the functionality of the regions in the brain thereby providing a foundation for formulating hypotheses for further investigations employing alternate techniques to study the temporal evolution of neural processing in a more fine-grained manner.

Recent studies, especially concerning the visual modality, have promoted a novel method of exploring brain functionality at a micro-level, namely employing voxel-based encoding. Naselaris et al. (2011) describe it as a technique that enables prediction of brain activity using stimulus-based features, at a voxel-level. Due to the possibility of representing the stimulus as features, the models created using such an approach can be tested on alternate datasets in order to assess their generalizability. However, this approach has been tested so far using only random selections of natural scenes (Kay et al., 2008, Naselaris et al., 2009) rather than continuous natural stimuli. To add to that, Wu et al. (2006) highlight the laboriousness involved in analyzing data obtained using natural stimuli. Nevertheless, studying neural processing as a continuous process is of vital importance in order to understand how the brain processes information, be it via any sensory modality, in the real world.

In the auditory modality, voxel-based encoding studies are scarce. One of the initial investigations was performed by Janata et al. (2002) in which they used fMRI and identified the rostral and medial part of the superior frontal gyrus or medial prefrontal cortex (MPFC) as the area that was engaged in tonality-tracking of an artificially created stimulus that modulated through major and minor keys systematically. In a subsequent study Janata (2009) examined music-evoked autobiographical memories and the interaction of musical features representing tonality. This time their stimuli comprised 30-second excerpts of natural music which were familiar or unfamiliar to the participants. The participants were asked to perform a behavioral rating task after each excerpt. The tonal structure in the music was captured by a perceptually validated toroidal model that is known to emulate the perceptual and cognitive processes involved in tonality processing. Results revealed the MPFC as the core area associated with processing music-evoked memories in addition to several frontal, secondary visual, and sub-cortical regions. Chapin et al. (2010) reported that the level of expressivity in a piano performance affected the evolution of neural activity. The level of expressivity was characterized by two features representing the evolution of tempo and sound intensity which were found to be processed by limbic and paralimbic regions of the brain as well as the dorsal MPFC. Particularly they report the involvement of several regions of the brain previously known to process pulse belonging to the primary motor and somatomotor areas.

The only study to date that has investigated voxel-based continuous encoding of several timbral, rhythmic, and tonal features simultaneously in the brain was performed by Alluri et al. (2012). They introduced a novel approach for investigating naturalistic music processing in the brain. Ecological musical material (an instrumental modern tango) was used in a more realistic setting wherein musicians listened to the stimulus uninterruptedly and without distraction by any experimental task unlike previous imaging studies that employed controlled settings with task requirements. Subsequently a comprehensive set of acoustic features was extracted from the stimuli and correlation analyses were performed with the fMRI time-series to determine where these features were processed. As a result they were able to localize the processing of the extracted musical features in the brain with timbral features recruiting auditory and somatomotor areas and higher-order rhythmic and tonality features involving also emotion-related limbic and paralimbic areas of the brain. However, the generalizability of these results is yet to be assessed. Here we extend this novel approach to other musical stimuli. In the present study, we aimed at predicting the temporal evolution of brain activity in relation to acoustic features extracted from musical pieces belonging to various genres, with and without lyrics. As generalizability was the main goal, we also assessed the robustness of the hence created models across stimuli via cross-validation.

In light of previous studies that examined temporal evolution of musical features in the brain (Alluri et al., 2012, Chapin et al., 2010, Janata, 2009) we hypothesized that the encoding models of the voxels which were found to be associated with musical feature processing would possess a high degree of generalizability across musical stimuli. Specifically, we expected to successfully predict the activations in the auditory, default mode network (DMN)-related, limbic and paralimbic, and somatomotor-related brain areas as well as cerebellar cognitive areas, and dorsal and rostromedial regions of the prefrontal cortex. Furthermore, activations in regions belonging to the ventrolateral prefrontal cortex (VLPFC), extrastriate visual areas (cuneus), and cerebellum may also be predictable at more liberal significance thresholds (Janata, 2009). Furthermore, following the notion of lateralized processing of speech in the left hemisphere (e.g., Zatorre et al., 2002) and in line with a previous study comparing brain activity to music without lyrics and music including lyrics (Brattico et al., 2011), we anticipated that the presence of lyrics might shift the balance in processing of musical features more to the right-hemisphere, measured as the number of voxels whose activations can be modeled with high accuracy. Based on this, we predicted that voxel-based models of music without lyrics would have lower predictive accuracy for activations in the left auditory cortex than those models related to music containing lyrics. Previous evidence indicated that voice stimuli (and other sound stimuli containing fast spectrotemporal transitions; Zatorre and Gandour, 2009) are predominantly processed in the left primary and adjacent auditory cortices (Heschl's gyrus and superior temporal gyrus), whereas musical stimuli are bilaterally processed or else preferentially in the corresponding areas of the right hemisphere (Garza-Villarreal et al., 2011, Hickok and Poeppel, 2000, Samson et al., 2011, Tervaniemi and Hugdahl, 2003, Zatorre et al., 2002). To note, this evidence had been obtained with carefully controlled parametric studies or with artificial designs including direct contrasts between two sound categories. Our goal here is also to find support for hemispheric specialization for sound processing by utilizing realistic stimuli, a naturalistic listening condition, and novel data processing techniques.

One main advantage of employing the voxel-based encoding technique is the possibility of testing the models in alternate settings, which helps enhance their generalizability. To this end, we further extended the cross-validation procedure by using a third dataset, specifically the one obtained in Alluri et al. (2012), as it was comparable to the current data in terms of experimental setting. That dataset comprised continuous fMRI measurements of musicians' brains while listening to the tango, Adios Nonino by Astor Piazzolla. Such comparisons would then facilitate the determination of regions of the brain that are dedicated to musical feature processing across different participant pools as well.

Section snippets

Stimuli

Two musical medleys, lasting approximately 15 min each, including pieces from various genres, with and without lyrics, were used in two separate scan sessions. One of the medleys comprised the B-side from Abbey Road by The Beatles (1969) and will be referred to as Abbey Road. The other consisted of four musical pieces without lyrics from different genres: Booker T and the MGs (Green Onions), Vivaldi (The Four Seasons — Spring), Miles Davis (Straight, no chaser), and The Shadows (Apache), and

ISC and PCR models

Fig. 1 displays the results of the correlation analysis performed on the participants' fMRI responses. As can be seen, significant mean inter-subject correlations were demonstrated in the auditory cortices with the maximum value in the superior temporal gyrus (STG) (r = .34, p < .01 for Abbey Road and r = .28, p < .05 for Medley).

Subsequently, we performed PCR analyses on each of the datasets as described in the Principal component regression (PCR) modeling section. Fig. 2 demonstrates the correlation

Discussion

The work presented here aimed at predicting the temporal evolution of brain responses to musical pieces with and without lyrics. To this end, we employed the paradigm introduced by Alluri et al. (2012) wherein participants were scanned continuously while listening to pop songs by the Beatles and to a medley of instrumental pieces belonging to various genres. Following this, acoustic features were extracted from the stimuli and used to model the brain responses using principal component

Conclusions

It was demonstrated that using continuous fMRI recording during naturalistic stimulus presentation, brain activity in response to acoustic feature processing could be predicted in the auditory, limbic, and motor regions of the brain with significant accuracy. For the first time, it is demonstrated that orbitofrontal regions, previously associated with evaluative judgments and aesthetic appraisal, are also recruited while passively listening to full musical pieces and their activations can be

Acknowledgments

This research was supported by the Academy of Finland (project number 7118616), TEKES (Finland) grant 40334/10 ‘Machine Learning for Future Music and Learning Technologies’, MindLab grants from the Danish Ministry of Science, Technology and Innovation, the Academy of Finland (Post-Doctoral Researcher; project number 133673), and the University of Helsinki (Three-year Grant; project number 490083). Asoke K. Nandi would like to thank TEKES for the funding of the Finland Distinguished

References (44)

  • T. Naselaris et al.

    Bayesian reconstruction of natural images from human brain activity

    Neuron

    (2009)
  • T. Naselaris et al.

    Encoding and decoding in fMRI

    NeuroImage

    (2011)
  • A. Schirmer et al.

    On the spatial organization of sound processing in the human temporal lobe: a meta-analysis

    NeuroImage

    (2012)
  • A.M. Smith et al.

    Investigation of low frequency drift in fMRI signal

    NeuroImage

    (1999)
  • M. Tervaniemi et al.

    Lateralization of auditory-cortex functions

    Brain Res. Rev.

    (2003)
  • R.J. Zatorre et al.

    Structure and function of auditory cortex: music and speech

    Trends Cogn. Sci.

    (2002)
  • A.J. Blood et al.

    Intensely pleasurable responses to music correlates with activity in brain regions implicated in reward and emotion

    Proc. Natl. Acad. Sci.

    (2001)
  • A.J. Blood et al.

    Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions

    Nat. Neurosci.

    (1999)
  • E. Brattico et al.

    A functional MRI study of happy and sad emotions in music with and without lyrics

    Front. Psychol.

    (2011)
  • S. Brown et al.

    Passive music listening spontaneously engages limbic and paralimbic systems

    NeuroReport

    (2004)
  • R.L. Buckner et al.

    The brain's default system: anatomy, function, and relevance to disease. In the year in cognitive neuroscience 2008

  • H. Chapin et al.

    Dynamic emotional and neural responses to music depend on performance expression and listener experience

    PLoS ONE

    (2010)
  • Cited by (0)

    1

    Department of Music, University of Jyväskylä, PL 35(M), 40014 Jyväskylä, Finland.

    2

    Royal Academy of Music, Aarhus Skovgaardsgade 2C, DK-8000 Aarhus C, Denmark.

    3

    Center of Functionally Integrative Neuroscience, Aarhus University Hospital, Nørrebrogade, 8000 Aarhus C, Denmark.

    4

    Department of Mathematical Information Technology, University of Jyväskylä, P.O. Box 35 (Agora), FIN-40014, Finland.

    5

    Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex UB8 3PH, UK.

    6

    Institute of Behavioral Sciences, P.O.B. 9, 00014, University of Helsinki, Finland.

    View full text