Abstract
The human brain can infer temporal regularities in auditory sequences with fixed sound-to-sound intervals and in pseudo-regular sequences where sound onsets are locked to cardiac inputs. Here, we investigated auditory and cardio-audio regularity encoding during sleep, when reduced vigilance may result in altered bodily and environmental stimulus processing. Using electroencephalography and electrocardiography in healthy volunteers (N=26) during wakefulness and sleep, we measured the response to unexpected sound omissions within three regularity conditions: isochronous, with fixed sound-to-sound intervals, synchronous, where sound and heartbeat are temporally coupled, and a control condition without specific regularity. The isochronous and synchronous sequences induced a modulation of the omission-evoked neural response in wakefulness and N2 sleep, the latter accompanied by a background oscillatory activity reorganization. Cardio-audio regularity encoding was accompanied by a heartbeat deceleration upon omissions in all vigilance states. The violation of auditory and cardio-audio regularity elicits neural and cardiac surprise responses across vigilance stages.
Introduction
The processing of auditory regularity is a basic brain mechanism that enables the rapid detection of unexpected stimuli which can persist even in altered states of consciousness such as sleep and coma [1]. The mechanism underlying auditory regularity encoding has mostly been studied by investigating the neural responses to deviant sounds interrupting a sequence of repeated standard stimuli as reported in healthy human wakefulness [2,3], during sleep (e.g. [4–6]), and in disorders of consciousness patients [7–9]. This rudimental component of auditory discrimination, known as mismatch negativity (MMN), has often been interpreted in the framework of the predictive coding theory [2,3,10,11]. According to this theory, the MMN may arise from the contribution of multiple, non-exclusive mechanisms including repetition suppression in response to frequent stimuli and generation of a ‘prediction error’ following an unexpected mismatch between the predicted and presented stimuli. The predictive nature of the neural responses to violations within regular auditory sequences has received experimental support from studies on unexpected omissions [12–16] wherein the top-down prediction is not confounded by the neural response to deviant sound stimuli. In these paradigms, the neural response to rare deviant stimuli arises from a violation of ‘what’ is expected in the sequence. The mechanisms underlying the expectation of ‘when’ a sound is expected in the context of regular auditory stimuli remain far less studied and largely unresolved [17]. In this context, recent studies demonstrated that auditory prediction is not limited to fixed sound-to-sound regularities but it extends to pseudo-regular sequences in which temporal prediction is induced by locking the sound onsets to interoceptive inputs, i.e. internally-generated signals such as the heartbeat [18–20]. To date, it remains unknown whether interoceptive signals similarly affect auditory processing during sleep, when consciousness and attentional levels are altered, and perception of the external environment is reduced compared to wakefulness.
Here, first, we aimed at investigating whether auditory regularity encoding is preserved across sleep stages as revealed by the neural response to sound omissions and second, we aimed at testing whether detecting regularity in trains of auditory stimuli may benefit from their temporal alignment (or misalignment) with bodily signals such as the heartbeats. We hypothesized that – as the brain gradually disconnects from the environment during sleep – bodily signals may play an important role in informing auditory regularity encoding and the detection of unexpected violation of such regularities. Healthy volunteers underwent two separate sessions of simultaneous electroencephalography (EEG) and electrocardiography (ECG) recordings during wakefulness and full night sleep. Participants passively listened to two possible varieties of auditory regularities (Fig. 1). In the first condition, sounds were presented at a fixed short delay relative to the ongoing heartbeat (synchronous or synch condition), in the second, sounds were presented at a fixed sound-to-sound interval (isochronous or isoch condition) and a third condition, where sounds were presented without any specific regularity (asynchronous or asynch condition) served as a control condition. To assess the strength of regularity encoding in all conditions, we measured the EEG and cardiac responses - measured by ECG - to unexpected omissions interspersed within the auditory sequences. Upon sound omissions in wakefulness, we expected prediction error generation in the synch and isoch conditions driven by auditory regularity and cardio-audio regularity.
Results
1. Participant demographics and sleep characteristics
1.1. Participants
Twenty-six healthy volunteers participated in the study (14 female; 1 left-handed; mean age: 27 years, range: 20-35 years) and each took part in a wakefulness and sleep recording session. All 26 participants were included for wakefulness while the sleep dataset included 25 participants (13 female; 1 left-handed; mean age: 26 years, range: 20-35 years) due to malfunctioning equipment during the sleep session for 1 participant.
We performed the analyses of the simultaneously acquired EEG and ECG signals. For the EEG analysis, the data of 5 participants were excluded for the wakefulness session due to excessive artifactual periods during acquisition (which yielded a loss of more than 50% of the trials), resulting in the inclusion of 21 out of 26 participants. For the sleep session, from 25 participants, we included 12 for N1, 23 for N2, 14 for N3, and 13 for REM sleep. For the ECG analysis, we included all 26 participants for the wakefulness session and 15 for N1 sleep, 24 for N2 sleep, 18 for N3 sleep, and 15 for REM sleep (S1 Table & S2 Table).
1.2. Sleep characteristics
The sleep characteristics recorded during the experimental night across the eligible population (N=25) are summarized in Table 1.
2. Quality control analysis
We conducted a series of control analyses on the heart rate and the relation between heartbeat and sound onsets across auditory conditions in order to identify possible factors influencing the neural and cardiac response to sound omissions during the wakefulness (AWAKE: N=21) and sleep (N1: N=12; N2: N=23; N3: N=14; REM: N=13) sessions. In the following, we will refer to R as the latency of the R peak of the ECG waveform (Fig. 1), S represents the sound onset latency, RR the R peak-to-R peak latency, SS the sound-to-sound latency, RS the R peak-to-sound latency, SR the sound to the next R peak latency, and variability is calculated as the standard error of the mean (SEM). One way repeated measures Friedman tests with factor Condition (synch, asynch, isoch and baseline - a control condition without auditory stimulation - where relevant) were performed on the group RR interval, RR variability, SS interval, SS variability, RS interval and RS variability for sound trials, as well as SR interval and SR variability for omission trials (Fig. S1-4).
In the synch condition, we observed an average RS interval of 51.8 ms (SEM = 1.1 ms) for sound trials in wakefulness and sleep, and -2.2 ms (SEM = 5.2 ms) for the other auditory conditions (figure A in S1 Fig. for neural quality control analyses; see Materials and Methods, Auditory conditions for the full cohort results). By construction, we expected lower RS variability in the synch relative to the two other conditions (isoch and asynch), as was confirmed by one way repeated measures Friedman tests with factor Auditory Condition (synch, asynch, isoch) (figure B in S1 Fig.; p<0.0005; AWAKE: χ2(1,3) =31.5; N1: χ2(1,3) =18.2; N2: χ2(1,3) =34.5; N3: χ2(1,3) =21.1; REM: χ2(1,3) =19.5; all post-hoc paired Wilcoxon signed-rank tests showed lower variability in the synch compared to the asynch or isoch condition). In addition, as expected, SR within omission was more variable in the asynch and isoch conditions compared to the synch condition (figure B in S2 Fig.; p<0.0005; AWAKE: χ2(1,3) =31.7; N1: χ2(1,3) =18.2; N2: χ2(1,3) =35.0; N3: χ2(1,3) =21.6; REM: χ2(1,3) =19.5; post-hoc Wilcoxon signed-ranked tests confirmed significant differences). This first series of control analyses demonstrated that the experimental manipulation, inducing an online fixed temporal alignment between R peak and sound in the synch and a variable one in the other conditions, was successful.
The analysis of auditory regularity based on the SS interval confirmed higher variability during the synch and asynch compared to the isoch condition (figure B in S3 Fig.; p<0.0005; AWAKE: χ2(1,3) =28.6; N1: χ2(1,3) =18.2; N2: χ2(1,3) =36.3; N3: χ2(1,3) =19.0; REM: χ2(1,3) =16.8; corroborated by post-hoc Wilcoxon signed-rank tests) with an average variability value of 1.5 ms in wakefulness and 3.4 ms across sleep stages for the isoch condition. This second analysis confirmed that the isoch condition was characterized by highly regular sound-to-sound intervals in comparison to the other conditions.
During the sleep session, the heartbeat was not globally affected during N2, N3 or REM sleep by the different cardio-audio and audio-audio stimulus presentation conditions as shown by 1×4 repeated measures Friedman tests on average RR intervals with factor Condition (synch, asynch, isoch, baseline) (figure A in S4 Fig.; p>0.05). Conversely, the same analysis performed in wakefulness and N1 sleep revealed significant differences in RR intervals (figure A in S4 Fig.; AWAKE: χ2(1,3)=14.4, p<0.005; N1: χ2(1,3) =12.3, p<0.05). Post-hoc Wilcoxon signed-rank tests uncovered that significantly reduced RR intervals were specific to the baseline condition (AWAKE: p<0.005 for synch/asynch/isoch vs baseline; N1: p<0.005 for asynch/isoch vs baseline) while no differences were observed across auditory conditions (p>0.05).
Finally, we carried out a control analysis on the ECG waveforms to exclude the potential confound of a different degree of contamination of the ECG artifacts in the EEG between conditions of interest upon interpretating the differential EEG signals locked to the R peak. To this aim, we performed a time-wise ECG waveform comparison of the grand averaged ECG trials time-locked to R peaks during sound omissions. Statistical analysis based on non-parametric cluster permutation statistics (p<0.05, two tailed) contrasting paired experimental conditions (synch, asynch, isoch, baseline) for each vigilance state (S5 Fig.) revealed no significant differences (p>0.05) in any of the sleep stages and for any of the comparisons. In wakefulness, significant differences (p<0.05) were observed in the synch vs baseline comparison between 365 ms and 446 ms and in the asynch vs baseline comparison between 368 ms and 443 ms.
Overall, these last series of analyses suggest that the heartbeat and ECG characteristics were well matched across auditory conditions and as a consequence, they do not represent a main confounding factor for explaining the cardiac and neural differential responses across regularity types.
3. Neural omission response
3.1. Auditory response comparisons
Auditory evoked potentials (AEPs) were computed as the average of the peri-stimulus EEG epochs between 100 ms pre-stimulus and 500 ms post-stimulus onset in the synch, asynch and isoch conditions for the wakefulness session (AWAKE: N=21) and all sleep stages (N1: N=12; N2: N=23; N3: N=14; REM: N=13). The AEPs revealed the expected event-related potential components including the N100 at approximately 100 ms post-stimulus onset and similar AEP morphologies across sleep stages provided qualitative evidence that sound stimuli were processed by the participants’ brain (Fig. 2). Cluster-based permutation statistical analysis confirmed no significant differences between the isoch and asynch AEPs in wakefulness or any of the sleep stages (p>0.05); suggesting auditory processing of the single sounds within the sequence was similar across conditions. We refrained from performing these comparisons in the synch condition as the AEPs in that case were contaminated with the response to the heartbeat.
3.2. Omission response in auditory temporal regularities
3.2.1. Omission response comparisons
In order to test our hypothesis that auditory regularities induced an expectation of incoming stimuli, we derived omission responses during sound omissions in the isoch and asynch conditions in wakefulness (AWAKE: N=21) and all sleep stages (N1: N=12; N2: N=23; N3: N=14; REM: N=13). Average sound-based omission evoked potentials (OEPs) were calculated by extracting epochs from the continuous EEG recordings that were time-locked to the average sound interstimulus interval. A random selection of epochs were extracted from continuous baseline recordings, such that the latencies between epoch onset and closest heartbeat (i.e. R peak) were matched to the trial onsets in the sound-based isoch and asynch conditions at the single-trial level.
Based on previous reports of neural responses to omissions within regular auditory sequences during wakefulness (e.g. [12,15,21]), here, we expected central negativity in the isoch condition (where temporal regularity existed in sound stimuli) between 100 ms and 250 ms. In the wakefulness session, the cluster-based permutation test comparing the isoch vs asynch condition yielded a significant negative cluster (p<0.05, Cohen’s d = 0.67) at 226 ms to 288 ms following expected sound onset in central scalp electrodes (Fig. 3A). The existence of an omission response in the isoch condition is further confirmed by the isoch vs baseline comparison (figure A in S6 Fig.), revealing a significant negative cluster (p<0.05, Cohen’s d = 0.91) at 103 ms to 170 ms following expected sound onset on anterior-central scalp electrodes. A second significant negative cluster (p<0.05, Cohen’s d = 1.23) was observed for the same comparison at 251 ms to 295 ms on posterior electrodes and a significant positive cluster (p<0.05, Cohen’s d = 1.02) at roughly the same latency between 246 ms and 303 ms on anterior-central electrodes. Despite the absence of a regularity rule in the asynch condition, the asynch vs baseline comparison in wakefulness also revealed a significant positive cluster (p<0.05, Cohen’s d = 0.82) on central scalp electrodes at 197 ms to 288 ms following expected sound onset (figure B in S6 Fig.). This points to some degree of inference of the temporal relationship across auditory stimuli being established during asynch sequence administration in wakefulness, potentially driven by the pseudo-regularity of auditory inputs.
Sleep OEPs differences across auditory conditions were only significant in N2 sleep comparisons. In addition, the isoch vs baseline and the asynch vs baseline comparisons during N2 sleep did not reveal significant differences. Isoch vs asynch OEP differences were largely similar between wakefulness and N2 sleep, at least in terms of latency relative to expected sound onset. In more detail, statistical evaluation of the isoch and asynch condition differences (Fig. 3B) identified a significant negative cluster (p<0.05, Cohen’s d = 0.66) at 85 ms to 223 ms following expected sound onset localized to posterior-central scalp electrodes. Our results in wakefulness and N2 sleep suggest that in the isoch condition, the fixed sound-to-sound interval induced an expectation of upcoming sounds and resulted in a neural surprise response upon violation of the regularity rule during omissions.
3.2.2. Sound vs omission response comparisons
We contrasted the isoch AEPs and OEPs in wakefulness and N2 sleep, in order to ensure that the observed neural omission responses were indeed related to the unfulfilled sound expectation and not due to differences of neural responses to stimuli across blocks. In wakefulness (figure A in S7 Fig.), cluster permutation statistical analysis contrasting the isoch AEPs and isoch OEPs revealed two significant positive clusters, a first (p<0.0005) between 72 ms and 362 ms and a second (p<0.05) between 394 ms and 500 ms, both localized to posterior electrodes. Two negative clusters were additionally identified, a first (p<0.0005) between 66 ms and 190 ms and a second between 259 ms and 360 ms, both on central scalp electrodes. In N2 sleep (figure B in S7 Fig.), the isoch AEP and OEP contrast yielded one positive cluster (p<0.005) between 136 ms and 407 ms on central electrodes and one negative cluster (p<0.0005) between 138 ms and 500 ms on outer electrodes. These latencies overlap with those obtained when comparing OEPs between isoch vs baseline (and isoch vs asynch) and provide support to our interpretation that these neural omission responses were due to regularity violation upon omission of expected sounds.
3.3. Omission response in cardio-audio regularities
3.3.1. Omission response comparisons
To investigate the effect of cardio-audio synchronicity on omission responses, we derived heartbeat-evoked potentials during sound omissions (OHEPs) in the synch and asynch conditions for the wakefulness session (AWAKE: N=21) and all sleep stages (N1: N=12; N2: N=23; N3: N=14; REM: N=13). Average OHEPs were calculated by extracting epochs from the continuous EEG recordings that were time-locked to the first R peak of the ECG signal during omissions. As an additional control condition for this analysis, we extracted a random selection of R peaks in the ECG signal to derive average HEPs from the continuous EEG recordings in the baseline condition.
In the wakefulness session, we expected to observe differences in the HEPs when comparing the synch to asynch and the synch to baseline conditions, as a consequence of the predictability of sound onset in the synch condition, based on the fixed delay between R peaks and sounds [19]. We expected no differences when comparing asynch vs baseline. For the synch vs asynch comparison (Fig. 3C), the cluster-based permutation test (p<0.05, two-tailed) revealed a significant positive cluster (p<0.05, Cohen’s d = 1.36) at 223 ms to 274 ms following R peak on posterior-central scalp electrodes. A significant negative cluster (p<0.05, Cohen’s d = 0.91) at 162 ms to 267 ms following R peak was additionally observed in this comparison within two distinct intervals, a first at 162 ms to 213 ms on anterior-central electrodes and a second at 217 ms to 267 ms on posterior-lateral rightmost electrodes. The synch vs baseline comparison (figure C in S6 Fig.) yielded one significant negative cluster (p<0.0005, Cohen’s d = 1.21) on posterior-central scalp electrodes at 36 ms to 275 ms. As anticipated, the asynch vs baseline comparison showed no significant differences (figure D in S6 Fig.).
In N2 sleep, we found differences between the OHEPs of the synch and asynch conditions (Fig. 3D) at -99 ms to 117 ms (p<0.05, Cohen’s d = 0.84) and at 322 ms to 500 ms (p<0.05, Cohen’s d = 0.62) following R peak onset; both clusters were localized to central electrodes. As sound omissions are unexpected, the early significant difference between the synch and asynch conditions observed only during N2 sleep could be explained by changes in the background oscillatory activity during this vigilance state (addressed in 4. Slow oscillation analysis). Similar to wakefulness, no significant clusters were identified for the asynch vs baseline comparison in N2 sleep. However, contrary to wakefulness results, comparing the synch and baseline OHEPs in N2 sleep revealed no significant differences between the two conditions. The same statistical tests carried out in the remaining sleep stages provided no significant results.
3.3.2. Sound vs omission response comparisons
We additionally contrasted the synch AEPs (time-locked to the R peak) and OHEPs in wakefulness and N2 sleep in order to identify the latencies at which the neural correlates of the unfulfilled auditory prediction arise within the synch condition. In wakefulness (figure C in S7 Fig.), cluster permutation statistics revealed two significant positive clusters, a first (p<0.0005) between 125 ms and 313 ms and a second (p<0.05) between 325 ms and 443 ms, both on posterior electrodes. Two negative clusters were additionally identified, a first (p<0.0005) between 113 ms and 246 ms and a second (p<0.05) between 314 ms and 442 ms, both on central electrodes. In N2 sleep (figure D in S7 Fig.), the synch AEPs vs OHEPs contrast yielded one positive cluster (p<0.0005) between 208 ms and 500 ms on central electrodes and one negative cluster (p<0.0005) between 203 ms and 500 ms on outer electrodes. Put together, the latencies of differential responses between synch AEPs and OHEPs included those observed when comparing the OHEPs between the synch and asynch conditions and between the synch and baseline conditions and are therefore coherent with our interpretation that the neural omission responses arise from the neural response to violation of cardio-audio regularity.
3.3.3. Heartbeat response comparisons
The OHEPs comparison between the synch and asynch conditions and between the synch and baseline conditions contains both the neural response to heartbeat signals and that to the sound omission. In an attempt to disentangle the contribution of these two neural responses, we built an estimation of the neural response to heartbeat in the synch condition free from the neural response to sounds. This was obtained by subtracting from the AEPs of the synch condition (which by construction includes both the neural response to cardiac and auditory signals), the AEPs of the asynch condition which is not contaminated in a systematic manner by the neural response to heartbeat. We then performed two comparisons for this clean ‘synch HEPs’ estimation, one to the asynch HEPs and the other to the baseline HEPs during sound presentations using cluster permutation statistical tests (p<0.05, two-tailed). In N2 sleep the ‘synch HEPs’ did not differ from the asynch HEPs or baseline HEPs (p>0.05). In wakefulness, we found no significant differences (p>0.05) between the synch and asynch condition HEPs. Significant differences (p<0.05) were only observed between the synch and baseline HEPs at 330 ms to 493 ms and 293 to 443 ms, as a positive and negative cluster respectively. The fact that these latencies occur later than those at which we observed differences between the OHEPs in the synch vs baseline condition corroborate our interpretation that the neural omission responses (Fig. 3CD) are a result of the unfulfilled prediction upon sound omission in the synch condition (and not a result of a modulation of the response to cardiac input).
4. Slow oscillations analysis
As outlined above (Fig. 3D), the analysis of OHEPs during N2 sleep (N=23) revealed that cardio-audio synchronization (compared to the asynch) gave rise to a central early onset positivity (at -99 ms to 117 ms) and late onset negativity (322 ms to 500 ms). This effect is reminiscent of the up and down states of slow oscillations (SOs) during N2 sleep (0.5-1.2 Hz oscillations [22,23]). This interpretation was confirmed by the analysis of the band-passed EEG data at 0.5 Hz to 4 Hz (delta band; reflecting the range of SO frequencies) which produced a highly similar profile for the synch vs asynch OHEPs during N2 sleep with an early onset central positivity difference (p<0.05) between -99 ms and 125 ms and a late onset central negativity difference (p<0.05) between 259 ms to 500 ms following heartbeat onset.
In light of this observation and since sound presentations are known to alter the background oscillatory activity in sleep, notably the SOs during NREM sleep [24–26], we investigated whether the three auditory conditions had differential effects on the ongoing oscillations during N2 sleep. We selected the SO positive half-wave peak latency at electrode Cz (Fig. 4AB) as the representative SO latency and computed the median sound-to-SO latency for all auditory conditions and median R peak-to-SO latency for all auditory conditions and the baseline.
Statistical assessment using an 1×3 repeated measures Friedman test on the mean sound-to-SO latencies yielded significant differences across auditory conditions (Fig. 4C; χ2(1,3) =7.9, p<0.05). Post-hoc Wilcoxon signed-rank tests confirmed lower latencies in synch compared to asynch (p<0.005) and in isoch compared to asynch (p<0.05). These results suggested a possible readjustment of the SOs with respect to sound onset depending on the regularity condition (Fig. 4C) since when regularity was present, either in the synch or isoch condition, SOs tended to align to the sound onset. Because of the fixed RS delay in the synch condition, the alignment between sound onset and SOs was also reflected in a lower heartbeat to SO peak in the synch compared to the asynch and isoch conditions (Fig. 4D). In order to rule out that this SO readjustment was due to a specific relation between R peak and SO irrespective of sound presentation, we carried out a 1×4 repeated measures Friedman test on the mean R peak-to-SO latencies which revealed significant differences across conditions (Fig. 4D; χ2 (1,4)=7.4, p>0.05), suggesting that potential readjustment of SOs was specific to auditory regularities and not to cardiac input.
5. Cardiac omission response
We further investigated whether regularity violation upon omission of expected sounds could also elicit a cardiac response across vigilance states. We analyzed heartbeat changes based on the RR intervals extracted from the ECG in response to sound omissions as a function of auditory conditions (synch, asynch, isoch).
5.1. Auditory condition contrast
We compared average RR intervals before, during and after omissions using two-way repeated measures ANOVAs with factors Auditory Condition (synch, asynch, isoch) and Trial Order (one trial before, trial during, first trial after, second trial after sound omissions) in the wakefulness and sleep sessions. To take full advantage of the available data in each vigilance state (S1 Table & S2 Table), separate repeated measures ANOVAs were performed for each vigilance state (AWAKE: N=26; N1: N=15; N2: N=24; N3: N=18; REM: N=15). These analyses were conducted on normalized RR intervals (Fig. 5; by subject-wise division of each of the investigated average RR intervals by the average RR interval prior to sound omission), so that the results would not be attributable to inter-subject variability in RR intervals. Of note, analyses on non-normalized RR intervals displayed similar results (data in S2 Text).
Overall, in wakefulness and sleep, we observed that cardio-audio synchronization led to an increase in the average RR interval during sound omissions (RRom) compared to the trial before omission (RR-1) that persisted over the following trial (RR+1) and decreased over the second trial following omission (RR+2). This deceleration in the heart rate upon sound omission was absent or inconsistent in the asynch and isoch conditions.
Statistical analysis confirmed this difference across auditory conditions as follows. 3×4 repeated measures ANOVAs revealed a significant interaction of Auditory Condition x Trial Order for all vigilance states (Fig. 5A; p<0.0005; AWAKE: F(3,4)=14.1; N1: F(3,4)=6.6; N2: F(3,4)=11.6; N3: F(3,4)=6.2; REM: F(3,4)=6.2) with a main effect of Auditory Condition (AWAKE: F(3,1)=27.2, p<0.0005; N1: F(3,1)=11.4, p<0.0005; N2: F(3,1)=14.4, p<0.0005; N3: F(3,1)=9.0, p<0.005; REM: F(3,1)=17.6, p<0.0005) and a main effect of Trial Order (AWAKE: F(1,4)=20.1, p<0.0005; N1: F(1,4)=5.3, p<0.005; N2: F(1,4)=18.4, p<0.0005; N3: F(1,4)=3.5, p<0.05; REM: F(1,4)=7.5, p<0.0005). Permutation testing followed by Wilcoxon signed-rank tests evaluated the distribution of post-permutation F values against the original F values and confirmed the significance of the main effects and interactions (p<0.0005). Post-hoc paired Wilcoxon signed-rank tests with Bonferroni correction for multiple comparisons across conditions corroborated that in the synch condition (Fig. 5A, red line), omissions elicited a long-lasting heart rate deceleration, with higher RR interval during and immediately after the omission than before the omission across all vigilance states (AWAKE: p<0.0005; N1: p<0.0005; N2: p<0.0005; N3: p<0.005; REM: p<0.005).
5.2. Vigilance State Contrast
To better characterize how sound omissions modulated RR intervals as a function of auditory conditions and vigilance states (Fig. 5B), we computed a three-way repeated measures ANOVA with factors Vigilance State (AWAKE, N1, N2, N3, REM), Auditory Condition (synch, asynch, isoch), and Trial Order (one trial before, trial during, first trial after, second trial after sound omission), including only participants who had sufficient data in all vigilance states (N=6; S2 Table).
The ANOVA revealed a significant main effect of Auditory Condition (F(1,3,1)=27.5, p<0.0005), due to overall higher RR interval values (i.e. enhanced deceleration) during the synch condition, and a main effect of Trial Order (F(1,1,4)=3.9, p<0.05) but no main effect of Vigilance State (p>0.05). Critically, replicating the analyses on separate vigilance states (Fig. 5A), the interaction effect of Auditory Condition and Trial Order was also significant (F(1,3,4)=4.2, p<0.0005). Permutation testing followed by Wilcoxon signed-rank tests verified the significance of the main effects and interaction (p<0.0005). The interactions Vigilance State x Auditory Regularity, Vigilance State x Trial Order, and the triple interaction across all factors (Vigilance State x Auditory Regularity x Trial Order) were not significant (p>0.05). This pattern of results further confirmed that the observed heart rate deceleration upon sound omission in the synch condition and lack thereof in the asynch and isoch conditions was consistent across vigilance states.
Discussion
We investigated the neural and cardiac correlates of cardio-audio regularity processing by administering sounds at fixed temporal pace (isoch), in synchrony to the ongoing heartbeat (synch) and in a control condition without specific temporal regularity (asynch) while maintaining matched average sound-to-sound intervals across conditions in wakefulness and sleep. We tested whether auditory regularity encoding would result in a violation detection response as measured by the cardiac and neural signals upon unexpected omitted sounds. Neural responses to sound omissions revealed that sound sequences induced prediction of upcoming sounds both when sounds occurred at a fixed pace and when temporally synchronized to the ongoing heartbeat during wakefulness and N2 sleep. Enhanced prediction related to cardio-audio synchronization was also evident in all vigilance states observed as a strong cardiac deceleration after sound omission. Analysis of the SOs during N2 sleep revealed a reorganization of the ongoing background brain activity both when sounds occurred in synchrony with the ongoing heartbeat and at fixed temporal pace.
Auditory regularity and omission response in wakefulness
The neural response to sound omissions differed between the isoch and asynch conditions at 226 ms to 288 ms (Fig. 3A), a later onset compared to classic omission responses in the literature, observed at 100 ms to 250 ms post-stimulus onset ([12,15,16,21] and others). Two main factors can possibly explain this divergence. First, in our study, the sound-to-sound intervals were individually adjusted, in order to match the average heartbeat, which resulted in a variable SS interval across participants. Second, the neural omission response was derived from the difference between unexpected omissions in the isoch condition, inducing both auditory and temporal prediction (‘what’ and ‘when’), and in the asynch condition, which can only generate an auditory prediction (‘what’ without ‘when’) [17]. In previous reports, the omission response was instead based on the comparison of expected vs unexpected sound omissions, revealing a violation detection response typically in the range of 100 ms and up to 250 ms post-stimulus onset [15,27]. It is likely that in our analysis, the earliest of these effects was canceled as a result of having similar ‘what’ prediction about incoming sounds in both isoch and asynch. That some level of auditory prediction is also elicited in the asynch sequence is also confirmed by the differential response between the OEPs in the asynch vs baseline (figure B in S6 Fig.).
Evidence of a violation detection upon omission of expected sounds was further confirmed by the difference between isoch and baseline at 103 ms to 170 ms and 251 ms to 295 ms post-stimulus onset (figure A in S6 Fig.). The early difference (∼100ms) of central negative polarity in the isoch vs baseline comparison is consistent with previous reports on a sensory template formation at the predicted sound onset occurring in the same period as the auditory N100 response and traced back to the auditory cortex [15,16,21,28,29].
Cardio-audio regularity and omisssion response in wakefulness
During wakefulness, the temporal coupling between cardiac and auditory inputs in the synch condition resulted in different OHEPs (time-locked to the R peak) compared to the asynch condition (Fig. 3C) or the baseline condition (figure C in S6 Fig.). The observed centrally-located positive peak in the synch vs asynch comparison around 240 ms (Fig. 3C) closely matches previous results using a similar paradigm, where a difference of the same polarity and localization was observed at 260 ms [19]. When considering the average offset of 52 ms (i.e. average heartbeat to sound onset latency in the present study), the latencies of observed differences in the present study (peaking at synch vs asynch: 240 ms; synch vs baseline: 200 ms) overlap well with previously reported classic omission responses between 150 ms and 200 ms ([15,16,21] and others). Moreover, during the synch condition, the cardio-audio synchronicity was essential for inducing the OHEP response, as demonstrated by the lack of a differential OHEP response between the asynch and baseline conditions (time-locked to the R peak). Overall, these results suggest that the brain utilizes the temporal cue provided by the ongoing heartbeat to predict the onset of upcoming sounds.
Omission response in auditory and cardio-audio regularity in N2 sleep
During N2 sleep, we showed a neural response to violation of both auditory and cardio-audio regularity upon omission of expected sounds. First, OEPs during the isoch compared to the asynch conditions revealed a negative polarity difference at approximately 100 ms to 200 ms (Fig. 3B). Since, to the best of our knowledge, no other studies in sleep have utilized omissions to study violation detection, we can only refer to the deviant sound literature in interpreting these results. Accordingly, the observed isoch vs asynch negative polarity is in line with the finding of the classic MMN response in NREM sleep (e.g. [30–32]) and suggests an implicit tracking of the temporal relationship between sounds despite altered vigilance in N2 sleep. In the comparison of the OHEPs during the synch vs asynch condition, we found a positive and negative cluster between -99 ms and 117 ms and 322 ms and 500 ms, respectively (Fig. 3D). In order to corroborate the interpretation of the early OHEPs difference, we compared AEPs and OHEPs locked to the R peak during the synch condition (figure D in S7 Fig.), which confirmed that the only sound omission related effect was the one occurring at latencies starting at 322 ms.
Sound administration modulates SO activity in N2 sleep
The modulation of the OHEP in N2 sleep prompted us to look into the possible influence of auditory stimulation on the background oscillatory activity. Indeed, the latencies and topography unveiled in the synch vs asynch N2 comparison (i.e. fronto-central positivity at -99 ms to 117 ms and fronto-central negativity at 322 ms to 500 ms) could reflect the up and down states of SOs during N2 sleep; keeping in mind that SO peak to trough latencies typically range between 400 ms to 1000 ms (reflecting 0.5-1.2 Hz oscillations; [22–24]). Here, this influence was demonstrated by a significantly reduced median latency between the sound stimulus onset and the peak of SOs for the synch and isoch conditions compared to the asynch condition (Fig. 4C). This indicates that sound presentations induced a modulation of slow oscillatory activity in N2 sleep in our subjects in such a way that, when auditory prediction could be generated (i.e. synch and isoch), sound onset was more likely to occur close to the SO peaks. This evidence is reminiscent of the closed-loop auditory stimulation literature wherein temporary synchronization of sound stimuli to ongoing SOs induced an enhancement of the SO rhythm during NREM sleep [24–26,33]. In the present paradigm as well as in closed-loop auditory stimulation studies, the temporal proximity between sound onset and the SO positive peak suggests the existence of a preferential time window of stimulus processing which may coincide with the positive phase of the SO cycle when neuronal firing is maximal [23,34]. On this basis, auditory regularity encoding would induce a reorganization of the ongoing SO activity, in order to facilitate the neural processing of expected sounds in a sequence when sound onset can be predicted (see also [35] for consistent findings in associative learning bound to the SO peaks).
Of note, not only sounds have been shown to have an impact on the latency of SOs in NREM sleep but also R peaks tend to occur close to the positive peak of the SO compared to other latencies [36,37]. With this in mind, we additionally investigated the potential impact of heartbeat signals on SOs (Fig. 4BD). In this study, we revealed no significant differences between R peak onset and SO peak across conditions. This finding suggests that SO latency modulation was specific to auditory regularities and not driven by a systematic temporal readjustmenet of the slow oscillations by the heartbeat, which would otherwise compromise our results in the synch condition where heartbeat and sound are temporally aligned.
Cardio-audio regularities result in heart rate deceleration upon sound omissions
In addition to studying whether and how the brain might track regularities across cardiac and auditory inputs, we investigated the ECG responses to sound omissions as another potential marker of auditory prediction. We found a deceleration of the heart rate during omissions across wakefulness and all sleep stages that was most evident and consistently observed in the synch condition (Fig. 5). One possible interpretation of the cardiac deceleration upon omission across vigilance states is in terms of attention reorientation following an unexpected and potentially dangerous event, a parasympathetically-driven effect often reported in conditioning paradigms [38–40]. This response can also be associated to a startle reaction and a physiological freezing response, linked to heart rate deceleration, pupil dilation and skin conductance alterations [41–45]. The heart rate deceleration upon threat detection may result from cholinergic system engagement, which modulates arousal, with similar mechanisms observed in rats when awakened from NREM sleep by activation of basal forebrain cholinergic neurons using chemical and optogenetic techniques [46,47].
Yet, another plausible explanation for the heart rate deceleration following unexpected omission in the synch condition relates to top-down adjustment of cardiac rhythm in order to account for the unexpected silence. As a sound is predicted following a heartbeat (in the synch), the omission may prolong the generation of the next heartbeat within physiologically plausible boundaries so as to ‘wait’ for a delayed auditory stimulus within the sequence, followed by a rapid readjustment to the original rhythm upon subsequent sound presentations. By construction, unlike the synch condition where the heartbeat deceleration was consistently observed, such deceleration was either reduced or absent in conditions where sound presentations bear no temporal relation to ongoing heartbeats (i.e. asynch and isoch). In a previous study investigating the cardiac response to sound omission as a function of heartbeat-to-sound onset delay and of interoceptive vs exteroceptive attention, a similar cardiac deceleration was reported only in the condition of external attention [20]. While this result seems at odds with our finding of preserved cardiac deceleration across vigilance states (and potentially attentional resources), a straightforward comparison is prevented by our lack of control of the focus of attentional resources during wake, also shown to be preserved in sleep [48,49].
Cardiac but not neural omission responses observed in N1, N3 and REM sleep
Despite previous reports of preserved deviance detection in auditory oddball paradigms during N1 [4,6,31], N3 [30] and REM [4–6] sleep, in the present study, we did not identify neural omission responses to violations of the auditory and cardio-audio regularity in these sleep stages. This is possibly explained by the low number of participants providing sufficient EEG based artifact-free trials in N1, N3 and REM sleep (S1 Table & S2 Table), not the case in N2 sleep which occupies a much larger portion of the healthy sleep cycle. It therefore remains unresolved whether the lack of neural omission responses during these sleep stages resulted from a low statistical power or was due to truly impaired auditory regularity processing and deviance detection.
The analysis of the cardiac response to unexpected omissions suggests that auditory regularity processing, especially when induced by the cardio-audio coupling, is preserved across all sleep stages and would possibly be detectable at the neural level provided sufficient available data. What also emerges from this observation is a possible strength of the cardiac omission response. Detecting the heart rate deceleration with a measure as easily obtained and artifact-free as the ECG signal could offer a simple and reproducible biomarker of the preservation of sensory regularity processing in health and disease. In line with this proposal, recent work posits that the analysis of heart rate fluctuations may be informative of the degree of preservation of conscious sensory processing in healthy populations and disorders of consciousness patients [50–53]. A further investigation into the origins of the cardiac omission response through the measure of pupil dilation, respiration or changes in skin conductance will help identify to what extent this response is a consequence of autonomic nervous system activity.
Limitations
First, as the OHEPs are contaminated by the heartbeat related artifacts, differential neural responses to sound omission across conditions could trivially be driven by differences between these artifacts. Nonetheless, in our study, we could reasonably exclude this confounding factor when interpreting the differences in the OHEPs because of the highly similar ECG signal waveforms during the omission period across auditory conditions and both in wakefulness and sleep (S5 Fig.). The only relevant difference was observed when comparing the baseline average ECG waveforms to that of the synch and the asynch conditions which nevertheless occurred at different latencies to those emerging from the comparison of the OHEPs. This difference could be explained by the fact that in the wakefulness session, the baseline was administered once at the start of the experiment, causing elevated heart rate and reduced heart rate variability (S4 Fig.), until participants settled with the experimental setting.
A final possible confound to mention relates to how sensory processing is influenced by the stimulus onset relative to the cardiac cycle [54–59], thought to rely on whether stimulus presentation occurs within the systole or diastole period [57,60]. Although in our case, the latencies of sound onset and periods of investigation are solely within the systole range, the slower heart rate in sleep in comparison to wakefulness may have resulted in sounds being presented at different timings within the cardiac cycle across vigilance states. Nonetheless, within the range of the systole period, we expect that sensory processing would be similar [60] both in wakefulness and sleep. In support of this, we already verified that changes of the order of 10 ms between R peak and sound onset produce similar results in the comparison between synch and asynch in wakefulness, upon consideration of the different average RS interval in the synch condition for this study, M = 52 ms, and an earlier study with M = 41 ms [19].
Key contributions
The experimental paradigm implemented herein has several advantages over previous investigations into the role of heartbeat signals in auditory regularity processing. First, the use of omissions as opposed to deviant sounds commonly employed in sleep MMN investigations (e.g. [4–6]) and interoceptive-auditory stimulus interactions [18], allows for a direct investigation of sensory regularity encoding, free from bottom-up auditory stimulus contributions. Second, this experimental paradigm enabled a counterbalanced ECG artifact between the synch, asynch and baseline conditions when contrasting the OHEPs time-locked to the R peaks. Third, by investigating both wakefulness and sleep in the same healthy volunteers, we now demonstrate that the human brain infers on the temporal relationship across cardiac and auditory inputs in making predictions about upcoming auditory events across vigilance states. Finally, in the current investigation, we did not only focus on potential neural responses to omissions but additionally identified a cardiac deceleration as a result of violation detection. Importantly, the cardio-audio synchronicity created ad hoc in the experimental environment might reflect real life readjustment of the heartbeat rhythm in order to optimize the temporal relationship between bodily signals and exteroceptive inputs for optimal sensory encoding.
Conclusion
Collectively, the present results suggest that the human brain can keep track of temporal regularities between exteroceptive inputs, and across interoceptive and exteroceptive inputs during both wakefulness and sleep. Inferring on this relationship may allow humans to form predictions about upcoming sensory events and may aid violation detection when such predictions are unfulfilled. Our findings replicate earlier findings in wakefulness [19] and support theories of an interoceptive predictive coding mechanism [61–63]. To our knowledge, this is the first study to investigate auditory regularity processing using omissions and to offer evidence for a potential role of interoceptive inputs under the predictive coding framework in sleep. The conscious and unconscious brain may implicitly process relationships across interoceptive and exteroceptive inputs in order to optimize the signalling and prediction of potential upcoming dangers.
Materials and Methods
Data and code availability
All the data used in the main analyses are available for review and will be made publically available upon publication. Costum-made code used to run the quality control analyses and EEG and ECG data analyses are available on https://github.com/DNC-EEG-platform/CardioAudio_Sleep/. Any additional information required to reanalyze the data reported in this paper is available from the corresponding authors upon request.
Ethics statement
Approval for the study was obtained by the local ethics committee (La Commission Cantonale d’Ethique de la Recherche sur l’Etre Humain), in accordance with the Helsinki declaration.
Human participants
Twenty-six self-reported good sleeper volunteers took part in both the wakefulness and sleep arms of this study. Participants were considered eligible if they had no history of psychiatric, neurological, respiratory or cardiovascular conditions, no sleep apnoea, and a regular sleep schedule, evaluated during a phone interview. Hearing conditions were an additional exclusion criterion. All participants gave written informed consent and received approximately 150 Swiss Francs as monetary compensation.
Experimental design
A two-way crossover experimental design was implemented in this study. Participants attended one wakefulness and one sleep session on two occasions separated by a minimum of one day and a maximum of ten days (wakefulness session first for 12 out of 26 participants). In both sessions, participants were instructed to passively listen to the administered sound sequences. They were naïve to the experimental manipulation, as suggested by informal verbal inquiry regarding the experimental design after the experiment. At the end of the second session, and if desired, participants were debriefed about the purpose and design of the experiment.
The sleep session recordings took place in a sound-attenuated hospital room equipped with a comfortable hospital bed to allow for overnight sleep recordings. During the sleep session, participants arrived at the laboratory at approximately 9 pm and following set-up preparation, they were instructed to lie down and inform the experimenters when they were ready to sleep. Lights were then switched off, the auditory stimulus administration commenced and the volunteers were left alone to naturally fall asleep. Although participants had the liberty to leave at any time, we explained that their inclusion in the study required a minimum of four hours of continuous data acquisition after sleep onset. They were free to choose to spend the night at the sleep laboratory and to be woken up at a desired time or by approximately 7 am the next morning, at which time lights were switched on.
In both sessions, participants were equipped with electrodes for heartbeat (ECG), eye movement (EOG), and EEG recordings (see below). For the sleep session, additional electrodes for submental electromyography (EMG) were attached, in accordance with the 2007 AASM guidelines for sleep scoring [64]. In-ear phones (Shure SE 215, Niles, IL) were utilized during both sessions instead of external headphones, in order to increase sound attenuation, subject comfort during sleep and to prevent physical contact with and thus displacement of EEG cap and electrodes. The online EEG, ECG, EOG and EMG were continuously monitored by the experimenters to ensure effective stimulus administration, data acquisition, heartbeat detection and sleep quality throughout both sessions.
Assessment of sleep onset was qualitatively inspected by monitoring the electrophysiological signals in real-time to identify well-known sleep markers in accordance with Iber et al. [64]. Specifically, indicative of sleep onset in the EEG signals, were the dissipation of low-amplitude alpha (8-13 Hz) posterior activity, the appearance of high-amplitude slow wave activity (1-4 Hz) and observable K complexes or sleep spindles. In addition, the reduction in eye movements and muscle tone demonstrated that volunteers were asleep, as observed in the EOG and submental EMG peripheral electrophysiological signals, respectively.
Stimuli
Sound stimuli were 1000 Hz sinusoidal tones of 100 ms duration (including 7 ms rise and fall times) and 0 μs inter-aural time difference. A 10 ms linear amplitude envelope was applied at stimulus onset and offset to avoid clicks. Stimuli were 16-bit stereo sounds sampled at 44.1 kHz and were presented binaurally with individually adjusted intensity to a comfortable level for wakefulness. A considerably lower than wakefulness intensity of approximately 45 dB was chosen for sleep, in order to facilitate a non-fragmented sleep session without multiple awakenings.
Experimental procedure
During the wakefulness session, volunteers sat comfortably on a chair in a sound-attenuated experimental room and were instructed to keep their eyes open, avoid excessive eye blinking, body and jaw movements, and to breathe regularly; the aforementioned measures served in ensuring high signal quality. Each participant was presented with four types of stimulation conditions administered in separate experimental blocks in a pseudo-random order and was asked to passively listen to the sounds while keeping the eyes fixed on a cross centrally located in the visual field. The conditions were a baseline without auditory stimulation and three auditory conditions, namely synch, asynch and isoch. During wakefulness, the baseline lasted ten minutes and was acquired prior to auditory stimulation. During sleep, numerous two-minute baseline blocks were acquired in alternation to the sound sequences in an attempt to ensure that baseline background activity was comparable to the preceding sound stimulation during all stages of sleep. The three auditory conditions lasted five minutes each and corresponded to separate experimental blocks, which were repeated six times during wakefulness in a semi-randomized order. During the sleep session, sounds were administered for the entire length of the sleep recording in sequences of three auditory blocks always followed by a baseline (e.g. isoch-synch-asynch-baseline or synch-asynch-isoch-baseline).
Auditory conditions
All auditory conditions consisted of the sequential presentation of 250 stimuli (80% sounds and 20% omissions) administered in a pseudo-random order wherein at least one sound stimulus intervened between two subsequent omissions. Details for each auditory condition are given below and a thorough post-hoc evaluation of the experimental manipulation is provided in the Results and Supporting Information (Results 2. Quality control analysis & Fig. S1-4).
In the synch condition, the temporal onset of each sound stimulus was triggered by the online detection of R peaks from raw ECG recordings. To enable effective online R peak detection, raw ECG recordings were analyzed in real-time using a custom MATLAB Simulink script (R2019b, The MathWorks, Natick, MA). The variance over the preceding 50 ms time window was computed and an R peak was detected when the online ECG value exceeded an individually adjusted 10–15 mV2 variance threshold, which in turn triggered the presentation of a sound stimulus or an omission. This procedure resulted in a fixed R peak-to-sound average delay (i.e. RS interval) of 52 ms (SD = 5 ms) for wakefulness and sleep across participants.
In the asynch condition, the onset of sound presentation was based on the RR intervals extracted from a previously acquired synch block. Specifically, the ECG recorded during the preceding synch block was analyzed offline to extract RR intervals by automatic detection of R peaks and computation of RR intervals. 250 RR intervals were selected if they were above the 25th and below the 75th percentile of RR interval distribution in the synch block, in order to take into account possible missed R peaks in the online detection during the synch block. Next, RR interval order was shuffled giving rise to a predefined pseudo-random sequence closely resembling the participant’s heartbeat rhythm. By construction, differences between the synch and asynch conditions in terms of average and variance of the RR intervals were minimized, contrary to the RS interval being fixed in the synch condition and variable in the asynch condition. The variable sound to heartbeat relationship resulted in an average RS interval of - 3 ms (SD = 25 ms, range = −206 to 96 ms), as well as an average SR interval of 848 ms (SD = 136 ms, range = 504-1259 ms) for wakefulness and sleep.
In the isoch condition, the onset of sound presentations was based on the median RR interval calculated during a previously acquired synch block. This procedure produced similar sound-to-sound intervals across the synch, asynch and isoch conditions. However, unlike the synch and asynch conditions, sound-to-sound intervals in the isoch condition were of a low variability (i.e. SS variability: SEM of SS interval) of 1 ms for wakefulness and 3 ms for sleep across participants. Similar to asynch, in the isoch condition, the variable sound to heartbeat relationship resulted in an average RS interval of -1 ms (SD = 32 ms, range = −116 to 165 ms) in addition to an average SR interval of 853 ms (SD = 141 ms, range = 516-1263 ms) for wakefulness and sleep.
Data acquisition
Continuous EEG (g.HIamp, g.tec medical engineering, Graz, Austria) was acquired at 1200 Hz from 63 active ring electrodes (g.LADYbird, g.tec medical engineering) arranged according to the international 10–10 system and referenced to the right ear lobe. Biophysical data were acquired using single-use Ag/AgCl electrodes. Three-lead ECG was recorded by attaching two electrodes (a third was a reference) to the participant’s chest on the infraclavicular fossae and above the heart. A vertical EOG electrode was attached below the right eye and a horizontal EOG electrode was attached to the outer right canthus. Since changes in muscle tone are associated with alterations in consciousness during sleep and are an essential marker for effective sleep staging [64], EMG was additionally acquired sub-mentally during the sleep session alone. Impedances of all active electrodes were kept below 50 kΩ. All electrophysiological data were acquired with an online band-pass filter between 0.1 and 140 Hz and a band-stop filter between 48 and 52 Hz to reduce electrical line noise.
Sleep scoring
An experienced sleep scoring specialist (Somnox SRL, Belgium), blind to the experimental manipulation in this study, performed the scoring of the continuous sleep electrophysiological data in order to pinpoint the periods of wakefulness and micro-arousal as well as periods of N1, N2, N3 and REM sleep. Sleep scoring was performed via visual inspection of contiguous 30-second segments of the EEG, EOG and EMG time-series, as outlined in the 2007 AASM guidelines for sleep scoring [64]. Segments scored as periods of wakefulness or micro-arousals in the sleep recordings were excluded from further analysis.
Data analysis
Electrophysiological data analyses were performed in MATLAB (R2019b, The MathWorks, Natick, MA) using open-source toolboxes EEGLAB (version 13.4.4b, [65]), Fieldtrip (version 20201205, [66]), as well as using custom-made scripts. Raincloud plots were generated using the Raincloud plot toolbox [67].
R peak detection
The R peaks in the continuous raw ECG signal were selected offline using a semi-automated approach as in Pfeiffer & De Lucia [19]. The custom-made MATLAB script, rpeakdetect.m (https://ch.mathworks.com/matlabcentral/fileexchange/72-peakdetect-m/content/peakdetect.m), was utilized to automatically identify the sharp R peaks in the raw ECG signal. Visual inspection of the online and offline detected R peaks ensured that the selected peaks fitted within the expected structure of the QRS complex in the continuous raw ECG signal. Frequent flawed online identification of the R peaks or faulty auditory stimulus presentation in a given block resulted in the exclusion of the block from a given participant’s dataset. For blocks that were included, unrealistic RR, RS and SR interval values, observed as a result of infrequent flawed offline marking of R peaks, were identified and excluded using the rmoutliers MATLAB function (R2019b, The MathWorks, Natick, MA) and visual inspection of the detected R peaks and selected outliers.
Quality control analysis
A series of control analyses were performed to investigate whether the experimental manipulation was producing the expected RS, RR and SS mean and variances, and the presence of possible confounding factors. RR and SS intervals (for sound trials preceded by sound trials) were extracted for the synch, asynch, isoch and baseline conditions where relevant. In addition, RS intervals for sound trials and SR intervals for omission trials were computed to quantify the degree of cardio-audio synchronization and heartbeat onset variability during sound omission, respectively. Variability in the same interval measures, computed as the SEM was additionally investigated. Of note, unlike RR, SR and RS variability, SS variability was first computed within a given experimental block and then across experimental blocks, to account for ongoing changes in RR intervals (and hence SS intervals) in sleep. Non-parametric one way repeated measures Friedman tests (p<0.05) were performed on the average RR, SS, RS and SR intervals and variabilities of each participant for wakefulness and all sleep stages with within-subject factor Condition (3 levels for SS, RS, SR intervals and variability: synch, asynch, isoch; 4 levels for RR intervals and variability: synch, asynch, isoch, baseline). Post-hoc paired Wilcoxon signed-rank tests (p<0.05) identified any significant pairwise comparisons (no multiple comparisons correction was applied since pairwise differences were of interest).
To ensure no significant differences in the ECG signal across experimental conditions, ECG waveforms during omissions were extracted from ECG recordings between -100 ms and 500 ms relative to R peak onset, matching the EEG trial-based analysis (see below, EEG Data Analysis). The non-parametric cluster-based permutation statistical analysis approach [68] was employed to investigate ECG waveform differences between the various experimental conditions outlined herein. In order to reject the null hypothesis that no significant differences existed in the given set of experimental conditions being contrasted, maximum cluster-level statistics were determined by shuffling condition labels (5000 permutations), allowing for a chance-based distribution of maximal cluster-level statistics to be estimated. Since maxima are utilized by this method, it enables the correction of multiple comparisons over time. A two-tailed Monte-Carlo p-value allowed for the definition of a threshold of significance from the distribution (p<0.05, two-tailed).
EEG data analysis
Continuous raw EEG data were band-pass filtered using second-order Butterworth filters between 0.1 and 40 Hz for the wakefulness session and 0.5 and 30 Hz for the sleep session. A larger high pass cut-off was chosen for sleep electrical signals in order to exclude periodic breathing and sweating artifacts which may occur in that frequency range in sleep and a lower low-pass cut-off was employed in agreement with the sleep literature (e.g. [24]). To look specifically at slow oscillatory activity, raw electrophysiological data were band-pass filtered between 0.5 and 4 Hz, a well-established frequency range for investigations of slow-wave activity in the delta band range [69].
Ocular and muscular activity in the continuous EEG was identified for wakefulness data by performing Independent Component Analysis (ICA) as implemented in Fieldtrip [65,66] and any selected eye movement or muscle activity related components were removed. We refrained from applying the ICA algorithm to the continuous sleep EEG data since changes in periodic eye movements and muscle activity form an integral part of the human sleep cycle. Therefore, contrary to wakefulness, excluding ocular and muscular contributions to the acquired EEG signal may have resulted in loss of relevant activity in sleep electrophysiology.
Wakefulness and sleep continuous EEG data were epoched between -100 ms and 500 ms relative to the selected event onset. This range was chosen in reflection of previously reported classic omission response latencies ([15,16,21] and others) and importantly, to ensure no overlap between consecutive auditory stimulation trials (as in [19], assuming a range of 60-100 beats per minute across volunteers). Various event onsets were selected, namely, heartbeat onset, sound onset and omission onset, depending on the comparison performed (Results 3. Neural omission response). Of note, while the isoch condition had fixed SS intervals, this was not the case in the asynch condition. To test whether sound prediction was based on computing the time of maximal sound occurrence probability, average sound-based omission responses were calculated by extracting epochs from the continuous EEG asynch and isoch recordings, now time-locked to the latency occurring during the omission at the average SS interval.
The resting-state baseline recordings acquired as part of the experimental procedure (without auditory stimulation) were utilized to extract EEG evoked potentials with matched event onsets to auditory stimulation trials and epoched between -100 ms and 500 ms. First, we extracted HEPs based on heartbeat onset in the resting state for upcoming statistical comparisons to the synch and asynch OHEP. Second, in the absence of sound stimulation, the baseline allowed for comparing the OEPs of the isoch and asynch conditions to a control condition. In this case, a random selection of epochs was extracted from continuous baseline recordings, such that the latencies between epoch onset and closest heartbeat (i.e. R peak) were matched on the single-trial level to the trial onsets in the sound-based isoch and asynch conditions. Upcoming pre-processing steps were replicated for auditory stimulation and baseline trials, for each participant, and for wakefulness and all sleep stages as follows.
Artifact electrodes and trials were identified using a semi-automated approach for artifact rejection as implemented in Fieldtrip [66]. Noisy EEG electrodes were excluded based on a signal variance criterion (3 z-score Hurst exponent) and substituted with data interpolated from nearby channels using spherical splines [70]. Across eligible participants, an average of 7.3 (SD = 2.2) electrodes (M = 11.7%, SD = 3.6%) were interpolated in wakefulness and an average of 8.9 (SD = 1.1) electrodes (M = 14.4% SD = 1.7%) were interpolated in sleep. Extended recording lengths, participant sweating, and head and body movement during the night could account for the greater number of artifactual electrodes observed in sleep recordings. In wakefulness data, trials containing physiological artifacts (e.g. eye movement, excessive muscle activity), not accounted for by ICA, were identified by visual inspection and by using a 70 μV absolute value cut-off applied to the EEG signal amplitude and were excluded from further analysis. A higher absolute value threshold of 300 μV was employed for the selection of artifactual epochs in sleep, in order to prevent the exclusion of high-amplitude slow wave activity [71]. Extreme outliers in signal kurtosis and variance were additional criteria for the rejection of artifactual trials from sleep recordings. Finally, common average re-referencing was applied.
The 30-second sleep stage labeled epochs were used to label artifact-free trials for all event onset types and for all three auditory conditions and the baseline. Hence, five different sets of trials formed the final processed dataset per participant for each of the five vigilance states: AWAKE, N1, N2, N3 and REM sleep where available (S1 Table & S2 Table). We note that we chose not to employ pre-stimulus baseline correction in wakefulness or sleep trials. In addition, since one way ANOVAs on artifact-free trial numbers for each evoked response of interest confirmed that no significant differences (p>0.05) existed across auditory conditions alone, all trials were kept in auditory condition comparisons. Conversely, since acquired baseline data were significantly less (p<0.05) than the three auditory conditions, baseline trial numbers were quantitatively matched for each participant and for each vigilance state between auditory conditions and baselines. The number of available trials for each vigilance state, averaged across experimental conditions and event onsets of interest were AWAKE: M = 245, SD = 26 trials; N1: M = 97, SD = 22 trials; N2: M = 428, SD = 184 trials; N3: M = 159, SD = 72 trials; REM: M = 163, SD = 65 trials. Finally, grand average evoked responses to sound omissions were derived.
EEG statistical analysis
EEG statistical analyses were performed in MATLAB (R2019b, The MathWorks, Natick, MA) using Fieldtrip (version 20201205, [66]), as well as custom made scripts. For the analysis of the electrophysiological signals during the wakefulness and sleep sessions, we imposed a minimum of 60 artifact-free trials for the EEG data analysis, chosen based on the signal-to-noise ratio required to meaningfully interpret EEG statistical analysis results (Luck, 2005).
The non-parametric cluster-based permutation statistical analysis approach [72] was employed to investigate sensor-level EEG-based differences between the various experimental conditions outlined herein. Under this statistical framework, statistically significant individual data samples were grouped based on the degree of their shared spatial and temporal characteristics. The resulting clusters were statistically evaluated by summating the t-values for all samples forming up a given cluster. In order to reject the null hypothesis that no significant differences existed in the given set of experimental conditions being contrasted, maximum cluster-level statistics were determined by shuffling condition labels (5000 permutations), allowing for a chance-based distribution of maximal cluster-level statistics to be estimated. A two-tailed Monte-Carlo p-value allowed for the definition of a threshold of significance from the distribution (p<0.05, two-tailed). In wakefulness, this procedure was performed from 0 ms to 350 ms relative to the event onset of interest (heartbeat onset for OHEP comparisons and expected sound onset based on average SS interval for OEP comparisons). The aforementioned time window was chosen in reflection of a previous publication from our laboratory [19], suggesting that changes of interest occurred within the selected period. Conversely, in sleep and in the absence of prior knowledge specific to this paradigm, we chose to apply cluster-based permutation statistics over the entire epoch length from -100 ms to 500 ms relative to event onset. Finally, in order to evaluate the size of the observed effects, the Cohen’s d statistic was calculated at the latenies of the largest significant clusters [73].
SO data analysis
To examine whether N2 sleep EEG statistical analysis results could have been influenced by the relationship between sounds and SO activity [24,26,33] and/or between heartbeats and SO activity [36] in NREM sleep, we identified SOs in N2 sleep artifact-free EEG data. We then computed the latency at which the SOs occurred compared first, to sound presentations in the auditory conditions and second, to heartbeats in the auditory conditions and baseline.
SOs in the continuous EEG time-series were detected across experimental conditions over frontocentral electrode Cz where a high probability for SO detection is to be expected [22]. We marked SOs based on the method described in Ngo et al. [24] and in Besedovsky et al. [33] In brief, the 0.5 to 30 Hz band-pass filtered data were downsampled from 1200 Hz to 100 Hz and a low pass finite impulse response filter of 3.5 Hz was used in order to improve the detection of SO components. Next, N2 sleep labeled segments were extracted for the identification of SOs. Consecutive positive-to-negative zero crossings were picked out and were selected only if their temporal distance was between 0.833 s and 2 s, yielding a frequency between 0.5 Hz and 1.2 Hz for the designated oscillations. Negative and positive peak potentials in the oscillations were defined as the minima and maxima present within eligible consecutive positive-to-negative zero crossings. The mean negative and mean positive-to-negative amplitude differences were calculated across selected oscillations at electrode Cz. For each identified oscillation that satisfied the frequency criterion, if the negative amplitude was 1.25 times lower than the mean negative amplitude and the positive to negative amplitude difference was 1.25 times higher than the mean positive to negative amplitude difference, the oscillation was marked as a SO.
The positive half-wave peak time point was chosen as the representative latency for each SO in light of relevant literature [23,34] and following visual inspection of the SO and sound presentation time-series. Sound to SO latency for the synch, asynch and isoch conditions and R peak to SO latency for the synch, asynch, isoch and baseline conditions were computed for all artifact-free sound trials. Omission trials were excluded from this analysis since in this case, we were interested in how sounds and the sound to heartbeat relationship may modulate SOs in N2 sleep. Latencies were considered as valid only if they were between -800 and 800ms, a range chosen to minimize potential contamination of the sound or R peak to SO relationship by upcoming sounds or heartbeats. Median latencies were calculated for each subject and condition separately.
SO statistical analysis
In N2 sleep, a non-parametric 1×3 repeated measures Friedman test was calculated on the median sound to SO latencies at electrode Cz with within-subject factor Auditory Condition (synch, asynch, isoch) and a 1×4 repeated measures Friedman test were computed on the median heartbeat to SO latencies with within-subject factor Condition (synch, asynch, isoch, baseline). Pairwise comparisons were calculated using post-hoc paired Wilcoxon signed-rank tests between all investigated within-subject variables (no multiple comparisons correction was applied since pairwise differences were of interest). Latency trial numbers were matched for the median heartbeat to SO latency comparison but not for the sound to SO latency comparison, since in the former, baseline recordings lengths were significantly lower to auditory condition recordings (as confirmed by 1×4 and 1×3 repeated measures ANOVAs on latency trial numbers, which significantly differed only across heartbeat to SO trials and not sound to SO trials).
ECG data analysis
The identified R peaks were used to derive omission trial related RR intervals. The representative R peak for an omission trial was selected as the first R peak following an omission in the continuous ECG signal. The omission RR Interval (RRom) was therefore calculated as the latency between the selected omission R peak and the R peak immediately preceding it. In order to investigate potential heartbeat alterations associated with the omission trial, additional RR intervals were identified using contiguous R peaks to reflect the RR interval for the trial prior to omission (RR-1) and up to two trials following omission (RR+1 and RR+2). Omission trials were considered only if they were followed by at least two sound stimuli to ensure no overlap between investigated RR intervals.
The 30-second sleep stage labeled epochs were used to label omission trials for all auditory conditions. For the analysis of the ECG signals during the wakefulness and sleep sessions, we imposed a minimum of 30 artifact-free trials for the ECG data analysis and for each condition of interest. Hence, different sets of RR intervals formed the final dataset per participant and consisted of five vigilance states: AWAKE, N1, N2, N3 and REM sleep where available (S1 Table & S2 Table). One way ANOVAs on RR interval quantities showed no significant differences (p>0.05) in trial numbers across auditory conditions within each vigilance state, therefore all RR intervals were included in the statistical analysis. RR intervals for each omission trial order type were averaged for each auditory condition, participant, and each vigilance state.
ECG statistical analysis
First, 3×4 repeated measures ANOVAs were performed on the average RR intervals with within-subject factors Auditory Condition (synch, asynch, isoch) and Trial Order (one trial before, trial during, first trial after, second trial after sound omissions) for the wakefulness session (AWAKE) and each of the sleep stages (N1, N2, N3, REM) separately. Second, for a sub-set of participants (N=6) who had sufficient trials across wakefulness and all sleep stages, a 5×3×4 repeated measures ANOVA was performed on the average RR intervals with within-subject factors Vigilance State (AWAKE, N1, N2, N3, REM), Auditory Condition (synch, asynch, isoch) and Trial Order (one trial before, trial during, first trial after, second trial after sound omission). Of note, unlike comparisons within vigilance states, trial numbers were matched across vigilance states by randomly selecting a subset of RR intervals based on the minimum number (⩾30 trials) of available RR intervals across participants to ensure accurate comparison across vigilance states. In order to reject the null hypothesis that no significant differences existed in the given set of ANOVA factors being compared, each of the factors’ condition labels were shuffled (2000 permutations), allowing for an estimation of the chance-based distribution of maximal cluster-level statistics to be estimated. Wilcoxon signed-ranked tests (p<0.05) were used to evaluate whether significant differences existed between the distribution of post-permutation F values and the true main effect and interaction effect F values. Post-hoc Wilcoxon signed-rank tests with Bonferroni correction for multiple comparisons (p<0.018 for auditory condition and p<0.013 for omission trial order comparisons) were utilized for pairwise comparisons between all investigated within-subject variables.
Funding Sources
This work was supported by a Spark SNSF Grant (no. 196194) awarded to M.D.L.
Declaration of interests
The authors declare no competing interests.
Author contributions
A.P., C.P., S.S. and M.D.L. designed the experiment and conceived the analysis. A.P. conducted the experiment and acquired the data. A.P. and M.D.L. analyzed the data and wrote the manuscript. C.P. and S.S. edited the manuscript.
Acknowledgments
We express our gratitude to Nathalie Nguepnjo Nguissi, Elsa Dosi and Diana Ortolani for assistance during data acquisition. We extend our thanks to Dr Aurore Guyon Postalci from SOMNOX for performing the stage scoring of the sleep data. We thank Rupert Ortner and the g.tec team for technical support.
Footnotes
1. A reorganization of the manuscript structure for improving the logical sequence of hypotheses and experimental questions. 2. A complete data re-analysis based on non-parametric statistical methods which largely confirmed our previous results. In addition, inclusion of effect sizes for the main results strengthens the significance of any findings. 3. A new sets of data analysis to confirm the interpretability of the neural omission response in terms of neural correlates of sound omissions. 4. An improved results visualization based on raincloud plots and figures reorganization.