Cross-modal temporal biases emerge during early sensitive periods

Human perception features stable biases, such as perceiving visual events as later than synchronous auditory events. The origin of such perceptual biases is unknown, they could be innate or shaped by sensory experience during a sensitive period. To investigate the role of sensory experience, we tested whether a congenital, transient loss of vision, caused by bilateral dense cataracts, has sustained effects on the ability to order events spatio-temporally within and across sensory modalities. Most strikingly, individuals with reversed congenital cataracts showed a bias towards perceiving visual stimuli as occurring earlier than auditory (Exp. 1) and tactile (Exp. 2) stimuli. In contrast, both normally sighted controls and individuals who could see at birth but developed cataracts during childhood reported the typical bias of perceiving vision as delayed compared to audition. Thus, we provide strong evidence that cross-modal temporal perceptual biases depend on sensory experience and emerge during an early sensitive period.


Introduction
In every moment, a multitude of information reaches our brain through the different senses. These sensory inputs need to be separated, ordered in space and time to derive a coherent representation of the environment. Yet, the perception of temporal order is seldom veridical. Reports illustrating the subjectivity of cross-modal temporal perception date back to 18 th and 19 th century astronomers; small but stable individual biases in the perceived timing of visual and auditory events caused significant differences in the measurements of stellar transit times and subsequent scientific disputes. These early reports inspired the pioneering work of Wilhelm Wundt and promoted the role of perceptual biases as a major but still unresolved topic in experimental psychology 1,2 .
Determining the spatio-temporal order of events across sensory modalities poses an especially difficult challenge, as information arriving through different senses travels at different speeds -in the environment and within the nervous system 3 . However, the typical bias towards perceiving vision as delayed compared to audition cannot be explained by these physical and physiological speed differences [4][5][6] .
The perceived temporal order of visual-auditory events can be transiently shifted by exposing humans to a series of asynchronous stimulus pairs with a constant lag between vision and audition 7,8 . Yet, such recalibration effects quickly vanish and in the long-term cross-modal temporal biases are highly stable within individuals 1, 6 . This co-existence of short-term plasticity and long-term stability re-emphasizes the question that already troubled scientists 250 years ago: why does the brain not learn to compensate for such perceptual biases?
Given their long-term stability, biases in cross-modal temporal perception could either be innate, inherent to the structure of the underlying neural mechanisms, or emerge based on sensory experience potentially during a sensitive period of development. The ability to optimally order spatially and temporally distinct events across sensory modalities develops only in late childhood and overall later than within one sensory modality 9 . This late maturation points toward the possibility that cross-modal spatio-temporal biases are shaped by sensory experience accumulated during childhood. The role of early sensory experience for the genesis of perceptual biases is extremely difficult to address in humans; only naturally altered developmental sensory environments open a window into the influence of experience on perceptual development. We tested the hypothesis of a sensitive period for the emergence of cross-modal perceptual biases by measuring the ability to order events spatio-temporally across vision, audition, and touch in individuals born with dense, bilateral cataracts whose sight was restored 6 to 168 months after birth.

Results
Thirteen individuals with a history of transient, congenital bilateral, dense cataracts (CC) participated in two spatial temporal order judgement tasks, ten in a visual-auditory task (Exp. 1) and ten in a visual-tactile task (Exp. 2). To test for the role of vision during infancy as well as to control for the role of persisting visual impairments, sixteen individuals, nine per experiment, who underwent surgery for cataracts which had developed after birth during childhood (DC) served as controls, additionally to age-matched typically sighted individuals (MCC and MDC). In every trial, two successive stimuli were presented, one in each hemifield. Visual-auditory and visual-tactile stimulus pairs were randomly interleaved with unimodal stimulus pairs. Participants reported the side of the first stimulus irrespective of its modality 5 . We predicted preferential processing of and consequently an increased bias toward the auditory and tactile modality as well as a lower visual and cross-modal spatio-temporal resolution in the CC-group.

Discussion
We investigated whether cross-modal perceptual biases are innate or acquired during a sensitive phase in early childhood by testing sight-recovery individuals with a history of congenital visual loss. In two spatio-temporal order judgement tasks (visual-auditory and visualtactile) CC-individuals reported visual stimuli as earlier than both auditory and tactile stimuli, exhibiting a reversed cross-modal bias compared to their controls and sight-recovery individuals whose cataracts had developed later. Moreover, the ability to determine the spatio-temporal order of separate events across different sensory modalities and within vision was reduced after a transient phase of visual loss at any time during childhood. These results, for the first time, demonstrate that cross-modal perceptual biases are not innate but rather acquired during a sensitive period.
At first glance, CC-individuals' bias to perceive visual events as earlier than either auditory or tactile events seems to indicate that visual stimuli are processed faster than auditory and tactile stimuli following transient, congenital visual deprivation. However, CC-individuals' spatiotemporal resolution revealed a temporal processing impairment for vision but not for audition and touch. Consistently, event-related potentials have provided no evidence for earlier visuallyevoked brain activity in CC-individuals 10,11 and behavioral studies have demonstrated no visual advantage in reaction times to simple visual stimuli 12,13 . Moreover, reduced visual contrasttypical for cataract-reversal individuals -is associated with delayed responses of the visual system and lower visual temporal sensitivity 14,15 . Thus, there is currently no evidence indicating accelerated processing of visual information following congenital visual deprivation. Indeed, accelerated visual processing in the CC-group would be counterintuitive, given that these individuals had no reliable visual input for extensive periods after birth. Alternatively, biases in temporal order perception can be induced by allocating attention unequally across modalities [16][17][18][19] . However, attention has been ruled out as a cause of persistent biases in cross-modal spatiotemporal processing 6 . Moreover, it is not obvious why CC-and DC-individuals would entertain opposing attentional foci given that both groups experienced transient severe visual impairment and suffer from remaining visual acuity impairments. In sum, the diverging cross-modal temporal biases observed here between groups unlikely reflect differences in the speed of processing or attentional effects but rather represent a difference in the genuine perceptual biases of humans born with and without vision.
CC-individuals' surprising bias toward perceiving visual stimuli as occurring earlier than auditory and tactile stimuli is consistent with exposure to lagging visual stimuli while the cataracts were still present: Residual light perception, which exists even in the presence of dense cataracts, likely has been sluggish due to reduced retinal transduction rates 20 and suppression of visual cortex activity in the context of cross-modal stimulation 21 . Thus, cataract-reversal individuals likely were exposed to a consistent delay of vision compared to audition and touch before cataract reversal surgery, resulting in a reversed visual-auditory and visual-tactile bias after cataracts were removed. Crucially, the CC-individuals were exposed to these altered sensory environments from birth on. Studies in owls have demonstrated that atypical multisensory experience due to prism-altered vision leads to permanent structural differences in the mapping of auditory and visual spatial representations in the juvenile brain 22 but not in the adult brain 23 .
Thus, we suggest that the reversed bias exhibited by CC-individuals results from structural differences elicited by atypical cross-modal temporal experience after birth. In sum, stable crossmodal temporal biases 5,6 might exist because the brain optimizes cross-modal temporal perception 24 during a sensitive period and as a consequence establishes a setpoint for future recalibration.
The finding, that CC-individuals but not DC-individuals showed a reversed visual-auditory temporal bias provides strong evidence that perceptual biases are shaped by sensory experience during early childhood. In contrast to CC-individuals, DC-individuals had encountered temporally aligned cross-modal stimuli after birth, which we suggest had enabled them to develop a typical bias. It has been shown in cats that even minimal visual experience prior to visual deprivation allows for a typical development of cortico-cortical connections 25 . The sensitive period for crossmodal temporal perception might span the first six months of life, the minimal duration of deprived vision in our CC-group. A previous study tested CC-individuals whose vision was restored within early infancy (4 months of age on average) and their matched controls on simultaneity judgments of spatially aligned visual-auditory and visual-tactile stimulus pairs but did not report a reversed cross-modal bias 26 . However, since simultaneity judgments of spatially aligned cross-modal stimuli are less sensitive to biases than the spatial temporal order judgements we used, it remains open whether these CC-individuals showed no reversed crossmodal bias because their sight was restored before the sensitive period was over. In sum, our results provide strong evidence for a sensitive period for the emergence of cross-modal spatiotemporal biases during human ontogeny.
Both cataract reversal groups exhibited a lower spatio-temporal resolution, indicative of an increased temporal uncertainty, than their controls in visual and cross-modal contexts. This general disadvantage suggests a strong dependence of cross-modal temporal ordering on the visual sense that might be related to the spatial nature of our task. Moreover, the finding that both cataract groups exhibited a lower temporal resolution but only the CC-group an altered cross-modal bias strongly suggests that cross-modal spatio-temporal biases and resolution are dissociable processes. Moreover, the conjunction of CC-and DC-individuals' reduced resolution might point towards a long sensitive period for the development of spatio-temporal sensory resolution which would be compatible with the protracted developmental time course of multisensory temporal processing 9,27 . Yet, persisting visual impairments might have contributed to the reduced spatio-temporal resolution of both cataract-reversal groups.
Furthermore, the present finding of increased temporal uncertainty could explain why recent studies have persistently found altered multisensory integration following congenital, transient periods of visual deprivation 26,28,29 . A higher spatio-temporal uncertainty predicts wider temporal integration windows for simple, spatially-aligned stimuli 26 , and at the same time hinders the detection of temporal correlations 30 between more complex signals such as speech stimuli 28,29 .
In conclusion, congenital but not late transient visual deprivation was associated with a bias towards perceiving visual events as earlier than auditory or tactile events, suggesting an early sensitive period for the development of perceptual biases.

Participants
The sample of the visual-auditory experiment (Expt. 1) comprised 10 individuals who were born with bilateral dense cataracts (CC) and whose vision was restored later in life (for details see Table 1)  The presence of congenital cataracts was affirmed by medical records. Since cataracts were sometimes diagnosed at a progressed age, additional criteria such as presence of nystagmus, the density of the lenticular opacity, the lack of fundus visibility prior to surgery, a family history of congenital cataracts, and parents' reports were employed to confirm the onset of the cataract.

Apparatus and Stimuli
Participants sat at a table, facing two speakers, positioned at 14° visual angle (15 cm at 60 cm distance) to the left and to the right of the participant's midline. Three LEDs were mounted on top of each speaker. In the visual-tactile experiment, custom-made, noise-attenuated tactile stimulators were attached to the dorsal sides of both index fingers. For stimulation, the LEDs emitted red light, the speakers played white noise, and the tactile stimulators vibrated at a frequency of 100 Hz. Each stimulus lasted 15 ms, independent of modality. All three LEDs were used for cataract-reversal participants, but only one LED for typically sighted participants to roughly compensate for persistent visual impairments in cataract-reversal participants. To rule out that typically sighted participants perceived vision as delayed due to the lower number of LEDs, we tested 5 additional typically sighted participants (all female and right-handed, 23-50 years old, mean age 34 years) in the visual-auditory experiment while using all three LEDs. These participants too showed a significant bias towards perceiving vision as delayed (t(4)=6.36, p=0.003, Fig. 1 -Figure Supplement

Task, Procedure, and Design
In each trial, two stimuli were presented in close succession; one stimulus in each hemifield. Participants indicated at which side they perceived the first stimulus. Responses had to be withheld until the second stimulus had been presented. Response times were not restricted, and the next trial started 2 s after the response had been registered.
The modality of the stimulus presented at either side (visual or auditory, Expt. 1; visual or tactile, Expt. 2) and the stimulus onset asynchrony (SOA; ±30, ±90, ±135, ±400 ms, with negative SOAs indicating 'left side first'-stimulus pairs) of the two stimuli varied pseudo-randomly across trials. Each of the 32 stimulus conditions (2 modalities x 2 sides x 8 SOAs) was repeated 10 times; the 320 trials were divided into 10 blocks. Participants additionally completed ten practice trials with an SOA of ±400 ms at the beginning of the experiment. If necessary, the practice trials were repeated until participants felt confident about the task. In the visual-tactile experiment, a subsample of participants was additionally tested while holding the hands crossed (data not reported here). Participants were encouraged to take breaks in between blocks. Some of the cataract-reversal participants did not complete the full experiment, mostly due to time constraints. Except for practice trials, participants did not receive feedback.

Data Analysis
Data and analysis scripts are made available online 31 . Trials with reaction times shorter than 100 ms and more than 2.5 standard deviations above the participant's mean reaction time (RT) were excluded from the analysis (2.1% of trials; responses entered by the experimenter were not filtered). To test for temporal order biases toward one modality, we conducted a hierarchical logistic regression on single-trial 'visual first'-values with group as predictor. As planned comparisons, we first calculated pairwise contrasts comparing each cataract-reversal group with its matched control group and second estimated fixed contrasts separately for each group to evaluate whether the probability to perceive the visual stimulus before the auditory or tactile stimulus significantly differed from chance level.
To analyze the spatio-temporal resolution across groups and modality conditions, we conducted a hierarchical logistic regression on single trial accuracy values using group and modality as predictors. To resolve interactions between both predictors, we first conducted pairwise contrasts on both predictors comparing group differences between each cataractreversal group and its matched control group across modalities and second pairwise contrasts testing for group differences separately for each modality condition.
Spearman's rho, i.e., robust correlation coefficients were calculated between the two performance measures (bias and resolution) and CC-individuals' key medical data (duration of visual deprivation, time period since visual restoration, and visual acuity). Partial correlations, correcting for the effect of age on TOJ performance 9 , were used for temporal variables (duration of visual deprivation and time period since visual restoration). P-values were corrected for multiple comparisons using Benjamini and Hochberg's procedure.

Declaration of Interests
The authors declare no competing interests. Thirteen individuals with a history of congenital, bilateral dense cataracts (CC) and sixteen individuals whose reversed cataracts had developed later in life (DC) as well as age-, gender-, and handedness-matched typically-developed individuals (MCC, MDC) took part in the study (see Table 1 for details about the samples). Participants judged the spatio-temporal order of two successive stimuli -one presented in each hemifield -by indicating the location of the first stimulus. In Expt. 1, visual (grey), auditory (dark blue), and visual-auditory (light blue) stimuli were presented, in Expt.             Group mean proportions of 'right side first'-responses are shown as a function of the stimulus onset asynchrony (SOA) of the two stimuli, with negative values indicating 'left side first'stimulation. The data are split into responses to bimodal (top row) and unimodal (bottom row) stimulus pairs and according to the modality presented at the right side. Sigmoid curves fitted to the group mean data reported in Figure 1 are shown as a reference for participants for whom the onset of the cataract could be identified.