Abstract
Neuronal oscillations putatively track speech in order to optimize sensory processing. However, it is unclear how isochronous brain oscillations can track pseudo-rhythmic speech input. Here we investigate how top-down predictions flowing from internal language models interact with oscillations during speech processing. We show that word-to-word onset delays are shorter when words are spoken in predictable contexts. A computational model including oscillations, feedback, and inhibition is able to track the natural pseudo-rhythmic word-to-word onset differences. As the model processes, it generates temporal phase codes, which are a candidate mechanism for carrying information forward in time in the system. Intriguingly, the model’s response is more rhythmic for non-isochronous compared to isochronous speech when onset times are proportional to predictions from the internal model. These results show that oscillatory tracking of temporal speech dynamics relies not only on the input acoustics, but also on the linguistic constraints flowing from knowledge of language.