Atypical speech production of multisyllabic words and phrases by children with developmental dyslexia

The prevalent ‘core phonological deficit’ model of dyslexia proposes that the reading and spelling difficulties characterizing affected children stem from prior developmental difficulties in processing speech sound structure, for example perceiving and identifying syllable stress patterns, syllables, rhymes and phonemes. Yet spoken word production appears normal. This suggests an unexpected disconnect between speech input and speech output processes. Here we investigated the output side of this disconnect from a speech rhythm perspective by measuring the speech amplitude envelope (AE) of multisyllabic spoken phrases. The speech AE contains crucial information regarding stress patterns, speech rate, tonal contrasts and intonational information. We created a novel computerized speech copying task in which participants copied aloud familiar spoken targets like “Aladdin”. Seventy-five children with and without dyslexia were tested, some of whom were also receiving an oral intervention designed to enhance multi-syllabic processing. Similarity of the child’s productions to the target AE was computed using correlation and mutual information metrics. Similarity of pitch contour, another acoustic cue to speech rhythm, was used for control analyses. Children with dyslexia were significantly worse at producing the multi-syllabic targets as indexed by both similarity metrics for computing the AE. However, children with dyslexia were not different from control children in producing pitch contours. Accordingly, the spoken production of multisyllabic phrases by children with dyslexia is atypical regarding the AE. Children with dyslexia may not appear to listeners to exhibit speech production difficulties because their pitch contours are intact. Research Highlights Speech production of syllable stress patterns is atypical in children with dyslexia. Children with dyslexia are significantly worse at producing the amplitude envelope of multi-syllabic targets compared to both age-matched and reading-level-matched control children. No group differences were found for pitch contour production between children with dyslexia and age-matched control children. It may be difficult to detect speech output problems in dyslexia as pitch contours are relatively accurate.

fluctuations) produced at different temporal rates by speakers of human languages (Varnet et al.,119 2017). TS theory was developed to explain a decade of research studies across languages that 120 indicated impaired perception of AE rise times and impaired amplitude modulation detection for 121 children with dyslexia (e.g., Lorenzi et al., 2000; Goswami et al., 2002; see Goswami, 2011Goswami, , 2015 for reviews). The perceptual effects of rise times in the AE help to create speech rhythm, as when 123 we deliberately speak to a rhythm, the rise times of the vowels in stressed syllables are produced 124 approximately isochronously (Scott, 1998; Goswami & Leong, 2013). AE rise times are also 125 important sensory cues to stress placement in multi-syllabic words, as larger rise times at lower for dyslexia (Kalashnikova et al., 2018), and are perceived poorly by children with dyslexia (e.g., 138 Goswami et al., 2011). Children with dyslexia also show impaired neural representation of low 139 frequency AE information (Power et al., 2016;Molinaro et al., 2016). As the AE is perceived and 140 represented poorly by children with developmental dyslexia, it is logical to expect that the AE 141 might also be produced poorly during natural speech. 142 To explore this hypothesis, we created a novel syllable stress interface for children which 143 enabled us to measure different features of their natural speech production. The task was based on 144 copying adult-produced oral target words, which were spoken as names for pictures of familiar 145 child words such as "Aladdin". We then tested both children with dyslexia with this computerized 146 interface and CA-and RL-matched controls. Based on TS theory, we expected the children with 147 dyslexia to produce AEs for target items that were dissimilar to the target, while children without 148 dyslexia should show speech production of AEs that were more similar to the target. For example, children with dyslexia might be expected to produce inconsistent differences in amplitude between 150 strong and weak syllables, resulting in the overall envelope of their speech production being 151 substantially different from the target. Children heard the computer speak a pictured known word 152 (e.g. 'Aladdin') and simultaneously saw the target AE as a line on the computer screen (see Figure   153 1). They were asked to repeat the target word 3 times, simultaneously viewing their own AE, which 154 appeared in real time overlaying the target AE on each occasion to provide visual feedback.

155
Children were encouraged to try to match their AE to the target AE. The online interface speaking (panel C). The child was instructed to repeat the target stimulus in a time duration of 3 seconds.

171
As the child repeated the stimulus, the envelope of the target stimulus with the child's response envelope 172 overlaying were shown on each occasion on the main layout (panel C). (RL, average age 9 years 4 months), 19 children with dyslexia who were receiving an oral rhythmic 179 intervention comprising 18 sessions of musical and rhythmic language activities spread over the 180 school term (DY1, average age 11 years 4 months), and 17 children with dyslexia who were 181 awaiting intervention (DY2, average age 10 years 8 months). The task was given as an add-on to 182 the intervention for the DY1 children, but in contrast to the other oral rhythmic tasks being used 183 in the intervention, the experimenter did not provide feedback nor help the child to perform the and DY2 groups, despite the groups being well-matched in age when the study began. Note that 190 although this age difference could confound group comparisons (as the DY1 children have the 191 advantage of being older than their age-matched controls), the age difference goes against our a 192 priori hypothesis that the DY1 children will actually be worse than the CA children regarding AE 193 production. Being older, the DY1 group will have more oral language experience than the CA 194 children. Accordingly, if maturation was the sole driver of how adult-like the AE of children's 195 speech becomes, then the DY1 group should be at an advantage rather than a disadvantage 196 compared to the CA children concerning the adult-likeness of their speech production. between groups, see Table 1

223
The auditory stimuli were 20 multi-syllabic words or phrases listed in Table 2 Table 2. Words and phrases used in the study.
where , and , are average for variable and , respectively. MI between two random variables 266 is defined as a measure of information that one random variable gives about the other one, and is 267 calculated as (Cover, 1999): .      Figure 2 shows the speech production data by group for the AE and Figure 3 shows the speech 304 production data by group for pitch contour. As noted, a priori, we expected the children with 305 dyslexia to show worse speech production in terms of similarity to the target pronunciations for 306 the AE metrics (r and MI). We did not make an a priori prediction for the pitch contour metrics.

307
It was also thought possible that repeated practice and the visual feedback of viewing the AE could 308 improve repetition accuracy from repetition 1 to repetition 3 for the AE measures. The repeated 309 measures ANOVAs supported the first expectation regarding group differences, but not the second Accordingly, there is no strong evidence for learning effects during the experiment, as improved 317 similarity regarding target production did not increase parametrically across repetitions. Rather, 318 all groups performed best on their second repetition. The data also suggest that the visual feedback 319 of seeing the AE did not have a systematic effect on performance, as repetition effects were similar 320 for both the AE and the pitch contour analyses. As will be recalled, pitch contours were not 321 depicted visually during the task. It is difficult to explain why children got significantly worse on 322 their third repetition, however it could have been fatigue with the task.  rhythmic intervention designed to support their phonological awareness of multi-syllabic words 398 (DY2 group) were particularly impaired regarding AE similarity to the targets, as they also 399 performed significantly more poorly than younger RL-matched control children for the MI 400 similarity measure. These children (DY2 group) also performed significantly more poorly in 401 matching the AE than the children with dyslexia who were receiving the oral intervention designed 402 to support their phonological awareness of stress patterns (DY1 group). This suggests that the 18-403 session oral intervention within which the stress copying task was embedded for the DY1 group 404 does help children with dyslexia to perceive and produce speech rhythm patterns. Nevertheless, 405 the intervention appeared to have modest effects regarding AE copying, as the DY1 children were 406 still significantly poorer at producing AEs that matched the targets than the CA children using the 407 r similarity measure. Regarding the AEs that were produced by children with dyslexia, the 408 depictions of speech output for individual children provided in Figure 4 suggest that children with 409 dyslexia were not producing an envelope of amplitude (intensity or loudness) differences between 410 strong (stressed) and weak (unstressed) syllables that were similar in overall AE shape to the oral 411 targets.

412
Of particular note, all groups performed at similar levels in terms of matching the pitch 413 contours of the target items (see Figure 3). This rules out any specific motoric or articulatory 414 difficulties that could explain the impaired AE performance shown by the children with dyslexia.

415
Indeed, inspection of the individual example data depicted in Figure 4 suggests that the pitch 416 contours produced by the children with dyslexia (DY1 and DY2 groups) were highly similar to 417 the target pitch contours, with the DY1 group producing significantly more similar pitch contours 418 to the targets than the younger RL-matched controls. Pitch contours capture the overall changes in 419 fundamental frequency (f0) as speakers alternate between stressed and unstressed syllables and is 420 a core aspect of intonation (Ladd, 1980). The relative success of children with dyslexia in 421 producing pitch contours may help to explain why the speech production difficulties of children 422 with dyslexia regarding the AE appear to pass unnoticed by their listeners and teachers.

423
Accurate pitch contour production in dyslexia would not be consistent with a prior auditory 424 theory of developmental dyslexia, in which difficulties relating to rapid spectro-temporal changes 425 that reflect phonemes rather than the relatively slow spectro-temporal changes reflected in the AE