Universal brain signature of emerging reading in two contrasting languages

Despite dissimilarities among scripts, a universal hallmark of literacy in adults is the convergent brain activity for print and speech. Little is known, however, how early it emerges. Here we compare speech and orthographic processing systems in two contrasting languages, Polish and English, in 100 7-year-old children performing identical fMRI tasks. Results show limited language variation, with speech-print convergence evident mostly in left fronto-temporal perisylvian regions. Correlational and intersect analyses revealed subtle differences in the strength of this coupling in several regions of interest. Specifically, speech-print convergence was higher for transparent Polish than opaque English in right temporal area, associated with phonological processing. Conversely, speech-print convergence was higher for English than Polish in left fusiform, associated with visual reading. We conclude that speech-print convergence is a universal marker of reading even at the beginning of reading acquisition while minor variations can be explained by the differences in the orthographic transparency.


Introduction 42
Less than 6000 years ago writing systems began to develop to convey linguistic 43 information through space and time. Despite striking dissimilarities among writing systems in 44 regularity, frame and arrangement, they all represent the units of a spoken language. 45 Irrespective of the writing system, reading depends on access to existing brain regions dedicated 46 to the processing of spoken words. In consequence, the convergence of the speech and print 47 the spoken linguistic stimuli (dynamic frequency and amplitude content). However, linguistic 141 content has been eliminated (orthographic and phonetic, respectively). This design activates the 142 language network, and is sensitive to individual differences in reading skills in both adults 143 (Malins et al., 2016) and children (Chyl et al., 2018). Polish children were asked to pay attention 144 to the stimuli, but no explicit task was given to the participants. American children were also 145 asked to pay attention to the stimuli and informed that after the task two simple recognition 146 questions would be asked (e.g. "Did you hear the word "banana"?"). This step was introduced 147 in order to make sure that children were focused on the task. However, reading should occur 148 implicitly even without explicit instruction to read (Price, Wise, & Frackowiak, 1996) and 149 listening is automatic as well. 150 On each trial, four different stimuli from the same condition were presented in rapid 151 succession in a 'tetrad', designed to evoke strong activation within a relatively short imaging 152 time. Each visual stimulus was presented for 250 ms, followed by a 200 ms blank screen, 153 whereas each auditory stimulus was allowed 800 ms to play out. 'Jittered' intertrial intervals 154 were employed with occasional 'null' trials resulting in ITIs ranging from 4 to 13 s (6.25 s on 155 average). The task was performed in two runs, each lasting 5:02 minutes. All conditions were 156 presented in each run, with 48 trials per run presented pseudorandomly, with restriction not to 157 repeat one condition more than three times in a row. This resulted in 24 total trials per condition, 158 and 96 total stimuli per condition. Stimuli were presented using Presentation software 159 (Neurobehavioral Systems, Albany, CA) in Poland and E-Prime software in the United States. 160 161 fMRI data acquisition 162 fMRI data at each site were acquired on Siemens 3T Magnetom Trio scanners using 163 similar whole-brain echoplanar imaging sequences, 12-channel head coil (32 slices, slice-164 thickness 4 mm, TR = 2,000 ms, TE = 30 ms, FOV = 220x220 mm2, matrix size = 64 x 64, 165 voxel size = 3 x 3 x 4). There was a difference in the flip angle parameter (Polish = 80°, 166 American = 90°). Anatomical data was acquired using a T1 weighted MP-RAGE sequence (176 167 slices, slice-thickness = 1 mm, TR = 2,530 ms, TE = 3.32 ms, flip angle=7°, matrix 168 size=256*256, voxel size= 1x1x1 mm). Generalized Autocalibrating Partial Parallel 169 Acquisition (GRAPPA) acceleration was used at the Polish site (iPAT = 2), but not at the 170 American site. To correct scanner differences, we performed iterative smoothness equalization 171 and included signal-to-fluctuation-noise-ratio (SFNR) as a covariate in all between group 172 comparisons (Friedman, Glover, & Fbirn Consortium, 2006). 173 174 fMRI data processing and analysis 175 The preprocessing and analyses were performed using SPM12 (Wellcome Trust Center  176 for Neuroimaging, London, UK) and AFNI version 17.3.09 (Cox, 1996). In SPM12, images 177 were realigned to the first functional volume. Then structural images from single subjects were 178 coregistered to their mean functional images. Coregistered anatomical images were segmented 179 using pediatric tissue probability maps (generated with Template-O-Matic toolbox). Next, 180 DARTEL was used to create a group-specific template and flow fields based on segmented 181 tissues (Ashburner, 2007). Functional images were normalized to MNI space with 2x2x2mm 182 voxel size using compositions of flow fields and a group-specific template. Next, in the 183 univariate analyses, Gaussian spatial smoothing was performed using the 3dBlurtoFWHM 184 option in AFNI, which allows for the "adaptive smoothing" method, and the data were 185 smoothed to equalize estimated FWHM at 10 mm. The data were modeled using the canonical 186 hemodynamic response function convolved with the experimental conditions and fixation 187 periods. Movement regressors were added to the design matrix using ART toolbox to reject 188 motion-affected volumes surpassing the movement threshold of 3 mm and a rotation threshold 189 of 0.05 radians. On average 4.02 volumes were removed in the US, and 6.74 in PL samples, 190 with non-significant difference between the groups. 191 To examine speech-print convergence we applied three different analytic approaches: 192 intersect maps for print and speech on the whole brain and in selected regions of interest (ROIs), 193 correlation analysis between brain activation to print and speech in selected ROIs and 194 representational similarity analysis (RSA). Selection of ROIs was guided by the results on 195 skilled adults (Rueckl et al., 2015) as well as meta-analyses of reading studies (Linkersdörfer,196 Lonnemann, Lindberg, Hasselhorn, & Fiebach, 2012; Richlan, 2012 activated regions for all contrast of interest from both groups for 1) spoken or 2) printed stimuli 218 were computed to control for the relative degree of brain activation for each participant and 219 together with 3) local SFNR were used as regressors of no interest. 220 In the correlation analysis, regression parameter estimates (averaged within the ROIs) 221 for print and speech were used to compute r-Pearson correlation coefficients across subjects in 222 each group. Correlation coefficients were then compared between languages using the Fisher 223 r-to-z transformation. 224 The searchlight RSA was conducted for each subject by using RSA Spearman's rho maps were then Fisher-z transformed and submitted to second-level statistical 240 tests. All RSA results are presented on the voxel threshold p < 0.005, FDR cluster corrected. 241 Additionally, activation to print only or speech only, as well as print>symbols and 242 speech>vocoded speech was compared between the languages within the selected ROIs, 243 corrected for SFNR. Whole-brain group comparisons were not performed, as they are 244 potentially more susceptible to cross-scanner differences, and could result in differences in 245 regions outside the canonical reading and speech networks (Rueckl et al., 2015). 246 Behavioural data, ROI data, parameters of the items used in fMRI experiment as well 247 as the experimental protocols used at both sites are available online (https://osf.io/982ks). 248

Behavioral results 252
Demographics and test performance is presented in Table 1. Since the groups were 253 matched for reading, no differences were found for word reading score. However, independent 254 samples t-test showed significant differences between Polish and American children in the 255 estimated scores of letters in pseudowords read per second, with Polish children reading more 256 efficiently than American. Since no difference was found in the pseudowords per minute, this 257 result reflects the differences in test items, as pseudowords used in US group were 258 shorter. There was no difference between the fathers' education, but mothers of the PL group 259 obtained higher level of education. 260

fMRI results 262
Language-independent activation 263 Figure 1 and Table 2 reports the results of the group conjunction analysis revealing language-264 independent networks for printed and spoken word recognition. For print, the regions that were 265 commonly employed by Polish and American children were bilateral occipital, frontal and 266 temporal cortex. Print specific (print > symbols) activation common for both groups was 267 present solely in the left IFG and precentral gyrus (PrCG). For speech and speech specific 268 (speech > vocoded) conditions both groups activated bilateral temporal and frontal cortex, but 269 speech specific activation was less extensive.  Table S2), as well 278 as regions convergently active for print and speech in both groups (Table 3) p=0.07), however the difference between correlation coefficients was not significant (z=1.5; 299 p=0.13). In case of the right STG/MTG, the correlation was significant in both languages 300 (r=0.636 [0.438; 0.778], p<0.001 and r=0.301 [0.030; 0.537], p=0.034 for PL and US 301 respectively), but was significantly higher in Polish than English (z=2.14; p=0.03). 302 Additionally, the significant difference in the correlation coefficients was found in the left IFG 303

314 315
A high degree of similarity in speech-print convergence between Polish and American 316 children was also revealed in RSA analysis ( Figure 4 and Table 4). Again, the convergence as 317 measured by similarity between brain response to speech and print was present in bilateral 318 temporal regions and left frontal areas. No significant differences between the groups were 319 found in RSA ROI analyses.

325 326
Language-specific activation 327 Next, we examined group differences in activation to print only or speech only, as well 328 as print>symbols and speech>vocoded speech within the selected ROIs. For visual conditions, 329 only one significant difference was found, with English involving left IFG pars triangularis 330 more than Polish in response to print (t(98) = 3.163, p < 0.002). In print specific condition no 331 differences were found. For speech, English had higher activation than Polish in the left FG 332 Here, we present how young beginning readers of Polish and English process spoken 338 and printed words. We particularly focused on the aspect of conjoint processing of print and 339 speech, a hallmark of the successful literacy acquisition (Chyl et al., 2018) and common for 340 different languages in skilled adult readers (Rueckl et al., 2015). We also tested language-341 related similarities and differences in processing print and speech separately. 342 Our results show striking resemblance to previous findings (Rueckl et al., 2015), and 343 demonstrate that incorporating print into the existing speech network is similar in contrasting 344 languages, not only in adulthood but also at the beginning of reading acquisition. Bilateral IFG 345 and MTG/STG were activated by print and speech in both Polish and American children. 346 Complementary RSA analysis confirmed language invariant speech-print coactivation in the 347 left IFG and bilateral MTG/STG. Speech-print convergence in the previous study (Rueckl et 348 al., 2015) was additionally present in left parietal cortex, which may be related to the task 349 demands. Here, we measured implicit activation to speech and print, while in previous study 350 participants made semantic judgments. Nevertheless, we provide evidence that the core speech-351 print convergence is independent of reading experience and the fMRI task, at least for typical 352 reading development. 353 When we tested the size of speech-print convergence in several ROIs of the language 354 network, we found that Polish children had more convergent voxels in the right STG/MTG than 355 American, while a reversed pattern was present in the left FG. In summary, we have demonstrated that in the two groups of children speaking different 421 languages the neural pattern of print and speech processing is remarkably similar. Importantly, 422 the speech-print convergence is present in both groups, yet again suggesting that incorporating 423 orthographic processing into the speech pathways shaped by evolution is universal for different 424 languages and scripts. However, orthographic transparency of the language may evoke different 425