Genome-wide association analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people

The use of spoken and written language is a capacity that is unique to humans. Individual differences in reading- and language-related skills are influenced by genetic variation, with twin-based heritability estimates of 30-80%, depending on the trait. The relevant genetic architecture is complex, heterogeneous, and multifactorial, and yet to be investigated with well-powered studies. Here, we present a multicohort genome-wide association study (GWAS) of five traits assessed individually using psychometric measures: word reading, nonword reading, spelling, phoneme awareness, and nonword repetition, with total sample sizes ranging from 13,633 to 33,959 participants aged 5-26 years (12,411 to 27,180 for those with European ancestry, defined by principal component analyses). We identified a genome-wide significant association with word reading (rs11208009, p=1.098 × 10−8) independent of known loci associated with intelligence or educational attainment. All five reading-/language-related traits had robust SNP-heritability estimates (0.13–0.26), and genetic correlations between them were modest to high. Using genomic structural equation modelling, we found evidence for a shared genetic factor explaining the majority of variation in word and nonword reading, spelling, and phoneme awareness, which only partially overlapped with genetic variation contributing to nonword repetition, intelligence and educational attainment. A multivariate GWAS was performed to jointly analyse word and nonword reading, spelling, and phoneme awareness, maximizing power for follow-up investigation. Genetic correlation analysis of multivariate GWAS results with neuroimaging traits identified association with cortical surface area of the banks of the left superior temporal sulcus, a brain region with known links to processing of spoken and written language. Analysis of evolutionary annotations on the lineage that led to modern humans showed enriched heritability in regions depleted of Neanderthal variants. Together, these results provide new avenues for deciphering the biological underpinnings of these uniquely human traits.


Introduction
The processing and production of complex spoken and written language are capacities that appear to be distinct to our species. Such skills have become fundamental for day-to-day life in modern society.
Decades of family and twin studies have revealed substantial genetic components contributing to individual variation in reading-and language-related traits, as well as to susceptibility to associated disorders. A recent meta-analysis integrated available data on these skills from 49 twin studies, with a total sample size of 38,000 children and adolescents, aged 4-18 years. The meta-analysis yielded heritability estimates of 66% for word reading (meta-analysis of 48 studies), 80% for spelling (15 studies), and 52% for phoneme awareness (the ability to identify and manipulate individual sounds of spoken words; 13 studies), and suggested greater genetic influences on reading-related abilities than language-related measures (heritability 34%; meta-analysis of 10 studies with measures on receptive and expressive vocabulary, oral language and naming abilities) 1 .
Linkage mapping and candidate gene studies have reported associations of single-nucleotide polymorphisms (SNPs) and/or genetic loci with reading and language-related traits, as well as with disorders such as dyslexia and developmental language disorder (DLD; which encompasses the older definition of specific language impairment or SLI) 2 . However, replication efforts have met with limited success. Genome-wide association studies (GWAS) are beginning to identify SNPs that show genomewide significant associations with reading-and language-related traits: rs7642482 near ROBO2 associated with expressive vocabulary in infancy 3 , rs17663182 within MIR924HG with rapid automatized naming of letters 4 , and rs1555839 near RPL7P34 with rapid automatized naming and rapid alternating stimulus, deficits of which are often implicated in dyslexia 5 . Nonetheless, insights into the genomic underpinnings of these types of skills from GWAS approaches have thus far been limited, which may reflect low power due to the relatively small sample sizes of the cohorts, such that the majority of genetic variance remains unexplained. Sample sizes have remained limited because of the labour-intensive assessment methods required for phenotyping reading-and language-related traits, which are difficult or even impossible to replace with simple questionnaires. Yet, well-powered GWAS efforts that characterize the molecular genetic variation involved in reading-and language-related traits can provide novel perspectives on the biological bases and origins of human cognitive specializations 6 .
Here, we present large-scale GWAS meta-analyses of a set of reading-and language-related traits, measured with psychometric tools. We captured variation across the phenotypic spectrum, extending beyond disorder. Our study focused on traits that have been assessed using continuous measures in multiple cohorts from the international GenLang network (www.genlang.org), together with several public datasets that have data available for the relevant phenotypes, matched to genome-wide genotype information. Five quantitative traits were identified for which phenotype data could be aligned across different cohorts, to yield sufficiently large sample sizes for GWAS: word reading, nonword reading, spelling, phoneme awareness and nonword repetition. Univariate GWAS metaanalyses were performed to identify genetic variation influencing these traits and to model the genetic overlaps between them. For comparative purposes, a GWAS meta-analysis for performance IQ was also performed in the same dataset. Together with publicly available GWAS summary statistics from prior studies of cognitive performance and educational attainment, these data were used to study genetic relationships between reading-and language-related traits, IQ and educational attainment. A multivariate approach allowed us to optimise the power of the word reading GWAS meta-analysis for functional follow-ups investigating the tissues, cell types, brain regions and evolutionary signatures involved.

GWAS meta-analysis for quantitative reading-and language-related traits
We studied five quantitative reading-and language-related traits: word reading accuracy, nonword reading accuracy, spelling accuracy, phoneme awareness and nonword repetition accuracy (Table 1).
These traits are thought to tap into a number of underlying processes involved in written and spoken language. For example, nonword reading relies heavily on basic decoding skills: translating graphemes one by one into phonemes 7 , while spelling utilizes lexical and orthographic knowledge: understanding of permissible letter patterns and how they are arranged in specific words 8 . Phoneme awareness measures the ability to distinguish and manipulate the separate phonemes in spoken words 9 . Nonword repetition tasks tap into speech perception, phonological short term memory and articulation 10 . A total of 22 cohorts, aggregated by the GenLang consortium combined with several publicly available datasets, provided data for one or several of these traits (Supplemental Table 1 and 2, Supplemental Figure 1 and 2). The cohorts connected in the GenLang network have either been originally ascertained through a proband with language/reading disorder (DLD/SLI or dyslexia) or were sampled from the general population; all cohorts include quantitative phenotypic data gathered via validated psychometric tests. Some of the samples are birth cohorts, and some involve family or twin designs.
The phenotype data were collected across an array of different ages, test instruments and languages (primarily English, and also Dutch, Spanish, German, French, Finnish, and Hungarian). We reduced heterogeneity of assessment age by excluding individuals over 18 years of age (except for three cohorts, see Supplemental Methods), and, where phenotype data were available from the same participant at multiple ages, by choosing the age that matched best with the assessment ages of the largest cohort(s). We limited the heterogeneity introduced by different test instruments by only including those that measured the phenotypes described in our analysis plan (Supplemental Notes), and, in cases where data from more than one test instrument were available, by selecting the test instrument that was used by the largest number of cohorts.  Figure 3) and LDSC ratios (Supplemental Table 4 Table 3).

A genome-wide significant locus associated with word reading
For evaluating statistical significance of SNP associations, we determined an appropriate threshold that was adjusted for multiple-testing based on the correlation structure of our five reading-/languagerelated traits, as estimated using phenoSpD. The significance threshold for assessing the GWAS metaanalysis results was thus set to 5x10 -8 / 2.15 independent traits = 2.33x10 -8 . We identified a genomewide significant locus associated with word reading (rs11208009 C/T on chromosome 1, p=1.10x10 -8 ) (Supplemental Figure 5). Rs11208009 has not shown association with general cognitive performance (IQ) or educational attainment, but other SNPs in LD with rs11208009 (r 2 >0.6) have been associated with triglyceride and total cholesterol levels in blood in previous GWAS studies (Supplemental Table   5). Three genes are located in the vicinity of rs11208009 and SNPs in LD (r 2 >0.6): DOCK7, encoding a guanine nucleotide exchange factor important for neurogenesis 12 ; ANGPTL3, which encodes a growth factor specific for the vascular endothelium that is expressed specifically in the liver 13 ; and USP1, encoding a deubiquitinating enzyme specific for the Fanconi anemia pathway 14 . The associated locus harbours an eQTL regulating DOCK7 and ATG4C (another nearby gene which encodes an autophagy regulator 15 ) in the cerebellum, and DOCK7, ATG4C and USP1 in blood samples (Supplemental Table 6).
Genome-wide significant loci were not identified for the other traits. Supplemental Table 7 lists all results with p<1x10 -6 .
Traits related to reading and language are highly correlated at the genetic level All five traits showed significant SNP heritability, with LDSC-based estimates ranging from 0.13 for nonword repetition to 0.26 for nonword reading (Supplemental Table 4), indicating that the captured common genetic variation accounts for a substantive proportion of the phenotypic variance in these skills. Pairwise genetic correlation analyses showed significant overlap among the reading-and language-related traits ( Figure 1A; Supplemental  17 showed small but significant positive genetic correlations with word reading, nonword reading and nonword repetition. Genomic structural equation modelling (GenomicSEM) 19 was used to further investigate the shared genetic architecture of the five reading-and language-related traits, together with performance IQ, full-scale IQ and educational attainment (Supplemental Table 9). In the final model ( Figure 1B), the first factor explains variation in nonword reading, spelling and phoneme awareness, word reading and full-scale IQ. For the first three of these traits, there is no evidence for additional genetic influences, suggesting high genetic similarity. The second factor explains additional variation in full-scale IQ, and is also related to performance IQ and educational attainment. The third factor explains variation in nonword repetition, word reading and educational attainment. Factors one and three are highly correlated, indicating that the genetic architecture underlying word reading does not differ much from nonword reading, spelling and phoneme awareness. Nonword repetition, on the other hand, is genetically more distinct, as indicated by evidence for specific genetic influences not captured by the model. Specific genetic influences were also evident for full-scale IQ and educational attainment. Thus, although reading-and language-related traits show genetic overlaps with full-scale IQ and educational attainment, the model indicates that these traits also have unique unshared components, in line with genetic correlation estimates that are lower than 1.

Genes and SNPs previously associated with reading-/language-related traits and disorders
Previous GWAS studies on reading-/language-related traits and disorders (Supplemental Table 10) have identified very few associations exceeding genome-wide significance. In those prior studies, there were a total of 48 independent SNPs meeting a less stringent threshold of p<1x10 -6 in the respective GWAS. We ran look-ups of each of those SNPs in our GenLang GWAS meta-analysis results (Supplemental Table 11). Where SNP associations passed a threshold adjusted for multiple testing of 48 SNPs and 2.15 independent GenLang traits (p<4.84x10 -4 ), we then reran the association analyses after exclusion of the original cohort(s) in which the association was first identified, to evaluate independent effects beyond those of the earlier study. According to these criteria, only one SNP, rs1555839, previously associated with rapid automatized naming in the GRaD cohort 5 , yielded a significant signal in the remainder of the GenLang cohorts, showing association with spelling (p=3.33x10 -4 ). Rs1555839 is one of five SNPs that reached the threshold for genome-wide significance in the original GWAS study of the GRaD cohort.
Some 20 genes have been described as candidate genes for reading-/language-related traits and disorders, based on a range of different mapping approaches 2 . Supplemental Table 12 gives genebased p-values from our GenLang GWAS meta-analysis, calculated by MAGMA 20 , for each of these genes. Variation in one of these genes, namely DCDC2, showed association with nonword reading that passed the significance threshold for multiple testing of 20 genes and 2.15 independent GenLang traits (p<0.0012). DCDC2 was originally identified in a linkage region for dyslexia susceptibility, and SNPs in and near this gene were subsequently associated with dyslexia in candidate gene studies 21 , although some investigations including a meta-analysis of seven studies have failed to support this 22 . Many dyslexic children perform poorly on nonword reading tests 23 . No single candidate SNP highlighted in prior studies of DCDC2, nor in any other candidate gene, was significantly associated with any of the traits in the GWAS meta-analysis results after correction for 54 SNPs and 2.15 independent GenLang traits (p<4.31x10 -4 ) (Supplemental Table 13).

Multivariate GWAS analysis for word reading
To improve the power of our GWAS meta-analysis and to gain insights into biological pathways shared by the traits of interest, utilizing the high genetic correlations between the traits, a multivariate GWAS analysis was performed with MTAG 24 Table 14). Functional MRI studies have linked this region to different aspects of written and spoken language processing [25][26][27][28] . To investigate further how the different reading-and language-related traits contribute to these findings, we went on to specifically assess genetic correlations for banks of the left STS with the results from the original univariate GenLang GWAS meta-analyses, finding consistent correlations (range 0.18-0.23, Supplemental Table 15).

Genetic correlation with traits from the UK biobank and brain-related traits from LD hub
We went on to assess genetic correlations of the multivariate word reading results with 20 cognitive, education, neurological, psychiatric and sleeping-related traits and 515 additional UK Biobank traits using LD hub. To further investigate overlaps with IQ, genetic correlations between these 535 traits and the published GWAS summary statistics for full-scale IQ 16 Figure 3. The genetic correlations of reading and full-scale IQ with other traits had similar directions and effect sizes, but several differences were evident. Cognitive and education-related traits were more highly genetically correlated with full-scale IQ than reading. In addition, several psychiatric and wellbeing traits showed significant (negative) genetic correlations with full-scale IQ but not with reading, for example depressive symptoms, cross-disorder susceptibility (from the Psychiatric Genomics Consortium GWAS), and tense, hurt and nervous feelings. In contrast, several traits related to physical health and lifestyle showed larger genetic correlations with the multivariate reading results than with full-scale IQ, including BMI, reduced alcohol intake as a health precaution, and usual walking pace. were obtained from the Social Science Genetic Association Consortium 16

Evolutionary analysis
We used LDSC heritability partitioning 29 Table 17). These regions are large stretches in the human genome that are depleted for Neanderthal ancestry, possibly due to critical functions in Homo sapiens and intolerance to gene flow 30 . The dashed line shows the p-value threshold for significant enrichment after Bonferroni-correction for testing 5 annotations. Results are also available in Supplemental Table 17.

Functional enrichment using heritability partitioning and MAGMA gene property analysis
We applied LDSC heritability partitioning analyses 29 Table 18). H3K4me1 is considered a marker for enhancer regions. Next, we used MAGMA gene property analysis 20 to study whether the multivariate reading results are enriched in a specific tissue or brain cell type, using tissue-specific and cell type-specific gene expression data in FUMA 31,32 . As MAGMA corrects for average expression, each comparison can only answer the question of whether the tissue or cell type is more related to the multivariate reading results than the average of the tissues or cell types in the dataset. After correction for 83 tissues (p<6.02x10 -4 ), no relation was found of the multivariate reading results with tissue-specific gene expression patterns of adult tissues from GTEx and brain tissues of a specific (developmental) time from Brainspan (Supplemental Figure 10 and Table 19). In the cell type-specific gene expression analysis, three single-cell RNA-sequencing datasets of embryonic, fetal and adult brain tissue were used. After correction for 142 cell types (p<3.09x10 -4 ), the multivariate reading results were significantly associated with one of the mature neurons from the fetal dataset: red nucleus neurons (Supplemental Figure 11 and Table 20). This association could reflect an association with this specific nucleus, or with the higher maturity of the neurons compared to the other cell types in the fetal dataset. The red nucleus is a large subcortical structure in the ventral midbrain that is part of the olivocerebellar and cerebello-thalamo-cortical systems. It plays an important role in locomotion and non-motor behaviour in various animal species, and in humans might also play a role in higher cortical functions 33 . A trend was observed for a relation with GABAergic neurons from the embryonic dataset.

Discussion
We performed GWAS meta-analysis of five quantitative reading-and language-related traits: word reading, nonword reading, spelling, phoneme awareness and nonword repetition, in sample sizes (up to ~34,000 participants) that are substantially larger than any prior genetic analyses of reading and/or language skills assessed with neuropsychological tools (n=1,331-10,819) 3-5,34-36 . We identified genomewide significant associations for word reading (rs11208009 at chromosome 1, p=1.10x10 -8 ), highlighting DOCK7, ATG4C, ANGPTL3 and USP1 as potential new candidate genes. Other SNPs from the locus have been previously associated with triglyceride and cholesterol levels 37  The five reading-and language-related traits in GenLang also showed high genetic overlaps with fullscale IQ and educational attainment, in line with the widespread pleiotropy found between many aspects of cognitive functioning including language, reading, mathematics and general cognitive ability 4,46 . Given the evidence of a highly shared genetic architecture between word reading, nonword reading, spelling, and phoneme awareness, we performed a multivariate analysis with MTAG, to increase our power and reduce multiple-testing constraints for follow-up analyses of biological pathways. The multivariate word reading results showed genetic correlations with phenotypes related to education, eyesight, chronotype, wellbeing, lifestyle, physical health, exercise and socioeconomic status. These associations may in a large part reflect shared biology between the reading-/languagerelated traits and cognitive abilities, educational attainment and socioeconomic status. No significant genetic correlations were observed with neuropsychiatric disorders based on available data in LDhub.
However, we note that summary statistics from the current investigation have also been used to investigate genetic overlaps with self-report of dyslexia diagnosis in an independent GWAS by 23andMe (~52k cases), yielding substantial negative genetic correlations between GenLang scores on phoneme awareness and spelling appear to be genetically correlated with better performance in written than in oral exams. This may reflect the greater importance of phoneme awareness and spelling for proficient writing than for oral language. "Component 4", corresponding to relatively better performance in Danish (the native language of the participants in that study) as compared to performance in English, showed negative genetic correlations with our GenLang nonword repetition measure, possibly reflecting the particular importance of verbal short-term memory in second-language learning 50,51 . The results of the GWAS-by-subtraction, previously suggested to represent so-called "non-cognitive" abilities related to educational attainment such as motivation, curiosity and persistence, were genetically correlated with word reading, nonword reading, and nonword repetition in GenLang.
Human abilities to process spoken and written language depend on an array of distributed brain circuits [52][53][54][55][56] . We performed a genetic correlation analysis of our multivariate GenLang GWAS with summary statistics from 58 MRI-based neuroanatomical phenotypes, chosen because they concerned brain areas and/or tracts with known links to language processing [53][54][55][56] . This selection included brain regions involved in early modality-specific pre-processing of spoken and written language, for example the auditory regions in transverse temporal gyrus, also known as Heschl's gyrus, and the superior temporal gyrus (spoken language), and several other temporal regions (written language). Other regions included are involved in aspects of language comprehension, including the middle temporal gyrus, the inferior parietal area and parts of the inferior frontal gyrus. We identified a significant genetic correlation of our multivariate GWAS with cortical surface area of the banks of the superior temporal sulcus (STS) on the left hemisphere. The STS is a location where the processing of spoken and written language converges, in between modality-specific pre-processing and language comprehension [25][26][27][28] . A broad range of language-related functions have been previously linked with the left STS through (meta-analyses of) fMRI and PET studies, including those essential for the reading-and language-related traits included in the GWAS meta-analyses: sublexical processing of speech 57,58 and representation of phonological word forms 59 . The importance of this brain area for reading-related traits is also evident from a meta-analysis of structural MRI studies that found lower grey matter volume in the left STS related to reading disability and poor reading comprehension 60  These loci are enriched for promoters and regions conserved in primates 63 , as well as enhancers present in many tissues and those specific for fetal brain and muscle 64 . Only brain enhancers show signs of stringent purifying selection against Neanderthal variation, indicating that these regions mark parts of the genome where variation has a high probability of deleterious consequence 64 . Our results may reflect divergent selection acting on skills that underlie language in humans compared to Neanderthals. However, in our analyses we could not pinpoint any specific timeframe of human evolution during which genetic variants associated with reading-and language-related traits were introduced.
Regarding functional implications of our findings, heritability partitioning analyses of the multivariate GenLang GWAS results identified enrichment in enhancer regions present in fetal brain tissue and the adult germinal matrix. The enhancer regions of the germinal matrix are highly similar to those of fetal brain tissues, and not the other adult brain tissues we studied, likely reflecting the neural stem cell population present in that tissue 65 . In these analyses there was no specific association with one particular brain cell type. However, follow-up work with MAGMA, using single-cell RNA sequencing data from fetal, embryonic and adult brain, uncovered a significant association with fetal neurons from the Red nucleus, which may relate to the more adult state of these neurons compared to the other cell types in the fetal dataset 66 . The MAGMA analysis could not be used to test for replication of the association with (fetal) brain tissue of the LDSC heritability partitioning, as results in this case are corrected for the association with the average expression of the dataset 20,31 , and the datasets with fetal data only included brain samples.
We used the GWAS meta-analysis results to investigate evidence for association of previously reported candidate SNPs/genes and suggestive genome-wide screening results from prior studies of reading-/language-related traits and disorders. Out of the 54 candidate SNPs and 20 candidate genes that we assessed (none of which met genome-wide significance), only DCDC2 yielded an association that survived correction for multiple testing in the context of targeted replications. This locus showed association only at the gene-based level and with one trait: nonword reading. Some previously reported associations in the literature could reflect the specific language, phenotype, or recruitment procedure of the cohort in which the gene or variant was investigated, and/or differences between contributions of common and rare variation at a locus of interest. Yet the lack of support here also suggests that false positive results have made an impact on the field, most likely related to limited sample size in prior reports, which is known to elevate the risk of type-1 error 67 . Few SNPs have shown genome-wide significant associations (p<5 x 10 -8 ) in previous GWAS studies of quantitative reading-/language-related traits 3,5 . In our GWAS-meta analysis, the SNP rs1555839, previously associated with rapid automatized naming and rapid alternating stimulus 5 , was significantly associated with spelling.
Overall, these results highlight the need for a genome-wide perspective, and the importance of large well-powered samples, if we are to obtain reliable insights into the role of common genetic variants in language-and reading-related traits.
Reading-and language-related phenotypes pose special challenges for scaling-up genetic analysis, since psychometric assessments can be labour-intensive to administer and score, and because of the heterogeneity introduced by differences in assessment tools, ages, populations and languages, among other factors 68,69 . One-item questions have enabled increases in sample size for GWAS of a wide range of traits and disorders, especially when available through large resources such as the UK Biobank. For example, a recent paper reporting development of a polygenic index repository included a polygenic score for a one-item question asking adult participants (customers of 23andMe) "at what age did you start to read?" 70 . This "age-started-reading" item was somewhat confusingly referred to in that study as "childhood reading". While answers to the question were available in a large sample (n of >173k), the item has substantial limitations, including its reliance on adult self-report of the timing of specific events from early childhood, ambiguities over how to interpret the phrase "start to read", as well as confounding effects of regional, cultural and historical differences in the age at which children first receive formal reading instruction. Moreover, the age a person started to read is a poor proxy of This first wave of analysis from GenLang represents the largest GWAS meta-analyses for direct quantitative assessments of reading-and language-related abilities to date, including 22 cohorts with data available for at least one of the phenotypes. Nonetheless, although substantially increased over prior work in this area, sample sizes may still be considered relatively modest compared to the stateof-the-art for genetic association analyses of other complex traits. While they captured a significant proportion of the genetic variation underlying each phenotype, yielding several novel insights into the associated biology, detection of genome-wide significant loci was still limited. In addition, a number of phenotypes of interest (such as those that tap into syntactic skills) could not (yet) be pursued due inadequate sample sizes, even when combining data available from multiple cohorts. We note that despite our best efforts at harmonizing the included datasets, and limited signs of heterogeneity in the results based on Cochran Q statistics, LDSC intercepts and genetic correlations between subsets of the data, we cannot fully exclude that heterogeneity is introduced by the inclusion of data from different assessment tools, languages, and ages. The choice of assessment tools for future collection of readingand language-related phenotypes for genomic studies, to increase the sample sizes of these GWAS meta-analyses and also to collect additional language-related phenotypes, should therefore be based at least partially on optimal matching with existing data. At the same time, we should invest in facilitating and simplifying the collection of language-related phenotypes, in part by developing and optimizing test batteries that could be reliably administered online in web/app-based settings. Indeed, these are major areas of focus for the GenLang consortium moving forward.
In summary, our GWAS meta-analyses of five reading-and language-related phenotypes in sample sizes of up to ~34,000 participants have demonstrated significant SNP heritability for all traits, and identified genome-wide significant associations with word reading on chromosome 1 (rs11208009, p=1.098 x 10 -8 ). Structural equation models revealed a single factor accounting for much of the genetic architecture underlying word reading, nonword reading, spelling and phoneme awareness, prompting a multivariate GWAS analysis of these four highly correlated traits. The multivariate results were genetically correlated with cortical surface area of the banks of the left STS, a brain region where the processing of spoken and written language comes together. Finally, partitioned heritability analyses showed enrichments in fetal brain enhancers, highlighting links to early brain development, and in Neanderthal depleted regions, suggesting that genomic regions associated with emerging languagerelated skills in Homo sapiens may have been intolerant to gene flow from other archaic hominins.
These efforts by GenLang open up novel avenues for deciphering the biological underpinnings of spoken and written language.

Phenotypes
Word reading accuracy was measured as the number of correct words read aloud from a list in a time restricted or unrestricted fashion. Nonword reading accuracy was measured as the number of nonwords read aloud correctly from a list in a time restricted or unrestricted fashion. A nonword is a group of phonemes that looks or sounds like a word, obeys the phonotactic rules of the language, but has no meaning. Spelling accuracy was measured by the number of words correctly spelled orally or in writing, after being dictated as single words or in a sentence. Phoneme awareness was measured in phoneme deletion/elision and spoonerism tasks. Nonword repetition accuracy was measured by the number of nonwords or phonemes repeated aloud correctly. Performance IQ, included for follow-up analyses of the results of the reading-and language-related traits, employing a matching study design, was measured with nonverbal subtests of broader batteries that test for cognitive skills. The Supplemental Notes contain further details of phenotyping procedures, and Supplemental Table 1 provides an overview of the tests used to measure these phenotypes for each cohort.

Study cohorts
The meta-analyses included GWAS summary statistics from 22 independent cohorts. These were,  Different measures for the reading-and language-related traits had been assessed in each cohort and could be included in the GWAS meta-analyses (see Supplemental Notes for details of each measure, and Supplemental Table 1 for an overview of the included measures and sample sizes for each cohort).
Data from children, adolescents and young adults were included in the meta-analysis (age at time of assessment ranging from 5 years to 26 years). Outlier samples, based on the phenotype data (>4 SD), were removed for each phenotype separately. The phenotype data were then adjusted for covariates (age, age 2 , sex and ancestry principal components; age-normed phenotypes were not adjusted for age and age 2 ). See Supplemental Table 2 for details on the included covariates per cohort. Phenotype data for word reading, nonword repetition and performance IQ were standardized to z-scores. Phenotype data for spelling, phoneme awareness and nonword reading were rank transformed to acquire normally distributed data for all cohorts. For follow-up analyses involving multiple phenotypes -the genetic correlation analysis with LDSC 71 and the multivariate GWAS analysis with MTAG 24 -a separate rank transformation was performed for word reading, nonword repetition and performance IQ to further harmonize the phenotype data processing. For male-and female-only association analyses, phenotype data were filtered, adjusted and transformed for male and female subsets separately.
The genotype data were subjected to stringent quality control according to a detailed analysis plan following standard procedures for GWAS, including SNP filters for minor allele frequency, call rate and Hardy-Weinberg equilibrium and sample filters including missingness and (for cohorts of unrelated individuals) relatedness. Cohort-specific details on quality control can be found in Supplemental Table   2. Individuals of European ancestry were identified using PCA-based analysis of genetic diversity.
Individuals with non-European ancestry were excluded from all cohorts, with exception of ABCD, GenR, GRaD and PING. For the ABCD, GenR and PING cohorts, two association analyses were performed, one including and one excluding individuals of non-European ancestry. Results of datasets including individuals of non-European ancestry were excluded from follow-up analyses with LDSC 71 , MAGMA 20 and MTAG 24 , since these methods utilize LD information based on European-ancestry reference data when raw genotyping data is not available (as is the case for this GWAS meta-analysis). The X chromosome was included by all cohorts except for NeuroDys. Genotype data were imputed using the Haplotype Reference Consortium version 1.1 panel for 20 out of the 22 cohorts, and using the 1000 Genomes Project Phase 3 reference panel for the GRaD and SYS cohorts. Single variant association analyses were performed using linear regression methods with the imputed additive genotype dosages for the full dataset, and for males and females separately. For the X chromosome, males were treated as homozygous diploids. Descriptions for each cohort of the samples, phenotype measures, genotyping, quality control and analysis procedures can be found in Supplemental Table 1 and 2.

Meta-analyses
The summary statistics for each GWAS cohort for each trait were subjected to stringent quality control measures. SNPs were excluded from the meta-analyses based on low imputation quality scores <0.  (Table 1 and Supplemental Table 1). SNPs for which data were available from less than 5,000 individuals were excluded from the meta-analysis results. For the heritability and genetic correlation analyses with LDSC, separate meta-analyses without genomic control correction were performed, because the LDSC regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control 71 . Only data of individuals of the PCA-selected European subgroup were included, to allow use of pre-computed LD scores, as genotyping data of all cohorts was not available at a single site to allow the computation of LD-scores for the full partially admixed dataset.
To accommodate the multiple-testing burden present in performing separate meta-analyses for the five reading-and language-related traits, while taking into account the high phenotypic correlations between them, we calculated the effective number of independent variables (VeffLi) from the metaanalysis results using PhenoSpD 74 (v1.0.0). The Bonferroni-corrected genome-wide significant P-value threshold was determined at 2.33x10 -8 (5x10 -8 / 2.15 independent traits).
We investigated the degree to which differences between cohorts in age distribution and phenotyping tools introduced heterogeneity in the meta-analysis results. First, Cochran's Q test statistics, which assess whether estimated effect sizes are homogeneous across studies, were obtained with METAL, visualized with quantile-quantile plots (Supplemental Figure 3) and used to decide between a fixedeffects and random-effects meta-analysis. Based on these analyses, a fixed-effects meta-analysis was performed for all traits except for nonword repetition. Second, LDSC intercept and ratio were inspected to distinguish polygenicity from confounders 71 . Third, meta-analyses of subsets of the cohorts were performed, split up by mean age or the type of reading test applied. Heterogeneity caused by difference in mean age and type of reading test was studied by calculating genetic correlations between data subsets using LDSC 11 . In addition, meta-analyses for male-and female-only subsets of the data were run as sensitivity analyses, to investigate the degree to which males and females might show differences in SNP heritability for these traits and show genetic overlap as calculated with genetic correlation analyses.

GenomicSEM
To investigate the high genetic correlations between the reading-and language-related traits, and with cognitive performance and educational attainment, we used genomic structural equation modeling (GenomicSEM; version 0.03) 19 to model the joint genetic architecture. Summary statistics of the five GenLang traits and performance IQ, and published GWAS summary statistics for cognitive performance and educational attainment from the Social Science Genetic Association Consortium (https://www.thessgac.org/data) 16 were used as input. GenomicSEM first runs multivariable LDSC to obtain genetic covariance and sampling covariance matrices. Next, exploratory factor analyses were run using a maximum-likelihood factor analysis, for models with one to four factors. Confirmatory . PhenoSpD 74 was used to calculate the effective number of independent variables (VeffLi) to inform the multiple testing correction. A total of 18.28 independent comparisons were performed in Figure 1, the p-value threshold is therefore set to p=2.74*10 -3 .

Multivariate GWAS analysis
Publicly available GWAS summary statistics of neuroimaging traits were obtained via the Oxford Brain Imaging Genetics Server 75 (http://big.stats.ox.ac.uk/). Out of 3,144 brain imaging-derived traits with summary statistics available from the UK Biobank, a total of 58 neuroanatomical phenotypes were selected based on their relevance to language processing. Brain imaging traits encompassed surfacebased morphometric (SBM) and diffusion tensor imaging (DTI) phenotypes. For SBM, data were originally generated with Freesurfer by parcellation of the white surface (the surface area between the white and grey matter) using the Desikan-Killiany atlas. Both cortical surface area and mean cortical thickness were selected for brain areas that overlapped regions previously related to language processing, based on literature review [53][54][55][56] . For DTI, tracts spanning the extended language network 52 were selected, and fractional anisotropy values derived from both tract-based-spatial statistics and probabilistic tractography were used (both mean and weighted-mean fractional anisotropy). Again, in LD Hub 76 (v1.9.3, http://ldsc.broadinstitute.org/ldhub/). These 535 traits comprise all phenotypes available through LDhub with relevance to brain function (beyond neuroimaging traits), and all available traits from the UK Biobank. Genetic correlations between these 535 traits and the published GWAS summary statistics for full-scale IQ 16 were obtained as well. The Bonferroni corrected p-value threshold for significance of the LDhub results was 0.05 / (535*2) = 4.67x10 -5 . Genetic correlations may reflect pleiotropy, correlation between causal loci or spurious associations, and can inform about shared biological mechanisms and causal relationships between traits 77 .

Functional Mapping and Annotation of GWAS meta-analysis results
The  cell types data of the middle temporal gyrus; http://celltypes.brain-map.org/) brain samples were used in the Cell Type analysis in FUMA 31 . MAGMA performs a one-sided test which essentially assesses the positive relationship between tissue specificity and genetic association of genes.

Partitioning heritability of chromatin and evolutionary signatures
LDSC heritability partitioning 29 was used to estimate the enrichment of heritability of the MTAG results in annotations reflecting tissue-specific chromatin modification patterns. Annotations were based on data from the Roadmap Epigenomics project and ENTEX, processed by Finucane et al. 78 .
In addition, LDSC heritability partitioning was used to study the association with several annotations Neanderthal-depleted regions 30 . All enrichments are controlled for the baseline LD v2 model 29,84 , and heritability enrichment in human adult and fetal HGEs were additionally controlled for adult and fetal brain active regulatory elements.

Data availability
The full GWAS summary statistics will be made freely available through the GWAS Catalog (https://www.ebi.ac.uk/gwas/) and the website of the GenLang network (www.genlang.org).

Code availability
Code used to perform the meta-analysis and follow-up analyses is available at https://gitlab.gwdg.de/else.eising/genlang_quantitative_trait_gwasma.  We are extremely grateful to all the children, twins, families and participants who took part and are taking part in the 22 cohorts whose data contributed to these GWAS meta-analyses. We also thank the staff working on the different cohorts, including volunteers, study coordinators, interviewers, teachers, nurses, research scientists, general practitioners, midwives, psychologists, psychometrists, computer and laboratory technicians and colleagues who assisted in the quality control and preparation of the imputed GWAS data and the pharmacies and hospitals that were involved.

Author contributions
We gratefully acknowledge iPSYCH for sharing their summary statistics. The iPSYCH team was

Adolescent Brain Cognitive Development SM Study
Data used in the preparation of this article were obtained from the Adolescent Brain Cognitive Development SM (ABCD) Study (https://abcdstudy.org), held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children age 9-10 and follow them over 10 years into early adulthood. implemented the study and/or provided data but did not necessarily participate in the analysis or writing of this report. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. The ABCD data used in this report are summarized in this NDA study (https://doi.org/10.15154/1522876).

Aston cohort
Data collection for the Aston Cohort has been supported by funding from the European Union Horizon

Avon Longitudinal Study of Parents and their Children (ALSPAC)
We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. publication is the work of the authors and they will serve as guarantor for the ALSPAC contribution to this article.

The Basque Center on Cognition, Brain and Language (BCBL)
Data collection was supported by the Basque Government through the BERC program, and by the Agencia Estatal de Investigación through BCBL Severo Ochoa excellence accreditation.

Brisbane Adolescent Twins Study (BATS)
The research was supported by the Australian Research Council (A7960034, A79906588, A79801419, DP0212016 and DP0343921), with genotyping funded by the NHMRC (Medical Bioinformatics Genomics Proteomics Program, 389891).

Colorado Learning Disabilities Research Center (CLDRC)
The CLDRC was supported by a NICHD grant P50 HD 27802.

The Early Language in Victoria Study (ELVS)
Funding by the NHMRC grant number 436958 is gratefully appreciated.

Familial Influences on Literacy Abilities (FIOLA)
The FIOLA cohort (PIs EvB and PFdJ) is supported by the University of Amsterdam, the MPI Nijmegen, and by fellowships awarded to EvB (NWO's Rubicon 446-12-005 and VENI 451-15-017). We are grateful to the NEMO Science Museum.

Generation R
The Generation R Study is conducted by the Erasmus Medical Center in close collaboration with School

UK Dyslexia
Cohort recruitment and data collection was supported by Wellcome Trust (076566/Z/05/Z and 075491/Z/04) and The Waterloo Foundation (grants to JBTa and SP; 797-1720). Genotype data were generated and analyses funded by the EU (Neurodys, 018696) and the Royal Society (UF100463 grant to SP).

York
The York cohort was funded by Wellcome Trust Programme Grant 082036/B/07/Z.

Conflict of interest
The authors declare no conflict of interest.