PT - JOURNAL ARTICLE AU - Tim Sainburg AU - Anna Mai AU - Timothy Q Gentner TI - Long-range sequential dependencies precede complex syntactic production in language acquisition AID - 10.1101/2020.08.19.256792 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.08.19.256792 4099 - http://biorxiv.org/content/early/2020/08/20/2020.08.19.256792.short 4100 - http://biorxiv.org/content/early/2020/08/20/2020.08.19.256792.full AB - To convey meaning, human language relies on hierarchically organized, long-range relationships spanning words, phrases, sentences, and discourse. The strength of the relationships between sequentially ordered elements of language (e.g., phonemes, characters, words) decays following a power law as a function of sequential distance. To understand the origins of these relationships, we examined long-range statistical structure in the speech of human children at multiple developmental time points, along with non-linguistic behaviors in humans and phylogenetically distant species. Here we show that adult-like power-law statistical dependencies precede the production of hierarchically-organized linguistic structures, and thus cannot be driven solely by these structures. Moreover, we show that similar long-range relationships occur in diverse non-linguistic behaviors across species. We propose that the hierarchical organization of human language evolved to exploit pre-existing long-range structure present in much larger classes of non-linguistic behavior, and that the cognitive capacity to model long-range hierarchical relationships preceded language evolution. We call this the Statistical Scaffolding Hypothesis for language evolution.Significance Statement Human language is uniquely characterized by semantically meaningful hierarchical organization, conveying information over long timescales. At the same time, many non-linguistic human and animal behaviors are also often characterized by richly hierarchical organization. Here, we compare the long-timescale statistical dependencies present in language to those present in non-linguistic human and animal behaviors as well as language production throughout childhood. We find adult-like, long-timescale relationships early in language development, before syntax or complex semantics emerge, and we find similar relationships in non-linguistic behaviors like cooking and even housefly movement. These parallels demonstrate that long-range statistical dependencies are not unique to language and suggest a possible evolutionary substrate for the long-range hierarchical structure present in human language.Competing Interest StatementThe authors have declared no competing interest.