PT - JOURNAL ARTICLE AU - Peter Ranacher AU - Nico Neureiter AU - Rik van Gijn AU - Barbara Sonnenhauser AU - Anastasia Escher AU - Robert Weibel AU - Pieter Muysken AU - Balthasar Bickel TI - Contact-tracing in cultural evolution: a Bayesian mixture model to detect geographic areas of language contact AID - 10.1101/2021.03.31.437731 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.03.31.437731 4099 - http://biorxiv.org/content/early/2021/04/01/2021.03.31.437731.short 4100 - http://biorxiv.org/content/early/2021/04/01/2021.03.31.437731.full AB - When speakers of two or more languages interact, they are likely to influence each other: contact leaves traces in the linguistic record, which in turn can reveal geographic areas of past human interaction and migration. However the complex, multi-dimensional nature of contact has hindered the development of a rigorous methodology for detecting its traces. Specifically, other factors may contribute to similarities between languages. Inheritance (a property is passed from an ancestor to several descendant languages), and universal preference (a property is universally preferred), may both overshadow contact signals. How can we find geographic contact areas in language data, while accounting for the confounding effects of inheritance and universal preference? We present sBayes, an algorithm for Bayesian clustering in the presence of confounding effects. The algorithm learns which similarities in a set of features are better accounted for by confounders, and which are due to contact effects. Contact areas are free to take any shape or size, but an explicit geographic prior ensures their spatial coherence. We test the clustering method on simulated data and apply it in two case studies to reveal language contact in South America and the Balkans. Our results are supported by —mostly qualitative— findings from previous studies. While we focus on the specific problem of language contact, the method can also be used to uncover other traces of shared history in cultural evolution, and more generally, to reveal latent spatial clusters in the presence of confounders.Competing Interest StatementThe authors have declared no competing interest.