PT - JOURNAL ARTICLE AU - Adeelia S. Goffe AU - Julia Fischer AU - Holger Sennhenn-Reulen TI - Bayesian inference and simulation approaches improve the assessment of Elo-scores in the analysis of social behaviour AID - 10.1101/160671 DP - 2017 Jan 01 TA - bioRxiv PG - 160671 4099 - http://biorxiv.org/content/early/2017/07/07/160671.short 4100 - http://biorxiv.org/content/early/2017/07/07/160671.full AB - The construction of rank hierarchies based on agonistic interactions between two individuals (”dyads”) is an important component in the characterization of the social structure of groups. To this end, winner-loser matrices are typically created, which collapse the outcome of dyadic interactions over time, resulting in the loss of all information contained in the temporal domain. Methods that track changes in the outcome of dyadic interactions (such as “Elo-scores”) are experiencing increasing interest. Critically, individual scores are not just based on the succession of wins and losses, but depend on the values of starting scores and an update (”tax”) coefficient. Recent studies improved existing methods by introducing a point estimation of these auxiliary parameters on the basis of a maximum likelihood (ML) approach. For a sound assessment of the rank hierarchies generated this way, we argue that measures of uncertainty of the estimates, as well as a quantification of the robustness of the methods, are also needed.We introduce a Bayesian inference (BI) approach using “partial pooling”, which rests on the assumption that all starting scores are samples from the same distribution. We compare the outcome of the ML approach to that of the BI approach using real-world data. In addition, we simulate different scenarios to explore in which way the Elo-score responds to social events (such as rank takeovers), and low numbers of observations.Estimates of the starting scores based on “partial pooling” are more robust than those based on ML, also in scenarios where some individuals have only few observations. Our simulations show that assumed rank differences may fall well within the “uncertain” range, and that low sampling density, unbalanced designs, and coalitionary leaps involving several individuals within the hierarchy may yield unreliable results.Our results support the view that Elo rating can be a powerful tool in the analysis of social behaviour, when the data meet certain criteria. Assessing the uncertainty greatly aids in the interpretation of results. We strongly advocate running simulation approaches to test how well Elo scores reflect the (simulated) “true” structure, and how sensitive the score is to “true” changes in the hierarchy.