RT Journal Article SR Electronic T1 Hierarchical Compression of C. elegans Locomotion Reveals Phenotypic Differences in the Organisation of Behaviour JF bioRxiv FD Cold Spring Harbor Laboratory SP 029462 DO 10.1101/029462 A1 Gomez-Marin, Alex A1 Stephens, Greg J. A1 Brown, André E.X. YR 2016 UL http://biorxiv.org/content/early/2016/06/10/029462.abstract AB Regularities in animal behaviour offer insight into the underlying organisational and functional principles of nervous systems and automated tracking provides the opportunity to extract features of behaviour directly from large-scale video data. Yet how to effectively analyse such behavioural data remains an open question. Here we explore whether a minimum description length principle can be exploited to identify meaningful behaviours and phenotypes. We apply a dictionary compression algorithm to behavioural sequences from the nematode worm Caenorhabditis elegans freely crawling on an agar plate both with and without food and during chemotaxis. We find that the motifs identified by the compression algorithm are rare but relevant for comparisons between worms in different environments, suggesting that hierarchical compression can be a useful step in behaviour analysis. We also use compressibility as a new quantitative phenotype and find that the behaviour of wild-isolated strains of C. elegans is more compressible than that of the laboratory strain N2 as well as the majority of mutant strains examined. Importantly, in distinction to more conventional phenotypes such as overall motor activity or aggregation behaviour, the increased compressibility of wild isolates is not explained by the loss of function of the gene npr-1, which suggests that erratic locomotion is a laboratory-derived trait with a novel genetic basis. Because hierarchical compression can be applied to any sequence, we anticipate that compressibility can offer insight into the organisation of behaviour in other animals including humans.