Abstract
Two language laws have been identified as manifestations of universal principles of animal behaviour, both acting on the organisation of numerous vocal and behavioural communicative systems. Zipf’s law of brevity describes a negative relationship between behavioural length and frequency of behaviour. Menzerath’s law defines a negative correlation between the number of behaviours in a sequence and average length of the behaviour composing it. Both laws have been linked with the information-theoretic principle of compression, which tends to minimise code length. We investigate the presence of these two laws in the repertoire of chimpanzee sexual solicitation gestures. We find that chimpanzee solicitation gestures do not follow either Zipf’s law of brevity or Menzerath’s law consistently. For the second time in ape gestural communication, evidence supporting Zipf’s law of brevity was absent, and, here, the presence of Menzerath’s law appears individually driven. Ape gesture does not appear to manifest a principle of compression or pressure for efficiency that has been previously proposed to be universal. Importantly, the same signals were shown to adhere to these laws when used in a different behavioural context; highlighting that signallers consider signalling efficiency broadly, and diverse factors play important roles in shaping investment in signal production.
Introduction
Over the past 100 years, quantitative linguistics has revealed important statistical regularities present across human languages (Altmann & Gerlach, 2016; Köhler et al., 2005; Menzerath, 1954; Zipf, 1936). These are hypothesized to be manifestations of the information theoretic principle of compression (Ferrer-i-Cancho et al., 2020) suggested to be a universal principle towards coding efficiency (Ferrer-i-Cancho et al., 2013) and argued to be present across systems of biological information (Gustison et al., 2016). Compression is a particular case of the principle of least effort (Zipf, 1949) in which there is an advantage to choosing the outcome that requires the least amount of energy to produce or achieve. In communication, compression is expressed as a pressure towards reducing the energy needed to compose a code but limited by the need to retain the critical information in the transmission (Cover & Thomas, 2006; Ferrer-i-Cancho et al., 2020).
Compression predicts diverse statistical patterns at different levels of organization. Zipf’s law of abbreviation is the tendency of more frequent words to be shorter in length (Strauss et al., 2007; Zipf, 1949), and is generalised as the tendency for more frequent elements of many kinds (e.g., syllables, words, calls) to be shorter or smaller (Ferrer-i-Cancho et al., 2013). Besides being found in human spoken, signed, and written languages (Bentz & Ferrer-I-Cancho, 2016; Börstell et al., 2016; Hernández-Fernández et al., 2019; Sanada, 2008; Wang & Chen, 2015), Zipf’s law of brevity has been identified in genomes (Naranan & Balasubrahmanyan, 2000) and in the communication of diverse taxa: dolphins (Ferrer-i-Cancho & Lusseau, 2009), bats (Luo et al., 2013), penguins (Favaro et al., 2020), hyraxes (Demartsev et al., 2019), and various primates (macaques: Semple et al., 2013; marmosets: Ferrer-i-Cancho & Hernández-Fernández, 2013; gibbons: Huang et al., 2020).
At the level of larger constructs, Menzerath’s law states that “the greater the whole, the smaller its constituents” (Altmann, 1980; Köhler, 2012; Menzerath, 1954); for example: longer sentences have words of shorter average length, and words with more syllables contain syllables of shorter length. Menzerath’s law has been identified in human languages (Altmann, 1980), genomes(Ferrer-i-Cancho & Forns, 2009; Li, 2012), music (Boroda & Altmann, 1991), and in the communication of penguins (Favaro et al., 2020) and primates (geladas: Gustison et al., 2016; chimpanzees: Fedurek et al., 2017; Heesen et al., 2019; gibbons: Clink et al., 2020; Huang et al., 2020; gorillas: Watson et al., 2020).
Chimpanzee gestural communication represents a powerful model in which to explore compression and language laws. Apes have large repertoires of over 70 distinct gesture types (Byrne et al., 2017); as compared to vocal communication, gestural repertoires are larger and are more flexibly deployed, with individual gesture types used to achieve multiple goals (Bard et al., 2019; Call & Tomasello, 2007; Hobaiter & Byrne, 2011a; Liebal et al., 2004). Gestures are also used intentionally, i.e., to reach social goals by influencing the receivers’ behaviour or understanding (Graham et al., 2018; Hobaiter & Byrne, 2011a, 2014; Schel et al., 2013), and flexibly across contexts (Call & Tomasello, 2007; Hobaiter & Byrne, 2011a; Liebal et al., 2004). In a first study, Menzerath’s law appears to hold in chimpanzee play gesture sequences (Heesen et al., 2019), while those were found to represent a rare example of a failure of Zipf’s law of brevity – at least at the level of the play gesture repertoire as a whole (Heesen et al., 2019). Play is a particularly prominent context for gesturing and involves the majority of the available repertoire (Hobaiter & Byrne, 2014). Zipf’s law of brevity held only in subsets of the gesture types used in play; and given its widespread presence across diverse species’ systems of communication, the repertoire-level failure of Zipf’s law of brevity in chimpanzee gesture remains a conundrum.
Although failures of Zipf’s law of abbreviation have been previously reported (e.g., European heraldry: Miton & Morin, 2019; computer based neural networks: Chaabouni et al., 2019), in non-human animal communication these failures have typically been limited to long-distance communication (e.g. gibbon song: Clink et al., 2020; bats: Luo et al., 2013; although cf. female hyrax calls: Demartsev et al., 2019) where the impact of distance on signal transmission fidelity may alter the costs of compression (Ferrer-i-Cancho et al., 2013; Gustison et al., 2016). Moreover, the reported failure in Heesen et al. (2019) is a first for non-vocal systems of communication, as Zipf’s law of brevity has previously been described in signed languages (Börstell et al., 2016) and body-signals (Ferrer-i-Cancho & Lusseau, 2009). Signal compression may also vary in response to different socio-ecological constraints. Play is produced when there is an excess of time and energy (Held & Špinka, 2011; Pellis & Pellis, 1996; Smith, 2014) and the need to reduce signal effort through increased compression may be limited. As a result, it remains unclear whether the failure of Zipf’s law of brevity in chimpanzee gesture was due to the use of gestures from within play, or whether it reflects a system-wide failure of brevity.
Sexual solicitations represent a more urgent context for communication than play (Hobaiter & Byrne, 2012). With females limited by long inter-birth intervals of 4-5 years (Clark, 1977; Thompson, 2013), and males having substantial variation in reproductive success (Newton-Fisher et al., 2009; Tutin, 1979), both sexes rely on diverse strategies to improve individual fitness including mate guarding (Muller & Wrangham, 2009), opportunistic mating (Tutin, 1979; Watts, 2015), and consortship (Tutin, 1979). Sexual solicitations signals are subject to strong selection pressures and represent an excellent novel model in which to explore compression in gesture.
We test for patterns predicted by Zipf’s law of brevity and Menzerath’s law, both at the level or single gesture types and gesture sequences, respectively. To investigate Zipf’s law of brevity, we test (1) the correlation between frequency of use of gesture types and their average duration. To investigate Menzerath’s law, we test (2) the correlation between the number of gestures in the sequence (i.e., size of the sequence construct) and the average duration of gestures in the sequence. For both laws, following previous studies and to allow for a robust assessment, we also compute compression values related to the respective patterns (Heesen et al., 2019). In doing so, our study provides an assessment of compression in a novel evolutionarily urgent context of gestural communication.
Results
We measured N=560 male to female sexual solicitation gestures from 173 videos of 16 wild, habituated East African chimpanzees (Pan troglodytes schweinfurthii). Within the 560 gestural tokens, we identified 26 gesture types: 21 manual gestures and 5 whole-body gestures (for full repertoire and distribution of gesture tokens across gesture types see S4 Table and S5 Fig) performed by 16 male individuals aged 10-42 years old. On average each individual produced 35 ± 70.7 gesture tokens (range 2-290 tokens).
Sequence length ranged from 1 to 6 tokens (Table 1). 243 tokens were single gestures, the remaining 317 were part of sequences with length n>1. Gesture duration ranged from 0.04-15.04 seconds (mean: 2.39 ±2.35s). Of the 116 sequences analysed that were composed of 2 or more gesture tokens; 27 (23%) were formed by the repetition of the same gesture type, whereas the remaining 89 (77%) included more than one gesture type (Table 1).
Number of sequences composed of the same or different gesture types, listed according to sequence length.
Do chimpanzee sexual solicitation gestures follow Zipf’s law of brevity?
We did not find a pattern in agreement with Zipf’s law of brevity; there was no significant negative correlation between mean gesture type duration (d) and frequency of use (f) (Spearman correlation: rs=0.30, n=26, p=0.066; with outlier excluded: Spearman correlation: rs=0.22, n=25, p=0.147; Fig1). Consistent with this result, the compression test revealed that the expected mean code length of gesture types L had a magnitude of 2.39s and was not significantly small (pleft=0.951). Rather, L was significantly big (pright=0.05). Excluding an outlier, expected mean code length of gesture types L was 0.91s and was not significantly small (pleft=0.861) and expected mean code length of gesture types L (outlier excluded) was not significantly big (pleft=0.861), suggesting a weak but nonlinear association between f and d.
Whole-body gestures are in yellow, and manual gestures are in blue. Plot A: outlier gesture type was Object shake, performed more than 250 times. Plot B magnifies the area delimited by the red rectangle in plot A. Black dashed line indicates relationship between gesture type frequency and gesture duration across the whole dataset. Whiskers indicate s.e.m. Absence of whiskers indicates either small variation of durations for the gesture type or that the gesture was only used once.
These results were in line with our mixed model analysis (see Supporting Information S6 file). While Model 1 fitted the data significantly better than the null model (full-null model comparison: X2(3)=91.1, p<0.001), when controlling for signaller ID, neither proportion of gesture type within the dataset (Proportion) nor Category of gesture type (manual vs whole-body gesture type), nor their interaction, had an effect on gesture duration (Table 2a). Model 2, without the outlier gesture object shake, failed to fit the data better than the null model (full-null model comparison: X2(3)=2.47, p=0.48; see S6 File). There was little variation between individuals in both models.
Although Category and the interaction between Category and Proportion were retained when evaluating Model 1 fit with AIC (Table 2b), their effect was negligible when ranking the models based on their BIC (Table 2c), which introduces a stronger preference for parsimonious models. Thus, Proportion was identified as the factor that best fit the data.
Subset analysis: whole-body and manual gesture types
We found no evidence for a negative correlation between d and f when separating whole-body gestures from manual gestures (Spearman’s rank correlation: whole-body, rs=-0.3, n=5, p=0.342; manual, rs=0.42, n=21, p=0.969; Fig 1). Rather, manual gestures showed a significant positive correlation (rs=0.42, n=21, p=0.031). Compression tests revealed that for whole-body gestures, L=0.13s and was neither significantly big or small (pleft=0.174, pright=0.817), and for manual gestures, L=2.26s and, if anything, tended towards being significantly big (pright=0.058) rather than small (pleft=0.942).
Do chimpanzee sexual solicitation gesture sequences follow Menzerath’s law?
We tested Menzerath’s law in 359 sequences, composed of 530 gesture tokens; there was no relationship between mean constituent duration and sequence size (Spearman’s rank correlation: rs=-0.08 n=359, p=0.076, Fig 2).
The scatterplot indicates the average gesture duration (t) for each sequence across different sequence sizes (n). Whiskers indicate s.e.m.
The compression test revealed that the total sum of the duration of each sequence M had a value of 1300.67 and was significantly small (n=359, p=0.003) suggesting a linear association between n and t that could not be sufficiently captured by the Spearman correlation test. Model 3 fit the data significantly better than the null model (full-null model comparison: X2(3)=9.85, p=0.020). When controlling for signaller ID, Size was the only factor to have a significant effect on gesture duration (longer sequences were formed of shorter gestures; Table 3a). Proportion of whole-body gesture types in the sequence (PWB) as well as the interaction between Size and PWB had no effect on gesture duration (Table 3a). Average sequence duration was not well described by any of the factors or their interactions. Model ranking based on AIC (Table 3b), showed no improvement in model fit between the null model and a model including any other factors, or their interaction. When ranking the models based on BIC the null model provided a similar fit (Table 3c).
The inconsistency between the AIC and BIC rankings and the other tests may have derived from the presence of individual effects. We found slightly higher individual variation in Model 3, as compared to Models 1 and 2. To investigate this further, we performed an additional unplanned analysis on the presence of Menzerath’s law in chimpanzee gestures on a subset of the dataset that excluded data from the most prominent individual. When excluded, we found no significant correlation between sequence size and average constituent duration, M was not significantly small and the GLMM revealed no effect of any fixed factor (S8 Fig, S9 File).
Discussion
Chimpanzee sexual solicitation gestures did not follow either Menzerath’s law or Zipf’s law of brevity consistently. In contrast to previous findings from play gestures, where subsets of gestures adhered to Zipf’s law of brevity, no subsets of solicitation gestures followed the brevity law, and in sexual solicitations manual gestures showed an opposite-Zipf pattern: more frequently employed gesture types were longer in duration. Similarly, for Menzerath’s law, although longer sequences of gestures were not consistently made up of gestures of shorter average length, mean constituent duration (M) was significantly small, hinting at the presence of some form of compression; however, careful additional analysis suggests that this effect may have been driven by a single individual.
These results represent a further failure of Zipf’s law of brevity in great ape gestural communication and support the wider finding that – unlike most other close-range systems of communication described to date – gestural communication does not seem to manifest a pressure for compression and efficiency, challenging the view that compression is a universal principle in human and other animal communication (Börstell et al., 2016; Ferrer-i-Cancho et al., 2013). It particularly highlights that compression does not act on communicative systems uniformly: 20 of the 26 gesture types described here in sexual solicitations overlapped with those in the study of play sequences, data were collected from the same community over the same period, but the two behavioural contexts produced conflicting results.
The previous study of play gesture found clear evidence for Menzerath’s law in gestural sequences (Heesen et al., 2019), and it was argued that these patterns might derive from the prolonged muscular activity necessary to produce extended gestural sequences (Scott, 2008), promoting energy-efficient communicative coding (Heesen et al., 2019). In contrast, we found no clear correlation between sequence length and the duration of its constituent gesture tokens. While there was limited evidence in favour of Menzerath’s law when all the individuals were included, it disappeared with the exclusion of a single prolific individual. Given the importance of the goal the signals are used for, any possible losses to fitness due to a lack of compression in signalling are more than offset by the potential gains to individual fitness from maximising opportunities for reproductive success.
Taken together, the presence of a pattern neutral or opposite to that dictated by Zipf’s law in sexual solicitations, the variation in the presence of Zipf’s brevity law and Menzerath’s law across behavioural contexts, and the possible presence of individual effects, underline how diverse factors play an important role in shaping investment in signal production. In contrast to vocal communication across primate species, and to the majority of the gestural repertoire during play, in chimpanzee sexual solicitations ‘inefficiency’ in signalling effort by the signaller appears to be slightly favoured, which nonetheless may be efficient in terms of achieving the signaller’s goal of successful communication in a context vital for reproductive success. Given the long inter-birth intervals and active mate guarding (Muller & Wrangham, 2009), chimpanzee paternity is often heavily biased towards a few high-ranking individuals (Newton-Fisher et al., 2009). With so few opportunities to mate, sexual solicitations may represent one of the most evolutionarily urgent contexts in which chimpanzee gestures are produced. Where the costs of signal failure are high, there is a pressure against compression and towards redundancy, as in chimpanzees’ use of gesture-vocal signal combinations in agonistic social interactions (Hobaiter et al., 2017). While there are examples of vocal communication systems used in urgent contexts that adhere to Zipf’s brevity law (Favaro et al., 2020), the benefits of successful communication to individual fitness in chimpanzee solicitation appear to outweigh the energetic costs associated with the production of a vigorous and conspicuous signal.
Research to date has typically focused on signal compression at the level of the communication system, but communication happens in-situ. Signallers likely consider signalling efficiency more broadly: an intense but time-limited investment in clear signalling may be more efficient than the need to travel with a female for extended periods following a failed signal. A similar solicitation with a different audience may need to be produced rapidly and inconspicuously, as the detection of this activity by other males could be fatal (Fawcett & Muhumuza, 2000). Here, the same signals used by the same chimpanzees in a less urgent context – play – did show compression. While many vocalizations are relatively fixed, gestural flexibility (in goal and context – Bard et al., 2019; Call & Tomasello, 2007; Hobaiter & Byrne, 2011a; Liebal et al., 2004) allows us to explore how compression acts within both specific instances of communication as well as on communication systems as a whole. To do so will require large longitudinal datasets in which it is possible to test both individual variation and variation within individuals across different gesture types and sequence lengths, and across different socio-ecological contexts of use. The use of redundancy within specific subsets of gesture, or within specific contexts of gesture demonstrates both the importance of compression in communicative systems in general, but also the flexibility present in each specific usage. In doing so, it highlights the importance of exploring the impact of individual and socio-ecological factors within wider patterns of compression in biological systems in evolutionary salient scenarios.
Methods
We measured N=560 male to female sexual solicitation gestures from 173 videos of 16 wild, habituated East African chimpanzees (Pan troglodytes schweinfuthii) from the Sonso community of the Budongo Forest Reserve in Uganda (1°35’ and 1° 55’N and 31° 08’ and 31°42’ E), collected between December 2007 and February 2014.
Sexual solicitation gestures
Sexual solicitation gestures were defined as those given by a male towards a female with the goal of achieving sex, usually accompanied by the male having an erection and the female being in oestrus (Hobaiter & Byrne, 2011a). We included solicitations in the context of sexual consortship; here a male gestures in order to escort a female away from the group to maintain exclusive sexual access, which can occur prior to the peak of the female oestrus (Tutin, 1979). We restricted our analyses to male to female sexual solicitation, as female to male sexual solicitation attempts rarely involve sequences of gestures in this population. We further restricted analysis to male solicitations by individuals of at least 8-years old, as this is the minimum age of siring recorded in this community, limiting our signals to those on which there is more direct selective pressure.
Defining gesture types and tokens
In quantitative linguistics, word types are used to assess Zipf’s law of brevity, whereas tokens are used to assess patterns conforming to Menzerath’s law. The former involves the calculation of mean duration (L) of each word type (Ferrer-i-Cancho et al., 2013; Ferrer-i-Cancho & Hernández-Fernández, 2013), and the latter the quantification of the total duration of tokens (M) (Gustison et al., 2016). To distinguish the two, consider the question:
Which witch was which?
The question is composed of 4 tokens (overall word count), and three different word types, (which, witch, was). Gesture types (see S4 Table for a detailed repertoire description) were categorized according to the similarity of the gesture movement, which could be used either as a single instance or in a sequence; and each gestural instance represented an individual token. Sequence length was quantified as the number of gesture tokens produced with less than 1s between two consecutive gesture tokens; single gestures were coded as sequences of length one (Heesen et al., 2019; Hobaiter & Byrne, 2011b).
Gesture duration
Gesture duration was calculated using MPEG streamclip (version 1.9.3beta). We measured gesture duration in frames, each lasting 0.04s. Following (Heesen et al., 2019) gesture start was defined as the initial movement of a part of the body required for the production of the gesture. The end of a gesture corresponded to (1) the cessation of the body movement related to gesture production, or (2) a change in body positioning if the gesture relied on body alignment, or (3) the point at which the goal was fulfilled, and any further movement represented effective action (for example, locomotion or copulation).
Intra-observer reliability
Intra-observer reliability was tested by randomizing the order of the videos and re-coding the duration of the gestures of every ninth clip, for a total of 75 gestures from 23 clips. We performed an intraclass correlation coefficient (ICC) test – class 3 with n=1 rater (Landers, 2015) – which revealed very high agreement on gesture duration measurements (ICC=0.995, p<.001).
Statistical analysis
All data were analysed using R version 4.0.0 and RStudio version 1.2.5042 (R Core Team, 2020; RStudio Team, 2020).Compression predicts that mean duration should be smaller than expected by chance (Ferrer-i-Cancho et al., 2013). Similarly, optimal compression predicts linguistic laws as a correlation in a specific direction, i.e., the correlation cannot be positive (Ferrer-i-Cancho et al., 2013, 2020). Accordingly, we employed one-tailed tests throughout, but report the outcome of two-tailed equivalents in S3 file for comparison with previous findings (Heesen et al., 2019).
We conducted one-tailed Spearman rank correlation tests to analyse the relationship between the frequency within the sample of a gesture type (f) and its mean duration (d), calculated by dividing D, the total sum of all durations of the same gesture type, by f (i.e., d=D/f) (Semple et al., 2013). A similar procedure was used to test for a correlation between the mean gesture duration within a given sequence (t) and the number of gesture tokens in the same sequence (n). Mean gesture duration was calculated by dividing the total duration of a gestural sequence (T) – i.e., the sum of all durations of the gesture tokens present in the sequence excluding pauses between gestures – by the number of gesture tokens within that sequence n (i.e., t=T/n). A negative correlation between d and f coherent with Zipf’s law of abbreviation, and a negative correlation between t and n conforming to Menzerath’s law could both be unavoidable artefacts given the relationship between d and f, and between t and n – as defining d involves f, and defining t involves n – which could lead to d = 1/f and t=1/n (Ferrer-i-Cancho et al., 2014). Such artefacts can be excluded by establishing that D and f, and T and n are significantly positively correlated (Ferrer-i-Cancho et al., 2014; Semple et al., 2013), which we tested using two Spearman rank correlation tests. Earlier research demonstrated Zipf’s law of abbreviation can be present in parts of a repertoire when it appears to be absent in the whole repertoire (Ferrer-i-Cancho & Hernández-Fernández, 2013; Heesen et al., 2019). As a result, we also tested for Zipf’s law of abbreviation in specific subsets of the repertoire, namely manual versus whole-body gesture types which had been found to differ in previous work (Heesen et al., 2019).
To control for repeated measures of individuals, we fitted Generalised Linear Mixed Models using the ‘lme4’ package (version 1.1-23; Bates et al., 2015), and based model evaluation on both AIC and BIC information criteria (Akaike, 1998; Schwarz, 1978). Model 1, which tested Zipf’s law of brevity, contained gesture token mean duration (s) as the response variable, the Proportion of occurrences of a particular gesture type in the dataset (Proportion), and gesture Category (manual vs whole-body) as main effects, and an interaction term between Proportion and Category. We included signaller ID as a random effect. Given the detection of the outlier gesture type Object shake performed more than 250 times in the dataset, we ran a second model (Model 2) identical to Model 1 in structure but on a subset of the dataset that excluded the outlier gesture type. Model 3, which we used to test Menzerath’s law, contained average duration of gesture tokens (s) within a sequence as the response variable, sequence Size (number of gesture tokens), and proportion of whole-body gestures within the sequences (PWB) as main effects, and an interaction term between Size and PWB. Once more, we modelled signaller ID as a random effect. Prior to the GLMM analysis we assessed data distribution using the ‘fitdistr’ package (version 1.0-14; Delignette-Muller & Dutang, 2015). Following data inspection, we log-transformed gesture duration and average sequence duration as data from the response variable strongly skewed towards zero (for data inspection analysis and untransformed results S2 and S7 files). We checked for collinearity of fixed factors looking at variance inflation factors (VIFs) using the function ‘vif’ of the ‘car’ package [version 3.0-8; Fox & Weisberg, 2019), which revealed collinearity not to be an issue in either model (all VIF < 1.10). Furthermore, we assessed model fit by comparing each full model against its relative null model containing only the random factor Signaller ID using the ‘anova’ function of the ‘car’ package (Fox & Weisberg, 2019). Finally, following (Heesen et al., 2019), we used a test for compression (or the opposite effect of redundancy), assessing whether mean duration of all gesture types L and the total duration of the sequences M were significantly small (or significantly large). The permutation test produces a left p-value to check if L (or M) is significantly small and a right p-value to check if L (or M) is significantly large, (S1 file) (Heesen et al., 2019). The total number of permutations carried out was R=105.
The compression test and the correlation test above are related: it has been shown that the method to test if L (or M) is significantly small is equivalent to a one-tailed test on the Pearson correlation between f and d (or n and t; Ferrer-i-Cancho et al., 2020). Nonetheless, our correlation test and the compression test employ statistic with remarkable differences. While a Pearson correlation is a measure of linear association, Spearman correlations is a measure of test of monotonic, possibly non-linear association (de Siqueira Santos et al., 2014).
Data and code
Data and code for all analyses are available in a public GitHub repository: github.com/Wild-Minds/LinguisticLaws_Papers
Acknowledgements
We thank the staff and field assistants of the Budongo Conservation Field Station for their assistance in the original gestural data collection, and the Ugandan National Council for Science and Technology and the Ugandan Wildlife Authority for permission to conduct the original research. We thank the Royal Zoological Society of Scotland for its funding of the field station. This research received funding from the European Union’s 8th Framework Programme, Horizon 2020, under grant agreement no 802719.