## Abstract

Dating population divergence within species from molecular data and relating such dating to climatic and biogeographic changes is not trivial. Yet it can help formulating evolutionary hypotheses regarding local adaptation and future responses to changing environments. Key issues include statistical selection of a demographic and historical scenario among a set of possible scenarios and estimation of the parameter(s) of interest under the chosen scenario. Such inferences greatly benefit from new statistical approaches including Approximate Bayesian Computation - Random Forest (ABC-RF), the later providing reliable inference at a low computational cost, with the possibility to take into account prior knowledge on both biogeographical history and genetic markers. Here, we used ABC-RF, including or not independent information on evolutionary rate and pattern at microsatellite markers, to decipher the evolutionary history of the African arid-adapted pest locust, *Schistocerca gregaria*. We found that the evolutionary processes that have shaped the present geographical distribution of the species in two disjoint northern and southern regions of Africa were recent, dating back 2.6 Ky (90% CI: 0.9 – 6.6 Ky). ABC-RF inferences also supported a southern colonization of Africa from a low number of founders of northern origin. The inferred divergence history is better explained by the peculiar biology of *S. gregaria*, which involves a density-dependent swarming phase with some exceptional spectacular migrations, or by a brief fragmentation of the African forest core during the interglacial late Holocene, rather than a continuous colonization resulting from the continental expansion of open vegetation habitats during the past Quaternary glacial episodes.

## Introduction

As in other regions of the world, Africa has gone through several major episodes of climate change since the early Pleistocene (deMenocal 1995 and 2004). During glaciation periods, the prevalent climate was colder and drier than nowadays, and became more humid during warmer interglacial periods. These climatic phases resulted in shifts of vegetation (Vivo and Carmignotto 2004) and are most likely at the origin of the current isolation between northern and southern distributions of arid-adapted species (Monod 1971). In Africa, at least fifty-six plant species show disjoint geographical distributions in southern and northern arid areas (Monod 1971; Jurgens 1997; Lebrun 2001).

Similarly, a number of animal vertebrate species show meridian disjoint distributions on this continent, including eight mammals and 29 birds (Monod 1971; de Vivo and Carmignotto 2004; Lorenzen *et al.* 2012). The desert locust, *Schistocerca gregaria*, is among the few examples of insect species distributed in two distinct regions along the north-south axis of Africa. Other known disjunctions in insects are interspecific and concern species of the families Charilaidae (Orthoptera) and Mythicomyiidae (Diptera), and of the genus Fidelia (Hymenoptera) (Le Gall *et al.* 2010). Similarities in extant distributions of African arid-adapted species across divergent taxonomic groups point to a common climatic history and an important role of environmental factors. Yet, to our knowledge, studies relating evolutionary history and climatic history have rarely been carried out in this continent (but see mitochondrial studies by Miller *et al.* 2011 on the ostrich, Atickem *et al.* 2018 on the black-backed jackal, and Moodley *et al.* 2018 on the white rhinoceros).

Dating population or subspecies divergence within a species and relating such dating to climatic and biogeographic changes in the species history is not trivial. First, global climate models have been largely calibrated using northern hemisphere drivers and validation datasets. Their quality has therefore been tested less often in Africa, even less so when it comes to hindcasting potential distributions using projections of such climate models into different past temporal windows. Recent comparisons between botanical and climate models have suggested that climate forcing in Africa may operate in a different way, and have therefore shed some doubts regarding the validity of such projections, in particular into long time periods involving several thousand years into the past (Chase and Meadows 2007; Dupont 2011). Second, finding a reliable calibration to convert measures of genetic divergence into units of absolute time is challenging, especially so for recent evolutionary events (Ho *et al.* 2008). Extra-specific fossil calibration may lead to considerable overestimates of divergence times and internal fossil records are often lacking (Ho *et al.* 2008). A sensible approach when internal calibration is available for a related species is to import an evolutionary rate estimated from sequence data of this species (Ho *et al.* 2008). Unfortunately, on the African continent, fossils, such as radiocarbon-dated ancient samples, remain relatively rare and are often not representative of modern lineages (*e.g.*, Le Gall 2010 for insects). The lack of paleontological and archaeological records is partly due to their fragility under the aridity conditions of the Sahara. The end-result is that the options to relate population divergence to biogeographic events in this region are very limited.

In this context, the use of versatile molecular markers, such as microsatellite loci, for which evolutionary rates can be obtained from direct observation of germline mutations in the species of interest, represents a useful alternative. Microsatellite mutation rates exceed by several orders of magnitude that of point mutation in DNA sequences, ranging from 10^{-6} to 10^{-2} events per locus and per generation (Ellegren 2000). This providing allows both to observe mutation events in parent-offspring segregation data of realistic sample size and work out the recent history of related populations. However, the use of microsatellite loci to estimate divergence times at recent evolutionary time-scales still needs overcoming significant challenges. Since microsatellite allele sizes result from the insertion or deletion of single or multiple repeat units and are tightly constrained, these markers can be characterized by high levels of homoplasy that can obscure inferences about gene history (*e.g.*, Estoup *et al.* 2002). In particular, at large time scales (i.e., for distantly related populations), genetic distance values do not follow anymore a linear relationship with time, reach a plateau and hence provide biased unreliable estimation of divergence time (Takezaki and Nei 1996; Feldman *et al.* 1997; Pollock 1998). Microsatellites remain informative with respect to divergence time only if the population split occurs within the period of linearity with time (Feldman *et al.* 1997; Pollock 1998). The exact value of the differentiation threshold above which microsatellite markers would no longer accurately reflect divergence times will depend on constraints on allele sizes and population-scaled mutation rates (Feldman *et al.* 1997; Pollock 1998). For any inferential framework, including independent information on microsatellite allele size constraints and mutation rates (for instance into priors when using Bayesian methods) is expected to improve the accuracy of parameter estimation, especially when considering divergence times between populations.

The desert locust, *S. gregaria*, is a generalist herbivore that can be found in arid grasslands and deserts in both northern and southern Africa (Figure 1a). In its northern range, the desert locust is one of the most widespread and harmful agricultural pest species with a huge potential outbreak area, spanning from West Africa to Southwest Asia. The desert locust is also present in the southwestern arid zone (SWA) of Africa, which includes South-Africa, Namibia, Botswana and south-western Angola. The southern populations of the desert locust are termed *S. g. flaviventris* and are geographically separated by nearly 2,500 km from populations of the nominal subspecies from northern Africa, *S. g. gregaria* (Uvarov 1977). The isolation of *S. g. flaviventris* and *S. g. gregaria* lineages was recently supported by highlighting distinctive mitochondrial DNA haplotypes and male genitalia morphologies (Chapuis *et al.* 2016). Yet, the precise history of divergence remains elusive.

The main objective of the present study is to unravel the historical and evolutionary processes that have shaped the present disjoint geographical distribution of the desert locust and the genetic variation observed both within and between populations of its two subspecies. To this aim, we first used paleo-vegetation maps to construct biogeographic scenarios relevant to African species from arid grasslands and deserts. We then used molecular data obtained from microsatellite markers for which we could obtain independent information on evolutionary rates and allele size constraints in the species of interest from direct observation of germline mutations (Chapuis *et al.* 2015). We applied newly available algorithms of the Approximate Bayesian Computation - random forest method (ABC-RF; Pudlo *et al.* 2016; Estoup *et al.* 2018a; Raynal *et al.* 2019) on our microsatellite population genetic data to compare a set of thoroughly formalized and justified evolutionary scenarios and estimate the divergence time between *S. g. gregaria* and *S. g. flaviventris* under the most likely of our scenarios. Finally, we interpret our results in the light of the paleo-vegetation information we compiled and various biological features of the desert locust.

### New approaches

Due to its great flexibility, Approximate Bayesian Computation (ABC, Beaumont *et al.* 2002) is an increasingly common statistical approach used to perform model-based inferences in a Bayesian setting, especially when complex models are considered (*e.g.*, Beaumont 2010, Bertorelle *et al.* 2010, Csilléry *et al.* 2010). However, both theoretical arguments and simulation experiments indicate that scenario’ posterior probabilities can be poorly evaluated by standard ABC methods, even though the numerical approximations of such probabilities can preserve classification (Robert *et al.* 2011). To overcome this problem, Pudlo *et al.* (2016) recently proposed a novel approach based on a machine learning tool named random forests (RF) (Breiman 2001), hence leading to the ABC-RF methodology. When compared with standard ABC methods, the ABC-RF approach enables efficient discrimination among scenarios and estimation of posterior probability of the best scenario while being computationally less intensive. Building on that success, Raynal *et al.* (2019) recently proposed an extension of the RF methodology applied in a (non-parametric) regression setting to estimate the posterior distributions of parameters of interest under a given scenario. When compared with various ABC solutions, this new RF method offers many advantages: a significant gain in terms of robustness to the choice of the summary statistics; independence from any type of tolerance level; and a good trade-off in term of quality of point estimator precision of parameters and credible interval estimations for a given computing time (Raynal *et al.* 2019). An overview of the ABC-RF methods used in the present paper is provided in Supplementary Material S1. Readers can consult Pudlo *et al.* (2016), Fraimout *et al.* (2017), Estoup *et al.* (2018a,b) and Marin *et al.* (2018) for scenario choice, and Raynal *et al.* (2019) for parameter estimation to access to further detailed statistical descriptions, testing and applications of ABC-RF algorithms.

To our knowledge, the present study is the first one using recently developed ABC-RF algorithms to carry out inferences about both scenario choice and parameter estimation, on a real multi-locus microsatellite dataset. It includes and illustrates three novelties in statistical analyses that were particularly useful for reconstructing the evolutionary history of the divergence between *S. g. gregaria* and *S. g. flaviventris* subspecies: model grouping analyses based on several key evolutionary events, assessment of the quality of predictions to evaluate the robustness of our inferences, and incorporation of previous information on the mutational setting of the used microsatellite markers.

#### (1) Model grouping

Both the poor knowledge on the species history and the complex climatic history of Africa make it necessary to consider potentially complex evolutionary scenarios. We formalized eight competing scenarios including (or not) three key evolutionary events that we identified as having potentially played a role in setting up the disjoint distribution of the two locust subspecies (for details see the section *Formalization of evolutionary scenarios* in Materials and methods). Following the new approach proposed by Estoup *et al.* (2018a), we processed ABC-RF analyses grouping scenarios based on the presence or absence of each type of evolutionary event, before considering all scenarios separately. Such grouping approach in scenario choice is of great interest to disentangle the level of confidence of our approach to make inferences about each specific evolutionary event of interest.

#### (2) Assessing the quality of predictions

For scenario choice and parameter estimation, we evaluated the robustness of our inferences at both a global (*i.e.*, prior) and a local (*i.e.*, posterior) scale. The global prior error was computed, using the computationally parsimonious out-of-bag prediction method for scenarios identity and parameter values covering the entire prior multidimensional space. Since error levels may differ depending on the location of an observed dataset in the prior data space, prior-based indicators are poorly relevant, aside from their use to select the best classification method and set of predictors, here our summary statistics. Therefore, in addition to global prior errors, we computed local posterior errors, conditionally to the observed dataset. The latter errors measure prediction quality exactly at the position of the observed dataset. For model choice, we demonstrated that the error measure given the observation can be computed as 1 minus the posterior probability of the selected scenario. For parameter estimation, we propose an innovative way to approximate local posterior errors, again relying partly on out-of-bag predictions. See the section *Local posterior errors* in Supplementary Material S1 for details. These statistical novelties were implemented in a new version of the R library abcrf (version 1.8) available on R CRAN. Finally, for estimation of divergence time between the two subspecies, we evaluated how accurately the divergence time posterior distributions reflected true divergence time values and the threshold above which the divergence time posterior estimates reach a plateau. To do this, we used simulated pseudo-observed datasets to compute error measures conditionally to a subset of fixed divergence time values chosen to cover the entire prior interval.

#### (3) Incorporation of previous information into the microsatellite mutational setting

Our ABC-RF statistical treatments benefited from the incorporation of previous estimations of mutation rates and allele size constraints for the microsatellite loci used in this study. Microsatellite mutation rate and pattern of most eukaryotes remains to a large extent unknown, and, to our knowledge, the present study is a rare one where independent information on mutational features was incorporated into the microsatellite prior distributions. We thoroughly evaluated to which extent the incorporation of such independent information improved the performance of ABC-RF for choosing among evolutionary scenarios and for estimating the time of divergence between the two locust subspecies.

## Results

### Formalization of evolutionary scenarios

Using a rich corpus of (paleo-)vegetation data, we reconstructed the present time (Fig. 1C) and past time (Figs. 1D-F) distribution ranges of *S. gregaria* in Africa, going back to the Last Glacial Maximum period (LGM, 26 to 14.8 Ky ago). Maps of vegetation cover for glacial arid maximums (Figs. 1E and 1F) showed an expansion of open vegetation habitats sufficient to make the potential range of the species continuous from the Horn of Africa in the north-west to the Cape of Good Hope in the south. Maps of vegetation cover for interglacial humid maximums (Fig. 1D) showed a severe contraction of deserts. These maps helped us formalize eight competing evolutionary scenarios (Figure 2), as well as bounds of prior distributions for various parameters (see the section *Prior setting for divergence parameters* in Materials and methods). The eight competing scenarios included different combinations of three key evolutionary events that we identified as having potentially played a role in setting up the observed disjoint distribution of the two locust subspecies: (i) a long population size contraction in the ancestral population, due to the reduction of open vegetation habitats during the interglacial periods, (ii) a bottleneck in the southern subspecies *S. g. flaviventris* right after divergence, associated to a single long-distance migration event of a small fraction of the ancestral population, and (iii) a secondary contact with an asymmetrical genetic admixture from *S. g. gregaria* into *S. g. flaviventris*, in order to consider the many climatic transitions of the last Quaternary.

### Scenario choice

ABC-RF analyses supported the same best scenario or group of scenarios for all ten replicate analyses (Table 1). The classification votes and posterior probabilities estimated for the observed microsatellite dataset were the highest for the groups of scenarios in which (i) *S. g. flaviventris* experienced a bottleneck event at the time of the split (average of 2890 votes out of 3,000 RF-trees; posterior probability = 0.965), (ii) the ancestral population experienced a population size contraction (2245 of 3,000 RF-trees; posterior probability = 0.746), and (iii) no admixture event occurred between populations after the split (2370 of 3,000 RF-trees; posterior probability = 0.742). When considering the eight scenarios separately, the highest classification vote was for scenario 4, which congruently excludes secondary contact and includes a population size contraction in the ancestral population and a bottleneck event at the time of divergence in the *S. g. flaviventris* subspecies (1777 of 3,000 RF-trees). The posterior probability of scenario 4 averaged 0.584 over the ten replicate analyses (Table 1).

Table S2.1 (Supplementary Material S2) shows that only two other scenarios obtained at least 5% of the votes: scenario 2 including only a single bottleneck event in *S. g. flaviventris* (mean of 537 votes) and scenario 8 with a bottleneck event in *S. g. flaviventris*, a population size contraction in the ancestral population and a secondary contact with admixture from *S. g. gregaria* into *S. g. flaviventris* (mean of 380 votes). All other scenarios obtained less than 5% of the votes and were hence even more weakly supported. Scenario 4 obtained the highest number of votes also for analyses based on a naive mutational prior setting for microsatellite markers, *i.e.*, when drawing prior values for mean mutation parameters from uniform distributions instead of setting them to a fixed value as in our informed mutational prior setting (Table 1 and Table S3.1, Supplementary Material S3; see also the Materials and methods section *Microsatellite dataset, mutation rate and mutation model* for details about the microsatellite prior distributions for the informed and naive mutational settings). Posterior probability values for scenario 4 and for the best groups of scenarios were slightly lower when using a naive mutational prior setting, except for the group without any admixture event (Table 1).

We found that posterior error rates (*i.e.*, 1 minus the posterior probabilities) were lower than prior error rates for the analyses considering either groups of scenarios based on the presence (or not) of a bottleneck in *S. g. flaviventris* (*i.e.*, 3.5% versus 10.2%) or the scenarios separately (*i.e.*, 41.6% versus 47.9%). For other groups of scenarios, the discrimination power was similar at both the global (prior error rates) and local (posterior error rates) scales, with values ranging from 23.5% to 25.8% (Table 1). Altogether, these results indicate that the observed dataset belongs to a region of the data space where the power to discriminate among scenarios is higher than the global power computed over the whole prior data space, and that the presence or absence of a bottleneck in *S. g. flaviventris* is the demographic event with the most robust prediction in our ABC-RF treatments. These results hold true when using a naive mutational prior setting (Table 1). They can be visually illustrated by the projection of the reference table datasets and the observed one on a single (when analyzing pairwise groups of scenarios) or on the first two linear discriminant analysis (LDA) axes (when analyzing the eight scenarios considered separately) (Figure S2.1, Supplementary Material S2 and Figure S3.1, Supplementary Material S3).

Figure S2.2, Supplementary Material S2, illustrates how RFs automatically rank the summary statistics according to their level of information. It shows that the set of most informative statistics is different depending on the comparisons (groups of scenarios or individual scenarios). Two sample statistics that measure the amount of genetic variation shared between populations (*F*_{ST}, LIK and DM2) were among the most informative when discriminating among groups of scenarios including or not an admixture event. For groups of scenarios differing by population size variation events, statistics summarizing variation between the two subspecies samples (*F*_{ST} and DM2 for the bottleneck event in *S. g. flaviventris*; DAS and LIK for the population size contraction in the ancestral population) and statistics summarizing genetic variation within subspecies samples (mean expected heterozygosity and mean number of alleles for both population size variation events) were among the most discriminative ones. Only eight single sample statistics were not informative (according to their position relatively to the noise statistics added to our treatments) when considering the eight individual scenarios separately. All those non informative statistics were associated to the set of transcribed microsatellites (Figure S2.3, Supplementary Material S2). When using a naive mutational prior setting, twice as many more summary statistics turned out to be non-informative (Figure S3.2, Supplementary Material S3).

### Parameter estimation

Figure 3A shows point estimates with 90% credibility intervals of the posterior distribution of the divergence time between the two subspecies under the best supported scenario 4. Our estimations point to a young age of subspecies divergence, with a median divergence time of 2.6 Ky and a 90% credibility interval of 0.9 to 6.6 Ky, when using some informed mutational priors and assuming an average of three generations per year (Table 2 and Table S2.2, Supplementary Material S2). The naive mutational prior setting led to a median estimate of 1.7 Ky with a wider 90% credibility interval of 0.4 to 7.9 Ky (Fig. 3a, Table 2 and Table S3.2, Supplementary Material S3). Accuracy of divergence time estimation was similar at both the global and local scales (*i.e.*, normalized mean absolute errors of 0.369 and 0.359, respectively; Table 3). The incorporation of independent information into prior distributions of mutational parameters allowed a more accurate estimation of the median divergence time (*cf.* NMAE values were 30 % higher when using the naive mutational prior setting; Table 3). This observation holds true for the three other demographic parameters, with NMAE values 4 to 35 % lower when using informed mutational priors (Table 3).

Using the median as a point estimate, we estimated that the population size contraction in the ancestor could have occurred at a time about three fold older than the divergence time between the subspecies (Table 2). Estimations of the ratio of stable effective sizes of the *S. g. gregaria* and *S. g. flaviventris* populations (*i.e., N*_{f} / *N*_{g}) showed large 90% credibility intervals and include the rate value of 1 (Table 2). Accuracy analysis indicates that our genetic data withhold little information on this composite parameter (Table 3). The bottleneck intensity during the colonization of south-western Africa (*i.e., db*_{f} / *Nb*_{f}) shows the highest accuracy of estimation (Table 3). The median of 1 and the 90% credibility interval of 0.5 to 2.4 exclude severe and mild bottlenecks and rather sustain a strong to moderate event (Table 2).

The most informative summary statistics were different depending on the parameter of interest (results not shown). For the time since divergence between the two subspecies, the most informative statistics corresponded to the expected heterozygosity computed within the *S. g. flaviventris* sample and the mean index of classification from *S. g. flaviventris* to *S. g. gregaria* (Figure S2.4, Supplementary Material S2). The addition of noise variables in our treatments showed that most statistics characterizing genetic variation within the *S. g. gregaria* sample were not informative. These results hold true when using a naive mutational prior setting (Figure S3.3, Supplementary Material S3).

Constraints on allele sizes in conjunction with high population-scaled mutation rates potentially strongly affect the linearity of the relationship between mutation accumulation and time of divergence estimated from microsatellite data. We thus evaluated the accuracy of ABC-RF estimation of the population divergence time as a function of the time scale, under scenario 4. Analyses of pseudo-observed datasets using informed mutational priors showed that the ABC-RF median estimate of divergence time reached a plateau for time scales ≥ 100,000 generations (Figure 4). Thus, the divergence time between *S. g. flaviventris* and *S. g. gregaria* estimated on our real microsatellite dataset (∼10,000 generations) is positioned within the period of linearity with time, well before reaching a plateau reflecting a saturation of genetic information at microsatellite markers. It is hence expected to represent a sensible estimation of the actual divergence time. Figure 4 also showed that the use of a naive mutational prior setting led to a downward bias of the point estimate and to a lower accuracy of estimations. As a result, the incorporation of independent information into the prior distributions of mutational parameters considerably decreased both the NMAE for median estimates and the relative amplitude for time-scales < 100,000 generations (Figure 5).

## Discussion

### A young age of subspecific divergence

With a 90% credibility interval of the posterior density distribution of the divergence time at 0.9 to 6.6 Ky, our ABC-RF analyses clearly point to a divergence of the two desert locust subspecies occurring during the present Holocene geological epoch (0 to 11.7 Ky ago; Figure 3A). The posterior median estimate (2.6 Ky) and interquartile range (1.8 to 3.7 Ky) postdated the middle-late Holocene boundary (4.2 Ky). This past time boundary corresponds to the last transition from humid to arid conditions in the African continent (Figure 3B). This increasing aridity was shown to be a progressive change, with a concomitant maximum in northern and southern Africa at around 4 to 4.2 Ky ago, where aridity caused a contraction of the forest at its northern and southern peripheries without affecting its core region (Guo *et al.* 2000; Maley *et al.* 2018). Interestingly, the earliest archeological records of the desert locust found in Tin Hanakaten (Algeria) and Saqqara (Egypt) archaeological sites date back to this period (see Figure 3B and references within). Pollen records also showed that during this period the plant community was dominated by the desert and semi-desert taxa found today, including some species of prime importance for the current ecology of the desert locust (Kröpelin *et al.* 2008, Shi *et al.* 1998, Duranton *et al.* 2012). Then, the past 4 Ky are thought to have been under environmental stability and as dry as at present. One can therefore reasonably assume that, at the inferred divergence time between the two locust subspecies, the connectivity between the two African hemispheres was still limited by the moist equator, in particular at the west, and by the savannahs and woodlands of the eastern coast (Figure 1C). Consequently, contrary to most phylogeographic studies on other African arid-adapted species (Atickem *et al.* 2018, Moodley *et al.* 2018), it is unlikely that the rather ancient Quaternary climatic history explained the Southern range extension of the desert locust; see Supplementary Material S4 for additional points of discussions on the influence of climatic cycles on *S. gregaria*.

Recent geological and palynological research has shown that a brief fragmentation of the African primary forest occurred during the Holocene interglacial from 2.5 Ky to 2.0 Ky ago (reviewed in Maley *et al.* 2018). This forest fragmentation period is characterized by relatively warm temperatures and a lengthening of the dry season rather than an arid climate. Although this period does not correspond to a phase of general expansion of savannas and grasslands, it led to the opening of the Sangha River Interval (SRI). The SRI corresponds to a 400 km wide (14–18° E) open strip composed of savannas and grasslands dividing the rainforest in a north-south direction. The SRI corridor is thought to have facilitated the southern migration of Bantu-speaking pastoralists, along with cultivation of the semi-arid sub-Saharan cereal, pearl millet, *Pennisetum glaucum* (Schwartz 1992; Bostoen *et al*. 2015). The Bantu expansion took place between approximately 5 and 1.5 Ky ago and reached the southern range of the desert locust, including northern Namibia for the Western Bantu branch and southern Botswana and eastern South Africa for the Eastern Bantu branch (Vansina 1995). We cannot exclude that the recent subspecific distribution of the desert locust has been mediated by this recent climatic disturbance, which included a north-south corridor of open vegetation habitats and the diffusion of agricultural landscapes through the Bantu expansion. The progressive reappearance of forest vegetation 2 Ky ago would have then led to the present-day isolation and subsequent genetic differentiation of the new southern populations from northern parental populations.

Our ABC-RF results indicate that a demographic bottleneck (*i.e.*, a strong transitory reduction of effective population size) occurred in the nascent southern subspecies of the desert locust. The high posterior probability value (96.5%) shows that this evolutionary event could be inferred with strong confidence. This result can be explained by the abovementioned colonization hypothesis if the proportion of suitable habitats for the desert locust in the SRI corridor was low, strongly limiting the carrying capacity during the time for range expansion. Alternatively, the bottleneck event in *S. g. flaviventris* can be explained by a southern colonization of Africa through a long-distance migration event. Long-distance migrations are possible in the gregarious phase of the desert locust, with swarms of winged adults that regularly travel up to 100 km in a day (Roffey and Magor 2003). However, since effective displacements are mostly downwind in this species, the likelihood of a southwestern transport of locusts depends on the dynamics of winds and pressure over Africa (Nicholson 1996, Waloff and Pedgley 1986). Because in southern Africa, winds blow mostly from the north-east toward the extant south-western distribution of the desert locust (at least in southern winter, *i.e.*, August; Figure 1A), only exceptional conditions of a major plague event may have brought a single or a few swarm(s) in East Africa (see Figure 1B) and sourced the colonization of south-western Africa. In agreement with this, rare southward movements of desert locust have been documented along the eastern coast of Africa, for instance in Mozambique in January 1945 during the peak of the major plague of 1941-1947 (Waloff 1966)

### Gain in statistical inferences when incorporating independent information into the mutational prior setting

The mutational rate and spectrum at molecular markers are critical parameters for model-based population genetics inferences (*e.g.*, Estoup *et al.* 2002). We found that the specification into prior distributions of previous estimations of microsatellite mutation rates and allele size constraints substantially improved the accuracy of the divergence time estimation. The using of a naive mutational prior setting, where values for mutational parameters were drawn from uniform distributions allowing for larger uncertainties with respect to mutation rates and allele size constraints, resulted in a larger credibility interval of the divergence time estimated from the observed dataset. The latter credibility interval did not include, however, another transition to a dry climatic period, such as the Younger Dryas (YD, 12.9 to 11.7 Ky) or the Last Glacial Maximum (LGM, 21.1 to 17.2 Ky), two periods with a more continuous potential ecological range for the desert locust. Simulation studies also showed that a naive mutational prior setting resulted in a downward bias in median estimate, which could have altered the historical interpretation of our results. For example, the down-biased estimate of the divergence time obtained when using a naive mutational prior setting (median of 1.7 Ky) agrees less with the timing of the aridity associated with the SRI opening (2.5 Ky to 2 Ky). For scenario choice, the inferential gain in incorporating independent information in mutational prior setting was weaker, with power and error rates decreasing by only a few percent.

It is legitimate to ask the question of whether the observed increases in confidence levels in scenario choice and parameter estimation are worth the substantial efforts required to estimate microsatellite mutation rates from direct observation of germline mutations in non-model species. As food for thought, the use of uniform prior rather than a log-uniform prior for time period parameters led to an absolute bias and increase in credibility interval in divergence time estimate similar to that observed when using a naive rather than an informed mutational prior setting (Supplementary Material S5). Using a log-uniform distribution remains a sensible choice for parameters with ranges of values covering several if not many log-intervals, as doing so allows assigning equal probabilities to each of the log-intervals. The observed effect of prior shape distributions highlights, once again, the well-known potential impacts of the prior settings assumed in Bayesian analyses, and calls for processing various error and accuracy analyses using different prior settings as done in the present study.

### Implication for the evolution of phase polyphenism

Interestingly, the southern subspecies *S. g. flaviventris* lacks, at least partly, the capacity to mount some of the phase polyphenism responses associated with swarming observed in the northern subspecies *S. g. gregaria* (reviewed in Chapuis *et al.* 2017). Since the *S. g. flaviventris* lineage arose about 7,700 generations ago, it seems unlikely that a hard selective sweep from *de novo* mutation(s) is responsible for the loss of phase polyphenism, although the large effective population sizes may prevent their loss by genetic drift and increase the efficacy of selection (Kimura 1962). Selection on standing genetic variation may therefore better explain such a rapid evolution, since beneficial alleles are immediately available, less likely to be lost by drift than new mutations, and may have been pre-tested by selection in past environments (Barrett and Schluter 2008). Such a scenario would require that variants associated with the reduction of phase polyphenism in *S. g. flaviventris* were already present in past *S. g. gregaria* environments at relatively high frequencies, which may have occurred through *prior* adaptation. First, temporal heterogeneity in selection between low-density (solitarious) and high-density (gregarious) environments in the northern range may have contributed to retain a high level of genetic variance on this trait (Siepielski *et al.* 2009; Pélissié *et al.* 2016). Second, the southern colonization was preceded by a prolonged and severe contraction of northern deserts, providing ecological conditions favorable for the evolution of a solitarious phase in the native environment that may have facilitated adaptation in the novel southern range of the species.

Hundreds to thousands of genes have been previously identified as differentially expressed between isolated (solitarious) and crowded (gregarious) phases of the desert locust but the challenge of targeting those relevant to the polyphenetic switch is daunting (Badisco *et al.* 2011, Bakkali and Martín-Blázquez 2018). In this context, a promising investigation axis to identify key genes (or transcripts) is to use population genomics (or transcriptomics) approaches comparing highly polyphenic *S. g. gregaria* populations and less polyphenic *S. g. flaviventris*. In particular, genomics studies based on genome scans (reviewed in Vitti et al. 2013) use population samples to measure genetic diversity and differentiation at many loci, with the goal of detecting loci under divergent selection. Since the variance in differentiation estimates across loci is expected to be lower in poorly differentiated populations (Hoban et al. 2016), the recent divergence between desert locust lineages should ease the detection of signatures of natural selection. Genome scans can lead to misleading signals of selection if the effects of geographical, temporal and demographic factors are not properly accounted for (Li *et al.* 2012; Vitti *et al.* 2013). For example, bottlenecks may create spurious signatures that mimic those left by positive selection. Future genome scan studies will therefore greatly benefit from the historical and demographic parameters inferred in the present study, as they could be explicitly included in the analytical process (*e.g.* Vitalis *et al.* 2001; Nielsen *et al.* 2009).

## Materials & Methods

### Formalization of evolutionary scenarios

To help formalize the evolutionary scenarios to be compared, we relied on maps of vegetation cover in Africa from the Quaternary Environment Network Atlas (Adams and Faure 1997), considering more specifically the periods representative of arid maximums (LGM and YD; Fig.1E-F, humid maximums (HCO; Fig.1D), and present-day arid conditions (Fig.1C). Desert and xeric shrubland cover fits well with the present-day species range during remission periods. Tropical and Mediterranean grasslands were added separately to the desert locust predicted range since the species inhabits such environments during outbreak periods only. The congruence between present maps of species distribution (Fig.1A) and of open vegetation habitats (Fig.1C) suggests that vegetation maps for more ancient periods could be considered as good approximations of the potential range of the desert locust in the past. Maps of vegetation cover during ice ages (Figs. 1E and 1F) show an expansion of open vegetation habitats (*i.e.*, grasslands in the tropics and deserts in both the North and South of Africa) sufficient to make the potential range of the species continuous from the Horn of Africa in North-West to the Cape of Good Hope in the South.

Based on the above climatic and paleo-vegetation map reconstructions, we considered a set of alternative biogeographic hypotheses formulated into different types of evolutionary scenarios. First, we considered scenarios involving a more or less continuous colonization of southern Africa by the ancestral population from a northern origin. In this type of scenario, effective population sizes were allowed to change after the divergence event, without requiring any bottleneck event (*i.e.*, without any abrupt and strong reduction of population size) right after divergence. Second, we considered the situation where the colonization of Southern Africa occurred through a single (or a few) long-distance migration event(s) of a small fraction of the ancestral population. This situation was formalized through scenarios that differed from the formers by the occurrence of a bottleneck event in the newly founded population. The bottleneck event occurred into *S. g. flaviventris* right after divergence and was modelled through a limited number of founders during a short period.

Because the last Quaternary cycle includes several arid climatic periods, including the intense punctuation of the Younger Dryas (YD) and the last glacial maximum (LGM), we also considered scenarios that incorporated the possibility of secondary contact with asymmetrical genetic admixture from *S. g. gregaria* into *S. g. flaviventris*. Since previous tests based on simulated data showed a poor power to discriminate between a single versus several admixture events (results not shown), we considered only models including a single admixture event.

Finally, at interglacial humid maximums, the map of vegetation cover showed a severe contraction of deserts, which were nearly completely vegetated with annual grasses and shrubs and supported numerous perennial lakes (Fig.1D; deMenocal *et al.* 2000). We thus envisaged the possibility that climatic-induced contractions of population sizes have pre-dated the separation of the two subspecies. Hence, whereas so far scenarios involved a constant effective population size in the ancestral population, we formalized alternative scenarios in which we assumed that a long population size contraction event occurred into the ancestral population at a time *t*_{ca}, with an effective population size *Nc*_{a} for a duration *dc*_{a}.

Combining the presence or absence of the three above-mentioned key evolutionary events (a bottleneck in *S. g. flaviventris*, an asymmetrical genetic admixture from *S. g. gregaria* into *S. g. flaviventris*, and a population size contraction in the ancestral population) allowed defining a total of eight scenarios, that we compared using ABC-RF. The eight scenarios with their historical and demographic parameters are graphically depicted in Figure 2. All scenarios assumed a northern origin for the common ancestor of the two subspecies and a subsequent southern colonization of Africa. This assumption is supported by recent mitochondrial DNA data showing that *S. g. gregaria* have higher levels of genetic diversity and diagnostic bases shared with outgroup and congeneric species, whereas *S. g. flaviventris* clade was placed at the apical tip within the species tree (Chapuis *et al.* 2016). All scenarios considered three populations of current effective population sizes *N*_{f} for *S. g. flaviventris, N*_{g} for *S. g. gregaria*, and *N*_{a} for the ancestral population, with *S. g. flaviventris* and *S. g. gregaria* diverging *t*_{div} generations ago from the ancestral population. The bottleneck event which potentially occurred into *S. g. flaviventris* was modelled through a limited number of founders *N*_{bf} during a short period *d*_{bf}. The potential asymmetrical genetic admixture from *S. g. gregaria* into *S. g. flaviventris* occurred at a time *t*_{sc}, with an effective population size *N*_{ca} and a proportion *r*_{g} of genes of *S. g. gregaria* origin. The potential population size contraction event occurred into the ancestral population at a time *t*_{ca}, with an effective population size *N*_{ca} during a duration *d*_{ca}.

### Prior setting for historical and demographical parameters

Prior values for time periods between sampling and secondary contact, divergence and/or ancestral population size contraction events (*t*_{ca}, *t*_{div} and *t*_{sc}, respectively) were drawn from log-uniform distributions bounded between 100 and 500,000 generations, with *t*_{ca} > *t*_{div} > *t*_{sc}. Assuming an average of three generations per year (Roffey and Magor 2003), this prior setting corresponds to a time period that goes back to the second-to-latest glacial maximum (150 Ky ago) (de Vivo and Carmignotto 2004, deMenocal *et al.* 2000). Preliminary analyses showed that assuming a uniform prior shape for all time periods (instead of log-uniform distributions) do not change scenario choice results, with posterior probabilities only moderately affected, and this despite a substantial increase of out-of-bag prior error rates (*e.g.*, + 50% when considering the eight scenarios separately; Table S5.1, Supplementary Material S5). Analyses of simulated pseudo-observed datasets (pods) showed that assuming a uniform prior rather than a log-uniform prior for time period parameters would have also biased positively the median estimate of the divergence time and substantially increased its 90% credibility interval (Figure S5.1 and Table S5.2, Supplementary Material S5).

We used uniform prior distributions bounded between 1×10^{4} and 1×10^{6} diploid individuals for the different stable effective population sizes *N*_{f}, *N*_{g} *and N*_{a} (Chapuis *et al.* 2014). The admixture rate (*r*_{g}; *i.e.*, the proportion of *S. g. gregaria* genes entering into the *S. g. flaviventris* population), was drawn from a uniform prior distribution bounded between 0.05 and 0.5. We used uniform prior distributions bounded between 2 and 100 for both the numbers of founders (in diploid individuals) and durations of bottleneck events (in number of generations). For the contraction event, we used uniform prior distributions bounded between 100 and 10,000 for both the population size *N*_{ca} (in diploid individuals) and duration *d*_{ca} (in number of generations). Assuming an average of three generations per year (Roffey and Magor 2003), such prior choice allowed a reduction in population size for a short to a relatively long period, similar for instance to the whole duration of the HCO (from 9 to 5.5 Ky ago) which was characterized by a severe contraction of deserts.

### Microsatellite dataset, mutation rate and mutational model

We carried out our statistical inference on the microsatellite datasets previously published in Chapuis *et al.* (2016). The 23 microsatellite loci genotyped in such datasets were derived from either genomic DNA (14 loci) or messenger RNA (9 loci) resources, and were hereafter referred to as untranscribed and transcribed microsatellite markers (following Blondin *et al.* 2013). These microsatellites were shown to be genetically independent, free of null alleles and at selective neutrality (Chapuis *et al.* 2016). Previous levels of *F*_{ST} (Weir 1996) and Bayesian clustering analyses (Pritchard *et al.* 2000) among populations showed a weak genetic structuring within each subspecies (Chapuis *et al.* 2014, 2017). For each subspecies, we selected and pooled three population samples in order to ensure both a large sample size (i.e., 80 and 90 individuals for *S. g. gregaria* and *S. g. flaviventris*, respectively), while ensuring a non-significant genetic structure within each subspecies pooled sample, as indicated by non-significant (i.e. p-value > 0.05; Genepop 4.0; Rousset 2008) (i) Fisher’s exact tests of genotypic differentiation among the three initial population samples within subspecies and (ii) exact tests of Hardy-Weinberg equilibrium for each subspecies pooled sample. More precisely, the *S. g. gregaria* sample consisted in pooling the population samples 8, 15 and 22 of Chapuis *et al.* (2014) and the *S. g. flaviventris* sample included the population samples 1, 2 and 6 of Chapuis *et al.* (2017).

Mutations occurring in the repeat region of each microsatellite locus were assumed to follow a symmetric generalized stepwise mutation model (GSM; Zhivotovsky *et al.* 1997; Estoup *et al.* 2002). Prior values for any mutation model settings were drawn independently for untranscribed and transcribed microsatellites in specific distributions. The informed mutational prior setting was defined as follows. Because allele size constraints exist at microsatellite markers, we informed for each microsatellite locus their lower and upper allele size bounds using values estimated in Chapuis et al. (2015), following the approach of Pollock *et al.* (1998) and microsatellite data from several species closely related to *S. gregaria* (Blondin *et al.* (2013). Prior values for the mean mutation rates were set to the empirical estimates inferred from observation of germline mutations in Chapuis *et al.* (2015), *i.e.*, 2.8×10^{-4} and 9.1×10^{-5} for untranscribed and transcribed microsatellites, respectively. The parameters for individual microsatellites were then drawn from a Gamma distribution with and shape = 0.7 (Estoup *et al.* 2001) for both types of microsatellites. We ensured that the chosen value of shape parameter generated the same inter-loci variance as estimated in Sun *et al.* (2012) from direct observations of thousands of human microsatellites. Prior values for the mean parameters of the geometric distributions of the length in number of repeats of mutation events were set to the proportions of multistep germline mutations observed in Chapuis *et al.* (2015), *i.e.*, 0.14 and 0.67 for untranscribed and transcribed microsatellites, respectively. The *P* parameters for individual loci were then standardly drawn from a Gamma distribution ( and shape = 2). We also considered mutations that insert or delete a single nucleotide to the microsatellite sequence. To model this mutational feature, we used the DIYABC default setting values (*i.e.*, a uniform distribution bounded between [10^{-8}, 10^{-5}] for the mean parameter and a Gamma distribution ( and shape = 2) for individual loci parameters; Cornuet *et al.* 2010; see also DIYABC user manual p. 13, http://www1.montpellier.inra.fr/CBGP/diyabc/).

We evaluated how the incorporation of independent information on prior distributions for mutational parameters affected both the posterior probabilities of scenarios and the posterior parameter estimation under our inferential framework. To this aim, we re-processed our inferences using a naive mutational prior setting, often used in many ABC microsatellite studies (*e.g.*, Estoup et al. 2002). In this case, prior values for mean mutation parameters were drawn from uniform distributions instead of being set to a fixed value as in the informed mutational prior setting. For each set of untranscribed or transcribed microsatellites, all loci were free of allele size constraints (cf. allele size bounds were fixed to very different values such as 2 and 500 for the lower and upper bounds, respectively), prior values for were drawn from a uniform distribution bounded between 10^{-5} and 10^{-3}, values were drawn in a uniform distribution bounded between 0.1 and 0.3. Finally, the mean rate of single nucleotide indel mutations and all parameters for individual loci were set to the DIYABC default values (Chapuis *et al.* 2014; 2015).

### Analyses using ABC Random Forest

We used the software DIYABC v.2.1.0 (Cornuet *et al.* 2014) to simulate datasets constituting the so-called reference tables (i.e. records of a given number of datasets simulated using the scenario ID and the evolutionary parameter values sampled from prior distributions and summarized with a pool of statistics). Random-forest computations were then performed using a new version of the R library ABCRF (version 1.8) available on the CRAN. This version includes all ABC-RF algorithms detailed in Pudlo *et al.* (2016), Raynal *et al.* (2019) and Estoup *et al.* (2018a) for scenario choice and parameter estimation, as well as several statistical novelties allowing to compute error rates in scenario choice and accuracy measures for parameter estimation (see details below).

For scenario choice, the outcome of the first step of the ABC-RF statistical treatment applied to a given target dataset is a classification vote for each scenario which represents the number of times a scenario is selected in a forest of *n* trees. The scenario with the highest classification vote corresponds to the scenario best suited to the target dataset among the set of compared scenarios. This step also provides an error rate relevant to the entire prior sampling space, the global prior error. See the section *Global prior errors* in Supplementary Material S1 for details. The second RF analytical step provides a reliable estimation of the posterior probability of the best supported scenario. One minus such posterior probability yields the local posterior error associated to the observed dataset (see the section *Local posterior errors* in Supplementary Material S1). In practice, ABC-RF analyses were processed by drawing parameter values into the prior distributions described in the two previous sections and by summarizing microsatellite data using a set of 32 statistics (see Table S6.1, Supplementary Material S6, for details about such summary statistics as well as their values obtained from the observed dataset) and the one LDA axis or seven LDA axes (i.e. number of scenarios minus 1; Pudlo *et al.* 2016) computed when considering pairwise groups of scenarios or individual scenarios, respectively. We processed ABC-RF treatments on reference tables including 100,000 simulated datasets (*i.e.*, 12,500 per scenario). Following Pudlo et al. (2016), we checked that 100,000 datasets was sufficient by evaluating the stability of prior error rates and posterior probabilities estimations of the best scenario on 50,000, 80,000 and 90,000 and 100,000 simulated datasets (Table S6.2, Supplementary Material S6). The number of trees in the constructed random forests was fixed to *n* = 3,000, as this number turned out to be large enough to ensure a stable estimation of the prior error rate (Figure S6.1, Supplementary Material S6). We predicted the best scenario and estimated its posterior probability and prior error rate over ten replicate analyses based on ten different reference tables.

In order to decipher the main evolutionary events that occurred during the evolutionary history of the two desert locust subspecies, we first conducted ABC-RF treatments on three pairwise groups of scenario (with four scenarios per group): groups of scenarios with *vs.* without a bottleneck in *S. g. flaviventris*, groups with *vs.* without a population size contraction in the ancestral population, and groups with *vs.* without a secondary contact with asymmetrical genetic admixture from *S. g. gregaria* into *S. g. flaviventris*. We then conducted ABC-RF treatments on the eight scenarios considered separately.

For parameter estimation, we constructed ten independent replicate RF treatments based on ten different reference tables for each parameter of interest (Raynal *et al.* 2019): the time since divergence, the ratio of the time of the contraction event into the ancestral population on the time since divergence, the intensity of the bottleneck event in the sampled *S. g. flaviventris* population (defined as the ratio of the bottleneck event of duration *db*_{f} on the effective population size *Nb*_{f}) and the ratio of the stable effective population size of the two sampled populations. For each RF treatment, we simulated a total of 100,000 datasets for the selected scenario (drawing parameter values into the prior distributions described in the two previous sections and using the same 32 summary statistics). Following Raynal *et al.* (2019), we checked that 100,000 datasets was sufficient by evaluating the stability of the measure of accuracy on divergence time estimation using 50,000, 80,000 and 90,000 simulated datasets (Table S6.3, Supplementary Material S6). The number of trees in the constructed random forests was fixed to *n* = 2,000, as such number turned out to be large enough to ensure a stable estimation of the measure of divergence time estimation accuracy (Figure S6.2, Supplementary Material S6). For each RF treatment, we estimated the median value and the 5% and 95% quantiles of the posterior distributions. It is worth noting that we considered median values as the later provided more accurate estimations (according to out-of-bag predictions) than when considering mean values (results not shown). Accuracy of parameter estimation was measured using out-of-bag predictions and the normalized mean absolute error (NMAE). NMAE corresponds to the mean of the absolute difference between the point estimate (here the median) and the (true) simulated value divided by the simulated value (formula detailed in Supplementary Material S1).

Finally, because microsatellite markers tend to underestimate divergence time for large time scales due to allele size constraints, we evaluated how the accuracy of ABC-RF estimation of the time of divergence between the two subspecies was sensitive to the time scale. To this aim, we used DIYABC to produce pseudo-observed datasets assuming fixed divergence time values chosen to cover the prior interval (100; 250; 500; 1,000; 2,500; 5,000; 10,000; 25,000; 50,000; 100,000; 250,000 generations) and using the best scenario with either the informed or the naive mutational prior setting. We simulated 5,000 of such test datasets for each of the eleven divergence time values. Each of these test dataset was treated using ABC-RF in the same way as the above target observed dataset. In addition, we computed for each test dataset the relative amplitude of parameter estimation, as the 90% credibility interval divided by the (true) simulated value.

## Supporting Material

Additional supporting information may be found in the online version of this article.

## Supplementary material online for the manuscript: « A young age of subspecific divergence in the desert locust »

### Supplementary Material S1: Overview of the used ABC Ran-dom Forest (ABC-RF) methods

In this supplementary material, we provide readers with an overview of the Approximate Bayesian Computation Random Forest (hereafter ABC-RF) methods used in the present paper. We invite the reader to consult Pudlo et al. (2016), Estoup et al. (2018), and Raynal et al. (2018) for more in-depth explanations.

#### ABC framework

Let **y** denote the observed data and ** θ** a vector of parameters associated to a statistical model whose likelihood is

*f*(. |

**). Under the Bayesian parametric paradigm the posterior distribution is of prime interest. It characterizes the distribution of**

*θ***given the observation**

*θ***y**and can be interpreted as an update of the prior distribution

*π*(

**) by the likelihood of**

*θ***y**. The likelihood is hence pivotal, but unfortunately intractable in the evolutionary scenarios (models) we consider in the present study, as well as in many other evolutionary studies. As a matter of fact, the underlying Kingman’s coalescent process (Kingman, 1982) does not allow a close expression for the likelihood because all the possible genealogies and mutational process yielding

**y**should be considered. To solve this issue, some likelihood-free methods have been developed using the fact that, even though the likelihood is not available, generating artificial (i.e. simulated) data for a given value of

**is much easier if not feasible (e.g. Beaumont (2010). Approximate Bayesian computation (ABC) is one of them (Beaumont et al., 2002).**

*θ*In a nutshell, ABC consists in generating parameters ** θ**′ and associated pseudo-data

**z**from the scenario, and accepting

**′ as a realization from an**

*θ**approximated*posterior if

**z**is similar to

**y**. In standard ABC treatments, the notion of similarity is defined through the use of a distance

*ρ*to compare

*η*(

**z**) and

*η*(

**y**), where

*η*(.) is a projection of the data in a lower dimensional space of summary statistics. Only pseudo-data providing distance lower than a threshold

*ϵ*are retained. The choice of

*ρ, η*(.) and

*ϵ*is a major issue in ABC (Beaumont, 2010).

ABC-RF is a recently derived ABC approach based on the supervised machine learning tool named Random Forest (Breiman, 2001), which has as major advantage to avoid the three above-mentioned difficulties. Initially introduced in Pudlo et al. (2016) for model choice and then extended to parameter inference in Raynal et al. (2018), ABC-RF relies on the use of random forests on a set of simulated pseudo-data according to the generative Bayesian models under consideration. Let consider *M* Bayesian parametric models. For a given model index *m* ∈ {1, *…*, *M*}, a prior probability ℙ (ℳ = *m*) is defined, with *θ*_{m} its associated parameters and *f*_{m}(**y** | *θ*_{m}) its likelihood. The generation process of a reference table made of *H* elements is described in Algorithm 1.

The output takes the form of a matrix containing simulated model indexes, parameters and summary statistics, as described below

#### ABC-RF for model choice

The ABC-RF strategy for model choice is described in Algorithm 2. The output is the affectation of **y** to a model (scenario), this decision being made based on the majority class of the RF tree votes.

The selected scenario is the one with the highest number of votes in his favor. In addition to this majority vote, the posterior probability of the selected scenario can be computed as described in Algorithm 3.

### ABC-RF computation of the posterior probability of the selected scenario

Such posterior probability provides a confidence measure of the previous prediction at the point of interest *η*(**y**). It relies on the building of a regression random forest designed to explain the model prediction error. More specifically, and as a first step, posterior probability computation makes use of out-of-bag predictions of the training dataset. Because each tree of the random forest is built on a bootstrap sampling of the *H* elements of the reference table (i.e. the training dataset), there is about one third of the reference table that remains unused per tree, and this ensemble of left aside datasets corresponds to the “out-of-bag”. Thus, for each pseudo-data of the reference table, one can obtain an out-of-bag prediction by aggregating all the classification trees in which the pseudo-data was out-of-bag. In a second step, the out-of-bag predictions are used to compute the indicators . These 0 - 1 values are used as response variables for the regression random forest training, for which the explanatory variables are the summary statistics of the reference table. Predicting the observed data thanks to this forest allows the derivation of the posterior probability of the selected model (Algorithm 3). Note that using the out-of-bag procedure prevents over-fitting issues and is computationally parsimonious as it avoids the generation of a second reference table for the regression random forest training.

#### Model grouping

A recent useful add-on to ABC-RF has been the model-grouping approach developed in Estoup et al. (2018), where pre-defined groups of scenarios are analysed using Algorithm 2 and 3. The model indexes used in the training reference table are modified in a preliminary step to match the corresponding groups, which are then used during learning phase. When appropriate, unused scenarios are discarded from the reference table. This improvement is particularly useful when a high number of individual scenarios are considered and have been formalized through the absence or presence of some key evolutionary events (e.g. admixture, bottleneck, …). Such key evolutionary events allow defining and further considering groups of scenarios including or not such events. This grouping approach allows to evaluate the power of ABC-RF to make inferences about evolutionary event(s) of interest over the entire prior space and assess (and quantify) whether or not a particular evolutionary event is of prime importance to explain the observed dataset (see Estoup et al. (2018) for details and illustrations).

#### ABC-RF for parameter estimation

Once the selected (i.e. best) scenario has been identified, the next step is the estimation of its parameters of interest under this scenario. The ABC-RF parameter estimation strategy is described in Algorithm 4 and takes a similar structure to Algorithm 2. The idea is to use a regression random forest for each dimension of the parameter space (i.e. for each parameter). For a given parameter of interest, the output of the algorithm is a vector of weights **w**_{y} that can be used to compute posterior quantities of interest such as expectation, variance and quantiles. **w**_{y} provides an empirical posterior distribution for *θ*_{m,k}; see Raynal et al. (2018) for more details.

#### Global prior errors

In both contexts, model choice or parameter estimation, a global quality of the predictor can be computed, which does not take the observed dataset (about which one wants to make inferences) into account. Random forests make it possible the computation of errors on the training reference table, using the out-of-bag predictions previously described in the section “ABC-RF for model choice”.

For model choice, this type of error is called the prior error rate, which is the mis-classification error rate computed over the entire multidimensional prior space. It can be computed as For parameter estimation, the equivalent is the prior mean squared error (MSE) or the normalised mean absolute error (NMAE), the latter being less sensitive to extreme values. These errors are computed as They can be perceived as Monte Carlo approximation of expectations with respect to the prior distribution.

#### Local posterior errors

In the present paper, we propose some posterior versions of errors, which target the quality of prediction with respect to the posterior distribution. As such errors take the observed dataset *η*(**y**) into account, we mention them as local posterior errors.

For model choice, the posterior probability provided by Algorithm 3 is a confidence measure of the selected scenario given the observation. Therefore directly yields the posterior error associated to .

For parameter estimation, when trying to infer on *θ*_{m,k}, a point-wise analogous measure of a local error can be computed as the posterior expectations
We approximate these expectations by
We again uses the out-of-bag information to compute , hence avoiding the (time consuming) production of a second reference table, and assume that the weights **w**_{y} from the regression random forest are good enough to approximate any posterior expectations of functions of *θ*_{m,k}:
Another more expensive strategy to evaluate the posterior expectations (1) is to construct new regression random forests using the out-of-bag vector of values
depending on the targeted error. The observation *η*(**y**) is then given to the forests, targeting the expectations (1).

Note that the values in the previous formulas can be replaced by either the approximated posterior expectations or the posterior medians , again using the out-of-bag information, to provide the local posterior errors. We found that both in the present paper (see main text, Materials and Methods section) and for various tests that we carried out on different inferential setups and datasets (results not shown), the posterior median provides a better accuracy of parameter estimation than the posterior expectation (aka posterior mean). This trends also holds for global prior errors that can be computed using either the mean or the median as point estimates.

As final comment, it is worth noting that so far a common practice consisted in evaluating the quality of prediction (for model choice or parameter estimation) in the neighborhood of the observed dataset, that is around *η*(**y**) and not exactly for *η*(**y**). For model choice, Estoup et al. (2018) use the so called posterior predictive error rate which is an error of this type. In this case, some simulated datasets of the reference table close to the observation are selected thanks to an Euclidean distance, then new pseudo-observed datasets are simulated using similar parameters, on which is computed the error (see also Lippens et al., 2017, for a similar approach in a standard ABC framework). However, the main problem of processing this way is the difficulty to specify the size of the area around the observation, especially when the number of summary statistics is large. We therefore do not recommend the use of such a “neighborhood” error anymore, but rather to compute the local posterior errors detailed above as the latter measured prediction quality exactly at the position of interest *η*(**y**).

## Supplementary material S4. On the influence of climatic cycles on the potential range variation of the desert locust *Schistocerca gregaria.*

It may appear surprising, at least at first sight, that the southern colonization of the desert locust did not occur during one of the major glacial episodes of the last Quaternary cycle, since these periods are characterized by a more continuous range of the desert locust (see paleo-vegetation maps in Figures 1E and 1F in the main text). In particular, during the last glacial maximum (LGM, −14.8 Ky to −26 Ky), the Sahara desert extended hundreds of km further South than at present and annual precipitation were lower (i.e. ∼200–1,000 mm/year). Several hypotheses explain why our evolutionary scenario choice procedure provided low support to the possibility of a birth of the locust subspecies *S. g. flaviventris* at older periods. First, we cannot exclude that our microsatellite genetic data allow making inferences about the last colonization event only. The probabilities of choosing scenarios including a genetic admixture event after the split were the lowest, with a posterior predictive error of 16.1% (see Table 1 in the main text). The recent North-to-South colonization event selected by our ABC-RF treatment may hence have blurred traces of older colonization events.

Second, while there is large evidence that much of Africa was drier during the last glacial phase, this remains debated for southwestern Africa (see the gray coloration in Figure 3B in the main text). Some climate models show that at least some parts of this region, such as the Kalahari Desert, may have experienced higher rainfall than at present (Cockcroft 1987; Ganopolski *et al.* 1998; Chase and Meadows 2007). Such regional responses to glacial cycles may have prolonged until the middle Holocene. In particular, the northern Younger Dryas (i.e. −12.9 to −11.7 Ky) can be correlated only partly with an arid period in the southern hemisphere (i.e. −14.4 to −12.5 Ky). Such older climate episodes in antiphase between hemispheres (see the sandy brown coloration in Figure 3B in the main text) may have prevented from either a successful North-to-South migration event or a successful establishment and spread in the new southern range.

Third, although semi-desert and desert biomes were more expanded than at present during the LGM, extreme aridity and lowered temperatures may have actually been unfavorable to the species. For example, mean temperatures lowered by 5 to 6°C in both southern-western Africa (Stute and Talma 1997) and Central Sahara (Edmunds *et al.* 1999). The maintenance of desert locust populations depends on the proximity of areas with rainfalls at different seasons or with the capacity to capture and release water. For instance, in the African northern range, breeding success of locust populations relies on seasonal movements between the Sahel-Saharan zones of inter-tropical convergence, where the incidence of rain is high in summer, and the Mediterranean-Saharan transition zone, with a winter rainfall regime (Rainey and Waloff 1951). In addition, adult migration and nymphal growth of the desert locust are dependent upon high temperature (Roffey and Magor 2003). It is hence possible then that the conjunction of hyper-aridity with intense cold could not easily support populations of the desert locust, despite the high extent of their migrations.

While ABC-RF analyses did not support that the Quaternary climatic history explained the subspecific divergence in the desert locust, they provided evidence for the occurrence of a large contraction of the size of the ancestral population preceding the divergence. Using the median as a point estimate, we estimated that the population size contraction in the ancestor could have occurred at a time about three fold older than the divergence time between the subspecies. This corresponds to the African humid period in the early and middle stages of the Holocene, though the large credibility interval also included the last interglacial period of the Pleistocene (Figure 3B in the main text). Such population size contraction was likely induced by the severe(s) contraction(s) of deserts that prevailed prior the estimated divergence between the two subspecies. Interestingly, these humid periods were more intense and prolonged in northern Africa, which corresponded to the presumed center of origin of the most recent common ancestor (Scott 1993; Partridge 1997; Shi *et al.* 1998).

## Acknowledgements

This work was supported by research funds from the French Agricultural Research Centre for International Development (CIRAD), the project ANR-16-CE02-0015-01 (SWING), the INRA scientific department SPE (AAP-SPE 2016), and the Labex NUMEV (NUMEV, ANR10-LABX-20). The data used in this work were partly produced through the technical facilities of the Centre Méditerranéen Environnement Biodiversité, Montpellier. We thank Christine N. Meynard for careful English language editing and insightful discussions on past climate models for Africa, Pierre-Emmanuel Gay for assistance with maps, Antoine Foucart, Gauthier Dobigny and Jean-Yves Rasplus for fruitful discussions, and Renaud Vitalis for constructive comments on an earlier version of the manuscript.