Revisiting biogeography of livestock animal domestication

Human society relies on four main livestock animals – sheep, goat, pig and cattle, which all were domesticated at nearly the same time and place. Many arguments have been put forward to explain why these animals, place and time were suitable for domestication, but the question – why only these, but not other animals, still does not have a clear answer. Here we offer a biogeographical perspective: we survey global occurrence of large mammalian herbivore genera around 15 000 – 5 000 years before present and compile a dataset characterising their ecology, habitats and dental traits. Using predictive modelling we extract patterns from this data to highlight ecological differences between domesticated and nondomesticated genera. The most suitable for domestication appear to be generalists adapted to persistence in marginal environments of low productivity, largely corresponding to cold semi-arid climate zones. Our biogeographic analysis shows that even though domestication rates varied across continents, potentially suitable candidate animals were rather uniformly distributed across continents. We interpret this pattern as a result of an interface between cold semi-arid and hot semi-arid climatic zones. We argue that hot Semi-arid climate was most suitable for plant domestication, cold Semi-arid climate selected for animals most suitable for domestication as livestock. We propose that the rates of domestication across biogeographic realms largely reflect how much intersection between hot and cold Semi-arid climatic zones was available at each continent.


INTRODUCTION
One of the fundamental questions in human history is why so relatively few animal and plant groups have been domesticated . Human society today relies on four main livestock animals (sheep, goat, pig and cattle), which all were domesticated at nearly the same time and place -starting about 10 500 years ago at the Fertile Crescent 1 . Even more intriguing is that those species did not evolve locally at or near the Fertile Crescent, the wild progenitors immigrated mostly from the Central Asia . Curiously, domestication of those livestock animals happened repeatedly at the same place, as "genetic analyses report multiple domestic lineages for each species" ). Many arguments have been put forward to explain why these animals, place and time were suitable for domestication (see e.g. , Larson and Fuller 2014, Zeder 2017, MacHugh et al 2017, Crosby 2006. But key questions -why animals from elsewhere were more suitable for domestication instead of native animals, why those animals have not been domesticated at their places of origin, and why animals local to the Fertile Crescent have not been domesticated instead or in addition, largely remain unanswered.
The main objective of this study is to investigate to what extent opportunities for early livestock domestication have been exhausted. To analyse options for animal domestication, we ask what potential candidate species were available around the time of first known domestications of livestock, given in Table 1. We analyse ecological, biogeographic and climatic context of large plant eating mammals contemporary to the time of those early domestications aiming to deduce common patterns of their physiology, ecology, dietary preferences and inferred behaviour. We computationally model the probability of domestication as a function of those factors. To the best of our knowledge this is the first quantitative analysis of domestication patterns at the global scale. We make the analysis datasets publicly available for research and development. Table 1. Domestication of livestock animals. Domestication time is an approximate time of transition between management of animals and morphological changes associated with domestication, rounded to the nearest thousand.
The list of animals is derived from , , .

MATERIALS AND METHODS
Our study region is the world at times of early livestock animal domestication, we consider a time range from around 15 000 to 5 000 years before present. Domestication has many definitions and interpretations (Decory 2019) that vary across groups of organisms and contexts of usage. Here we consider domesticated those species for which humans have significant degree of influence over the reproduction and care, and which have undergone significant genetic, behavioural and/or morphological changes since. Here we consider only livestock animals and do not cover birds.
Our study consists of two parts: analysis of ecological characteristics of candidate animals, and biogeographic context of their domestication. For the latter, we divide the world into six regions following designation of biogeographic realms at the present day (Olson et al 2001): Nearctic (mainly North America), Palearctic (mainly Eurasia), Afrotropic (mainly Africa), Indomalaya (South East Asia), Australasia and Neotropic (South and Central Americas). Mapping of those realms is given in the Supplement. The realms distinguish large terrestrial areas within which organisms have been evolving in relative isolation over long periods of time, separated by geographic features, such as oceans, broad deserts, or high mountain ranges. We therefore use realms as natural units for analysing and comparing domestication rates across the globe.

The dataset of candidate genera
We compiled the list of candidate genera for domestication following the lists of living mammalian species (Wilson andReeder, 2005, andNowak, 2018), assuming that genera that are alive today were present in the same biogeographic realms within the selected time frame in the past. We complemented our list of candidates by fossil genera from our own archives and the NOW database of fossil mammals (The NOW Community, 2018). We excluded genera that went extinct before 5000 B.P., assuming that they were rare at the times of possible first domestications (e.g. Hippidion in South America). Even though taxa could have been domesticated just before they went extinct, while going towards extinction rarity of individuals increases (Zliobaite et al 2017) and perhaps persistence in native environments is becoming more challenging, even if domestication is attempted. Our analysis of is at the level of genera rather than species, since genera is a more robust unit of analysis for the past (Eronen et al 2011) and cogeneric species rarely co-occur anyway (Levin et al 2012).
We chose to compile a list from scratch rather than building on existing regional list aiming for consistency of treatment and due to lack of existing options. A regional compilation of candidate species for South-West Asia by Garrad (1984) exists, which would have covered only a small part of our scope.  compiled a global list, which he used for computing domestication rates at continents, but the list of candidate taxa was not made available in his publication.
Our candidate list included herbivorous genera within the body mass range from 40 to 1000 kg, which covers the body mass classes of early domesticated livestock animals. Having livestock animals large enough saves the need to herd countless numbers of individuals. Yet domestic animals cannot be too large, since otherwise it would be too challenging to handle them and it would take too much time and effort to raise them to maturity. Therefore, neither small mammals, nor megaherbivores, such as elephants, are particularly practical for domestication. We did not include carnivores, but included omnivores. In retrospect, the most suitable seem to be large herbivores, from about human size to at most an order of magnitude larger. This goes in line with carnivore-prey body size relationships in mammals, where preferred prey is from about half to about twice size of the carnivore (Owen-Smith and Mills 2008). We assume that the lower bound for humans is the same as for predators hunting in the wild, but the upper bound for humans can go higher because of tools and technology available for humans for managing and killing livestock animals.
We characterized each candidate taxa by their habitat, dietary characteristics, and functional dental traits, which made 15 features in total aiming to capture their ways of life. Habitat and dietary characteristics were collected from numerous academic sources and animal databases. We did not record individual sources for the habitat and dietary variables, and cannot provide referencing for each individual data point thereof. Given our choice to work at the resolution of genera, habitat and dietary variables can only be approximate, since a genus may include species that have different diets or habitats, or their ways of life may have changed between 15 000 years before present and now. We resolved within genus variations by assigning characteristics of the most common species, or, in rare cases of equal balance of the opposite characteristics, we assigned a half value. If information about the behaviour now and in the past diverged, we assigned the value for the past. We characterized dental traits following the functional crown type scoring scheme described by  with one modification allowing selenodonts have acute lophs, following the reasoning of Oksanen et al (2019).
Each candidate genus was described by 15 binary or ordinal variables, given in Table 2. All the variables except for hypsodonty were encoded in the scale from 0 to 1. We kept hypsodonty in the original ordinated scale (Fortelius et al., 2002), in order not to introduce extra complications in interpretations. A detailed description of the scoring scheme is given in the Supplement. A full dataset along with the list of genera excluded from the candidate list and reasons thereof are given in the Supplement as well. 1 acute lophs present in molars, 0 absent. 11. Obtuse lophs (OL) 1 obtuse lophs present, 0 absent. 12. Fortification (SF) 1 structural fortification of molar cusps, 0 no fortification. 13. Flatness (OT) 1 occlusal topography flat, 0 non-flat. 14. Goat-like (GO) 1 obtuse lophs are present, but acute lophs, fortification or flatness are absent. 15. Bunodont (BU) 1 no lophs, 0 lophs of any type are present.
As a summary of the dataset and a sanity-check, Figure 1 visualizes the first two (scaled) principal components of the dataset. We can see from the plot and the rotation matrix (given in the Supplement) that the primary axis is mostly hypsodonts (open habitats or grazing taxa) versus brachydonts (primarily forest dwellers). The second axis is primarily about arid environment vs. humid (such as swamps). Domesticated animals occur by and large in all spaces of this visualization, but the density of coverage is not uniform. Visually, the highest density of domesticated animals is at the arid (bottom-left) end, where camelids and goats are positioned.

Computational methods for predictive modelling
The main computational task is to learn predictive models for explaining patterns of domestication. In the machine learning terms, we need a probabilistic classifier of a model form where contributions of each variable would be easy to interpret. The inputs for the model can be any or all variables given in Table 2.
The target variable is a binary class label indicating whether the taxon has been domesticated or not. One of the challenges for this predictive modelling is a relatively small sample size with a high imbalance of classes (13 domesticated taxa vs.55 not domesticated genera).  Table 2). Domestication status is not included in the inputs for the projection.
We used two types of models: logistic regression and decision tree (see e.g. Witten et al 2016 for an introductory text to machine learning). Logistic regression models a linear relationship between the input variables and the class label. The output is passed through an s-shaped function to make sure that the prediction falls between zero and one, and in our case, can be approximately interpreted as the probability of domestication. A number of alternative procedures for estimating the model parameters exist. We chose a combination of LASSO and Ridge optimization (alpha = 0.5, see e.g. Hastie et al 2009 for details). This selection keeps the regression weights from overgrowing and at the same time minimizes the number of variables selected to include into the final model. The number of variables included depends on internal statistical assessment of variance through the course of model fitting.
Decision tree is a non-linear model, which, conceptually similarly to LASSO, first selects the variable that explains the class the best, then second best and onwards. For fitting a tree, we used the standard CART algorithm with the Gini coefficient as the splitting criteria (see e.g. Witten et al 2011 for background technical information). We produced two decision tree models: one where all 15 variables are candidates, and the other without variables describing environment and habitat, the latter aimed to capture domestication patterns only from the organismal perspective. Our modelling choices are summarized in Table 3. Modelling was done in R software, using packages glmnet (for LASSO) and rpart (for tree fitting). The code used for the analysis is available on GitHub repository 2 , although not very well documented.

SUITABILITY FOR DOMESTICATION AND DOMESTICATION RATES
The purpose of our computational analysis is twofold: to identify which ecological and environmental characteristics best explain domestication of livestock animals, and to quantitatively describe each biogeographic realm in terms of estimated suitability of their candidate animals for livestock domestication 15 000 -5 000 years ago.

Which ecological and environmental characteristics best explain domestication
Two inferred decision tree models are plotted in Figure 2. The first tree internally selected most informative variables from all 15 candidate variables in Table 2. The second tree was given only 10 candidate variables describing animals themselves (Grazer, Specialized, Water dependent and dental characteristics), excluding the variables describing environment or habitat (Open habitat, Highland habitat, Migration, Highly endemic, Tropical).  Table 2 as candidates. (b) A model built excluding environmental variables. The trees are read from top to the bottom: given a candidate animal, each yellow node indicates a variable to look at, and the values for a particular animal indicates which path to take until a terminal square node is reach, which gives an estimate of whether the animal is suitable for domestication. The number in the terminal nodes indicate how many animals with these characteristics have been domesticated and how many animals with these characteristics are in total in our dataset.] The first tree can explain the majority of early livestock animal domestication cases (8 out of 13) by only two variables: coming not from tropical environments and being moderately migratory. This selection suggests that environmental criteria dominate over physiological in explanatory power. As the principle component analysis already hinted, domesticated animals appear through most of the space described by physiological traits, but the counts of successful domestication vary across that space. The highest densities of successful domestication, as the principal component and the tree analysis together suggest, must be the arid end of non-tropical environments. The fact that moderate migration has such a high explanatory power suggests a potential impact of cold and seasonality.
The second tree, which was not permitted to use environmental characteristics, gives a more entangled, but interesting pattern. Domesticable animals described by this tree are those having high hypsodonty, but not exclusively grazers. They are expected to be moderately or highly dependent on water sources, and not highly specialized (e.g. not aquatic animals). Yet, only half of the animals at the final node "Maybe" (9/20) were actually domesticated, which suggests that physiological characteristics alone, at least those included in our analysis, are not sufficient to explain the domestication patterns precisely. Realized domestication of livestock animals, seems clearly to be a combination of biogeographic and ecological factors.
Decision trees work well for visual and structural explanation, but without further modifications they are too coarse for probabilistic assessment of suitability of candidate animals for domestication. We need to assign suitability score for each candidate animal, such that we can rank them and reason about realized vs. potential domestication across biogeographic realms.
The inferred logistic regression model for estimating the probability of domestication is P(Domestication) = 1/ (1 + exp(-Z)), where Z = 0.42 HYP -0.35 AL + 0.50 Water + 0.40 Highlands + 0.14 Open -0.70 Endemic -2.92. (1) The model automatically selected the most informative variables from 14 candidate variables (all variables in Table 2  Judging from all three models the dominant, but not exclusive characteristics that make animals domesticable are their adaptedness to abrasive but not exclusively grass food (hypsodonty) in combination with somewhat marginal seasonally variable environmental conditions. This points to non-tropical generalists to have the highest propensity for livestock domestication.

Estimated suitability for domestication of candidate taxa
The next interesting question is how the 68 candidate taxa rank according to their potential suitability for domestication. Are there any taxa that within the context of other domesticated animals must have been very unlikely to be domesticated? And does any of the non-domesticated taxa look particularly suitable? Table 4 gives a ranking of the candidates from the highest to the lowest probability of domestication. Probabilities are estimated using the logistic regression model given in Equation (1). For information, we also include scores given by both decision trees.
We can see that out of non-domesticated taxa Antilocapra and Ovibos come the highest on the list. Both are North American taxa, which hints that non-domestication of those animals might have had more to do with the place than the animals themselves. Quite close comes Saiga, which is a mysterious animal of semideserts, steppes and grasslands. Today they are critically endangered and found only in central Asian steppes. Saigas have been hunted for a long time, no idea why they have not been domesticated.
The next non-domesticated cohort includes Hemitragus, Pseudois, Ammotragus, Budorcas and Redunca. All except for Redunca are close taxonomic relatives of goats and sheep, and all are from Palearctic. Since goats and sheep have already been domesticated at the same region, perhaps there was no strong need to go for a similar wild option rather than choosing to breed already domesticated species.
Of those domesticated Bos and Bubalus come rather low in the ranking, maybe because they are relatively more specialized and sensitive to food quality than the higher cohort including sheep, goat, camelids and horses.
Sus has been domesticated by comes very low in the ranking. Sus is an omnivore having their diet relatively close to that of humans, and it is indeed somewhat surprising that multiple domestications of Sus happened.

Realized and potential rates of domestication across biogeographic realms
Now that we have estimates for suitability for domestication, next we analyse how potential suitability is distributed across biogeographic realms, and how does it relate to observed rates of domestication across the realms. Table 5 gives the rates for 10 domesticated genera. In addition, Figure 3 plots potential domestication rates across all the ranking of candidate genera, first assuming that the genus with the highest estimate is domesticated, then that two genera with the highest scores are domesticated and onwards. Table 5. Actual and estimated potential rates of domestication across biogeographic realms. In some realms several species or subspecies of the same genera were domesticated (e.g. Bos), in such a case a genus is counted only once. Given potential rate of domestication is assuming that top-10 genera with highest estimated propensity for domestication are domesticated. The last row (All) is not a sum of all other rows, because some genera occur in several realms.  From the vertical alignment of lines in Figure 3 we can see three groups emerging: Palearctic, Nearctic and Neotropic seem to host the largest share of suitable genera (the highest vertical position), followed by Afrotropic together with Indomalaya, and finally, Australasia hosts the least. This can be interpreted as no matter how many candidates each realm hosts, how suitable their candidates are on average. Interestingly, while the logistic regression model given in Equation (1) used for estimating suitability scores does not include latitudinal information ("tropical"), latitudinal patterns do emerge. Temperate realms seem to host proportionally roughly 4 times more suitable taxa than tropical. Neotropic realm includes is a mix of tropical and non-tropical environments, and appears in-between of the temperate and tropical groups. Australia has only one genus large enough as a candidate and it gets a low estimate, thus Australia lags quite behind from both groups.

Realm
Our analysis suggests that, Australia discarded, the potential suitability was surprisingly consistent across the biogeographic realms: around 30% in temperate, less than 10% in tropical, and around 20% in the Central and South America, which is a mix of temperate and tropical.
Most interesting is the relation between actual and potential domestication rates. In Figure 3 we see that Palearctic and Neotropic have their actual rates (circles) quite near to their potential rates, and so does Australasia and Africa. Two exceptions appear: Indomalaya seems to have higher actual rates than the potential, and Nearctic (North America) has zero actual rate, while the potential appears to be as high as that of Palearctic.  argued that uneven rates of domestication across the globe were primarily due to animal characteristics rather than geographic location, which from our analysis seems to be only partially true. It is noteworthy that  counted domestication at the species, and analysed continents putting together South and North America, and not distinguishing Indomalaya, which is climatically very different from the rest of Asia. He got 18% domestication rate in Eurasia, 4% in the Americas and 0 elsewhere. Our analysis taking into account biogeographic realms suggests that while the total number of candidates in South America was very small, South America (Neotropic) had very similar actual domestication rates to Eurasia (Palearctic). North America (Nearctic), on the other hand, had seemingly many candidates, but actual domestications did not happen.
The distribution of actual and potential rates of domestication across the realms suggest that while each realm consistently hosted a pool of suitable animals, other factors than just suitability of animals themselves must have been driving domestication rates, but at the moment we have no explanation for the large North America mismatch between actual and potential. As for availability for availability of potentially suitable livestock animals, we hypothesize that a large part of explanation might be biogeographic, requiring a particular intersection of cold and hot semi-arid climatic zones. In the following section we discuss evidence for this along with our interpretations.

BIOGEOGRAPHIC ANALYSIS
To explain suitability scores for domestication across biogeographic realms, we suggest an interplay between climatic zones primarily suitable for plant domestication and animal domestication. The argument goes as follows: hot Semi-arid climate was most suitable for plant domestication, cold Semi-arid climate selected for animals most suitable for domestication as livestock, the realized rates of domestication, with a major exception of North America, reflect the how much intersection of hot and cold Semi-arid climatic zones was available at each continent.

Biogeography of plant domestication
Sedentary lifestyle started with plant domestications at least half a millennium earlier than the first known livestock animal domestications . Many arguments have been put forward why domestication of plants started at this particular time. Early arguments, such as the Oasis hypothesis (Childe 1928), favoured climate change. Recently more emphasis has been put on socioanthropological causes, such as depletion of wild food resources . Undoubtedly reasons for the time and place were multiple, among which was the warm seasonal climate at the Fertile Crescent. This climate favoured a bimodal way of life for plants and animals: feasting during the rich seasons, and being dormant during the lean seasons. Such a bimodal operation requires effective storage of nutrients, as well as fast onset and growth at the start of the rain season. Plants can cope with seasonal aridity, among other ways, by producing large protein-rich seeds , that are nutritious and preserve well. Such seeds make a nutritious storable material, considered to be favourable for domestication.
Assuming that climate stress selects for domestication-friendly plant material, the next question is where in the world such climates could be found at the times of early domestication. Based on climate of the Fertile Crescent today,  argued that Mediterranean climate presents the most favourable climatic conditions for plant domestication. In the Koppen climate classification system (Kottek 2006) Mediterranean climate is characterized by dry summers and mild wet winters with the average temperature of the coolest month 0-22 o C and the cumulative precipitation in the driest summer month less than 30-40 mm. Figure 4a depicts the areas of Mediterranean climate today.  argued that out of all the Mediterranean zones the Fertile Crescent was favourable because it was the most inland.
Mediterranean climate zones today are ideal for wine production. This climate primarily generates scrubby and dense vegetation -broad-leaved evergreen shrubs, bushes and small trees. Plants of this kind are not ideal targets for plant domestication, not only because it takes several years for first fruits to appear. Such a vegetation cover would have been tiresome grounds for early agriculture, because the land would continuously have needed to be cleared of bushes and shrubs. Slightly harsher (dryer) climate, on the other hand, would have kept large parts of land clear from woody vegetation naturally.
Indeed, the first domesticated plants were grasses (wheat and barley). It is conceivable that the climate could have been harsher than the Mediterranean to select for grasses over woody shrubs. The next zone towards harshness is the Semi-arid zone, characterized by low and very seasonal annual precipitation. The precipitation threshold for Semi-arid zones depends on whether the main precipitation comes in winter or in summer and how it interplays with the temperature and the length of daylight. Vegetation in these zones is short or scrubby, dominated by grasses or small shrubs. The coverage is depicted in Figure 4b.
Visually, Semi-arid zones in Figure 4b coincide with many areas of early agriculture, drawn after Zeder   Several areas of early plant domestication are outside the proposed Semi-arid zone pattern, in particularrise domesticated in South China, bananas, yams and taro in New Guinea, millet, bean and burdock in Japan, as well as some parts of Americas. Of those, rice from China and bananas from New Guinea are perhaps most widely cultivated today. Both places have quite seasonal (New Guinea in highlands), but not particularly arid climate.
On the other hand, two hot Semi-arid areas, Kalahari in Africa and Australian coastlands, hosted no known plant domestications. Despite these open questions global match between Semi-arid climatic zones and areas of early plant domestication is quite striking, suggesting that hot Semi-arid zones denoted as BSh in the Koppen climate classification system (Kottek et al 2006) rather than Mediterranean zones (CSa), might have the primary climate for onset of agriculture.

Cold seasonal environments select for suitable livestock animals
Transitioning to a sedentary lifestyle and plant domestication required early humans to settle in places suitable for agriculture. In such circumstances humans who have settled for agriculture would naturally look for domesticable animals nearby.  pointed out a curious pattern: while the first livestock animals were domesticated at the Fertile Crescent, all of them presumably evolved in Central Asia, suggesting initial adaptation to much colder seasonal environments. And indeed, our computational analysis reported in the previous section hinted that seasonal semi-arid zones, such as cold steppe, are likely to host animals most suitable for livestock domestication.
Strong seasonality prescribes two modes of existence: the prosperous season during which the ecosystem is flourishing, and the lean season during which the ecosystem shuts down to bare essentials. While small mammals can sleep or hide (Liow et al 2009) through the bad season, large mammals living in seasonal environments have little alternatives but to migrate in order to survive the lean season. For successful migration, it is critical for the group to follow the leader (Guttal and Couzin 2010), as commonly observed, for example, in birds. Therefore, seasonal environments must select for follow-the-leader social structure, particularly in large mammals. Alternation between seasons requires accumulation of energy during the rich season in order to survive the lean season. Therefore, seasonal environments must also select for rapid growth of offsprings to be ready to realocate when time comes, and overall rapid growth of body mass in order to accumulate resources for the lean season.
Hypsodonty, which characterises durability of teeth and tolerance of abrasive food, appears to be one of the strongest explanatory variables for domestication in our analysis. Food for domestic animals needs to be common in human habitats and available in bulk, since specialized diets such as meat, fish, nuts, fruits or aquatic plants would take a lot of effort to collect and would not be economically viable. More importantly, the diet of domestic animals needs to be different from preferred human foods, since otherwise a more energetically efficient solution for humans would be to eat that food themselves. Grass is available in bulk and has low requirements for maintenance in semi-arid climatic zones and humans cannot utilize it directly due to lack of ability to digest cellulose. Not surprisingly, most of livestock animals appear to be mixedfeeders that given hypsodonty and digestive specializations can tolerate or even prefer grass.
Fast growth might be associated with seasonality and migration. Animals experience seasonality in both tropical and temperate environments, but in our analysis tropical habitats have a strong negative association with domestication. Indeed, a curious question is why some tropical groups, such as antelopes, have never been successfully domesticated. Since the tropical savanna animals have been hunted for a long time, they must be at least eatable for meat. Many African antelopes come from marginal and very seasonal environments and presumably have similar migratory and growth strategies for coping with the bad season, but hot rather than cold.
The main difference between the two environments is that in cold seasonal environments temperature is the limiting factor, while in hot seasonal environments seasonal lack of precipitation is the limit. Thus, cold seasonal environments are likely to select for fast growth of fat to help to withstand changing temperatures, while tropical seasonal savannas are likely to prioritize selection for minimum loss of water. Indeed, most of grazing antelopes native to such conditions, such as Oryx, can survive without drinking water for long periods 3 . Indeed, in our analysis water dependency is among the main explanatory variables for probability of domestication along with hypsodonty, migration, and non-tropical habitats.
For domestication to be successful animals should have an incentive to stay domesticated. An animal could escape predators for years, survive without food for weeks, but a water-dependent animal would hardly survive without water for more than a couple of days. Since humans are water-dependent themselves, they are always in reach of drinking water, and may have technology for finding and storing water. That could be an incentive for animals that regularly need drinking water to stay close to humans.
Some other variables in our analysis, such as migration, grazing, and highlands have the strongest association with domestication in the middle of their value range. Such animals can be characterized as "migratory, but not too much", "grazer, but not to an extreme", "highland habitats, but not too high up". This links back to observations from analyses of evolutionary processes, suggesting that the frontier for speciation, so called "species factory" (Bernor et al 1996, Fortelius et al 2014, is often not at the extreme ends of the environment, but right behind the extreme frontier. Semi-arid environments which have produced most of material for domestication (plants and animals) are somewhat like this -they are close to deserts, but not yet deserts.
Following our computational analysis we propose that cold seasonal habitats, particularly alpine environments, are the highly suitable for selecting valuable candidates for potential domestication as livestock. Our analysis also showed that animals from all over the trait space have been domesticated, but those of marginal environments (such as goats, llamas, camel, sheep) were the most densely representing domestication in our character space. Animals from marginal environments do not have luxury to be highly specialized, they can efficiently process low quality vegetation of various kinds appear to be the most valuable, since they help to convert low quality fibrous vegetation, which humans otherwise would not utilize, into energy that humans can use.

Crossroads at the Fertile Crescent
While hot Semi-arid climate zones in Figure 4b appear to be the most common for plant domestication, cold Semi-arid zones highlighted in the same figure appear to be common to produce animals highly suited for domestication. We propose that places where hot semi-arid and cold semi-arid zones meet have been the most suitable for early sedentary life. Indeed, the Fertile Crescent is one of several such zones. While the match in Figure 4b is based on Semi-arid climatic zones today, the final question for our analysis is how such zones were distributed around 10 000 years ago at the time of first domestications, and whether their global distribution can potentially explain different rates of domestication across continents.
The best currently accessible global climatic map of the past depicts vegetation at the Last Glacial Maximum, that is 15 000 -25 000 years before present (Ray and Adams 2001). Figure 5a depicts our projection of Ray and Adams (2001) climate reconstruction to the Koppen climatic classes, and Figure 5b depicts Semi-arid zones today (Kottek et al 2006) along with deserts for a direct comparison. At the time of early domestication of plants and animals, about 10 000 years before present, geographic distribution of the climatic zones must have looked as an interpolation of Figures 5a and 5b. Judging from these maps, the most notable changes in the hot Semi-arid areas have occurred in the Fertile Crescent, North-East Africa and Central America. In those locations hot Semi-arid areas were more widely spread than today. Two other major changes of Semi-arid areas have occurred in South America and Australia. In South America cold Semi-arid areas have shrunk, and in Australia hot Semi-Arid areas have expanded.
Looking at the distribution of zones around 20 000 years ago (Figure 5b), only two zones, marked by blue ellipses, appear to provide large enough interfaces between hot and cold marginal environments. By and large it seems that the Fertile Crescent and Central South America areas were the largest interfaces of suitable environments, and those are the continents where the highest rates of domestication occurred. Climatically speaking, temperate desert is mainly formed in Central Europe and North America due to topographic effect and monsoon climate 4 . Monsoon climate makes the summer drier in Central Eurasia, Arabian Peninsula and Mediterranean. These regions are mostly covered by cold steppe climatic zone (Bsk). North America does not have much land in sub-tropical region, so there is little room for subtropical desert there. Only Eurasia and Africa have large areas of subtropical arid area. As a result, Arabia becomes the major region in which the two climatic zones meet.
North America also seems to have had a reasonably large suitable interface, but at least visually the extreme frontier (hot or cold deserts) are lacking in Figure 5a, making the North American Semi-arid areas (highlighted in green). Yet, at the present day extreme areas are present, thus, lack of domestication in North America remains an open question, particularly given that our analysis has identified about the same average suitability of species as in Eurasia (Palearctic). Indomalaya domestications are few, and do not quite fit the common pattern of cold environments. Perhaps humid and hot tropical settings required locally adapted animals, and the two domesticated ones --Bubalus and Bos (zebu), were the most suitable locally.

CONCLUDING REMARKS
Human society today relies on a small number of livestock animals, and we keep wondering what made those taxa particularly suitable, was it about the animals themselves, the environments, the geography, the timing, or perhaps an interplay of all. As Alfred Crosby (2006) phrased it, domestication was often discontinuous and did not always work. The patterns we have analysed are results of what has worked.
Our analysis showed that the most suitable animals for domestication were generalists adapted to persistence in marginal environments of low productivity, largely corresponding to cold semi-arid climate zones. Such animals would have inhabited temperate steppe or grasslands. At the time of first domestications the global distribution of such areas should have been somewhat interim version in between of map 5a and map 5b. Yet by far the largest, and almost the only area for such animals to come from is likely to have been Central Asia, and that is where the big four domestic animals are thought to have originated from. The Fertile Crescent has been and is the major intersection of hot and cold semi-arid climate zones. If hot semi-arid zones select for suitable plants, cold semi-arid zones select for suitable animals, an intersection of the two zones must be an ideal place to domesticate both kinds.
Many questions are still open, one of which is why animals from marginal hot environments apparently are unsuitable for domestication. Even if animals from cold environments might be preferable due to growth rates, growth rates do not vary that much given the body mass, and this should automatically disqualify animals from hot climates. The Sahara desert is not very likely to have been a barrier either. The difference must be related to adaptations to the tropical habitats. We have hypothesized that water dependency and predator density are important for the follow-the-leader social structure, but some of the hot arid climate grazing antelopes, like waterbuck, are water dependent, and some, like African buffalo, are quite resistant to predators. We hope for interesting studies to follow.

A.1 Domestication of the first known livestock animals
To the best knowledge today, goat and sheep were the first herbivorous animals to be domesticated about 11000 years ago in different parts of the Fertile Crescent

A.2 Scoring scheme for habitat, dietary characteristics and dental traits
All the variables except for hypsodonty are encoded as categorical in the scale from 0 to 1, such that the variables could potentially be used as numeric in the analysis. A detailed description of the scoring scheme is given in the Supplement.

Grazer
Encoding: 1 means grazer, 0.67 encodes graze-dominated mixed-feeder, 0.33 encodes mixed-feeder that takes a small portion of grass, for instance, browse-dominated mixed-feeders are encoded this way, 0 means non-grazer, and mostly includes browsers, also frugivores and omnivores may appear in this category.

Highly specialized
This variable is meant to distinguish regular grazing-browsing herbivores from specialized diets, such as aquatic plants or fruits. Encoding: 0 encodes regular grazing or browsing herbivore, 1 encodes highly specialized diets.

Water dependence
This variable primarily means dependence on drinking water. Encoding: 1 means the animal needs to drink daily, 0.5 means drinking every 2-3 days, 0 means the animal can go a week or more without drinking water. The last category includes aridity adapted animals that do not need to drink water at all (e.g. Oryx), they get their water with food. Category 1 also includes aquatic animals, such as Hippopotamus or Bubalus.

Highland habitat
This is meant to encode habitats at high elevation, mountainous or hilly terrains. Encoding: 1 mountains or hills, 0.5 foothills, 0 steppes or forests, more or less flat terrain.

Migration
This variable is meant to encode seasonal migration. Historical migration information was considered, where possible, for example, Bubalus or Bison have been migratory in the past and therefore are scored as migratory. Encoding: 1 regular seasonal migration to different geographic areas, 0.5 occasional seasonal migration or seasonal expansion/decline in range, 0 no migration.

Highly endemic
This variable is meant to describe genera occurring in very restricted geographic areas, typically islands. Encoding: 1 highly endemic, 0 wide distribution.

Tropical
This variable is meant to capture biogeographic distribution, following biogeographic realms ( Figure A1). Encoding: all genera from Afrotropic and Indomalaya realms are assigned to category 1 -tropical, all genera from Palearctic and Nearctic are assigned to category 0 -non-tropical, genera from Neotropic and Australasia are assigned to one or the other on case-by-case basis.
Dental trait scoring generally follows . We leave out horizodonty and cement, since they have shown relatively little global signal in the past analyses. Scoring is typically done on the second upper molar, except for suids, where functionally dominant tooth is the third upper molar. Dental trait scores come from publications , Galbrun et al 2018, Zliobaite et al 2018and Oksanen et al 2018. Pseudoryx teeth are scored by the nearest living relative (Bubalus or Syncerus, as per Bibi 2013).

Hypsodonty (HYP)
Hypsodonty refers to ordinated molar crown height: 3 is for hypsodonty, where the tooth height-to-length ration is roughly higher than 1.2, 2 is for mesodonty, where the height-to-length ratio is 0.8-1.2, and 1 is for brachydonty, where the ratio is less than 0.8. Figure A.1 gives examples of brachydont, mesodont and hypsodont traits.

Acute lophs (AL)
This variable indicates presence (or absence) of sharp-bladed cutting structures, typically extending across two or more cusps. A cusp is one developmental unit of a tooth, it typically has a pointy shape. Encoding: 1 acute lophs are present, 0 absent.

Structural fortification (SF)
This variable encodes differential thickening of enamel that results in cusps remaining to stand out throughout tooth wear. Encoding: 1 structural fortification of cusps is present, 0 absent.

Goat-like morphology (GO)
The purpose of this variable is to capture non-specialized herbivore dental morphology, where only obtuse lophs are present, but other specialized loph features, like acute lophs, structural fortification or flat occlusal topography are absent. This variable does not need to be scored, it can be computed from OL, AL, SF and OT. Encoding: 1 if OL =1 and AL=0 and SF = 0 and OT = 0, else 0. Figure A5. Examples of different GO scores: (a) present (Antidorcas), (b) absent due to no lophs and flat occlusal surface OT=1 (Phacochoerus), (c) absent due to acute lophs AL = 1 (Anchitherium), (d) absent due to flat occlusal surface OT = 1 (Equus), (e) absent due to structural fortification SF = 1 (Redunca), (f) absent due to flat occlusal surface OT = 1 (Alcelaphus).

Bunodont (BU)
This variable encodes teeth that have no lophs, for example, human teeth. This variable does not need to be scored, it can be computed from OL, AL, SF, OT. Encoding: 1 if OL = 0 and AL = 0 and SF = 0 and OT = 0, otherwise 0.
The status of domestication is assigned following , , . Even though domestication attempts on other taxa might have happen in the past, we label animals as domesticated based on the present day status, that indicates success in domestication of a given taxa.
(A1) Figure A7 maps biogeographic realms, which we use as units of analysis for rates of domestication. Figure A8 gives the linear correlations between the explanatory variables describing candidates for domestication. Table A2 gives the dataset of candidate genera, and Table A3 lists genera that have been excluded from the candidate list. Table A4 gives the first columns of the rotation matrix of the principle component analysis.