Power laws from the bird species abundance distribution

A recent study found that bird species with fewer individuals are abundant, but large species are rare. We show that this new data strongly suggests a power-law distribution rather than the most accepted lognormal. Moreover, we discuss extinction risk across the bird phylogeny and future conservation efforts by profiting from the hierarchical structure revealed by the new data.


Introduction
The species abundance distribution is one of the few universal patterns in ecology (Morlon et al. 2009). A new study found that bird species with fewer individuals are abundant, but large species are rare (Callaghan et al. 2021). The researchers considered the recent influx of citizen science data to make the number of individuals (abundance) estimates for 9,700 species, about 92 percent of all birds. They combined data for 724 well-studied species with counts from the app eBird (https://ebird.org/home), where people submit bird sightings. Then, they used an algorithm to extrapolate estimates for the sample. As a result, they discovered many species with small populations isolated in niche habitats and relatively few species over a vast territory. The study provides the methodological blueprint for detailed quantifying. Here, we offer an extra quantifying of this interesting nature dynamics and show it belongs to the class of hierarchical system dynamics. In particular, we discover power laws from biggest to most minor bird species abundance. Figure 1a shows the data (pnas.2023170118.sd01). From 50 billion birds, there are four undomesticated species with billion-plus individuals: house sparrows (Passer domesticus), European starlings (Sturnus vulgaris), ring-billed gulls (Larus delawarensis), and barn swallows (Hirundo rustica). By contrast, 5,022 species number fewer than 500,000 birds each. Figure 1a provides a glimpse of the power-law pattern we find and immediately demands attention to the group of 1,178 species departing from the pattern at the naked eye. Many ecological processes structure the abundance of a species (McGill et al. 2007). For example, the numerous rarer species may have evolved to occupy a single island, but human activities such as deforestation can further explain the pattern deviation in Figure 1a. Possibly some species are rare because we meddle with them. This issue is a matter of further investigation from the perspective of conservation efforts.

Materials and methods
A power law governs a quantity when the probability of getting a given value varies inversely with a power of that value (Newman 2005). Thus, a power law is a relationship between two quantities where a relative change in one leads to a proportional change in the other. Moreover, this holds regardless of the initial values. Power laws are common in hierarchical system dynamics arising in physics, economics, and biology.
A power law geometrically reveals itself as a straight line in a plot of log of rank and log of the quantity at hand. After ranking the bird species from top to bottom, three power laws emerge. We show this in Figure 1b after taking log of rank versus log of bird abundance. Here, we need to quantify the power laws in Figure 1b by computing their Pareto exponents related to the slopes of the straight lines.
One standard method is to take the survival function () Sx in a Pareto type I model (Jenkins 2017). Those bird species with abundance greater than xthat is, one minus the cumulative distribution function () Pxare given by where m x is the lower bound on abundance, and 0 m xx  . The shape parameter  is the Pareto exponent (tail index), describing the heaviness of the right tail, with smaller values meaning a heavier tail.

Results
We can estimate the three Pareto exponents associated with the rare and abundant species by ordinary least squares regressions of log of the survivor function on log of abundance and a constant term (Jenkins 2017). These generate straight-line intervals of slope   (Figure 1b). In particular, to determine the group of the most abundant A n species (apart from the top four), we run linear regressions varying A n from 10 to 5000 and record the corresponding values of the coefficient of determination 2 R . Then, we do the same for the R n rarer species, also ranging R n from 10 to 5000. Figure 1c shows the results. For the most abundant species, model (1) fits well for the 534 A n  most abundant species, providing 2 0.998 R  . As for the rarest species, the best fit presents 2 0.966 R  for the 89 R n  rarest species. Table 1 shows the Pareto exponents for the three subsamples. The value of  very close to zero for rare species suggests a heavier tail. Thus, these species are in a high uncertainty zone because there is no expected value for the number of rare species X. Therefore, such species are in a non-equilibrium state where they are heading either toward extinction (X = 0) or becoming increasingly less rare. On the other hand, the distribution of the most abundant species apart from the top four has an expected value X for abundance. However, the distribution does not have a defined variability (as 2   ), which means this group is still subject to abrupt hierarchical position shifts. Finally, the group of top four has the lightest tail. The literature already points to the fact that extinction risk is unequally spread across the bird phylogeny (Purvis 2008), but we provide detailed information of interest for future conservation efforts.

Discussion
Abundance estimates of bird species are critical for ecology, evolutionary biology, and conservation, and improvements in quantifying abundance are welcome. A log-normal is mainly accepted for abundance distributions in the ecology literature (Chisholm 2007), but the new data we analyze here strongly suggests a more appropriate power-law distribution. This insight comes from the hierarchical structure revealed by the data. This article contributes to show that power laws summarize the finding that the rarer a bird species, the more numerous. This circumstance means that all the results from the literature on hierarchical system dynamics can now come to the fore to enrich the discussion. Figure 2 shows a hypothetical evolution of the values of the Pareto exponent for the top four species as we increasingly remove the k most abundant species. In the benchmark of Figure 1b, no abundant species are removed, and so k = 0. Removing the most abundant species is represented by k = 1, and the impact on the value of the Pareto exponent of the remaining top four is displayed in Figure 2. Then, we compute the Pareto exponents after progressively removing the two most abundant species (k = 2), the three most abundant (k = 3), and so on. The blue line in Figure 2 is a Loess smooth curve fitting.
The shaded region represents a 95 percent confidence interval for  .
The exercise results reveal a pattern of interest for the impact of the extinction of the most abundant species on the remaining (top four) species. Notably, the Pareto exponents become unstable. The alphas oscillate between light-tailed ( 2   ) and heavytailed regimes. Of note, when 12   the Pareto has no variance and when 1   it has no mean either. Therefore, the impact of the big four's extinction dramatically increases the overall uncertainty for the other species.  Callaghan et al. (2021) correctly observe that many species in their data have minimal population estimates, and relatively few species are very abundant. There are about 200 more species than expected if bird species abundance followed a truly log-normal distribution. Nevertheless, they settle for a log left-skewed distribution (Callaghan et al. 2021) rather than the power-law distribution we consider here. The flexibility of the log left-skewed distribution accommodates the data well, but the Pareto seems more compelling for our piecewise approach. Nevertheless, we do not dismiss the log leftskewed distribution; we consider the Pareto limit because of our focus on the tail behavior.

Concluding remarks
One weakness of the data is that it is hard to believe that many species have a Nearctic distribution within the ten most abundant, including the third more numerous ring-billed gull. This limitation prevents reliable estimates using the eBird database for most species that do not occur widely in the Americas. Nevertheless, we believe more reliable estimates will not dismiss our discovery of the hierarchical nature of the bird species abundance distribution because power laws reveal themselves even if data are scant.