## Abstract

Two recent meta-analyses of local-scale biodiversity change over time, by the authors of the present paper, have been subject to a harsh critique. Here we use new data and analyses to respond to the main points of this critique. First, a central argument of the critique was that short-term time series lead to biased estimates of long-term biodiversity change. However, we show here that this conclusion was based entirely on two fundamental mistakes in the simulations used to support it. Second, we show that the critic's conclusion that there are negative relationships between temporal biodiversity change and study duration is entirely dependent on: (i) the unrealistic assumption that biodiversity change can be positive when study duration = 0; (ii) the use of only a subset of the available data; (iii) inclusion of a single outlier data point in a single study (out of 100 in this case); and/or (iv) a choice to use log ratios rather than slopes (when available) as the metric of temporal biodiversity change. In short, the evidence does not support the conclusion that studies of longer duration tend to find local diversity decline. Finally, the critique highlighted the obviously true fact that studies in the ecological literature represent a geographically biased sample of locations on Earth; this issue was noted in both original papers, and is relevant for all ecological data syntheses. This fact was used by the critics to cast doubt on our conclusion that, outside of areas converted to croplands or asphalt, the distribution of temporal biodiversity trends is centered on zero. As a scientific rule, future studies based on more or different data may cause us to modify our conclusion, but at present, alternative conclusions based on the geographic-bias argument rely entirely on speculation. In sum, the critique raises points of uncertainty typical of all ecological studies, but it falls far short of providing an evidence-based alternative interpretation for our results, which are now supported by syntheses of hundreds of individual data sets of temporal biodiversity change.

## Introduction

Gonzalez et al. (in press) have raised concerns about two papers that collectively analyzed >250 individual datasets on biodiversity change through time from many parts of the world (Vellend et al. 2013, Dornelas et al. 2014). Both of these studies found that the average magnitude of temporal change across studies was indistinguishable from zero. The concerns of Gonzalez et al. are for the most part typical of those that could be directed at any ecological meta-analysis: different results might obtain in different places (underrepresented regions) or times (before people collected data of this nature), and it is possible to find data subsets that deviate from the overall pattern. These concerns were articulated by Gonzalez et al. in dramatic fashion so as to call into question our conclusions. Some aspects of the Gonzalez et al. critique were based on selective use of data and methods of analysis, while others focused on the nature of the data themselves and accompanying interpretations.

Here we present analyses, as well as new data, to support the following conclusions: (1) Shortterm time series do not provide biased estimates of long-term trends. The opposite conclusion presented by Gonzalez et al. was based on two basic errors in their simulation model and calculations. (2) There is no compelling evidence that studies of longer temporal duration show greater biodiversity decline. On this point, the analyses presented by Gonzalez et al. were fragile in the extreme with respect to single outlier data points, to assumptions about model structure, and to the inclusion of additional data. (3) The ecological literature is indeed geographically biased, a fact discussed explicitly in both Vellend et al. (2013) and Dornelas et al. (2014). The sophisticated analysis of Gonzalez et al. on this issue serves only to make the self-evident point that new data (in this case from underrepresented regions) might modify conclusions.

### (1) Short-term time series do not provide biased estimates of long-term trends

One key component of the Gonzalez et al. critique is in factual error (i.e., not a matter of selective interpretation). Simulations of species richness (S) over 50-year periods and subsequent calculations of log ratios (log(S_{after}/S_{before})) or slopes of richness on time during shorter time intervals (5, 10, 20 years) were used to argue that “Estimates of biodiversity change are systematically biased when syntheses are based on datasets composed primarily of short time series”. Gonzalez et al. made two different errors, the first of which applies only to log ratios, the second of which applies to both log ratios and slopes:

When calculating a mean effect size for “short” windows of time, Gonzalez et al. did not take into account the fact that a log ratio for, let’s say, a 10-year period of time is only expected (mathematically) to capture one fifth of the amount of change that occurs over 50 years. In other words, they did not multiply the average of 10-year windows by 5 before comparing with the 50-year effect size. This is equivalent to the fallacious argument that, hypothetically, temperature only went up by 0.5C per decade, so the estimate of the “real” increase of 2.5C over 50 years is biased. This is obviously incorrect.

The second problem is less obvious, but no less important, and it accounts for apparent diversity increases in medium-sized time windows (e.g., 20 years) when a 50-year period shows a richness decline initially, followed by an increase, and then a leveling off (see Fig. 1A-C). The problem is that with a bounded range of 50 years, “randomly” chosen segments of 20 years severely over-represent the middle portion of the time series. In another well-known ecological context, this is called the mid-domain effect to explain peak species richness at central latitudes or altitudes (Colwell and Lees 2000). However, whereas the boundaries in space are real, the temporal boundaries are not, as time is infinite in both directions. The first point in the time series, for example, is only part of one 20-year segment in the “population” from which the Gonzalez et al. simulations sample, 0:20. The second time point is part of two segments, 0:20 and 1:21, and so on. Time points 20-30, on the other hand, are each part of 20 different segments. So, with the decline in richness happening early during the 50-year time span, seemingly random samples of 20 years mostly miss the decline, while “detecting” a transient increase only because it happens to occur in the middle portion of the time series. The apparent bias detected by Gonzalez et al. is an artefact of their simulation analysis focusing on an arbitrary bounded time interval (Fig. 1).

If one examines sequential, non-overlapping portions of any length of a given time series, the average log ratio captures precisely the rate of change over the entire time series. Simulations are not required to demonstrate this point, although we provide one corrected example from Gonzalez et al. (Fig. 1), in addition to the following explanation from first principles. Imagine we have a species-richness (S) time series of five points, t_{0}:t_{4}, and thus four year-to-year transitions. The log ratio from beginning to end is log(S_{4}/S_{0}). The average of one-year intervals is:

So, as long as we account for the fact that the one-year intervals cover only one quarter of the full time series (i.e., we multiply this by four), we recover the original “target” log ratio for the full time series precisely (see also Fig. 1G). The same result will hold for two-year intervals in this time series, 10-year intervals of a 50-year time series, or any other combination. The same precise mathematical equivalence does not hold for slopes, but it is equally true that there is no systematic bias introduced by the fact of sampling a subset of a longer time series. An incomplete sample of the portions of the longer time series will introduce variance (as is always the case with sampling), but not systematic bias (Fig. 1). The conclusion, based on simulations, “that short time series can provide unreliable estimates of a known trend” (Gonzalez et al. in press) is simply incorrect.

### (2) Local biodiversity trends in studies of different duration

Gonzalez et al. used their incorrect argument that short-term time series bias estimates of temporal biodiversity trends as a springboard to asking whether longer duration studies tend to show biodiversity declines. In this section, we address this issue for the two original studies in turn.

Using the data from Vellend et al. (2013), Gonzalez et al. modeled the log ratio of species richness at the end and start of a study (see previous section) as a function of the duration of that study, finding a statistically significant (p = 0.04) but weak relationship (Fig. 2A). They emphasized the conclusion that longer-duration studies tend to show richness declines, although by allowing for a non-zero intercept, their results also require explaining a nonsensical positive biodiversity trend in studies that last zero years. If one makes the ecologically realistic assumption that the log ratio must be zero at duration = 0 (i.e., a zero intercept), not only is the slope not significant, but its raw value is actually positive rather than negative (Fig. 2B). This illustrates the potentially major influence of assumptions about model structure on the spurious detection of weak statistical relationships.

Given the controversy sparked by Vellend et al. (2013), we have since expanded the data set by 37% to include studies published through the end of 2014 (the original paper had studies published up to July 2012; https://github.com/lbaeten/DivChangeResponse). The methods were identical to those in Vellend et al. (2013), except that we did not additionally read through the references of all new papers to find additional data sets. With the larger data set of 212 studies (the 2013 paper had 155), there is no significant relationship between local richness change and study duration, regardless of whether one allows for a non-zero intercept (Fig. 2C,D).

The data in Dornelas et al. (2014) includes studies with diversity estimates for at least three time points, thus allowing the estimation of slopes of diversity vs. time, rather than only before-after log ratios. There is no significant relationship between the diversity-time slope and study duration (Fig. 3A,B). Gonzalez et al. chose instead to calculate log ratios using the data in Dornelas et al. (2014; see Dataset S1 in that paper), and reported a significant negative relationship between log ratios and study duration (Fig. 3C). Again their analysis allowed for a non-zero intercept, and if the intercept is fixed at zero the relationship is not significant (Fig. 3D). In addition, the Gonzalez et al. result is highly sensitive to one outlier, depending not only on a single study (reference 90 in Dornelas et al. 2014), but on a single data point in that study (species richness = 43 in 1911, and <20 for the next 90 years). In the absence of that one data point, the relationship is not statistically significant, regardless of whether one assumes a zero or non-zero intercept (Fig. 3E,F).

In sum, the evidence provided by Gonzalez et al. to support their claim that longer-duration studies tend to show biodiversity decline is exceedingly weak at best, depending on specific and unrealistic assumptions, and even then provides negligible predictive value. Whether using the realistic assumption of zero biodiversity change at duration=0, using a larger data set (always preferable), taking account of an outlier, or analyzing slopes instead of log ratios, we find no convincing evidence that estimates of biodiversity change depend on study duration. In any given time series, it is clearly possible and indeed likely that trend detection will depend on the particular period of time analyzed, but with observed trends so evenly spread above and below zero for the range of durations with lots of data (<50 years or so), there is at present no evidence to support Gonzalez et al.’s conclusion that longer-duration studies systematically show average local biodiversity declines.

### (3) The ecological literature is indeed geographically biased

Ecological studies of all kinds have been conducted far more often in Europe and North America, and nearby waters, than elsewhere. In the case of our meta-analyses, we are unable at present to estimate with confidence how local biodiversity has changed across the continent of Africa or the Indian Ocean, for example. However, while any given subset of data might deviate slightly from the overall pattern, there was no obvious signal that geographic bias led to bias against finding biodiversity decline. For example, in Vellend et al. (2013), the estimated mean log ratios of species richness change over time for South America (N = 12), Asia (N = 9), Australia (N = 5), and Africa (N = 2) were all positive. Following the selective data subsetting approach of Gonzalez et al., one could choose to conduct an analysis giving greater weight to these understudied regions: this would shift the estimated central tendency towards biodiversity increases rather than decreases. Ultimately, only new data from underrepresented regions can speak directly to what is happening in those regions, and thus prompt a potential re-assessment of conclusions. Local biodiversity change is very much dependent on specific, local circumstances, and new and interesting results from poorly known regions may well emerge in the future. Improving the spatial representation of these regions is a high priority in obtaining better estimates of local biodiversity change.

In sum, Gonzalez et al. present a sophisticated analysis to demonstrate the obvious point (noted in both original papers) that the data are geographically biased. We note that precisely the same limitation applies to most ecological synthesis and meta-analysis papers, including many by these authors (e.g., Cardinale et al. 2012, Hooper et al. 2012, Elahi et al. 2015, Haddad et al. 2015) in which there was no such vigorous effort to quantify geographic bias and its attendant consequences for limiting the scope of conclusions. In the meantime, we are working with the best data available to directly document temporal biodiversity change at the local scale. Converting natural ecosystems to croplands or parking lots causes a local loss of biodiversity (Newbold et al. 2015), but otherwise there is a great deal of variation but no clear tendency for the net temporal local biodiversity trend to be different from zero across the sites in the available data (Vellend et al. 2013, Dornelas et al. 2014, Elahi et al. 2015).

## To conclude

One point on which we agree with Gonzalez et al. concerns the need for better biodiversity monitoring in the future. Our knowledge of a great many places on earth is quite limited, and many drivers of biodiversity change are expected to push in opposite directions. For example, non-native species introductions typically increase regional-scale species richness (Sax and Gaines 2003, Winter et al. 2009), and in areas that are currently cold and humid (e.g., temperate-zone mountain tops), species richness is also expected to increase due to climate warming (Pauli et al. 2012). On the other hand, nitrogen deposition often causes plant diversity to decline (Simkin et al. 2016), and for some taxa habitat fragmentation can do the same (Haddad et al. 2015). In all of these cases, we can expect major changes in species composition-that is, species turnover-with important implications for biodiversity conservation efforts (Dornelas et al. 2014, Magurran 2016). How different forces balance out in the future can best be determined by systematic, long-term monitoring-a major priority for future research in ecology and conservation.

## Appendix A

Gonzalez et al. (in press) analysed the effect of study duration on the temporal change in biodiversity. Here we started with the same statistical models to analyse the original and updated data, and subsequently modified the models to correspond to the assumption of no diversity change for studies that last zero years (see main text).

For the Vellend et al. (2013) data, multiple data sets were recorded in many studies, which were not independent of one another. This dependency was accounted for by using a multilevel model to relate the log ratio effect sizes *y _{i}* to the study duration

*X*, that is,

_{i}*y*=

_{i}*α*

_{j[i]}+

*β*

_{j[i]}

*X*+ ɛ

_{i}_{i}(Gelman and Hill 2007). The intercepts and slopes varied between the studies according to a normal distribution with mean

**μ**and standard deviation σ, so and . Independent within-study errors also followed a normal distribution

*ε*(0,

_{i}~N*σ*

^{2}). Removing the study-level intercepts

*α*led to a model assuming no change in diversity for studies lasting zero years.

_{j}For the Dornelas et al. (2014) dataset, the effect sizes *y _{i}* (slope or log ratio) were regressed on the duration

*X*with an ordinary linear model, that is,

_{i}*y*=

_{i}*α*+

*β X*+ ɛ

_{i}_{i}, with an intercept α, slope β and normally distributed errors

*ε*. Again, the assumption of no diversity change for studies that last zero years was implemented by removing the intercept parameter.

_{i}~N(0,σ^{2})The models were run in a Bayesian framework using the Stan probabilistic modelling language, called from R using the ‘*rstan*’ package (Stan Development Team 2015a). We used normal priors for the regression coefficients (α,β) and location parameters (μ_{α},μ_{β}) and half-Cauchy priors for the scale parameters (σ’s) (Stan Development Team 2015b). The models were run for 5000 iterations of both warm up and sampling. Model convergence was assessed by running four chains with different starting values and by checking the trace plots and Rhat statistic. Analyses were performed in R version 3.2.4. All data and reproducible code for the analyses and graphs is available through https://github.com/lbaeten/DivChangeResponse