The prevalence and impact of transient species in ecological communities

Transient species occur infrequently in a community over time and do not maintain viable local populations. Because transient species interact differently than non-transients with their biotic and abiotic environment, it is important to characterize the prevalence of these species and how they impact our understanding of ecological systems. We quantified the prevalence and impact of transient species in communities using data on over 17,000 community time series spanning an array of ecosystems, taxonomic groups, and spatial scales. We found that transient species are a general feature of communities regardless of taxa or ecosystem. The proportion of these species decreases with spatial scale leading to a need to control for scale in comparative work. Removing transient species from analyses influences the form of a suite of commonly studied ecological patterns including species-abundance distributions, species-energy relationships, species-area relationships, and temporal turnover. Careful consideration should be given to whether transient species are included in analyses depending on the theoretical and practical relevance of these species for the question being studied.


Introduction 55
Ecologists frequently conduct taxonomic surveys to characterize the diversity and 56 composition of ecological assemblages. While many of the species observed in these surveys 57 represent local populations, some may be irregular visitors that do not maintain viable local 58 populations, are poorly suited to the local conditions, and rarely interact with other members of 59 the community. Grinnell (1922) first coined the term "accidental" to refer to this kind of species, 60 which is observed inconsistently at a site over time in contrast to the more regular and predictable 61 members of an assemblage. This group of species has also been referred to as "occasional", 62 "vagrant", "transient", and "tourist" (Southwood et al. 1982;Costello and Myers 1996;Novotný 63 and Basset 2000; Magurran and Henderson 2003;Ulrich and Ollik 2004;Dolan et al. 2009;Coyle 64 et al. 2013;Petersen et al. 2015;Supp et al. 2015). Regardless of the name applied, these species 65 (hereafter "transients") have generally been identified based on the low frequency of observations 66 recorded in samples or surveys over time at a given location (i.e., low temporal occupancy). 67 Several studies ranging from birds to fish to amphipods have found that temporal occupancy is 68 frequently bimodally distributed within communities, with one distinct mode at very low 69 occupancy reflecting transient species, and another mode at high occupancy reflecting temporally 70 persistent "core" species ( Figure 1A; Costello and Myers 1996;Magurran and Henderson 2003;71 Coyle et al. 2013;Umaña et al. 2017). The existence of a mode at low occupancy indicates that 72 transient species may make up a larger proportion of ecological assemblages than has typically 73 been acknowledged. 74 Transient species are expected to interact with their biotic and abiotic environments 75 differently than core species since by definition they do not maintain viable local populations and 76 are not necessarily well adapted to the local environments in which they are found (Magurran and 77 Henderson 2003;Coyle et al. 2013;Umaña et al. 2017). Previous studies found that core 78  Coyle et al. (2013) illustrating one mode of "core" species observed consistently at sites and a mode of low occupancy "transient" species observed irregularly. (B) Core and transient species exhibit different species abundance distributions for the Hinkley Point fish assemblage (Magurran and Henderson 2003). (C) Four contiguous quadrats in which species (different shapes) may be core (shaded) or transient (open). (D) The species-area relationships for (C) depending on whether transient species are excluded or not, using the lower right panel to represent the smallest area. Because every species is core in at least one quadrat, species richness at the largest scale is the same for the two relationships. (E) Temporal turnover (the Jaccard index of dissimilarity) is much lower when transient species are excluded from the calculation, since they are the species most driving turnover. sampling in either space or time (e.g. variable numbers of surveys per year, or variable numbers 150 of spatial units per survey), we standardized the level of spatial or temporal subsampling for that 151 site in each year of the time series (see details in Appendix, Figure A1). 152 if it was observed in 33% or fewer of the temporal sampling intervals, and assessed the prevalence 156 of transients as the proportion of species in the assemblage below this threshold ( Figure 1A). We 157 also evaluated more restrictive definitions using maximum temporal occupancy thresholds of 10% 158 and 25% to evaluate the impact of this decision. Results were qualitatively similar for the three 159 different thresholds ( Figure A2-A6). 160 Although many authors have used the bimodality of temporal occupancy distributions (e.g., 161 Figure 1A) to identify transient species in this way (Magurran and Henderson 2003;Dolan et al. 162 2009;Coyle et al. 2013), some species will be incorrectly classified due to imperfect detectability. 163 Species with low detectability due to low density or traits or behaviors that make them difficult to 164 detect may be persistent at a site but only detected in a small proportion of samples (MacKenzie 165 et al. 2006). As such, estimates of the proportion of transient species based on observed temporal 166 occupancy are likely higher than the true numbers. A full exploration of the detailed influence of 167 imperfect detection is beyond the scope of this paper, but we are developing simulation-based 168 approaches to understand precisely how it influences estimates of the proportion of transients as 169 well as the identification of individual species (Hurlbert unpublished data). 170 While imperfect detection is clearly a concern for analyses of this type there is also 171 evidence that using observed occupancy provides a reasonable first approximation of transient 172 status. Magurran and Henderson (2003) showed that using occupancy to identify species as 173 transient is consistent with using habitat preferences. In an examination of nearly 500 bird 174 communities, Coyle et al. (2013) showed that transient species richness was correlated with 175 regional habitat heterogeneity as would be expected of true transients while it was not positively 176 correlated with vegetation which would be expected to impede species detections. In addition, 177 similar studies using habitat preference-based transient designations (Belmaker 2009) have yielded 178 similar conclusions to those using occupancy based approaches (Coyle et al. 2013). Finally, the 179 results in this paper are similar for species that are comprehensively surveyed and those that are 180 less thoroughly sampled (see Results and Considerations). So, while there is no doubt that 181 misclassifications will occur, for large data compilations like this one that lack both detailed habitat 182 preference data for species and the necessary sampling methods to estimate detection probabilities, 183 occupancy based approaches appear to provide a reasonable approximate classification. We 184 address these issues further in the Considerations section of the Discussion. 185 We evaluated the effect of spatial scale on the perceived prevalence of transient species 186 using the subset of datasets that included sampling at hierarchically nested spatial scales. We used 187 a linear mixed model to quantify how the proportion of transient species in an assemblage varied 188 with the spatial scale over which the assemblage was characterized. The model included taxonomic 189 group as a fixed effect and dataset as a random effect, with both variables having the potential to 190 influence both the slope and intercept of the relationship. Area was log-transformed for analysis. 191 Because scale will be perceived differently for organisms of different size-e.g. a 1 ha quadrat is 192 effectively much larger for ants than for birds-it may not allow for direct comparisons of "scale" 193 among taxonomic groups. As such, we also built a similar mixed model using the median 194 community size for all assemblages (i.e., the total number of individuals sampled in an assemblage, 195 median = 102) as an alternative, potentially more generalizable, measure of scale. 196 To explore the influence of habitat heterogeneity on the prevalence of transients we used a 197 linear mixed model to predict the proportion of transients as a function of elevational heterogeneity 198 (the variance in elevation within a 5 km radius of the site), with spatial scale (using community 199 size as a proxy) as a covariate and taxonomic group as a random effect. P-values were estimated 200 from the t-statistics using a normal approximation. All terrestrial datasets with geographic 201 coordinates were used to fit the model. We used a 30 arc-second digital elevation model DEM of 202 North America (GTOPO30), acquired from the USGS Earth Resources Observation and Science 203 Center (EROS), to calculate the variance of elevation. We calculated a pseudo R 2 for each mixed 204 model based on the fit between predicted and observed values. 205 Finally, we quantified the influence of transient species on a suite of commonly studied 206 ecological patterns including species-abundance distributions, species-area relationships, temporal 207 turnover, and correlates of species richness. We did this by comparing the form of these patterns 208 when using data on the entire community to the same pattern generated after excluding species 209 that were identified as transients (i.e. those species with temporal occupancy ≤ 33%). We fit two 210 distributions for species-abundance, the logseries and the Poisson lognormal to the combined 211 abundance data across years for each time-series. Magurran and Henderson (2003) proposed that 212 transient species should be better fit by the logseries and core species by the lognormal, meaning 213 that excluding transient species should result in improved fits by the lognormal. We compared the 214 fits of the two distributions based on AICc model weights. Analysis of species-area relationships 215 was restricted to datasets with hierarchical spatial sampling. Power function relationships were fit 216 to each assemblage using linear regression on log-transformed data (Xiao et al. 2011) to predict 217 the number of species observed from the area sampled. The fitted exponents of the relationships 218 were compared. Mean temporal turnover was calculated as the mean of the Jaccard dissimilarity 219 index (Krebs 1999; Figure 1E) between all adjacent time samples in each community time series. 220 Analyses of the drivers of species richness were restricted to data from the Breeding Bird Survey 221 of North America since it was the only dataset that employed consistent sampling across large 222 spatial scales with a large number of replicates. For this last set of analyses we used two 223 environmental correlates that are known to be important for determining richness in this dataset, 224 the Normalized Difference Vegetation Index (NDVI), a remotely sensed estimate of productivity, 225 and elevation (White and Hurlbert 2010). We calculated correlation coefficients between each 226 environmental variable and species richness (including or excluding transient species), as well as 227 correlation coefficients for transient species richness alone to further illuminate differences. 228 The complete set of R scripts for data cleaning and processing are available on Github 229  Figure 3A and 254  (Table 1). There was no evidence for an interaction between elevational 260 heterogeneity and community size (p = 0.98). 261   Species turnover was always higher when transient species were included than when they were 276 excluded, with an average deviation of 0.11 ( Figure 5C). Finally, the exponent of the species-area 277 relationship was typically higher when excluding transients (average deviation = 0.07; Figure 5D). 278 All results were similar using alternative occupancy thresholds to define transient species . 280

Discussion 282
We quantified the prevalence and impact of transient species in ecological communities 283 using data on over 17,000 community time series spanning multiple ecosystems, taxonomic 284 groups, and spatial scales. Transient species were a common element of communities in all taxa 285 community is related to spatial scale. For communities sampled at multiple spatial scales, the 292 proportion of transient species decreased with increasing scale, as species were more likely to be 293 observed and actually persist over larger sampling areas. As a result, comparisons of the prevalence 294 of transient species between studies should account for scale. However, area per se may not be 295 directly comparable between communities that differ substantially in body size or otherwise use 296 space differently. An alternative measure of scale, community size, effectively controls for 297 differences in area usage between taxonomic groups by integrating the influence of each species' 298 distinct life history traits and home range sizes. Correcting for scale in this way, we found that the 299 proportion of transient species did not vary with ecosystem type, whereas ignoring scale would 300 have led to the conclusion that transient species were much more common in freshwater than 301 terrestrial ecosystems. Similarly, correcting for scale led to a more even distribution of the 302 proportion of transient species across taxonomic groups, and some groups that would otherwise 303 have been inferred to differ substantially in the prevalence of transients were actually found to be 304

comparable. 305
Differences in the prevalence of transient species were evident among taxonomic groups 306 even when controlling for spatial scale. Invertebrate, plant, and bird communities had the highest 307 proportion of transient species while plankton and mammal communities had the lowest. These 308 taxonomic groups differ in many respects precluding a rigorous analysis, but we speculate that 309 traits such as dispersal ability and habitat specialization may increase the likelihood of species 310 being temporarily observed in areas where they are not well adapted and hence being recorded as 311 transients. For example, birds have strong dispersal ability relative to the other taxonomic groups grounds, many SAD models may be more appropriately applied to all species observed, or only to 352 the set of species that strongly interact and maintain viable populations. For example, neutral 353 theory applies to all species, as it explicitly allows for rare immigration or speciation events 354 (Hubbell 2001), whereas resource allocation based niche apportionment models (MacArthur 1957;355 Tokeshi 1990) are likely more appropriately applied only to non-transient species. While the SAD 356 may not be sufficient on its own to infer community structuring processes (Cohen 1968;Volkov 357 et al. 2005;Baldridge et al. 2016; but see Connolly et al. 2014), it is one of several ecological 358 patterns that may collectively shed light on such mechanisms (McGill et al. 2007;Blonder et al. 359 2014). As such, consideration of transient species has the potential to influence our understanding 360 of local community structure. 361 In addition to influencing measures of local community structure, the inclusion of transient 362 species also affected measures of how ecological systems turnover and change with scale. 363 Estimates of temporal turnover were always higher when transients were included in assemblages. 364 This occurs because transient species are only present over a small fraction of a time series, 365 resulting in higher turnover in species composition within a community over time (see also 366 Magurran and Henderson 2010). Conversely, the inclusion of transient species led to lower 367 estimates of spatial turnover as reflected in the slope of species-area relationships. This is because 368 a greater proportion of the species list at small spatial scales is identified as transient compared to 369 at a larger scale. As such, including transient species increases richness more at small scales than 370 large, resulting in a shallower species-area relationship and lower spatial turnover ( Figure 1D). 371 Turnover and associated scaling relationships have implications for assessment of community 372 responses to global change (Brown et al. 1997;Suding et al. 2008 for understanding local to regional scale ecological systems. 377 Given the impact on a wide range of ecological patterns, the decision to include or exclude 388 transient species in a community analysis is an important one that should be made by explicitly 389 considering the nature of the conceptual framework or theory being investigated. In some cases, it 390 will be necessary to remove these species from analyses or risk making improper inferences. 391

392
Considerations 393 Conceptually, transient species are those that do not maintain persistent populations over 394 time and therefore only appear infrequently during surveys. The bimodality of temporal occupancy 395 distributions (e.g., Figure 1A) has led many authors to suggest that temporal occupancy can be 396 However, for many groups detailed information on habitat preferences or estimates of true 411 population persistence is not readily available, and a definition based on a universal occupancy 412 threshold is currently the most feasible option for analyzing hundreds or thousands of assemblages 413 for cross-taxon comparisons like those presented here. 414 As described in the Methods, there is evidence that occupancy based thresholds provide 415 reasonable identifications of transient species (Magurran and Henderson 2003;Belmaker 2009;416 Coyle et al. 2013). There is additional evidence from our results that using this raw occupancy 417 based approach provides a reasonable approximate classification. First, the misclassification rate 418 should presumably be lower when defining transient species using stricter occupancy thresholds, 419 and so the consistency of our results across multiple thresholds lends some confidence to this 420 approach. Second, for certain communities the taxonomic group and mode of data collection 421 provide nearly complete censuses of all individuals within a static sample (e.g. plant stems within 422 a quadrat or fish in a seine net). In these communities, imperfect detection should have little 423 influence on estimates of occupancy (at the scale of sampling). The similarity of results in this 424 study across groups that tend to be thoroughly surveyed (e.g., plants and fish) and those that are 425 less intensively sampled (e.g., birds and butterflies) suggests that our results are not driven heavily Kasumigaura: The Lake Kasumigaura database, Table 10 Phytoplankton density, Lake 486 Kasumigaura database,  Sites Years 50% A B Table A1.

39
Lake Kasumigaura database ,  Figure A2. The impact of different thresholds on the proportion of transient species in assemblages from different taxonomic groups. Taxon symbols as in Figure 2. Figure A3. The impact of scale on the proportion of transient species as displayed in Figure 4 in the main text, but where transient species are defined as those with temporal occupancy ≤ 10%. Figure A4. The impact of scale on the proportion of transient species as displayed in Figure 4 in the main text, but where transient species are defined as those with temporal occupancy ≤ 25%. Figure A5. Impact of excluding transient species on four ecological patterns as displayed in Figure  5 in the main text, but where transient species are defined as those with temporal occupancy ≤ 10%. Figure A6. Impact of excluding transient species on four ecological patterns as displayed in Figure  5 in the main text, but where transient species are defined as those with temporal occupancy ≤ 25%.