Stochastic dispersal assembly dominates tropical invertebrate community assembly regardless of land use intensity

Abstract


Introduction
Community assembly processes drive biodiversity patterns, and a key goal in community ecology is to 14 quantify the relative importance of different community assembly processes. Simultaneously, it is well -15 documented that land use changes impact biodiversity [1], but it is much less well-known that they can also 16 affect community assembly mechanisms [2]. Currently, therefore, we have a strong awareness of the 17 patterns of biodiversity change that are generated by land use change [1], but little understanding of the 18 extent to which the fundamental community assembly processes that create that change are impacted. 19 Moreover, the specific assembly processes generating ecological communities may differ between taxa due 20 to differences in trait evolution [3], meaning studies examining land use impacts on the assembly processes 21 of mammals (e.g.[2]) may provide little insight into the impacts on other taxa. Attempts to rely on natural 22 ecological processes to restore biodiversity rely, by definition, on naturally occurring community assembly 23 processes [4,5]. It is therefore of fundamental importance that we gain a deeper understanding of whether 24 those assembly processes in modified habitats are the same or different to those observed in primary 25

habitats. 26
It is generally accepted that both stochastic (neutral, random) and deterministic (niche, non-random) 27 processes are simultaneously involved in structuring ecological communities [6], with the two sets of 28 processes sitting at opposite ends of a continuum. Real-world communities exist somewhere between 29 these two extremes [7,8]. Ecological stochasticity, henceforth referred to as 'stochasticity', can be defined 30 as random changes in community composition that operate independently of taxonomic identity, and 31 forms the basis of neutral theory [7, 9,10]. Neutral theory makes the extreme assumption that all individuals 32 within a particular feeding guild are ecologically equivalent, and emphasises the importance of stochastic 33 processes such as ecological drift (random fluctuations in birth, death, immigration, and speciation rates) 34 and dispersal that operate independently of taxonomic identity [11]. By contrast, determinism can be 35 defined as non-random changes in community composition that can be predicted based on species ' 36 identities and niches [12], and is broadly described as niche theory which assumes that deterministic 37 processes, such as selection, play a fundamental role in driving community assembly [13]. 38 There has been long-standing debate over the relative influence of stochastic and deterministic processes 39 on community assembly [14]. Numerous studies have investigated the relative importance of stochasticity 40 and determinism in different environments [e.g. 9, [15][16][17][18]. However, an understanding of the effects of 41 land use change on the relative importance of stochastic and deterministic community assembly processes 42 is largely absent, despite its potential importance. If community assembly processes exhibit hysteresis, then 43 recovering an ecological community in a modified habitat might rely on different assembly processes than 44 those that exist in primary habitats [19,20]. Therefore, our understanding of primary community assembly 45 may give information that at best is irrelevant, or at worst directly misleading, when it comes to planning 46 the restoration of modified communities. We therefore need to characterise how community assembly 47 mechanisms might change across gradients of habitat degradation. 48 Logging is a major driver of habitat degradation across many of the world's most productive and biodiverse 49 tropical forest ecosystems [21]. The tropical forests of Borneo have been subject to rapid and widespread 50 logging since the early 1970s. Between 1973 and 2010, there was an estimated 30 % decline in the extent 51 of Borneo's intact forests [22]. Logging often results in a heterogeneous landscape, with habitat patches 52 connected spatially but affected by different logging intensities [23]. This can result in gradients of 53 disturbance intensity, which are a frequent consequence of land use change in the tropics [2]. 54 There is uncertainty over how logging might affect stochastic and deterministic community assembly 55 processes, with little in the way of direct evidence. In primary forests, stochastic turnover in plant species 56 can drive deterministic changes in the niche dimensions available for moths, leading to a distance-turnover 57 relationship. However, such distance-turnover relationships have been shown to be absent in moth 58 assemblages in logged forests [24]. The effect of logging on the balance between stochastic and 59 deterministic processes has seldom been investigated for a variety of taxa. Döbert et al. (2017) showed that 60 understorey plant communities in tropical Bornean forests tend to be more stochastically assembled at 61 higher logging intensities. By contrast, Wearn et al. (2019) showed that, as logging intensity increases, 62 environmental control on the community assembly of mammals gets more important and spatial control on 63 community assembly gets less important. This implies that deterministic niche assembly becomes more 64 important with logging. There is broader conceptual and empirical support for determinism having higher 65 relative importance in logged forests, which tend to have harsher environmental conditions compared to 66 primary forests. Harsher environmental conditions can inflict a deterministic filter on community assembly, 67 thus increasing the importance of deterministic processes, which has been demonstrated across a variety 68 of taxa and regions [2, [26][27][28][29][30][31]. 69 There is further uncertainty over how taxonomic groups might differ in their community assembly 70 processes. For example, Aslani et al. (2022) documented variation in the relative importance of ecological 71 processes within groups of animals, fungi and protists in the soil eukaryome, which could be due to 72 differences in body size or niche breadth [32]. These differences in assembly processes may also exist 73 among different invertebrate taxa. For example, Sattler et al. (2010) demonstrated that deterministic 74 environmental variables explain <6 % of variance in bee communities and ~10 % of variance in spider 75 communities, suggesting that stochastic processes are highly important but that their exact importance 76 may vary depending on which taxa are investigated [33]. However, other studies have explained higher 77 percentages of variance in spider communities [34][35][36], which may be attributable to working in different 78 environments. We therefore need to consider how different taxonomic groups might differ in the balance 79 of stochastic and deterministic assembly processes. In microbial communities, generalist taxa have been 80 shown to contribute more to stochastic processes, and specialist taxa to contribute more to deterministic 81 processes, because specialists tend to have narrower tolerance to environmental changes [37,38]. Similarly, 82 studies on microbial communities have shown that rare taxa, which have narrower niche breadths, are 83 more deterministically assembled while abundant taxa with wider niche breadths are more stochastically 84 assembled [39,40]. There has not, to our knowledge, been a similar study directly investigating among-taxa 85 variation in stochastic and deterministic assembly processes for invertebrate communities. 86 There is a long-standing need to evaluate the relative importance of stochastic and deterministic processes 87 along environmental gradients and among taxa [3]. Here, we address that knowledge gap by examining the 88 balance of stochastic and deterministic contributions to community assembly, as well as the contribution of 89 a suite of individual community assembly processes, to seven invertebrate taxa across a gradient of logging 90 intensity in Malaysia. Our data encompass a comprehensive gradient of logging intensity, from areas that 91 have never experienced logging to areas that have been salvage logged. We quantified community 92 assembly mechanisms for a range of invertebrate taxa including three groups of Coleoptera (beetles), along 93 with Formicidae (ants), Lepidoptera (moths), Orthoptera and Araneae (spiders). Together, these taxonomic 94 groups encompass a range of feeding guilds and are of immense ecological importance [41][42][43][44][45]. We use our 95 data to test two hypotheses: (1) Stochastic processes will decrease in importance as logging intensity 96 increases, as logged forest should have harsher deterministic niche constraints than primary forest; and (2) 97 Specialist taxa will be less stochastic than generalist taxa as specialists are expected to be more strongly 98 affected by deterministic environmental filtering. Finally, we investigate whether the relative importance of 99 different community assembly processes varies across a gradient of logging intensity for different 100 invertebrate taxa. 101 102

104
The study sites were located within the Stability of Altered Forest Ecosystems (SAFE) project (4° 38′ N to 4° 105 46′ N, 116° 57′ to 117° 42′ E), a large-scale ecological experiment encompassing a gradient of land use 106 intensities in the lowland tropical forests of Sabah, Malaysian Borneo [46]. We used data from 14 out of the 107 17 experimental sampling blocks at SAFE [46], excluding three blocks located in oil palm plantation. Ten 108 sampling blocks were located in twice-logged forests and four were located in protected areas. Two of 109 these protected area blocks were in the Maliau Basin Conservation area and have never experienced 110 logging, while the other two had experienced light logging through both legal and illegal processes. Each 111 sampling block comprised a set of 4 -43 sampling sites (mean = 19). We grouped invertebrate samples 112 collected within each block which we considered as one local community for analysis. The aggregation of all 113 local communities across all sampling blocks was considered to represent the metacommunity.
Above-ground carbon density (ACD), calculated from LiDAR surveys and summarized at 1 hectare resolution 115 in 2014, was used to quantify logging intensity [47,48]: a higher above-ground carbon density corresponds 116 to a lower logging intensity. ACD was log-transformed to generate a more uniform spread of logging 117 intensity values. The sampling blocks covered a wide range of logging intensities, with average above-118 ground carbon densities ranging from around 15 t.C.ha -1 in heavily logged locations, to over 200 t.C.ha -1 in 119 the protected areas. Not all invertebrate taxa were sampled at the same subset of sampling points per 120 block. For analysis, then, we calculated the average ACD per sampling block separately for each taxon, 121 taking as inputs the ACD values for the specific subset of sampling sites where that taxon was collected. 122 We combined community composition data collected from seven invertebrate taxa: three groups of 123 beetles, plus ants, moths, spiders and Orthoptera. Different groups had different sample sizes and not all 124 groups were sampled in all 14 sampling blocks (Table S1, Figure S1). Beetles were sampled between 2011 125 and 2013 using combination pitfall-malaise traps in all 14 sampling blocks. Three different groups of beetles 126 were sampled: Curculionoidea (weevils), Staphylinidae (rove beetles) and Scarabaeoidea (scarabs) [49,50]. 127 Because of differences in their feeding guilds, each group was considered a separate taxon and was 128 analysed separately: weevils are predominantly herbivorous; most scarabs in our dataset are dung-feeders; 129 and rove beetles can belong to several feeding guilds. Beetles were identified primarily to morphospecies, 130 except some scarabs which were identified to species. Ants were sampled between December 2011 and 131 June 2012 in 12 sampling blocks using 12cm x 14cm plastic cards, which were laid flat in the leaf litter and 132 baited with 30 compressed dried earthworm pellets. The number of ants entering each card was observed 133 and recorded for 40 minutes, and individuals were identified to morphospecies [51]. Moths were sampled 134 in 2014 using UV light traps which were run overnight in 8 sampling blocks. Moths were identified where 135 possible and separated into morphospecies using morphology [52]. Spider abundance data were collected 136 in 2015 in 10 sampling blocks by beating plant foliage for 20 minutes at each site. Spiders were identified to 137 family, then separated into morphospecies by genitalia dissections and DNA barcoding using the CO1 gene 138 [53]. Finally, Orthoptera were sampled in 2015 by sweep netting along 100 m transects in 6 sampling 139 blocks. Orthoptera were identified to family, then separated into morphospecies using identification guides 140 [54]. 141 142 Stochasticity metrics 143 There are different ways for communities to express stochasticity, so there is value in assessing multiple 144 metrics of stochasticity on the same communities [12]. We chose to quantify stochasticity using three 145 different null-model based mathematical frameworks that summarise stochasticity both at the community 146 level and at the level of individual species. 147 First, we employed the normalised stochasticity ratio (NST) [55]. NST values are normalised on a scale from 148 0 to 1, with 0.5 as the boundary between more stochastic (>0.5) and more deterministic (<0.5) community 149 assembly. NST is based on pairwise community dissimilarity measures, for which there are many competing 150 metrics. We used Ružička dissimilarity which was shown to have the highest accuracy and precision when 151 the NST was developed [55]. NST compares the observed dissimilarity of the real community with the null 152 expected dissimilarity for 1,000 randomised communities [55]. To generate the random metacommunities 153 for the null expectation, the default null model algorithm was used. This algorithm is based on fixed taxa 154 richness and proportional taxa occurrence frequency [55]. occurrence frequencies that could be predicted by Sloan's neutral model [59,60], which quantifies the 168 relationship between the occurrence frequencies of taxa in a set of local communities and their 169 abundances across the metacommunity [58]. 'Neutral taxa' are those whose observed occurrence 170 frequencies are within one confidence interval of that which would be expected by Sloan's neutral model. 171 The proportion of neutral taxa was weighted according to the abundance of individuals in each taxon, and 172 was used to assess stochasticity at the level of individual species [58]. 173 All three stochasticity metrics were calculated from a separate community composition (site × species) 174 matrix for each of the seven taxa. We calculated stochasticity metrics for all sites in the composition matrix, 175 and grouped sites together by sampling block to calculate the mean of each stochasticity metric for each 176 sampling block. One-sample t-tests were used to compare the overall unweighted mean NST, |RC| and NTP across all taxa 227 to 0.5, which we used as an arbitrary boundary point separating stochastic (>0.5) from deterministic (<0.5) 228 community assembly processes. We also used ANOVA to test for significant differences in NST, |RC| and 229 NTP among taxa. 230 To test the hypothesis that generalists (ants and rove beetles) would be more stochastic than specialists 231 (moths, Orthoptera, spiders, scarabs and weevils), we used beta regression with a single categorical 232 predictor variable that describes whether the taxon is considered generalist or specialist. We fitted three 233 separate beta regression models, each with a stochasticity metric as the response variable. We broadly 234 categorised each taxonomic group into generalists or specialists based on their feeding habits Ants were 235 considered generalists because their diets can range from almost herbivorous to omnivorous and fully 236 predatory [66]. Many rove beetles are generalist predators, though some can belong to other feeding 237 guilds [67], so we also classified rove beetles as generalists. The Orthoptera and weevils included in our 238 data were mainly herbivorous [50,54], so we classified them as specialists. Spiders were all predatory [68], 239 scarabs were primarily coprophagous [49] and moths were herbivorous as larvae and nectarivorous as 240 adults [69], so we classified these groups as specialists for our analysis. 241 To investigate the effect of logging on the importance of each stochasticity metric we used beta regression, 242 with the mean ACD of each block as the predictor. To see how this relationship varies among taxa, we fitted 243 another beta regression model for each stochasticity metric with taxon, ACD and their interaction as 244 predictors of stochasticity. A similar analysis was conducted to investigate the effect of logging on the 245 relative importance of each community assembly process, however Dirichlet regression was used instead of 246 beta regression as Dirichlet regression is appropriate for proportions with more than two categories [70]. 247 To gain a taxa-independent metric reflecting the overall change in the relative importance of the five 248 community assembly processes across the logging gradient, we combined the slopes for all taxa to give a 249 weighted summary mean of the slopes for each process, using a method similar to a fixed effects meta-250 analysis, adapted from [71]. We also combined the mean relative importance of each of the five processes 251 and each of the three stochasticity metrics to give a weighted summary mean for all taxa using the 252 following method adapted from [71]: 253 For each group of taxa ( ), the weight ( ) is the inverse of the variance for that group of taxa = 1 . The 254 variance ( ) is the range of the 95 % confidence intervals. is the overall weighted mean for all groups of 255 taxa combined, it is analogous to the combined effect size in a meta-analysis. is calculated as = 256 , where is the mean relative importance or slope for each group of taxa, analogous to the effect 257 size of each study in a meta-analysis, and is the weight assigned to group , and is the number of 258 groups of taxa ( = 6). 259 260

261
Across all datasets, our analysis included 32,294 individuals belonging to 1,645 species or morphospecies 262 (Table S1). In general, community similarity between sampling blocks was low. Each of the three beetle taxa 263 showed a decrease in community similarity compared to old growth forest as logging intensity increased, 264 whereas the remaining three taxa did not ( Figure S2 There were significant differences in the role of stochasticity among taxa (ANOVA, NST: F70, 6 = 19.7, p 277 <0.0001; |RC|: F71, 6 = 68.7, p < 0.0001; NTP: F58, 5 = 2.6, p = 0.04). On average, ants had the highest NST, 278 whereas scarabs had the lowest NST. However, scarabs showed the highest stochasticity in terms of |RC| 279 and NTP (Table S3) However, all other slopes were not significant, suggesting that these metrics of stochasticity are generally 293 not affected by logging for the taxa investigated. Sloan's neutral model could not be fitted to the scarabs 294 dataset, so scarabs were excluded from the NTP analysis ( Figure 1C). 295 was estimated to be underpinned by drift, while this was estimated to be 74 % for Orthoptera (95 % 312 quantiles = 60 -82) and 67 % for moths (95 % quantiles = 62 -75) (Figure 2A; Table S4). 313 When comparing the relative importance of community assembly processes among sampling blocks, each 314 of which had a different level of logging intensity, the relative importance of the five community assembly 315 processes was not significantly affected by logging intensity for most taxa (Figure 2b, Table S5). When all 316 groups of taxa were combined, the weighted summary mean slopes for all processes were not significantly 317 different from zero. The strongest effects were a decrease in the importance of dispersal limitation as We found that ecological stochasticity was highly important in underpinning the assembly of all insect 335 communities studied. At a finer resolution, dispersal limitation was the dominant community assembly 336 process overall. In general, land use change had little impact on the relative importance of ecological 337 stochasticity, nor on the relative importance of a suite of community assembly processes. Together, this 338 suggests that stochastic dispersal assembly is the main driver of invertebrate community assembly at our 339 tropical rainforest site, regardless of the extent to which that habitat has been modified by logging. While 340 logging has profoundly negative impacts on biodiversity, the balance between stochastic and deterministic 341 processes appears robust to changes in logging intensity for the invertebrate taxa studied. 342 Overall, there was at best a very weak effect of land use intensity on the role of ecological stochasticity in 343 structuring insect communities. Different taxa showed different directions of the relationship between 344 logging and the three stochasticity metrics, but these relationships were generally not significant. The 345 relative importance of different community assembly mechanisms was also not significantly affected by 346 logging intensity for six out of seven taxonomic groups (Figure 2b). Moths, however, were the exception, 347 showing changes in community assembly processes (Figure 1; 2b) and turnover in species composition 348 ( Figure S2) with logging. For the six taxa that show no change in community assembly with logging, we 349 might expect to find little evidence of a change in the community assembly processes governing 350 invertebrate communities if those communities do not exhibit turnover in species composition across the 351 logging gradient. For three of the taxa (ants, Orthoptera and spiders), this assumption holds true: we found 352 no evidence of changing taxonomic identities across the logging gradient ( Figure S2) which aligns well with 353 a lack of change in the assembly processes governing those taxa (Figure 2b). However, all three beetle taxa 354 (rove beetles, scarabs and weevils) did exhibit significant turnover in the identity of species as logging 355 intensity increased ( Figure S2), which is consistent with previous studies [50,72,73]. Yet two of those taxa 356 (scarabs and weevils) appeared to have logging-related turnover in species identities where that turnover 357 was generated by the same ecological processes, regardless of logging intensity. Even for the rove beetles, 358 who did exhibit a small change from deterministic homogenous selection to stochastic dispersal limitation 359 as logging intensity increased (Figure 2b), dispersal limitation remained the dominant process across the 360 entire gradient (Figure 2a). This leads to the general conclusion that, regardless of whether or not species 361 identities change, community assembly processes remain robust to changes in above-ground carbon 362 density. 363 One possible explanation for why community assembly processes appear to be strongly conserved across 364 the gradient of logging intensity is that our data were collected after logging had taken place. While our 365 study landscape encompasses a very wide range of historic logging intensity, the time delay between the 366 logging event itself and our description of the invertebrate communities means any transitory impacts of 367 logging on community assembly processes would not have been detectable. Mahayani et al. (2020) showed 368 that phylogenetic diversity and community structure of tree communities had recovered 10 years after a 369 single logging cycle in Bornean tropical forest. Therefore, it could be possible, that the ecological 370 communities we sampled might have recovered their basic, pre-logging structures so that the community 371 assembly mechanisms in logged forests now resemble those of unlogged forests. 372 Two out of the three stochasticity metrics (NST and |RC|) were not significantly different between 373 generalists and specialists. Only the NTP metric supported our hypothesis that generalist taxa would be 374 more stochastic than specialists. When comparing taxa at a finer resolution, our results show that dispersal 375 limitation was the most important driver of community assembly for ants and beetles, whereas drift was 376 the main driver of assembly for spiders, Orthoptera and moths (Figure 2a). Spiders, moths and Orthoptera 377 were sampled at fewer sites than ants and beetles (Table S1), so we cannot definitively rule out the 378 possibility that this result may represent a sampling effect. We do note, however, that the scarab 379 community had a lower total number of individuals and a higher number of species than the Orthoptera, 380 suggesting that an undersampling-driven effect should have exerted a greater impact on them than on the 381 Orthoptera. Further, when we conducted a sensitivity analysis by grouping sites within the same block, drift 382 was still the dominant community assembly process for spiders, and orthoptera, and the importance of 383 drift was even slightly increased for moths (Text S1). 384 We have explicitly considered dispersal limitation to be a stochastic process, although we recognise in 385 some cases it can be generated by deterministic processes, or a combination of both [10,75]. However, in 386 neutral models and in empirical studies, dispersal is most often assumed to be stochastic, partly due to 387 difficulties associated with studying dispersal traits [75]. Moreover, our results which demonstrate that 388 invertebrate communities are both predominantly stochastically structured and that the main assembly 389 process is dispersal limitation, is consistent with the idea that dispersal limitation is a stochastic process. 390 The overall importance of dispersal limitation as the dominant community assembly process, especially for 391 ants and beetles highlights the importance of maintaining and, if necessary, restoring landscape 392 connectivity in logged forests [2]. Dispersal limitation can result in high spatial turnover in community 393 composition due to low levels of exchange of organisms among local communities [62]. The slight increase 394 in dispersal limitation we detected as logging intensity increased, though not significant, is in a direction 395 that is consistent with previous studies suggesting that animal communities in logged tropical forests may 396 experience lower levels of dispersal compared to primary forests [76][77][78]. Therefore, local communities 397 would be expected to exhibit more spatial turnover in community composition as logging intensity 398 increases and could become more isolated from each other. 399 We found that invertebrate community assembly processes were robust to a gradient of logging intensity. 400 Stochastic dispersal limitation remained the dominant driver of community assembly at all levels of habitat 401 degradation in our tropical rainforest site. This therefore suggests that knowledge of primary community 402 assembly can be useful in planning the restoration of modified communities as, for six out of seven 403 taxonomic groups, there were not significant changes in assembly processes despite changes in land use. 404 This study has quantified community assembly mechanisms across a gradient of logging intensity for seven 405 groups of invertebrate taxa in Bornean rainforests. The effect of logging on stochasticity, and on different 406 community assembly processes, varied only minimally among the different taxa and different metrics of 407 stochasticity, and painted a picture in which the dominant community assembly mechanisms are not 408 impacted by logging disturbance. Although logging did not alter the relative importance of stochastic and 409 deterministic community assembly processes in our study, we emphasize that logging, and in particular 410 severe logging, profoundly reduces species richness and changes community composition [1]