A comparison of asymmetric before-after control impact (aBACI) and staircase experimental designs for testing the effectiveness of stream restoration

Before-after-control-impact (BACI) experimental designs are commonly used in large-scale experiments to test for environmental impacts. However, high natural variability of environmental conditions and populations, and low replication in both treatment and control areas in time and space hampers detection of responses. We compare the power of two asymmetric BACI (aBACI) designs to two staircase designs for detecting changes in juvenile steelhead (Oncorhynchus mykiss) abundance associated with a watershed-scale stream restoration experiment. We performed a simulation study to estimate the effect of a 25% increase in steelhead abundance using spatial and temporal estimates of variance from an ongoing study, and determined the power of each design. Experimental designs were then applied to three streams and each stream was composed of three 4 km long sections. We compared the power of a single treatment section in one stream (BACI-1), three simultaneous treatments of all sections in one stream (BACI-3), three sequential treatments in one stream (STAIRCASE-1), and three sequential treatments in one section in each stream (STAIRCASE-3). All designs had ≥ 94% power to detect a 25% increase in abundance assuming average variance. Under worst-case variance (i.e., upper 95% confidence limits of historical variance estimates), the STAIRCASE-3 design outperformed the BACI-1, BACI-3, and STAIRCASE-1 designs (i.e., 77%, 41%, 8%, and 33% power respectively). All the designs estimated the effect of the simulated 25% abundance increase, but the length of the confidence interval was much shorter for the STAIRCASE-3 design compared to the other designs, which had confidence intervals 58-596% longer. The STAIRCASE-3 design continued to have high power (88%) to detect a 10% change in abundance, but the power of the other designs was much lower (range 34-56%). Our study demonstrates that staircase designs can have significant advantages over BACI designs and therefore should be more widely used for testing environmental impacts.


23
Before-after-control-impact (BACI) experimental designs are commonly used in large-scale experiments   24 to test for environmental impacts. However, high natural variability of environmental conditions and 25 populations, and low replication in both treatment and control areas in time and space hampers 26 detection of responses. We compare the power of two asymmetric BACI (aBACI) designs to two staircase 27 designs for detecting changes in juvenile steelhead (Oncorhynchus mykiss) abundance associated with a 28 watershed-scale stream restoration experiment. We performed a simulation study to estimate the 29 effect of a 25% increase in steelhead abundance using spatial and temporal estimates of variance from 30 an ongoing study, and determined the power of each design. Experimental designs were then applied to 31 three streams and each stream was composed of three 4 km long sections. We compared the power of a 32 single treatment section in one stream (BACI-1), three simultaneous treatments of all sections in one 33 stream (BACI-3), three sequential treatments in one stream (STAIRCASE-1), and three sequential 34 treatments in one section in each stream (STAIRCASE-3). All designs had > 94% power to detect a 25% 35 increase in abundance assuming average variance. Under worst-case variance (i.e., upper 95% 36 confidence limits of historical variance estimates), the STAIRCASE-3 design outperformed the BACI-1, 37 BACI-3, and STAIRCASE-1 designs (i.e., 77%, 41%, 8%, and 33% power respectively). All the designs 38 estimated the effect of the simulated 25% abundance increase, but the length of the confidence interval 39 was much shorter for the STAIRCASE-3 design compared to the other designs, which had confidence 40 intervals 58-596% longer. The STAIRCASE-3 design continued to have high power (88%) to detect a 10% 41 change in abundance, but the power of the other designs was much lower (range 34-56%). Our study 42 demonstrates that staircase designs can have significant advantages over BACI designs and therefore 43 should be more widely used for testing environmental impacts.

48
Impacts from manipulations to ecosystems, such as the extraction of natural resources, must be 49 identified and quantified to develop strategies to reduce our footprint on the environment and manage 50 resources sustainably. Ecosystem experiments using impacts (e.g., perturbations such as logging, 51 addition of nutrients) have led to a greater understanding of the influence of management actions on 52 ecosystem processes and biological populations [1][2][3]. Ecosystem experiments have provided a wealth of 53 information because they were appropriately scaled -the impacts were large (e.g., often whole 54 watersheds) and the monitoring was intensive (e.g., monitoring multiple scales and for many years or  uncertainty that was sometimes substantial. We assumed that the estimated variance represented the 216 "expected" variability present, and the lower and upper limit of the confidence intervals represented 217 the "best case" and "worst case" for variability respectively (  242 4 Assumed to be ½ x fish site because changes in differences between fish sites across years are likely to 243 be smaller than the original differences.

244
We also checked our models for serial correlation across years on the measurements made within 245 stream sections using an autoregressive process [AR(1); 29]. We found a statistically significant 246 autocorrelation coefficient (r=0.42, 95% confidence interval 0.10-0.74) that we could then build into our 247 simulations (see below).

249
We simulated four experimental designs based on the layout of the Asotin Creek lMW study 250 defined above (Fig 2). First, we simulated an aBACI design in which one section was restored with LWD 251 at the start of Year 7, and the eight other sections were controls (BACI-1; Fig 3). Second, we simulated 252 an aBACI design where three sections were restored simultaneously with LWD at the start of Year 7, and 253 the remaining six sections were maintained as controls (BACI-3; Fig 3). Third, we simulated a staircase  to detect a 25% increase in fish abundance at only 8%. All designs detected the simulated 25% increase 381 in abundance accurately under expected and worst-case variability (Fig 4a and b). Confidence interval 382 lengths were similar for all designs under expected variance (Fig 4 a); however, BACI-3 had larger 383 confidence interval lengths under both expected and worst-case variability and only STAIRCASE-3 had 384 confidence intervals that did not cross 0 under worst-case variability (Fig 4b).

451
The low power of the BACI-3 occurs in part because we have assumed there exists a stream by year 452 interaction as supported by our analysis of existing data. That is, the fish populations in different 453 streams within our watershed do not change identically between years, even though they all experience 454 similar out-of-watershed effects (e.g. the effects of the ocean, fishing, and Columbia and Snake 455 hydropower system), and localized climate effects. While using multiple streams can greatly reduce the 456 noise caused by these factors, noise created by asynchrony between streams is likely to always exist.

457
Keeping that extraneous noise out of the treatment comparisons is one of the key goals of an 458 experimental design. In much the same way as blocking is used in typical experimental designs to 459 improve precision of treatment comparisons, maintaining both control and treated sections in streams 460 improves the power for our IMW designs (Underwood 1994