To composite or replicate: how sampling method and protocol differences alter stream bioassessment metrics

Aquatic invertebrates are excellent indicators of ecosystem quality; however, choosing a sampling method can be difficult. Each method and associated protocol has advantages and disadvantages, and finding the approach that minimizes biases yet fulfills management objectives is crucial. To test the effects of both sampling methods and sample handling – i.e., to composite samples or leave them as replicates – we collected aquatic invertebrates from the Niobrara River at Agate Fossil Beds National Monument, Nebraska using three methods and two sample handling protocols. We compared aquatic invertebrate assemblages collected with a Hester-Dendy multi-plate sampler, Hess sampler and a D-frame dipnet. We calculated six common bioassessment metrics from composite (combined) and replicate (separate) samples. Hess samples contained the highest taxonomic richness (capturing 77% of all taxa observed) and dipnet samples the least (47%). Hester-Dendy samples had the greatest proportion of Ephemeroptera, and Ephemeroptera, Plecoptera and Trichoptera (EPT). Dipnet samples had the lowest evenness values. In terms of sample handling, composite samples had inflated richness, diversity and evenness compared to replicate samples, but bioassessment metrics calculated from proportions or averages (i.e. Hilsenhoff’s Biotic Index and the proportion of EPT taxa) did not differ between them. The proportion of invertebrate groups from composite samples were not statistically different among sampling methods, but several groups differed between replicate samples collected by different methods. Ultimately, we recommend collecting replicate samples with a Hess sampler when the goal of the study is to detect ecosystem change, among locations or differences in variables of interest.


Introduction 1
Aquatic invertebrates have been used to monitor ecosystem quality for over 150 years (Cairns 2 and Pratt 1993), largely because they have several characteristics that make them ideal for the 3 task. Aquatic invertebrates are relatively long lived (weeks to >100 years, Rosenberg and Resh 4 1993a) and unlike water samples that are collected periodically, invertebrates are permanent 5 stream residents and therefore their presence or absence reflects long-term conditions at a site. 6 For instance, water samples may miss discrete, short-lived discharges of pollution, but aquatic 7 invertebrate communities will respond to such an event (Rosenberg and Resh 1993b). 8 Furthermore, aquatic invertebrates are relatively sedentary, diverse and are inexpensive to collect 9 and identify. Most importantly, lower ecosystem quality in a stream can increase mortality and 10 decrease reproduction, survival and fitness of sensitive aquatic invertebrates (e.g., 11 Ephemeroptera) whiles others are more tolerant to disturbances (e.g., Diptera; Johnson et al. 12 1993; Barbour et al. 1999). Changes in the diversity or assemblage structure of aquatic 13 invertebrates can inform managers of stream ecosystem quality (Rosenberg and Resh 1993b). 14 Choosing a sampling method for aquatic invertebrate monitoring is difficult and depends 15 on many variables. All approaches have advantages and disadvantages (e.g., cost to implement, 16 time, bias towards specific taxa or life histories; e.g., Macanowicz et al. 2013, Tronstad and 17 Hotaling, 2017). Therefore, identifying a method that is cost-effective, minimizes bias and 18 fulfills management objectives is critical. Bioassessment studies use a variety of sampling 19 methods, including kicknets, fixed-area samplers (e.g., Hess sampler), artificial substrates (e.g., 20 Hester-Dendy samplers) and dipnets (Carter and Resh 2001). However, some sampling methods 21 are not well-suited to all stream habitats. For example, artificial substrates (e.g., Hester-Dendy 22 plates) are ideal for large, deep rivers that are otherwise difficult to sample (De Pauw et al. 23 1986). However, artificial substrates rely on colonization and therefore, do not represent natural 24 assemblages or densities and can be biased towards certain insect orders (Letovsky et al. 2012). 25 The type of information being collected also matters. For example, qualitative data may be 26 sufficient if the study is estimating ecosystem health to meet federal standards, but more rigorous 27 quantitative sampling is needed to assess change over time (e.g., Slavik et al. 2004). Qualitative 28 samples only report proportional data, while fixed area samplers provide quantitative information 29 on the density and biomass for each taxon in the assemblage. 30 Laboratory protocols can alter the taxa identified and the bioassessment metrics 31 calculated. Previous studies (e.g., Vinson & Hawkins 1996) have investigated what type of 32 subsampling method is best for bioassessment studies to minimize cost and produce reliable 33 results. The two main types of subsampling -fixed area (e.g., 25% of sample) and fixed count 34 (e.g., 300 individuals; e.g., King and Richardson 2002) -have been compared for many data 35 types (e.g., Vinson & Hawkins 1996 The headwaters of the Niobrara River are located near Lusk, Wyoming and the river flows 60 eastward into Nebraska and eventually into the Missouri River near Niobrara, Nebraska (Fig. 1).

61
The Niobrara River Basin covers 32,600 km 2 of which the majority is grassland in northern 62 Nebraska (Galat et al. 2005 an overstory of plains cottonwood (Populus deltoides) and cattails are more abundant than iris. 77 The central site, Agate Middle, lacks an overstory and has gravel substrate with abundant iris and 78 cattails surrounding the river. Finally, Agate East is located before the Niobrara River flows out 79 of the park and is the deepest site with riparian vegetation dominated by iris and a few willows 80 (Salix spp.).

82
General measurements 83 To assess general environmental characteristics of our study sites, we measured a number of 84 standard variables (e.g., temperature), as well as water quality and clarity, sediment composition, 85 water depth and discharge. We measured dissolved oxygen (percent saturation and mg/L), pH, 86 water temperature, specific conductivity and oxidation-reduction potential using a Yellow 87 Springs Instruments (YSI) Professional Plus. The YSI was calibrated on-site before use. We 88 measured water clarity by estimating the depth at which a Secchi disk disappeared from sight. 89 The dominant substrate was recorded in the main channel of all sites and where each Hess 90 sample was taken using soil texture tests (Thien 1979 Hester-Dendy sample collection 103 We deployed seven Hester-Dendy samplers (76 mm x 76 mm, 9 plates, Wildlife Supply 104 Company) at each site. For each sampler, we strung a rope across the stream between two fixed 105 posts with evenly spaced loops to separate the Hester-Dendy multiplate samplers. The Hester-106 Dendy samplers were suspended in the water column at least 15 cm above the substrate. Debris 107 dams were cleared weekly and we retrieved the samplers after 30 days of colonization by 108 approaching the site from downstream, placing a dipnet (150 µm mesh) under it and cutting the 109 rope. Hester-Dendy samplers were immediately placed in a container with ~80% ethanol and any 110 organisms in the dipnet were removed and placed in the same container. In the laboratory, we 111 dismantled and scrubbed the Hester-Dendy samplers to remove invertebrates that colonized the 112 plates, then we rinsed the samplers through a 212 µm sieve and preserved all specimens in ~80% 113 ethanol. The middle five Hester-Dendy samples were used for analysis except when one of the 114 samplers were compromised (e.g., touching the bottom).

116
Hess sample collection 117 We collected five Hess samples (500 µm mesh, 860 cm 2 sampling area, Wildlife Supply 118 Company) at each site. Samples were taken along the shallower margins of the stream where 119 emergent vegetation is abundant. We placed the Hess sampler over vegetation to collect 120 invertebrates living on it and in the surrounding benthic sediment. The vegetation and sediment 121 were vigorously agitated and invertebrates were captured in the net. Samples were preserved in 122 80% ethanol and returned to the laboratory for analysis. 123

124
Dipnet sample collection 125 We collected dipnet samples along a reach that was 40x the wetted stream width following 126 standard methods for sampling aquatic invertebrates in wadeable streams (US EPA 2013). We 127 measured the wetted width at five representative points along the stream and averaged values to 128 the nearest meter. The average width of the Niobrara River was less than 4 m, so we used a 129 minimum reach length of 150 m. We sampled invertebrates along 11 evenly-spaced transects that 130 were 15 m apart using a D-frame net (243 µm mesh, 30.5 x 25.4 cm opening, Wildlife Supply 131 Company). At each transect, we sampled the right, left and center of the stream systematically. 132 Multiple habitats were sampled including benthic substrate, woody debris, macrophytes and leaf 133 packs. All samples were composited and preserve in the field with 95% ethanol. 134 For dipnet sampling, we classified streams into riffle/run or pool/glide habitat and 135 adjusted our methods for each. We defined a habitat as riffle/run if the current fully extend the 136 net or a pool/glide if the net did not fully extend. For riffle/run habitats, we placed the net on the 137 bottom of the stream with the opening facing upstream. We visually defined a sampling area as 138 one net width wide and long upstream of the opening (~30 x 25 cm). We first removed any large 139 organisms (e.g., snails, mussels) from the sampling area and placed them into the net. Next, we 140 scrubbed all rocks that were golf ball sized (~4 cm) or larger to dislodge organisms, wash them 141 into the net and placed the scrubbed rocks outside of the sampling area. Finally, we held the net 142 below the sampling area and disturbed the remaining finer substrate for 30 seconds while the 143 drift washed into the net. Pool/glide habitats were sampled the same as riffle/run except the net 144 was repeatedly pulled through the disturbed water just above the substrate to capture organisms 145 and continuously moved throughout sampling to ensure no organisms escaped the net. 146 After we sampled a transect, we transferred the sample to a sieve bucket (500 µm mesh).

147
We removed as much gravel as possible and inspected the net for any residual organisms. We 148 inspected each large object (e.g., rocks or sticks), removed organisms that were attached to them 149 and discarded the object. For each sampled area, we recorded the dominant substrate size (e.g., 150 fine/sand, gravel, coarse, other) and the habitat type (riffle/run or pool/glide).

152
Sample processing -Hester-Dendy and Hess 153 Invertebrates collected with Hester-Dendy and Hess samplers were sorted from debris in white 154 trays and identified under a dissecting microscope. We rinsed all samples through a 2 mm sieve 155 followed by 212 µm (Hester-Dendy) or 500 µm (Hess) sieves to separate larger and smaller 156 invertebrates. All large invertebrates (> 2 mm) were identified. If invertebrates were visually 157 numerous in the smaller sieve, we subsampled the contents using the record player method 158 (Waters 1969 values were assigned to each taxon from Barbour et al. (1999).

162
Sample processing -Dipnet 163 We processed dipnet samples following the official EPA protocol (US EPA 2013). We elutriated 164 all dipnet samples to remove inorganic substrate with a 500 µm mesh sieve. In the laboratory, we 165 spread the sample evenly over a 30 x 36 cm sorting tray that was divided into 30 numbered grids 166 (6 cm 2 each). Using a random number generator in R (R Development Core Team 2013), we 167 selected six of the 30 grids, removed the invertebrates and counted them. If the first six grids did 168 not contain a minimum of 500 individuals, we randomly selected additional grids until the 169 minimum threshold was reached. We removed and identified large or rare invertebrates defined 170 as longer than 1.2 cm (Vinson and Hawkins 1996). All invertebrates were identified to the lowest 171 taxonomic level possible, typically genus, and we normalized our abundance estimates for each 172 site based upon the number of grids that were counted. 173 174 Statistical analyses 175 We  190 We evaluated differences in the aquatic invertebrate assemblage across sites and 191 sampling method with non-metric multidimensional scaling (NMDS) implemented in the R 192 package vegan (Oksanen et al. 2013). NMDS provides an ordination-based approach to rank 193 distances between objects and has been shown to perform well with non-normally distributed 194 data (Legendre and Legendre 1998). To prepare our data for NMDS analysis, we removed rare 195 taxa (as defined as any taxon that was unique to a single site+method combination). Next, we 196 calculated the mean and standard deviation (SD) for each taxon and removed two species which 197 were present at more than two deviations above the mean. Finally, we removed any taxon 198 present at less than 0.1% of the overall abundance (after the first two filtering steps were 199 completed). NMDS analyses were performed using Bray-Curtis distances on composite samples 200 with default settings. To test whether the assemblages recovered were different depending on 201 sampling method or site, we performed an analysis of similarities (ANOSIM) with default 202 settings (including 999 permutations). Next, we investigated differences in multivariate 203 dispersion for each method by calculating the mean distance of each sample to the group's 204 centroid in multivariate space with the function betadisper. We assessed pair-wise differences in 205 dispersion with a Tukey's HSD. To better visualize taxonomic differences in invertebrate 206 assemblages collected with each sampling method, we constructed a ternary plot using the R 207 package ggtern (Hamilton 2015). For ternary plot construction, we only removed rare taxa (as 208 described above) before averaging the abundances of each taxon in composite samples across 209 sites for each method. 210 211 Results 212 Environmental variation 213 Sites were environmentally similar to one another with little variation between our July and 214 August sampling dates ( Hester-Dendy samplers, 10 taxa not present in Hess samples and 8 taxa unique to dipnet 231 samples. 232 When composited, proportions of insects ( Fig. 2a; F = 0.3, df = 1, p = 0.75) and non-233 insects ( Fig. 2b; F = 0.3, df = 1, p = 0.75) did not differ among sampling methods. Proportions of 234 Annelida, Crustacea, Coleoptera, Diptera, Ephemeroptera, Hemiptera, Mollusca, Odonata and 235 Trichoptera also did not differ when composited (p ≥ 0.25; Fig. 2). Conversely, when treated as 236 replicates, the proportion of insects ( did not differ between replicate Hester-Dendy and Hess samples. 243 Additionally, NMDS analyses indicated that the sampling methods collected different 244 aquatic invertebrate assemblages (p, ANOSIM = 0.008; Fig. 3a), but that overall, assemblages 245 did not differ among sites (p, ANOSIM = 0.408; Fig. 3b). While different sampling methods 246 yielded distinct assemblages, the amount of multivariate space occupied by each method did not 247 differ (p, Tukey's HSD ≥ 0.94). Visualization of the assemblage recovered by each method via 248 ternary plot highlighted the strong bias towards Hess and Hester-Dendy sampling in terms of 249 unique taxa (Fig. 4). After filtering rare taxa as described above, only one taxon, Ceratopogon, a 250 genus of Ceratopogonidae, was observed in dipnet samples yet was largely absent elsewhere. 251 Both Hess (13 taxa) and Hester-Dendy (7 taxa) sampling recovered a number of taxa that were 252 either rare or completely absent in the results of the other methods. However, some taxa were 253 relatively equally represented across all three methods including Anax, Collembola, Hyallela and 254 Lymnaeidae (Fig. 4).

256
Bioassessment metrics 257 When calculated from composite samples, bioassessment metrics differed among sampling 258 methods, but most comparisons were not significant without incorporating replicates. Taxonomic Most bioassessment metrics calculated from electronically composited samples were 266 higher than those estimated from replicate samples. When composited, 40% and 80% more taxa 267 were observed in Hester-Dendy and Hess samples, respectively, versus replicate samples (Table  268 2). Similarly, EPT richness was 43% and 83% higher in composited Hester-Dendy and Hess 269 samples, respectively, versus replicates. Taxonomic diversity was also 82% higher in composited 270 Hester-Dendy samples and 63% higher in composited Hess samples.  Fig. 1 We sampled three sites along the Niobrara River at Agate Fossil Beds National Monument in Nebraska, USA. The black line is the Monument boundary and the transparent white areas are private land within the Monument. The inset shows the location of Agate Fossil Beds National Monument in Nebraska (star).