Abstract
Accurately constraining the transport of environmental DNA (eDNA) after it is shed from an animal is vital to appropriately geolocate species detections and biodiversity measurements from eDNA sequencing data. Modeling studies predict the horizontal transport of eDNA at concentrations detectable using quantitative PCR over scales of tens of kilometers, but more limited vertical transport. Field studies routinely find that eDNA metabarcoding data distinguishes biological communities at small spatial scales, over scales of tens of meters. Here, we leverage the unique bathymetry of an offshore, mesophotic bank and the benthic invertebrate community that it supports to determine the extent to which vertical and horizontal eDNA transport may affect the interpretation of species detections from eDNA metabarcoding data. We found that in a stratified water column, eDNA from benthic invertebrates was vertically constrained to depths close to the seafloor in the thermocline versus in the surface mixed layer above. However, when using primers that are taxonomically specific to corals, we found evidence for the horizontal transport of coral eDNA at distances at least ∼1.5 km from where they can be reasonably expected to occur. On the contrary, there was minimal evidence for horizontal transport of benthic eDNA in data generated using primers that broadly targeted sequences from eukaryotes. These results highlight the importance of horizontal transport as well as considering methodological details, like the taxonomic specificity of PCR primers, when interpreting eDNA sequencing data.
1. Introduction
The analysis of environmental DNA (eDNA) from water samples is a powerful tool for animal species detection and biodiversity assessment in marine ecosystems, complementing conventional approaches by detecting taxa that would be otherwise missed (e.g. Govindarajan et al., 2021; Govindarajan et al., 2023; West et al., 2024). While the spatial resolution of biodiversity data generated using conventional methods, including trawls, net tows, and submersible observations, is well-defined, the spatial resolution of data generated through eDNA sequencing is less clear. Aqueous eDNA is subject to transport by currents until it either settles to the benthos or degrades into fragments that are too short to amplify using PCR-based methods (reviewed in Harrison et al., 2019). Uncertainty in the distances that eDNA can be transported after it is shed from an animal complicates the integration of species detections and biodiversity estimates using eDNA sequencing with those from conventional methods, due to the possible discrepancies in the spatial resolution of these two data types. Geolocating species detections and biodiversity measurements from eDNA sequencing data depends on accurately constraining the scale of eDNA transport from its source, a vital component of what has been referred to as the “ecology of eDNA” (Barnes & Turner, 2016).
In lotic environments eDNA may be transported up to tens of kilometers downstream from its source, and transport distance downstream depends not only on flow but also the river substrate (Deiner et al., 2014; Jerde et al., 2016; Shogren et al., 2016; Shogren et al., 2017; Snyder et al. 2023). These findings are derived from experiments in artificial streams or in experiments where a novel animal is introduced to the environment and the distance downstream that its eDNA is transported is quantified. Hereafter we refer to the latter as “caged fish” experiments for brevity, though some have also used invertebrates as eDNA sources such as freshwater mussels (e.g. Sansom & Sassoubre, 2017). While fewer of these studies have been conducted in marine ecosystems, their results are enlightening. For example, Ely et al. (2021) found that eDNA concentrations decreased rapidly with distance from the source and were undetectable in many samples taken within 100s of meters. In contrast, Murakami et al. (2019) introduced a caged fish for 48 hours in a harbor and detected its eDNA up to 1 km from its location, albeit at low concentrations.
Most other inquiries into the transport distances of eDNA in marine and lacustrine habitats can be generally categorized into two main approaches: 1) modeling eDNA transport through Lagrangian particle tracking, or 2) conducting eDNA field sampling from different habitat types with distinct communities at known distances apart from each other and/or at different depths.
Lagrangian particle tracking studies have been especially useful for modeling eDNA transport over the large spatial scales that are relevant to studying biology in the ocean. By pairing particle tracking model simulations with the quantification of eDNA from field samples, these studies have predicted that detectable concentrations of eDNA may be transported up to tens of kilometers from its source. However, these models also predict that eDNA will be found in higher concentrations and detected with a higher probability closer to its source (Andruszkiewicz et al., 2019; Kutti et al., 2020). These predictions are consistent with the exponential decay of eDNA that these particle tracking models rely on. In the vertical dimension, results from modeling studies suggest that scales of transport are small (on the order of tens of meters) and depend on the rate of eDNA settling and water column stratification (Andruskiewicz et al., 2019; Allan et al. 2021). Modeling studies rely on three inputs that must be derived independently to accurately model eDNA particle transport: 1) a model of ocean circulation with a fine enough spatial resolution to model particle transport at relevant spatial scales, 2) a rate(s) of eDNA degradation and 3) a rate(s) of eDNA settling. Results from eDNA persistence experiments now permit the accurate estimation of eDNA degradation rate across ranges of relevant physicochemical factors, namely temperature (Andruszkiewicz Allan et al., 2021; McCartin et al., 2022). However, rates of eDNA settling are largely unknown, and their estimation is hindered by a lack of understanding of the predominant form of eDNA in the environment, e.g. dissolved DNA molecules versus cells or particulates, which may differ depending on environmental conditions (Barnes et al., 2021).
Due to the dependencies of particle tracking models on the accurate estimation of these parameters, field studies are vital to validate model predictions. While modeling studies predict the horizontal transport of detectable eDNA concentrations over tens of kilometers, metabarcoding eDNA from water samples routinely differentiates biological communities across habitats connected by either near shore currents and/or tidal movement at scales of tens of meters to a few kilometers (Port et al., 2016; Jeunen et al., 2019; West et al., 2020; Dugal et al., 2023; Shea & Boehm, 2024). These results suggest that when considering the entire pool of eDNA from multiple taxa in a sample, biological communities can be differentiated based on eDNA in seawater that is either present at the highest concentrations and has the highest detection probability; theoretically this eDNA should represent eDNA from taxa close to where the sample is taken. This inference is consistent with the predictions from particle tracking studies that eDNA concentrations and probabilities of detection will be highest closest to the source organism.
Horizontal transport of eDNA may still result in the detection of eDNA that has originated far from the sampling location. In one recent study, Shea and Boehm (2024) leveraged the natural tidal cycle and demonstrated that eDNA “communities” in tide pools separated by just 40 meters were more distinct from one another at the lowest stage of the tide. At higher stages of the tide, and thus just after flow between the tide pools had ceased, eDNA communities in the tide pools were more similar. Their study elegantly demonstrates that eDNA shedding from local fauna and the degradation of eDNA from distant fauna leads to distinct eDNA signals between locations, but that water flow between locations can blur these distinctions (Shea and Boehm, 2024).
eDNA metabarcoding studies have validated model predictions that the vertical transport of eDNA is weak in a stratified water column, across both very narrow depth ranges (meters) and broad depth ranges (hundreds to thousands of meters). Strong haloclines form in some fjords, where a freshwater layer exists at the very surface, and eDNA from these different layers is readily distinguished across depths of just a few meters (Jeunen et al., 2019; Robinson et al., 2023). Monuki et al. (2021) also found that samples from the surface were distinct from samples taken just 10 meters below the surface in a coastal kelp forest ecosystem (Monuki et al., 2021).
Govindarajan et al. found differences in eukaryotic eDNA between water masses in the offshore Northwest Atlantic across depth ranges of tens to hundreds of meters (Govindarajan et al., 2021). In one study, a lack of differentiation in fish eDNA across the pycnocline may be attributed to upwelling where the samples were collected, off the coast of California (Closek et al., 2019).
Even though these substantial efforts have been made, the transport distances of marine eDNA that is detectable in metabarcoding data remain largely unconstrained. No field studies have combined metabarcoding with known point sources of eDNA over multiple kilometers in the horizontal dimensions and across substantial depth ranges. In this study, we follow the principles of “caged fish” experiments by leveraging an isolated benthic invertebrate community that occurs at a bank offshore in the northwestern Gulf of Mexico, where diapiric intrusions of salt domes have formed steep-sloped carbonate banks along the edge of the continental shelf (Rezak et al., 1985). These banks support dense and diverse communities of benthic invertebrates at their peaks and slopes and provide habitat for reef-associated and commercially/recreationally important fish species (e.g. Red Snapper, Lutjanus campechanus, and groupers, Mycteroperca spp.) (Dennis & Bright, 1988). Across the depths at which these banks occur (as shallow as 17 meters to surrounding depths of over 200 meters), environmental conditions, including temperature, vary drastically. During the summer, temperature decreases rapidly with depth, resulting in shoaling of the surface mixed layer and the stratification of depth layers in the strong thermocline below (Lugo-Fernández, 1998). At these banks, benthic invertebrate communities and the demersal organisms associated with them turn over rapidly with depth, resulting in unique benthic invertebrate communities in each depth zone (Rezak et al., 1985; Sammarco et al., 2016). As a result of this structure by depth and the dearth of available hard bottom habitat elsewhere on the continental shelf, benthic communities at the banks are akin to isolated “islands” of biodiversity that do not occur elsewhere except many kilometers away at the nearest bank that rises to the same depths above the sedimented, turbid shelf edge.
We sampled eDNA systematically at distances away (up to ∼ 5 ½ kilometers) from one of these mesophotic banks, Bright Bank, and at different depths (from 20 to ∼100 meters below sea level), to determine the extent of eDNA transport in the horizontal and vertical dimensions.
Additionally, we compared data generated using PCR primers that broadly amplify a barcode in the 18S rRNA gene from all eukaryotes with data generated using primers that specifically enrich 28S in anthozoan corals to investigate the influence of this methodological choice on inferring the transport of eDNA from metabarcoding data.
2. Methods
2.1 Study site
Bright Bank is a carbonate bank within the Flower Garden Banks National Marine Sanctuary (FGBNMS) that is situated on the edge of the continental shelf ∼120 miles south of the borders of the U.S. states of Louisiana and Texas. Bright Bank protrudes from a background depth of approximately 120 meters to 33 meters. At its shallowest point, there is a small mesophotic stony coral reef community (Hickerson et al., 2008). Most of the top of the bank lies at depths between 40 and 80 meters deep. The closest seafloor at depths shallower than 80 meters in the region are at Rankin & 28 Fathom Banks ∼8 nautical miles (∼15 km) to the west and Geyer Bank ∼12 nautical miles (∼22 km) to the east (Figure 1A).
At Bright Bank, the benthos is characterized by abundant carbonate encrusted with coralline crustose algae and interspersed sediments. At the top of the bank at 67 meters, benthic invertebrate morphotaxa observed during remotely operated vehicle (ROV) dives include sponges, incl. Xestospongia muta and Siphonodycton sp.; giant Caribbean Sea anemone, Condylactis gigantea; black corals, antipathids and Plumapthes cf. pennacea; octocorals, Diodogorgia nodulifera, Nicella, and Ellisella; and other, smaller benthic invertebrates associated with the benthos and with corals. At depths of ∼85 meters near the eastern edge of the bank at site “Swiftia”, the diversity of benthic invertebrate morphotaxa observed is lower than at the top of the bank (11 versus 18 unique morphotaxa), but black corals and octocorals, including those not observed at shallower depths, Swiftia exserta and Chironepthya sp., occur. From the 80-meter isobath, the bathymetry slopes steeply to the north, east and south, and more gently to the west where it gradually connects to Rankin and 28 Fathom Banks. To the east of Bright at depths below 100 meters, the coral community is different than at shallower depths on the bank and includes black corals and octocorals such as Elatopathes abietina, Aphanipathes pedata, Acanthopathes thyoides, and Callogorgia gracilis that have not been observed at shallower depths during ROV dives at Bright.
2.2 Study design
To determine the distribution of benthic invertebrate eDNA from Bright Bank, we conducted a series of Niskin bottle rosette casts over the course of three days from September 24-26, 2019, to sample eDNA extensively at the center of Bright Bank and in the water column nearby. Casts were conducted at stations corresponding to the center of the bank, and then in each ordinal direction at distances 1.5 and 3 nautical miles from the bank’s center to capture potential eDNA transport from the bank’s center in all directions in the horizontal dimension (Figure 1B). An additional cast was conducted 1 NM east of the center of the reef bank, and two additional casts were conducted ∼2 NM to the NNW (station “mooring”) and NNE (station “Swiftia”) where remotely operated vehicle (ROV) dives had previously occurred.
Casts were conducted during daylight hours, as early as 7:00 CST (local time) and as late as 19:05 CST. The Niskin bottle rosette was outfitted with twelve ∼2L Niskin bottles and a Sea-Bird SBE 19plus V2 Conductivity Temperature Depth (CTD) unit (Seabird Scientific, Bellevue, Washington) that included sensors for pressure, conductivity, and temperature, as well as a WET Labs ECO-AFL/FL fluorescence sensor, a WET Labs ECO turbidity sensor, and an SBE 43 dissolved oxygen sensor. Downcast profiles were generated from .cnv files exported from the raw data using Seasoft software (Seabird Scientific) and plotted using the R package ggplot2 in the tidyverse (Wickham et al., 2019) in R (version 4.3.1 Beagle Scouts) to visualize conditions with depth at each station.
2.3 eDNA Sampling
Water samples were taken with Niskin bottles during upcasts at all stations 0, 1, 1.5 and 3 NM from the center of Bright Bank at depths from ∼100 meters to 20 meters. At the “Center” station (0 NM), the Niskin bottle rosette was lowered to a depth as close to the seafloor as possible (58 meters). Here, quadruplicate samples were taken near the seafloor, and at 40 meters depth and 20 meters in the water column. At stations 1.0, 1.5, and 3.0 NM from the Center of Bright, triplicate samples were taken in the water column at 100 meters depth or as close to the seafloor as possible (101 to 90 meters depth), and at 80 meters, 60 meters, and 40 meters in the water column. At the two stations “mooring” and “Swiftia”, all twelve Niskin bottles were triggered as close to the seafloor as possible at depths of 83 and 84 meters, respectively.
Water from the Niskin bottles was either first transferred to sterile 2L Whirl-Pak bags (Nasco Sampling, Madison, WI, USA) or filtered directly from the Niskin bottles. Filtration was conducted using peristaltic pump tubing (Masterflex, Vernon Hills, IL, USA) sterilized with a 10% solution of household bleach, Masterflex peristaltic pumps, and polyethersulfone filters. For all Niskin bottle rosette casts, except those conducted at the “Mooring” and “Swiftia’’ stations, water from each Niskin bottle was filtered over separate 0.22 µm polyethersulfone (PES) Sterivex filters (MilliporeSigma, Burlington, MA USA). The average volume filtered per sample was 2.20 ± 0.21 L (1 SD). For the Niskin bottle rosette casts conducted at the “Mooring” and “Swiftia” stations, the entire volume from all 12 Niskin bottles was filtered over a single 0.20 µm PES membrane with an outer pre-filter mesh in a Mini Kleenpak capsule (Pall Corporation, Port Washington, New York, USA). The volumes filtered for these two samples were 28.2 and 30 liters, respectively. After each Niskin bottle rosette cast, an average of 2.37 ± 0.14 L of MilliQ water (the same water used to rinse tubing following bleach sterilization) was filtered over a Sterivex to serve as a sampling negative control for each set of samples from each Niskin bottle rosette cast. Additionally, 22 liters of MilliQ water was filtered over a Mini Kleenpak to serve as a sampling negative control for the Niskin bottle rosette casts conducted at the “Mooring” and “Swiftia” sites. Detailed descriptions of the measures taken to prevent contamination are reported in Govindarajan et al. (2022). In that study, eDNA samples were included from Niskin bottle rosette casts conducted during the same cruise for a separate analysis.
2.4 DNA extraction, metabarcoding library preparation, and sequencing
DNA was extracted from the Sterivex and Mini Kleenpak filters using the Qiagen DNEasy Blood & Tissue Kit (Qiagen, Hilden, Germany) and modified protocols for each filter type as described in Govindarajan et al. (2022). The protocol for extraction from Sterivex filters is adapted from the protocol originally described in Spens et al. (2017) and involves extracting directly from the filter capsules using larger volumes of the kit reagents. For the Sterivex samples, DNA was eluted in 80 µl of molecular-grade water and stored at -20°C. For the Mini Kleenpak samples, DNA was eluted in 80 µl of Qiagen Buffer AE and stored at -20°C. DNA extractions from separate pieces of the Mini Kleenpak filters were pooled at equal volumes for downstream processing. For 18S library preparation, DNA from the inner PES filter and outer filter mesh of the Mini Kleenpak filters was processed separately in two separate libraries. For 28S library preparation, DNA from these two filter layers was processed together in a single library.
Metabarcoding libraries were prepared using primers that amplify a barcode in the V9 region of the 18S rRNA gene in many eukaryotic taxa including marine invertebrates (Amaral-Zettler et al., 2009) and separately using primers designed to amplify a barcode in the 28S rRNA gene in anthozoan corals. The latter also amplify 28S in ctenophores, medusozoans, and sponges (McCartin et al., 2023). The primer sequences for 18S are 5’-CCCTGCCHTTTGTACACAC-3’ (forward) and 5’-CCTTCYGCAGGTTCACCTAC-3’ (reverse), and the primer sequences for 28S are 5’-CGTGAAACCGYTRRAAGGG-3’ (forward) and 5’-TTGGTCCGTGTTTCAAGACG-3’ (reverse). Library preparation for the 18S marker is described in detail in Govindarajan et al. (2022), and library preparation for the 28S marker was conducted following McCartin et al. (2023). Here, we will describe a few details that differed between library preparation using the two markers. For the 18S libraries, DNA extractions were diluted 1:10 in molecular grade water prior to PCR amplification, whereas for the 28S libraries samples were not diluted. KAPA HiFi HotStart Readymix (Kapa Biosciences, Wilmington, MA, USA) was used for the amplification of 18S, whereas Platinum SuperFi II Mastermix (Thermofisher Scientific, Waltham, MA, USA) was used for the amplification of 28S. Primer concentrations for the two primer sets also differed, with final concentrations of 200 nM for 18S and 1,000 nM for 28S, and cycling conditions differed depending on the manufacturer’s recommended protocols for the different mastermixes. 25 PCR cycles were conducted for the 18S libraries, and 40 cycles of PCR were conducted for the 28S libraries. The protocols differed between the two primer sets to optimize the PCR for 28S to make the reaction as sensitive as possible for the detection of coral eDNA according to McCartin et al. (2023). Duplicate PCR reactions were conducted for each sample with each primer set, and the duplicate products were pooled after their visualization on a 1% agarose gel in TBE buffer stained with GelRed (Biotium, Fremont, California, USA). Each of the pooled PCR products using the 28S marker were size selected using a 0.8X ratio of KAPA Pure magnetic beads (Kapa Biosciences, Wilmington, MA, USA) and eluted in 10 mM Tris-HCl to remove primer-dimers. PCR negative controls were included on each plate and processed in the same manner along with the sampling negative controls to monitor for contamination during library preparation.
The 18S and 28S primers included CS1 and CS2 adapter overhangs for sample barcoding using Access Array barcodes (Standard Biotools, San Francisco, CA). Sample barcoding using a second PCR reaction and subsequent sequencing were conducted at the University of Illinois Chicago’s Genome Research Core for the 18S libraries and Rush University Medical Center’s Genomics and Microbiome Core Facility for the 28S libraries. For both 18S and 28S libraries, a preliminary sequencing run with paired-end 150 bp reads on an Illumina MiniSeq was conducted using equal, pooled volumes of each barcoded library. The pooled volumes for each sample were adjusted based on the number of reads produced in the MiniSeq runs for subsequent sequencing on an Illumina MiSeq, to obtain even depths of coverage across samples. 1 µl of each negative control sample (including sampling negative controls and PCR negative controls) that did not amplify (as visualized with gel electrophoresis) were pooled. The volumes of negative control samples that did amplify were adjusted in the same manner as for the Niskin bottle samples.
Prior to sequencing, the pooled 18S libraries were cleaned with a 1X ratio of Ampure magnetic beads (Beckman Coulter, Indianapolis, IN, USA). Pooled 18S libraries were sequenced with three MiSeq runs (all samples were included in each run) using V2 chemistry with paired-end 250 bp reads. Pooled 28S libraries were sequenced with one MiSeq run using V3 chemistry with paired-end 300 bp reads.
2.5 Bioinformatic processing
Primer sequences were trimmed from forward and reverse sequencing reads using cutadapt (Martin, 2011) with the following settings: forward and reverse primer sequences were anchored to the beginning of the forward and reverse reads, respectively; reverse complements of the reverse and forward primers were optionally trimmed from the forward and reverse reads, respectively, if they were identified; the minimum overlap length was set to five nucleotides (-O 5); the number of primer sequences matched was set to one (-n 1); the minimum length of trimmed reads retained was set to five nucleotides (--minimum-length 5); and untrimmed read pairs were discarded if either the forward or reverse primer sequences were not matched and trimmed from the forward and reverse reads, respectively. Read pairs were quality filtered, denoised to infer error-corrected amplicon sequence variants (ASVs), and merged using the R package DADA2 (Callahan et al., 2016). 18S read pairs were filtered using the filterAndTrim function with a maximum of two expected errors allowed in either the forward or reverse read, and forward and reverse reads were truncated to 125 bp. 28S read pairs were filtered using the filterAndTrim function with a maximum of two expected errors allowed in either the forward or reverse read, and forward and reverse reads were truncated to 250 and 200 bp, respectively. Error profiles were learned, and read pairs were denoised and merged using the default settings, except that the pseudo-pool option was used for denoising with the dada function (pool = “pseudo”). Chimeras were removed from the resulting ASV tables using the removeBimeraDenovo function and the default settings. For the 18S data, each sequencing run was processed separately, and then ASV tables were merged using the gather and summarise functions from the tidyverse (Wickham et al., 2019).
Taxonomic classification of ASVs was conducted using the assignTaxonomy function in DADA2, which implements a naïve Bayesian classifier (Wang et al., 2007). Custom reference database files were created from 18S and 28S data downloaded from MetaZooGene (https://metazoogene.org/database/) (O’Brien et al., 2024). For 18S, global data for all taxa were downloaded to form the reference database on April 25, 2024. For 28S, data from genus and/or species known from the North Atlantic were downloaded for all taxa other than anthozoan corals (Orders: Antipatharia, Scleractinia, Malacalcyonacea, and Scleralacyonacea) on June 20, 2024. A separate, custom reference database with sequences from these anthozoan coral taxa and their most up-to-date taxonomic nomenclature (McCartin et al., 2023) was concatenated to the 28S data from MetaZooGene to build a comprehensive taxonomic reference database for all animals. For taxonomic classification, the default settings were used, which includes a minimum bootstrap cutoff score of 50. Following taxonomic classification, separate phyloseq (McMurdie and Holmes, 2013) objects were created from the ASV tables, taxonomic classification tables, and sampling metadata files for the 18S and 28S data. The resulting phyloseq ASV tables were “decontaminated” of ASVs that were more prevalent in the negative control samples (both sampling and PCR negative controls) than in the field samples using the R package decontam and the “prevalence” method with the default settings (Davis et al., 2018).
2.6 Correcting for cross contamination between samples
In the 18S data, >100,000 sequencing reads were produced in a library prepared from the sampling negative control corresponding to the Niskin bottle rosette cast 1.0 NM to the east of Bright Bank’s center and in a library prepared from a PCR negative control from the same PCR plate as this negative sampling control. These libraries produced 117,894 and 116,000 reads, respectively, which were comparable numbers to libraries prepared from field samples. Cross contamination to the same degree was not apparent in the 28S library prepared from the same sampling negative control (602 reads), nor from the 18S library prepared from the other sampling negative control on the same PCR plate and just one column away (7 reads). The maximum number of reads in any of the other sampling negative controls or PCR negative controls was 334 reads. We assumed that cross contamination of the negative sampling control corresponding to the cast east of Bright Bank occurred during 18S library preparation, and we removed all samples from this cast for the analysis of the 18S data out of caution.
In the 28S data, > 100 sequencing reads were produced from libraries prepared from several sampling negative controls. We detected 279 total sequencing reads in the sampling negative control corresponding to the cast 1.5 NM to the NW of Bright; animal ASVs detected in this sample were classified to the black coral genus Antipathes. We detected 102 total sequencing reads in the sampling negative control corresponding to the cast 3.0 NM to the NW of Bright; animal ASVs detected in this sample were classified to Antipathes and the trachymedusae genus Aglaura. We also recovered 477, 602, and 5,077 total sequencing reads in the sampling negative controls corresponding to the CTD casts 1.5 NM to the NE, 1.0 NM to the E, and 1.5 NM to the SW of Bright Bank, respectively. Animal ASVs in these samples were classified to the octocoral genera Muricea and Eunicea (Family: Plexauridae).
The prevalence of these sequences in the sampling negative controls was concerning. However, we chose to retain sequences from these taxa for data analysis, because the detection of these taxa across samples could have very likely represented true detections, and false positive detections of these taxa would not have meaningfully influenced our broader interpretation of the data. Specifically, eDNA sequences from Muricea and Eunicea, which were most numerous among those taxa detected in the sampling negative controls, were largely constrained to samples from one depth in the CTD cast corresponding to the sampling negative control where they were detected. We infer that the negative control sample was likely contaminated from these few samples that represent valid detections of these genera. Antipathes was the most common coral genus across samples, based on both the number of sample replicates in which it was detected (52 of 115) and its average relative abundance across these samples (3.2%). Therefore, its presence in any sample below the surface mixed layer (it was not detected in any sample in the surface mixed layer) was plausible. False positive detections of these three coral taxa in the 28S data, the trachymedusae Aglaura in the 28S data, and any taxa detected in sampling negative controls at low sequence abundances in the 18S data (excluding the samples that were removed) may have marginally influenced some analyses that included all the data, for example ordination. However, broader interpretation of patterns in the distribution of eDNA with depth and with distance from the center of Bright Bank would not be changed. Nevertheless, we interpreted the detections of taxa across depths and sites in few sample replicates and in low relative abundances with caution, considering the detection of eDNA in sampling negative controls.
2.7 Data analysis
All downstream data analyses were performed in R (version 4.3.1 Beagle Scouts) using functions in phyloseq (McMurdie & Holmes, 2014) unless otherwise mentioned. Rarefied ASV richness was estimated from the ASV tables using rarecurve from vegan (Okansen et al., 2007) with a step size of 2,500 reads per random sample. Resulting rarefaction curves were plotted using ggplot2 to visualize if the depth of sampling was sufficient to capture ASV diversity in each sample. To analyze β-diversity across sampling depths, first the 18S and 28S ASV tables were subset only to ASVs classified to the Animal kingdom that were further classified to the Phylum level. We filtered out animal ASVs that were not further classified to the Phylum level to avoid including ASVs that were erroneously classified as animals, and because classifications only to the Animal level were not relevant to our analyses of taxonomic composition across samples. Sequence read abundances for each ASV detected from replicate samples at each sampling depth and station were summed using merge_samples2 from the speedyseq package.
Summed read counts were then converted to relative abundances using transform_sample_counts and were calculated as the proportion of sequencing reads represented by each ASV at each station and depth out of the total number of sequencing reads from ASVs assigned a taxonomic classification to an animal phyla. Bray-Curtis dissimilarities were calculated from both these relative abundance matrices, as well as matrices of presence/absence of the animal ASVs.
Ordination was performed on the resulting dissimilarity matrices with Principal Coordinates Analysis (PCoA) using the ordinate function, and the positions of each sample in the first two principal coordinate axes were plotted using the plot_ordination function. Distance-based redundancy analyses (dbRDA) from the dissimilarity matrices were conducted using the dbrda function from vegan to determine if there were significant correlations between differences in sampling depth and location and Bray-Curtis dissimilarities. We fit a model that included sampling depth (in meters), latitude, longitude, and proximity to the seafloor as constraining variables. Proximity to the seafloor was scored as a categorical, binary variable depending on whether the sample was taken as close to the seafloor as possible, within 6 meters as calculated from the sampling depth indicated using the CTD data and bathymetric data of Bright Bank.
Permutational analyses of variances implemented using the anova function in vegan were conducted with 999 permutations to determine if differences in these factors were significantly correlated with Bray-Curtis dissimilarities between samples calculated from the relative abundances of ASVs.
To visualize the taxonomic composition of eDNA sample replicates using the 18S and 28S markers with sampling depth and distance from the center of Bright Bank, the relative abundances of animal ASVs in each sample replicate were calculated as previously described and visualized using the plot_bar function. The differential abundances of different taxonomic groups across eDNA samples were modeled using DESeq2 (Love et al., 2014) and the extension for phyloseq objects as described in the online vignette (https://joey711.github.io/phyloseq-extensions/DESeq2.html). First, the differential abundance of sequencing reads in the 18S and 28S data classified to each Order were separately modeled with respect to sampling depth, to investigate the taxa driving trends overall in community dissimilarities across the thermocline. For this model, sampling depth was converted to a binary, categorical variable by grouping samples taken at depths of 20 and 40 meters (in the surface mixed layer) and samples taken at depths in the thermocline (60 to 100 meters). Next, the differential abundance of sequencing reads in the 18S and 28S data classified to each Family were separately modeled with respect to sampling depth and the distance of each sampling station from the center of Bright Bank in nautical miles. For these models, we subset the 18S and 28S data to only samples taken in the thermocline (depths of 60 meters or greater). Sampling depths and distances of the sampling stations from the center of Bright Bank were scaled by dividing each value by the standard deviation of each measurement. The interaction term for these two variables was not included in the model. For significance testing, p-values were adjusted using the Benjamini and Hochberg method to control the false-discovery rate, and 0.05 was used as the significance threshold.
We characterized invertebrate families as benthic, versus pelagic, according to whether these families have predominantly benthic adult forms (Brusca et al., 2022; Jumars et al. 2015). We did not consider hydrozoan families as benthic, because many hydrozoans in the orders Anthoathecata and Leptothecata detected in the data exist as pelagic medusa for a substantial period as adults. This resulted in the exclusion of some eDNA likely derived from the colonial, sessile adult forms of some hydroid families, including milleporid fire corals. We acknowledge that eDNA from the families distinguished as benthic may also be derived from pelagic larvae. Summary statistics, including ASV richness and relative abundance, were calculated and visualized using tidyverse R packages including tidyr, dplyr, and ggplot2 (Wickham et al., 2019). Bathymetric maps were made both in QGIS as well as in R with the package marmap (Pante & Simon-Bouhet, 2013). Distances from isobaths to sampling stations were measured using the measure tool in QGIS. High resolution bathymetry data of Bright Bank was taken from a publicly available multibeam mapping survey generated by the U.S. Geological Survey (https://cmgds.marine.usgs.gov/data/pacmaps/wg-index.html) (Gardner et al., 1998), and bathymetry data from the Gulf of Mexico more broadly was derived from the General Bathymetric Chart of the Oceans (GEBCO; https://www.gebco.net/). Hindcast current data were generated by the HYCOM+NCODA Gulf of Mexico 1/25° analysis (https://www.hycom.org/data/gomu0pt04/expt-90pt1m000) and analyzed/visualized in R using the packages netcdf (Michna & Woods, 2013), marmap, and lubridate and ggplot2 from the tidyverse.
3. Results
3.1 Water column profile
Consistently over the course of the three sampling days, there was a strong thermocline from approximately 45 to 60 meters depth that persisted below these depths with a less drastic rate of change (Figure 1C). This strong thermocline was expected and typical for the northwestern Gulf of Mexico in late summer, and coincided with consistent decreases in salinity, decreases in dissolved oxygen concentration, peaks in fluorescence, and slight increases in turbidity across CTD casts (Figure S1). From these data, we interpret that the shallowest depth of this strong thermocline corresponds to the chlorophyll maximum which occurs at the transition between the surface mixed layer and deeper, nutrient rich waters.
3.2 Amplicon sequence diversity and sequencing depth
After primer trimming, quality filtering, denoising, merging, and chimera removal, the average depth of sequencing per sample from the ∼2L Niskin bottle samples was 118,676 ± 26,704 reads (1 SD) for the 18S data and 45,971 ± 15,571 reads for the 28S data. The average number of total ASVs per sample in the 18S data (891 ± 217 ASVs) was over three-fold higher than in the 28S data (254 ± 70 ASVs). Visualization of rarefaction curves demonstrates that these sequencing depths were sufficient to capture the majority of ASV richness in both the 18S and 28S libraries; however, sequencing was more comprehensive with greater sequencing depth using the 18S barcode (i.e. rarefaction curves reached a true asymptote) (Figure 2).
3.3 Taxonomic composition of eDNA in the 18S and 28S metabarcoding data
The percentage of ASVs classified as animals and assigned a taxonomic classification to the phylum level was higher in the 28S data (25.2 ± 9.3%) (1 SD) than in the 18S data (5.2 ± 1.4%). Hereafter, these ASVs are referred to succinctly as “animal ASVs”. While the percentages of animal ASVs in the 18S and 28S data were quite low, on average animal ASVs were represented by more substantial percentages of the number of total sequencing reads in the 28S (68.0 ± 19.6%) and 18S (43.5 ± 20.7%) data across samples. Hereafter, these sequencing reads are referred to as “animal sequencing reads” and “relative abundances” refer to the proportion of reads with respect to the total number of animal sequencing reads in a sample.
In most samples in the 18S data (104 of 107), Hexanauplia (Phylum: Arthropoda) was the most abundant animal class proportionally with a mean relative abundance of 86.8 ± 14.4% (1 SD) across samples (Figure 3). ASVs classified to Hexanauplia were predominantly classified as calanoid and cyclopoid copepods. The second most abundant animal class was Hydrozoa (Phylum: Cnidaria) (mean relative abundance 9.0 ± 13.9%), which were predominantly siphonophores. Other classes that comprised at least 5% of the animal sequencing reads in any individual sample were Appendicularia and Thaliacea (Phylum: Chordata); Phascolosomatidea and Polychaeata (Phylum: Annelida); sagittoid chaetognaths; Tentaculata (Phylum: Ctenophora); and demosponges (Phylum: Porifera). On average, polychaetes comprised 1.2 ± 3.7% of the animal sequencing reads across samples, while the remaining classes comprised < 1% of the animal sequencing reads on average. The following animal phyla were also detected in the 18S data at relative abundances < 5% in every sample: Mollusca, Bryozoa, Echinodermata, Platyhelminthes, Nemertea and Rotifera.
In the 28S data, Hydrozoa was the most abundant animal class proportionally, comprising 78.7 ± 21.2% of the animal sequencing reads on average across samples. The second most abundant animal class in the 28S data was Tentaculata (Phylum: Ctenophora) (mean relative abundance 13.0 ± 16.3%). Anthozoans and calcareous sponges comprised just 3.6 ± 6.6% and 3.6 ± 10.1% of the animal sequencing reads on average across samples. However, in 20 and 9 of 117 samples, respectively, these taxa comprised at least 10% of the animal sequencing reads. The remaining four classes that comprised at least 5% of the animal sequencing reads in any individual sample were Nuda (Phylum: Ctenophora), Scyphozoa (Phylum: Cnidaria), and Cephalopoda (Phylum: Mollusca). On average these taxa comprised less than 1% of the animal sequencing reads across samples. The following animal phyla were also detected in the 28S data at relative abundances < 5% in every sample: Annelida, Chordata, and Arthropoda.
3.4 β-diversity analysis across sampling stations and depths
Principal coordinates and distance-based redundancy analyses consistently revealed that Bray-Curtis dissimilarities calculated from the relative abundances of 18S animal ASVs between eDNA samples were strongly correlated with differences in sampling depth, specifically whether or not samples were taken in (20 and 40 meters) versus below (60 meters and deeper) the surface mixed layer. Samples taken at depths in the surface mixed layer trended towards higher values on the first PCoA axis, which accounted for 30.4% of the variation in Bray-Curtis dissimilarities in animal eDNA relative abundances between samples (Figure 4). Distance-based redundancy analysis found that 29.1% of the variation in squared Bray-Curtis dissimilarities calculated from relative abundances of animal ASVs were constrained by sampling depth, latitude, longitude, and proximity to the seafloor. Bray-Curtis dissimilarities between samples were significantly correlated with differences in sampling depth (P-Value = 0.001), but not with differences in latitude (P-Value = 0.653), longitude (P-Value = 0.598), or proximity to the seafloor (P-Value = 0.154).
Distance-based redundancy analyses based on the presence/absence of 18S animal ASVs recapitulated the correlation between sampling depth and sample dissimilarities, finding that 28.0% of the variation in squared Bray-Curtis dissimilarities were constrained by the explanatory variables. Bray-Curtis dissimilarities between samples were significantly correlated not only with differences in sampling depth (P-Value = 0.001), but also with proximity to the seafloor (P-Value = 0.018). However, they were not correlated with differences in latitude (P-Value = 0.306) or longitude (P-Value = 0.189).
Differences in the relative abundances of 28S animal ASVs in samples taken in versus below the surface mixed layer were not as apparent from the PCoA. On the first two PCoA axes, samples collected from depths above the thermocline were not apparently more similar to one another than to samples taken at depths in the thermocline. Distance-based redundancy analysis found that 13.4% of the variation in Bray-Curtis dissimilarities between samples was constrained by depth, latitude, longitude and proximity to the seafloor. Bray-Curtis dissimilarities between samples were significantly correlated with differences in sampling depth (P-Value = 0.017), but not with differences in latitude (P-Value = 0.438), longitude (P-Value = 0.407), or proximity to the seafloor (P-Value = 0.217).
In contrast to analyses based on the relative abundances of 28S animal ASVs, ordination of Bray-Curtis dissimilarities calculated from the presence/absence matrix of 28S animal ASVs more apparently revealed that depth explained variation in community dissimilarities. Samples collected at shallower depths trended towards more positive values on the first PCoA axis, which explained 14.4% of the variation in Bray-Curtis dissimilarities. Further, distance-based redundancy analysis found that 21.6% of the variation in squared Bray-Curtis dissimilarities was constrained by the explanatory variables, and that dissimilarities between samples were not only significantly correlated with differences sampling depth (P-Value = 0.001), but also with differences in latitude (P-Value = 0.038), and nearly with proximity to the seafloor (P-Value = 0.085). Bray-Curtis dissimilarities between samples were not significantly correlated with differences in longitude (P-Value = 0.357).
3.5 Spatial distribution of invertebrate eDNA across depth layers
In the 18S data, the sequence abundances of 6 invertebrate orders were significantly correlated with sampling depths in the surface mixed layer. Sequence abundances of eDNA from cyclopoid copepods were significantly higher in the surface mixed layer than at sampling depths below 40 meters (log2 fold change = 1.09 ± 0.28) (1 SE) (adjusted P-Value = 0.001). The sequence abundances of the remaining orders were significantly lower in the surface mixed layer. These orders included several benthic taxa: black corals (Antipatharia) (Class: Anthozoa) (log2 fold change = -5.57 ± 1.73, adjusted P-Value = 0.008), Sabellida (Class: Polychaeta) (log2 fold change = -5.86 ± 1.46, adjusted P-Value = 0.00006), Spionida (Class: Polychaeta) (log2 fold change = -24.6 ± 2.92, adjusted P-Value < 0.00001), and Terebellida (Class: Polychaeta) (log2 fold change = -4.05 ± 1.16, adjusted P-Value = 0.003). Sequence abundances of siphonophores were also significantly negatively correlated with sampling depths in the surface mixed layer (log2 fold change = -2.26 ± 0.39, adjusted P-Value < 0.00001).
The sequence abundances of several invertebrate orders were also significantly correlated with sampling depths in the surface mixed layer in the 28S data. These include sessile, benthic taxa as well as zooplankton. The sequence abundances of all four orders of anthozoan corals were negatively correlated with sampling depths in the surface mixed layer: Antipatharia (log2 fold change = -7.46 ± 0.77, adjusted P-Value < 0.00001), Scleractinia (log2 fold change = -25.2 ± 1.96, adjusted P-Value < 0.00001), Malacalcyonacea (log2 fold change = -25.4 ± 1.25, adjusted P-Value < 0.00001), and Scleralcyonacea (log2 fold change = -7.45 ± 1.16, adjusted P-Value < 0.00001). The sequence abundances of sea anemones (Order: Actiniaria) were also significantly, negatively correlated with sampling depths in the surface mixed layer (log2 fold change = -5.63 ± 2.25, adjusted P-Value = 0.03). Sequence abundances of two orders of calcareous sponges were also significantly, negatively correlated with sampling depths in the surface mixed layer: Leucosolenida (log2 fold change = -10.6 ± 0.98, adjusted P-Value < 0.00001) and Clathrinida (log2 fold change = -10.1 ± 1.09, adjusted P-Value < 0.00001). The sequence abundances of eDNA from Lobata (Phylum: Ctenophora) and Narcomedusae (Phylum: Hydrozoa) were significantly, positively correlated with sampling depths in the surface mixed layer: Lobata (log2 fold change = 1.25 ± 0.33, adjusted P-Value = 0.0003), and Narcomedusae (log2 fold change = 2.33 ± 0.48, adjusted P-Value < 0.00001). Like in the 18S data, the sequence abundances of siphonophores were significantly, negatively correlated with sampling depths in the surface mixed layer (log2 fold change = -1.31 ± 0.35, adjusted P-Value = 0.0005). Additionally, sequence abundances of eDNA classified as belonging to cuttlefish (Order: Sepiida) were also significantly, negatively correlated with sampling depths in the surface mixed layer (log2 fold change = -7.51 ± 0.65, adjusted P-Value < 0.00001).
3.6 Vertical and horizontal distribution of benthic invertebrate eDNA
As indicated by the negative correlations of the sequence abundances several benthic orders with depths in the surface mixed layer, eDNA from benthic invertebrate families was largely constrained to depths below the thermocline (Figure 5). The average number of benthic invertebrate families detected in the 18S data was highest at the center of Bright Bank at depths of 60 meters close to the seafloor; on average 26.8 ± 4.5 (1 SD) families were detected across the four sample replicates. In the 28S data, the number of benthic invertebrate families detected was also highest in samples taken at 60 meters in the center of the bank; on average 14.8 ± 0.5 benthic families were detected. Anthozoan coral eDNA was nearly entirely restricted to depths in the thermocline, except for the detection of octocoral eDNA in one sample replicate at a depth of 40 meters.
Considering samples taken only at depths in the thermocline, the sequence abundances of 4 and 17 benthic invertebrate families were significantly correlated with sampling depth in the 18S and 28S data, respectively (Figure 6A). In the 18S data, these taxa included 1 family of demosponges, 1 family of gastropods, and 1 family of bivalves with sequence abundances negatively correlated with sampling depth (i.e. more abundant at shallower depths below the surface mixed layer). The sequence abundances of 1 family of gastropods were significantly, positively correlated with sampling depth. In the 28S data, the sequence abundance of 13 families of anthozoans and 7 families of sponges (including calcareous sponges and demosponges) were correlated with sampling depth. The sequence abundances from the majority of differentially abundant scleractinian coral families (3 of 4: Agariciidae, Astrocoeniidae, and Montastraeidae) were significantly, negatively correlated with greater sampling depths below the surface mixed layer. On the contrary, the sequence abundances of all seven differentially abundant octocoral families were significantly, positively correlated with greater sampling depth. Black corals were evenly split; sequence abundances of antipathids were negatively correlated with greater sampling depths while sequence abundances of aphanipathids were positively correlated with greater sampling depths. All differentially abundant families of sponges exhibited sequence abundances negatively correlated with greater sampling depth, as did sea anemones in the family Sagartiidae.
At depths in the thermocline, eDNA from benthic invertebrate families also decreased in diversity and relative abundance with distance from the center of Bright Bank. In samples taken away from the center of the bank and in the water column at 60 meters depth, the average number of benthic invertebrate families detected in the 18S data was just 3.2 ± 1.6 at distances 1.5 NM from the center and 1.5 ± 1.2 at distances 3 NM from the center (Figure 7). The sequence abundances of 25 benthic invertebrate families were significantly, negatively correlated with distance from the center of the bank (Figure 6B). The sequence abundances of just 1 family of polychaetes were significantly, positively correlated with distance from the center of Bright Bank.
Like in the 18S data, the number of benthic families detected in the 28S data at depths of 60 meters also decreased with increasing distance from the center of Bright. At distances 1 NM from the center of the bank, on average 9 ± 1 families were detected; at distances 1.5 NM from the center of the bank on average 4.8 ± 1.9 families were detected, and 3.0 NM from the center of the bank on average 1 ± 0.7 families were detected (Figure 7). The number of anthozoan coral genera detected, which comprised a large proportion of sequencing reads from benthic families in the 28S data, also decreased with distance from the center of Bright in samples collected at 60 meters. At the center of the bank on average 6 ± 0.8 genera were detected; at distances 1 NM from the center of the bank on average 4.3 ± 0.6 families were detected; at distances 1.5 NM from the center of the bank on average 2 ± 1.0 families were detected; and at distances 3.0 NM from the center of the bank on average just 0.3 ± 0.5 families were detected. In the 28S data the sequence abundances of 17 benthic invertebrate families, exclusively anthozoan corals and sponges, were significantly, negatively correlated with distance from the center of Bright Bank (Figure 6B).
3.7 Horizontal transport distances of coral eDNA from Bright Bank
In the 28S data, but not in the 18S data, eDNA from black corals was detected in all three sample replicates taken at 60 meters depth in the water column 1.0 and 1.5 NM to the east, southeast and southwest of Bright Bank (Figure 8). In fact, at samples taken at distances 1.5 NM from the center of Bright Bank to the southeast, the relative abundances of black coral eDNA (20.0 ± 3.1%) were higher than in samples taken at the center of Bright Bank near the seafloor, where black corals occur (16.1 ± 6.8%). Repeated detections of coral eDNA in samples taken in the water column were not limited to depths of 60 meters. At depths of 80 meters, we also detected eDNA from black corals and octocorals in the family Ellisellidae (genera Ellisella and Nicella) in all three sample replicates at 80 meters 1.5 NM to the southeast of Bright Bank and at substantial total relative abundances (14.7 ± 10.1%).
Assuming that vertical transport was negligible due to water column stratification, we measured the minimum horizontal transport distances of coral eDNA to these three stations where coral eDNA was repeatedly detected. In samples taken 1.0 NM to the east at 60 meters, the 60 meter isobath is at minimum 490 meters away; in samples taken 1.5 NM to the southeast the 60 meter isobath is at minimum 1,460 meters away; in samples taken 1.5 NM to the southwest the 60 meter isobath is at minimum 1,110 meters away; and in samples taken 1.5 NM to the southeast at 80 meters, the 80 meter isobath is at minimum 480 meters away.
4. Discussion
4.1 Vertical eDNA transport is limited in a stratified water column
Here we present further evidence supporting the hypothesis that physical stratification in the water column hinders the vertical transport of eDNA between depth layers (Jeunen et al., 2019; Allan et al., 2021; Govindarajan et al., 2021; Monuki et al., 2021; Hoban et al., 2023; Robinson et al., 2023). In the 18S and 28S data, eDNA in samples collected in versus below the surface mixed layer were highly dissimilar based on the presence and absence of animal ASVs. Furthermore, eDNA from zooplankton, including copepods and siphonophores, and numerous benthic invertebrates, like corals and sponges, was significantly, differentially abundant between the surface mixed layer and thermocline below.
In both the 18S and 28S data, we detected eDNA from benthic families in samples taken in the surface mixed layer, but these detections were largely spurious and comprised a small proportion of the total sequencing reads in these samples. In the 18S data, the most abundant of these benthic families detected in the surface mixed layer were opheliid polychaetes (at maximum nearly 10% of the sequencing reads across 7 samples at 40 meters deep), but this detection may plausibly reflect spawning aggregations of epitokes (Maciolek & Blank, 2006). Otherwise, the remaining benthic families detected in the surface mixed layer in the 18S data were present in few sample replicates and at low relative abundances. These families include capitellid polychaetes, fireworms (Family: Amphinomidae), nudibranchs (Family: Chromodorididae), file shells (Family: Limidae), and nemerteans (Family: Lineidae). In the 28S data, eDNA from anthozoans was detected in three sample replicates taken in the surface mixed layer, from a zoanthid (Family: Parazoanthidae), an anemone, and an octocoral. Surprisingly, the zoanthid was detected at a high relative abundance, more than 20% of the sequencing reads in the sample.
Detections of benthic invertebrates in the surface mixed layer could represent the transport of eDNA either as waste from predators that feed near the benthos and then traverse the thermocline (e.g. fish) or as swimming larvae. The latter possibility remains a notable point of uncertainty and begs the question of whether larvae should be considered eDNA and can be practically distinguished from other forms of “real” eDNA. Size fractionation, by using a pre filter of a large pore size to remove larger eDNA particles including larvae, seems to be the most reasonable approach to remove larvae from water samples. However, using a pre filter would potentially come at the cost of losing eDNA in the form of particulates or other larger size fractions. Currently, the potential presence of larvae in marine eDNA sequencing data is one factor that may complicate the interpretation of species occurrence records using eDNA analysis.
Below the surface mixed layer, we found evidence that water column stratification further limited vertical transport; the sequence abundances of benthic invertebrate families were correlated with depth in samples taken in the thermocline. Among these families in the 28S data were several families of octocorals as well as black corals in the family Aphanipathidae, which were positively correlated with greater sampling depth. The differential abundances of these families were reflected in the detection of a high diversity of octocorals and the detection of aphanipathid black corals in samples taken at depths > 90 meters and close to the seafloor. Coral genera that were not detected at shallower depths included Callogorgia (Order: Scleralcyonacea), as well as Elatopathes and Aphanipathes (Family: Aphanipathidae). In video collected during ROV dives, these three genera have only been observed at depths greater than 100 meters, reflecting the depths where they were detected in the eDNA data.
5.2 Evidence for the horizontal transport of coral eDNA over substantial distances in the 28S data
The distribution of eDNA shed from benthic invertebrate families was not only constrained by depth, but also by distance from the center of Bright Bank, where measurements of the diversity of benthic families were highest. At the center of Bright Bank near the seafloor, a diverse community of benthic invertebrates including polychaete worms, bryozoans, ascidian tunicates, anthozoan corals, molluscs, and sponges, were detected in all four sample replicates in the 18S data. With distance from the center of Bright, the diversity of benthic invertebrate families detected in the water column at depths equivalent to the seafloor (∼60 meters) decreased precipitously with distance from the bank. The average number of benthic families detected decreased ∼8-fold within 1.5 NM of the center of the bank and ∼18-fold within 3 NM of the bank center, coinciding with significant decreases in the abundance of eDNA from several benthic invertebrate families. In the 28S data, the diversity of benthic invertebrates was comparably low (∼15-fold less) at distances 3.0 NM from the center of Bright, and several calcareous sponges and anthozoan corals were differentially abundant with distance from the center of the bank. However, the decrease in diversity over distances of 1.5 NM was less drastic than in the 18S data; the average number of benthic families detected decreased just ∼3-fold within 1.5 NM of the center of the bank.
Detections of coral eDNA in substantial relative abundances in the water column at distances up to 1.5 NM from the center of the bank most likely reflect the horizontal transport of eDNA from corals at Bright Bank in the east, southeast, and southwest directions. Vertical transport does not easily explain these data. The bottom depth at these stations 1 and 1.5 NM from Bright Bank exceeds 90 meters (30 meters deeper than where these samples were taken), and neither these data nor data from our previous work have demonstrated the vertical transport of coral eDNA upwards against the thermocline over tens of meters in depth (McCartin et al., 2023). It is also not surprising that putatively transported eDNA would be shed from black corals in the family Antipathidae (Stichopathes spp. and Antipathes spp.). In eDNA samples taken from ROV dives at the seafloor at Bright Bank, and more broadly across the northern Gulf of Mexico, eDNA from Stichopathes spp. and Antipathes spp. is consistently more abundant than any other coral taxa at mesophotic depths (McCartin et al., 2023). If these corals shed the highest concentrations of eDNA as compared to other corals, it follows that eDNA from these corals would be detected the farthest away from their sources. In samples taken 1.5 NM to the southeast of Bright Bank at 80 meters depth we also detected eDNA from ellisellid octocorals (Ellisella spp. and Nicella spp.). The detection of ellisellid coral eDNA at this depth is also not surprising. In the large volume sample taken near the seafloor at depths of 84 meters (station “Swiftia”) eDNA from ellisellid octocorals was also detected in substantial relative abundances in addition to eDNA from black corals. Thus, it is likely that coral eDNA detected in samples at 80 meters deep 1.5 NM southeast of Bright originated from similar depths at the bank.
Under the assumption that vertical transport was negligible, we conservatively estimate that coral eDNA must have been transported up to ∼1.5 km from the seafloor at Bright Bank to reach these sampling locations. These measurements are in line with those from a previous study that quantified the distribution of eDNA from a caged fish in two directions using qPCR (Murakami et al., 2021). In this study, water temperature was nearly the same as in our study (∼ 20°C), meaning that eDNA degradation rates were likely also similar (McCartin et al., 2022).
One substantial difference between their study and ours, is that they detected exponentially lower concentrations of eDNA concentration 1 km from the caged fish, while we detected coral eDNA at higher relative abundances at distances 1.5 NM from the center of Bright Bank. However, this discrepancy can be explained by differences between qPCR and metabarcoding. eDNA metabarcoding is not quantitative like qPCR, and the relative abundances of different taxa in a sample is dependent on other taxa in that same sample. Thus, while the detection of eDNA from a taxa in multiple sample replicates at substantial relative abundances can be interpreted as having a high probability of being a real detection, uncorrected relative abundances in metabarcoding data should not be conflated with eDNA concentrations measured using qPCR.
Considering eDNA transport at the kilometer scale in metabarcoding data may be surprising. However, a back of the envelope calculation using eDNA persistence times from experimental studies and current speeds in the area reveals that the transport distances we observed are in fact entirely plausible. Using the model in McCartin et al. (2022), at 22°C, the temperature at 60 meters at Bright Bank, we expect eDNA concentrations to decrease by three orders of magnitude in four days (3.99 exactly). This three-fold decrease is a typical decrease from starting concentrations of experiments to concentrations near the limits of detection of qPCR. Thus, the following estimate assumes shed concentrations of eDNA from black corals are three-fold higher than the limit of detection of our 28S metabarcoding protocol, which is admittedly unknown. Hindcasted currents at 60 meters depth at Bright Bank over the course of sampling during our study exceeded speeds of 0.1 m/s or .36 km/hour. Considering the underlying assumptions and using this value as a conservative estimate for current speed, eDNA may be transported up to 8.64 kilometers per day in a straight-line distance. Thus, over four days it may be transported nearly 35 kilometers at detectable concentrations. Of course, these straight-line distances, which do not consider diffusion nor meandering current patterns, are overestimates; however, these calculations demonstrate that eDNA transport over the 1.5 km distance that we observed should not come as a surprise.
4.3 Comparison between expected current patterns and directions of coral eDNA transport
In the Gulf of Mexico, circulation patterns are driven by the protrusion and contraction of the loop current, riverine input, and the associated formation of mesoscale (twenty to hundreds of km in diameter) and submesoscale (< 10 km in diameter) eddies (Lugo-Fernández, 1998; Bracco et al., 2019). Typically, in September, currents in the upper 100 meters at East and West Flower Garden Banks (shelf-edge banks ∼30 and ∼50 km west of Bright, respectively) are predominantly in the westward direction along the shelf edge in the absence of eddies (Lugo-Fernández, 1998). Hindcasted currents revealed that currents at 60 and 80-meters depth at Bright Bank over the course of sampling during our study were consistently to the west and south and weakened over time (Figure S2).
We observed horizontal eDNA transport at distances up to 1.5 NM to the south and west, which is in line with these expected current patterns. On the other hand, detection of eDNA at the sampling site 1.0 NM east of Bright Bank is not immediately intuitive. However, there is a pinnacle as shallow as 60 meters to the north of our sampling location 1.0 NM to east of the center of Bright Bank. Thus, it is plausible that the detection of benthic eDNA did not originate from the center of Bright Bank, but rather from this shallow feature. Discrepancies between the exact direction of eDNA transport and hindcasted current patterns likely exist due to the low resolution of the HYCOM data (1/25°). Modeling the circulation of current patterns to a sufficient resolution is a computationally intensive problem that is beyond the scope of this study (A. Bracco, personal communication). Complex interactions with the bathymetry of Bright Bank at small spatial scales may drive current patterns in ways that are undetected in models at coarse resolutions. Making a probabilistic map of hindcasted eDNA source locations at spatial scales of tens to hundreds of meters is impossible with these currently available ocean circulation models. Thus, accurately constraining the position of eDNA sources at the same resolution as data from conventional methods like ROV surveys (meters/tens of meters with good navigation fixes) is a tall task. Rather, leveraging the complementary natures of eDNA sequencing and conventional approaches whilst considering each of their inherent limitations (in the case of eDNA sequencing constraining the exact position of source organisms) will prove to be a more fruitful approach than expecting eDNA to replace conventional survey techniques.
4.4 eDNA sequencing with taxonomically specific primers is sensitive and complements data generated using primers that broadly amplify marine invertebrates
In this study, we compare data generated through eDNA metabarcoding using a primer set designed to amplify a barcode in the V9 region of 18S in all animals, and more broadly eukaryotes, and data generated using a primer set designed to specifically amplify a barcode in 28S in anthozoan corals that is also complementary to several other marine invertebrate taxa. These two datasets provide complementary views of the pelagic and benthic invertebrate community.
The 18S data is composed of eDNA from a diverse suite of 86 invertebrate orders distributed into 14 different phyla. Of these taxa, just 17 orders and 7 phyla are also included in the 28S data. However, using the 28S primers detected eDNA from 10 orders not detected in the 18S dataset, including octocorals (Orders Malacalcyonacea and Scleralcyonacea). Further, using the 18S primer set, the average relative abundance of eDNA from anthozoan corals was extremely low; at maximum fewer than 0.2% of the animal eDNA in any 2L eDNA sample was classified to Anthozoa. Even in the ∼30 L samples taken close to the seafloor, the relative abundance of coral eDNA did not exceed 3%. By design, the 28S libraries were comparatively enriched in eDNA from anthozoan corals; at maximum in the 2L samples nearly one fourth (24.6%) the animal eDNA in any sample was classified to Anthozoa. In the ∼30 L samples taken close to the seafloor, the maximum relative abundance of coral eDNA was nearly 50%. The prevalence of eDNA from calcareous sponges in the 28S data (more than half of the eDNA in a sample) was not expected, but perhaps not surprising given that the primers match 28S sequences from some sponges (McCartin et al., 2023).
By process of elimination, we propose that the low prevalence of coral eDNA across samples using the general 18S primers is due to competition with DNA templates from other more abundant taxa. We aligned 18S sequences of anthozoans in the MetaZooGene database and found that of the 91 anthozoan sequences that contained the amplified V9 barcode, the primers were complementary to 84 of the sequences, including black corals, scleractinians and octocorals, with two or fewer mismatches. Therefore, it is unlikely that we failed to detect coral eDNA using this marker because the primers did not complement coral eDNA templates in our samples or because we did not have reference 18S barcodes. We fully captured sequence diversity in the libraries generated using the 18S primers, by conducting three separate runs on an Illumina MiSeq with all samples included in each run. Thus, it is also unlikely that the failure to detect anthozoan ASVs was due to insufficient sequencing depth. Competition with other eDNA templates and the failure to detect the less abundant coral eDNA is likely due to two processes that work in tandem: 1) a biological reality in which coral eDNA is more scarce than eDNA from other taxa in the sample and 2) amplification bias wherein over the course of the PCR reaction the proportion of eDNA from more abundant taxa is increased with each cycle, since eDNA from abundant taxa is more likely to amplify during earlier cycles (Kelly et al., 2019).
4.5 The spatial resolution of biodiversity measurements and species detections from eDNA metabarcoding data depends on PCR primer choice
Understanding the expected distances of horizontal and vertical eDNA transport in the ocean is complicated yet necessary for constraining the spatial resolution of eDNA metabarcoding data. Here, we leveraged an isolated source of offshore benthic invertebrate eDNA in the Gulf of Mexico to constrain the expected vertical and horizontal distribution of marine eDNA. In agreement with several other studies, we found that vertical stratification in the water column limits transport across depth layers. In the horizontal dimension, while the diversity of eDNA from benthic invertebrates and the modeled abundances of eDNA sequences decreased with distance from the source, we also found strong evidence for the horizontal transport of eDNA over substantial distances, up to nearly 1.5 km. From a management perspective, this transport resulted in detecting eDNA from corals that likely originated inside the Flower Garden Banks National Marine Sanctuary outside of sanctuary boundaries.
We confidently identified this horizontal transport of coral eDNA at substantial distances using primers that are taxonomically specific to coral 28S but not in data generated using primers that broadly target eukaryotes. Using taxonomically specific primer sets, such as the MiFish 12S primers and the 28S-Anth primers used in this study, is a sensitive approach that also permits the accurate classification of eDNA to the genus and species levels (Miya et al., 2015; McCartin et al., 2023). Further, species-specific quantitative assays (e.g. qPCR and digital droplet PCR) permit the accurate quantification of very low concentrations of eDNA with high specificity.
These characteristics are inherently beneficial for the respective purposes of these methods. However, here we demonstrate that the taxonomic specificity of PCR primers has a demonstrable effect on the inferred spatial distribution of eDNA metabarcoding data. With increasing taxonomic specificity of PCR primers, sensitivity of detection and certainty in taxonomic identity is higher, but inferring the exact location of an animal detected in data generated using specific primers is more challenging. Pairing primers that target barcodes in a diversity of taxa (e.g. the 18S primers herein or the Leray primers to COI) (Leray et al., 2013) with taxonomically specific primers for taxa of interest (e.g. fish, corals) is a comprehensive approach for biodiversity characterization at the community scale. However, inherent differences in the spatial resolution of biodiversity measurements and species detections inferred from data using each primer set must be considered to integrate these data for conservation and natural resource management.
Acknowledgements
We thank the expedition science team: Dr. Dana Yoerger, Justin Fujii, Eric Glidden, Rene Francolini and Katie Foley as well as the captain and crew of R/V Manta. We specifically thank Jimmy MacMillan for his assistance with Niskin bottle rosette deployments. We thank Weihua Wang from the Genomics Research Core at the University of Illinois Chicago, and Dr. Stefan Green, Ashley Wu, and Cecilia Chau from the Genomics and Microbiome Core Facility at Rush Medical University for library preparation and sequencing. We thank Dr. Wynn Meyer for her helpful comments on a draft of this manuscript. We also thank Dr. Allen Collins and Annemarie Wood for their helpful discussion regarding the interpretation of the zooplankton detections in the 28S data.
This research was funded by the National Oceanic and Atmospheric Administration’s Oceanic and Atmospheric Research, Office of Ocean Exploration and Research, under award NA18OAR0110289 to SH and JMM at Lehigh University. This research is part of the Woods Hole Oceanographic Institution’s Ocean Twilight Zone Project, funded as part of The Audacious Project housed at TED.