Modelling marine larval dispersal: a cautionary deep-sea tale for ecology and conservation

Larval dispersal data are increasingly sought after in ecology and marine conservation, the latter often requiring information under time limited circumstances. Basic estimates of dispersal are often used in these situations acknowledging their oversimplified nature. Larval dispersal models (LDMs) are now becoming more popular and may be a tempting way of refining predictions, but prior to targeted groundtruthing their predictions are of unknown worth. This case study uses deep-sea LDMs to compare predictions of dispersal. Two LDMs driven by different example hydrodynamic models are compared, along with an informed estimate based on mean current speed and planktonic larval duration (PLD) to provide insight into predictive variability. LDMs were found to be more conservative in dispersal distance than an estimate. This difference increased with PLD which may result in a bigger disparity for deep-sea species predictions. Although LDMs were more spatially targeted than an estimate, the two LDM predictions were also significantly different from each other and would result in contrasting advice for marine conservation. These results show a greater potential for model variability than previously appreciated by ecologists and strongly advocates groundtruthing predictions before use in management. Advice is offered for improved model selection and interpretation of predictions.


Introduction
Larval dispersal is an important ecological process. Many benthic animals rely upon this phase as 19 their only means to colonise a new area, making the process pivotal in individual survival as well as 20 in population dynamics and persistence.

21
Existing global efforts to establish networks of Marine Protected Areas (MPAs) are hampered 22 without knowledge of larval dispersal. An effective self-sustaining network needs each MPA to 23 supply larvae to both itself and another for protected populations to persist [1] -something that will 24 only be achieved by chance without dispersal data to base informed decisions upon. It is therefore 25 imperative that we gather information on larval dispersal as soon as possible.

26
The most basic way to fulfil this need is to estimate larval dispersal using a distance /speed /time 27 calculation based on average current speeds and planktonic larval durations (PLDs). This technique, 28 hereafter termed "an estimate", while highly simplistic, takes very little time, money, effort, or 29 expertise to produce. Consequently, estimates have been used both in ecology (e.g. [2]) and 30 conservation (e.g. [3]), although always acknowledging their oversimplified nature and the need for 31 more detailed study.

32
Among the more advanced methods that exist for identifying dispersal patterns [4], larval dispersal 33 models (LDMs) are gaining popularity in ecology and conservation (e.g. [5,6]). An LDM is a simulation 34 of dispersal driven by a numerical hydrodynamic model to produce maps predicting which 35 populations may be linked. As a simulation it doesn't require expensive and difficult to obtain 36 biological samples beyond knowing initial positional information, but it can integrate other biological 37 data (e.g. larval behaviour, mortality, or buoyancy) should it be available (see [7]). Furthermore, the 38 LDM can later be assessed and improved by groundtruthing with other sample-requiring methods 39 should they become available (e.g. geochemical tracers and population genetics [4]). The ability to 40 "provide an answer now" without requiring the time, money, and effort for additional sampling 41 makes the LDM method particularly attractive for marine conservation's urgent needs, especially in 42 the deep sea [8]. 43 However, it is well acknowledged that despite the specialist skills needed to produce an LDM, their 44 quality and accuracy may be highly variable. Poor bathymetry, temporal and spatial averaging, a lack 45 of sub-mesoscale processes, and unknown or estimated biological parameters can all add to the 46 error included within an LDM [9]. The true extent of such (often unavoidable) predictive inaccuracies 47 will always remain elusive until groundtruthing and validation can take place: essential steps in any 48 modelling process.

49
Once groundtruthed, the worth of these models can be quantified, but there remains a question as 50 to how useful un-groundtruthed LDM predictions are, and whether they should be used in 51 preliminary management decisions? If the errors in such un-groundtruthed predictions may be large, 52 then perhaps the crude, but fast and less expertise demanding, back-of-the-envelope estimates may 53 be just as useful.

54
Shanks [10] did examine the difference between estimated and modelled predictions of dispersal 55 distance while exploring the influence of PLD on dispersal. He found the estimate to be the least 56 conservative prediction (an overestimate), with an LDM being up to an order of magnitude more 57 conservative. However, the LDM also overestimated the predicted distance of dispersal when 58 compared with those approximated from genetic data.

59
Shanks's study focused on shallow-water and coastal species which are concentrated in areas of 60 arguably more complex hydrodynamics and faster current speeds than the deep-sea. There is 61 therefore potential for a greater similarity between estimated and modelled dispersal predictions if 62 a similar study were focussed in deep-water.

63
When assessing the stability of model predictions without new sampled validation data, one 64 approach often used in other ecological modelling disciplines, is a model comparison (e.g. [11,12]). It stands to reason that if all different models are trying to represent reality, there should be some 66 similarity in their predictions, provided that their assumptions are suited to the task at hand.

67
Exploring the differences and similarities between models promotes a greater understanding of 68 which variables control predictions and where previously unexplored sources of error may lie.

69
As an ecologist running an LDM, the selection of a hydrodynamic model to power simulations is the

77
Despite the glut of options, model choice will be restricted first by study location and finding suitable 78 parameterisation (e.g. see advice from [9,13,14]), but also by access (e.g. proprietary issues). Deep-79 sea studies, for example, due to the distance from shore and large spatial scales, are likely to be 80 limited to global circulation models (GCMs), shelf models, and occasional custom build models from 81 local observations (which carry their own limitations, see [14]).

82
At the end of the model selection process you may be faced with only a couple of imperfect but 83 differently (potentially) suitable models that are hard to choose between. Allowing for 84 parameterisation differences, a comparison of the dispersal predictions obtained from two such 85 hydrodynamic models must logically display some difference. The question is whether the difference 86 is negligible, and therefore potentially cross-validating, or substantial, making groundtruthing 87 absolutely necessary before either model prediction has value.

88
The need to source additional data to confirm or reject model predictive ability should be considered 89 mandatory regardless of the result, but if models are found to broadly agree they would provide a 90 first level of validation for each other and therefore allow meaningful research output before 91 additional (in the deep sea, potentially considerable) groundtruthing costs are outlaid.

92
This study will therefore investigate: 2) The difference between the predictions of LDMs driven by two different hydrodynamic 96 models, each selected as potentially suited to larval dispersal simulations in the study area.

97
By doing so this study aims to approximate the value of an un-validated LDM in relation to an 98 estimate and discover whether a multi-modelling approach has the potential for cross-validating 99 reassurance in predictions prior to formal groundtruthing. Note that this study does not offer a 100 formal validation or criticism of either model, nor does it seek to recommend one over the other,

101
instead it aims to highlight the difference and similarity between two example models to offer 104 The results of this study should be beneficial to both ecologists and marine managers in all marine 105 settings. The hope is to better inform those looking to use LDMs in the future and to enable a 106 responsible interpretation of their predictions.

108
Study Area This study was conducted in the NE Atlantic in offshore deep water west of the UK and Ireland (Fig.   110 1). The Rockall Trough is one of the best studied areas of deep-sea in the world, providing historic 111 datasets for at least a qualitative groundtruthing of predictions [15,16]. Arguably this area is not 112 typical of deep sea due to the rapid changing bathymetry in the presence of banks and seamounts; 113 something which can cause greater uncertainty in hydrodynamic model predictions than a flat 114 abyssal plain [9]. This could however make for a fairer comparison to complex shallow water and 115 coastal hydrodynamics and also promotes a greater similarity to estimate predictions which 116 represent a null model of maximal uncertainty and spreading of larvae. 132 [17]. This study will use the lower averaged current speed (0.1 m s -1 ) as the estimate, after Ellett,

136
Two LDMs were run in this study, each consisting of a single particle simulator paired with one of the 137 two hydrodynamic models; additional details on all model algorithms and parameterisations are 138 available in Supplementary Material S1.

Particle simulator: Connectivity Modeling System (CMS)
140 The CMS was used as the particle simulator (hereafter "simulator") for both LDMs. There are many

148
While it is easy to integrate biological data, this study uses the simulator it in its simplest 149 configuration simulating passive dispersal for the cleanest comparison. An hourly particle tracing 150 timestep was used as decided by model sensitivity testing [23], and positional outputs were 151 recorded daily.

155
-a fact which might recommend it above other models in this area [24,25] and has been extensively 156 validated over the UK surrounding waters [26]. The 1/6 ° x 1/9 ° (c. 12 km 2 ) resolution offers an 157 eddy-resolving solution, however it can only capture major eddies (c. 64 km in size based on needing 158 six or more data points to adequately resolve an eddy [27]) making this the coarser of the two 159 models trialled. The model was run with 40 terrain following depth layers (sigma-levels) although 160 outputs were interpolated to a z-level format (a list of set depth levels) using Matlab (v.R2013a) in 161 order to make them compatible with the CMS. POLCOMS has been used in several dispersal studies 162 to date (e.g. [28,29]).

164
HYCOM is a freely available global hydrodynamic model developed by the US Navy (www.hycom.org,

165
[30]). It is uniquely set up to use a hybrid of water mass following, terrain following, and depth 166 specific vertical layers, changing with the underlying topography, which may make it well suited to therefore not validate so well on a local scale [14]. HYCOM has already been used in multiple 173 dispersal studies [30][31][32][33], including in the deep-sea [34,20,23,35].  the study undertaken by Shanks [36] in comparison to Siegel et al. [37]. As a result of excluding 186 diffusivity only one particle is released per day as simultaneous releases will follow identical tracks.

218
The estimate was mapped as a sphere of influence buffer zone prediction with radius equal to the 219 average predicted dispersal distance. The major limitation of an estimate prediction is that it cannot 220 easily be extrapolated into a probabilistic spatial prediction without a method to quantify the error 221 caused by assuming a constant current direction. All areas within the buffer zone were therefore 222 considered as a presence-only record. The estimate prediction was therefore always equivalent to 223 the 2D maximal possible area of occupancy.

224
The prediction from each LDM was mapped as a percentage track density per grid cell occupancy in

237
Both in terms of distance and spatial dispersal patterns, the estimate prediction was the least 238 conservative and specific, with both modelled predictions being considerably more retentive and 239 spatially targeted ( Fig. 1 and Fig. 2a,b).   (Fig. 1).

264
The difference between the estimate and the POLCOMS LDM is the most pronounced. Even though Rosemary Bank (Fig. 3), Anton Dohrn Seamount (Fig. 4) and Porcupine Bank (Fig. 5). 277 Simulations from HYCOM and POLCOMS models are displayed as track densities delineating

Spatial -correlation 302
The correlation between the track density maps of each LDM was generally low (

Spatial -qualitative 309
Generally, Rosemary Bank simulations were the most dissimilar (Fig. 3). For example, while the 1,500 310 m Rosemary Bank simulations in HYCOM suggest connection southwards to most of the eastern 311 flank of Rockall Bank, POLCOMS predicts a relatively small dispersal range suggesting there may be 312 no connection to Rockall Bank at all.

313
Of most concern is when the two models disagree in the direction of dispersal. In Rosemary Bank

318
By contrast, the results from Anton Dohrn Seamount (Fig. 4) are more similar, with all "highways" 319 generally extending north-east towards Rosemary Bank in both HYCOM and POLCOMS simulations.

320
Yet if a marine manager were to ask whether larvae from Anton Dohrn reach the Darwin Mounds to 321 the north-east, HYCOM would say 'yes' and POLCOMS would say 'no'.

322
Simulations from Porcupine Bank (Fig. 5) might indicate a broad agreement that larvae will 323 eventually reach the southern Rockall Bank, but the less direct "highways" in the POLCOMS model 324 might reduce chances of larvae getting that far.

326
This study explored the value of larval dispersal predictions from LDMs by considering two 327 questions.

1) Will LDMs give a notably different result to an (informed) estimate?
329 Our results agree with Shanks [10] and suggest that yes, there can be a large difference in the 330 predicted distance, area, and specificity of estimated and modelled dispersal patterns. There could 331 therefore be a distinct advantage in going to the effort of modelling predictions, provided that 332 models are shown to adequately approximate realistic distances better than the estimate.

333
This study may also suggest that for deep-sea species the difference may be even more pronounced 334 than in shallow water. As the difference in predicted distance of dispersal increased exponentially 335 with tracking time (up to a 12-fold difference), species with longer PLDs, such as those from the 336 deep sea [8], may show even greater disparity between estimated and modelled predictions.

341
The complex topography of the Rockall Trough induces a lot of mesoscale activity [46]  whether the model is fit for purpose and recommends that study specific validation is vital, starting with a comparison to observational oceanography in the area [33]. In this study, for example, the  [48]. Note that both models will suffer from many other errors 377 including (but not limited to) currents that are too fast due to the exclusion of tides [49], coarse 378 bathymetry that may exclude hydrographically influential features [50], and no representation of 379 possible benthic storms which may divert dispersal pathways [51]. Only targeted groundtruthing can 380 quantify the error margins and clarify whether one model is more representative than the other for 381 this purpose, and indeed they may each prove to have areas of accuracy at different depths or 382 locations [33].

383
Second, Spatial and temporal resolution has been shown to make a great difference in whether a

407
Groundtruthing 408 Groundtruthing should be regarded by all modellers as essential, and were the models found to be 409 similar it would not have supplanted this necessity but could have lent some credence to modelled 410 outputs before groundtruthing data became available. As it stands, however, the tested models 411 could not be used interchangeably without consequence to ecological or marine management 412 conclusions (e.g. whether Rosemary Bank was connected to south-east Rockall Bank at 1,300 m ( Fig.   413 3)). Hence the next step must be to identify whether one model is more accurate than the other.

433
The variability between models also advocates the interpretation of LDM results as probabilistic (i.e. 434 possible) rather than deterministic (i.e. true). Practically this may be translated as looking at the high 435 density "highways of dispersal" which had some localised consensus between models, so these 436 could be interpreted as the more likely pathways of dispersal, with lower density predictions being 437 thought of as uncertain.

438
In summary, LDMs will have a place in marine ecology and conservation and offer a great 439 improvement on informed estimates of dispersal potential, however, the hydrodynamic models they but future comparison to population genetics, geochemical isotope tracers, or study-targeted 445 groundtruthing data must still be considered essential.