Removal modelling in ecology

Removal models were proposed over 80 years ago as a tool to estimate unknown population size. Although the models have evolved over time, in essence, the protocol for data collection has remained similar: at each sampling occasion attempts are made to capture and remove individuals from the study area. Within this paper we review the literature of removal modelling and highlight the methodological developments for the analysis of removal data, in order to provide a unified resource for ecologists wishing to implement these approaches. Models for removal data have developed to better accommodate important feature of the data and we discuss the shift in the required assumption for the implementation of the models. The relative simplicity of this type of data and associated models mean that the method remains attractive and we discuss the potential future role of this technique. Author summary Since the introduction of the removal in 1939, the method has being extensively used by ecologists to estimate population size. Although the models have evolved over time, in essence, the protocol for data collection has remained similar: at each sampling occasion attempts are made to capture and remove individuals from the study area. Here, we introduce the method and how it has been applied and how it has evolved over time. Our study provides a literature review of the methods and applications followed by a review of available software. We conclude with a discussion about the opportunities of this model in the future.

Removal models are ideally suited to estimating the number of invasive species as 11 they coincide with desirable management (i.e. the reduction or eradication of 12 populations) [20] and the method has recently been adopted as a conservation 13 management tool for example for mitigation translocations [12,21]. Models that use 14 data from management actions need to account for variations in removal effort as these 15 data are unlikely to be standardised across events [20]. [22] showed that removal models 16 that account for removal effort are effective at estimating abundance, particularly when 17 removal rates are high. 18 The classic removal model was introduced by [23] and [24], motivated by a theory 19 developed by [1]. This model relied on the assumption of population closure and 20 constant detection probability, meaning that the animals are assumed to be available for 21 capture with the same probability throughout the study and there are no births, deaths 22 or migration during the study. The basic removal model results in a geometric decline 23 in the expected number of captured individuals over time. This classic removal model is 24 a special case of model M b for closed populations which allows for a behavioural 25 response to initial trapping [5]. 26 Overview of the paper 27 Within this paper we have conducted a systematic literature review of removal 28 modelling in ecology. We describe the methods applied in the systematic review and the 29 aspects of interest. We present the results obtained from the literature analyses, 30 highlighting the key methodological advances which have been made in this field and a 31 review of software which has been used to fit removal models. The paper concludes with 32 a discussion about the future role of removal modelling in ecology. 33 Materials and methods 34 Literature Search 35 This systematic review followed the PRISMA (Preferred Reporting Items for Systematic 36 Reviews and Meta-Analyses), statement as a guide [25]. The bibliographic search was 37 performed using the SciVerse Scopus (https://scopus.com), ISI Web of Science 38 (https://webofknowledge.com), and Google Scholar (https://scholar.google.com) 39 databases. Papers published between 1939 and the cut-off date 01 July 2019 with the 40 terms "Removal model" or "Removal method" and "population" in the title, keywords, 41 or abstract were included. Non-English publications, and papers reporting removal models or removal study design were retained for the analysis (Fig 1).

49
Features and parameters of each study were categorised and compiled into a database of 50 removal model publications. We identified the details that were reported in the 104 51 reviewed publications and provided a descriptive summary of the essential details that 52 need to be reported in published removal models (see Table 1). 53 February 18, 2020 2/20

Year of publication
Year of publication as it appears in the final print Reference Author-year citation style Species Scientific name of the species included in the analysis Taxonomic Group Taxonomic group of the species included in the analysis (Class and Order) Research goal Aim of the study (practical or methodological) Journal Name of the journal where the study was published Subject area Research area of the journal Table 1. Parameters used to categorise Removal models included within the database

54
Synthesised Findings 55 The reviewed literature was published from 1939 to 2019 and interestingly there have 56 historically been long gaps in publications on this topic. However in recent years there 57 has been a more constant stream of published papers, suggesting a resurgence of interest 58 in the method (Fig 2). This is potentially an indication of the role of removal modelling 59 in studies of reintroduction, especially when translocated individuals are removed from 60 endangered populations [26], and the adaptations of model collection protocols to adopt 61 a "removal" design for other data types such as occupancy and distance sampling, as 62 will be discussed in "Adapting sampling schemes using removal theory" Section.

67
Early model developments 68 The basic principle of the removal method is that a constant sampling effort will remove 69 a constant proportion of the population present at the time of sampling. Thus, if the 70 total population size is N and p denotes the probability of capture of an individual, the 71 expected number of captures will be given by: pN , p(N − pN ) and 72 p[N − pN − p(N − pN )] for the first, second and third sampling occasions, respectively. 73 Population estimates can be obtained either by plotting catch per unit effort of 74 collection as a function of total previous catch (see for example [1,27,28]) or by 75 obtaining maximum likelihood estimates [23,24,29,30]. 76 This model can be formalised by defining the corresponding likelihood function.  Using a binomial-formulation, if N t individuals are still in the study at sampling 80 occasion t, we can define the probability of removing x t individuals by 81 Pr Alternatively, we could specify that the N individuals within the population belong 84 to one of T + 1 categories: either they are captured on one of occasion 1, . . . , T , or they 85 could never be captured. Let π t denote the probability that an individual is captured at 86 occasion t 87 π t = (1 − p) t−1 p and let π 0 denote the probability that an individual is never captured: Let n denote the number of individuals never captured, which is given by then the likelihood can be expressed as: Early developments of removal models were methodological in nature, to overcome 91 issues which these days are simple to deal with because of computing power. [24] 92 formalised the conditional binomial removal model of [23], providing an asymptotic 93 variance of the abundance estimator. Further, they demonstrated how graphical 94 methods can be used to estimate the parameters of capture and abundance. The 95 likelihood of [23] is weighted with a beta prior by [31], which results in estimators with 96 lower bias and variance. [32] proposed an improved confidence interval for abundance 97 for small populations, whilst [33] proved that the profile log-likelihood for the removal 98 model is unimodal and demonstrated that the likelihood-ratio confidence interval for 99 the population size has acceptable small-sample coverage properties. Similarly, [34] 100 proposes a profile likelihood approach for estimating confidence intervals which showed 101 improved performance. 102 Validity of model assumptions 103 The model assumptions required for these early models were very restrictive. 104 Specifically, capture probability, p is assumed to be constant both across all individuals 105 and for each sampling period. [23,24,29,30]  capture-recapture models, however it is not possible to fit many of these to removal 110 data which has only one occasion of capture. [35], noting that the assumptions of the 111 removal model are often violated, proposed the non-parametric jackknife estimator as 112 an alternative to the removal model.

113
[31] proposed a standard test that combines testing for addition or deletions to the 114 population with testing for equal catchability. The test entails examining trends in the 115 catch vectors: "When the expected third catch as determined from the first two catches 116 is larger than the observed third catch, emigration or a decreasing probability of capture 117 is indicated. When the opposite condition exists, immigration or an increase in the 123 [36] indicates that unequal catchability tends to be the rule in biological 124 populations. Hence testing for equal catchability is crucial unless one adapts the model. 125 Assuming that equal catchability exists when it does not leads to underestimation of the 126 population size. One procedure that avoids this bias is to identify subsets of the 127 population that are equally catchable and to obtain separate population estimates for 128 each subset [37]. However, because such subdivision of the data greatly decreases the 129 precision of the estimate of the total population size, one should not divide the 130 population unnecessarily if the assumption of equal catchability of all the individuals is 131 met. Therefore, employing a test of equal catchability is a crucial step in any 132 population analysis, even if failure to reject the null hypothesis of equal catchability is 133 an ambiguous result [38]. 134 Two aspects of equal catchability are important for the removal method: equal 135 catchability among groups and equal catchability in all sampling occasions. The first is 136 tested analogously to the test for marked and unmarked captures. If groups with 137 different catchability are identified, separate population size estimates are made for each 138 such group. Individual differences in catchability unrelated to a particular group 139 membership are still possible, but [32] showed that unless these differences are great, 140 their effect on the population estimate is small. [39] investigated the robustness of the 141 removal model to varying behaviours exhibited by fish using simulation. 142 The second assumption, that catchability remains constant in all sampling periods 143 can be tested by the chi-square test given by [24] or further test given by [5] or [40]. 144 Conclusions drawn from any of these tests will be accurate only if the population 145 remains closed during sampling. Use of a barrier if possible, is again desirable, or 146 independent verification by sampling of marked animals [37]. 147 The closure assumption of the removal model has been relaxed in [41], where a number of groups. Within this paper however it is assumed that any emigration from 153 the population is permanent, and this assumption has been relaxed in [12] which 154 presents a robust design, multilevel structure for removal data using maximum 155 likelihood inference. The implemented hidden Markov model framework [42], allows 156 individuals to enter and leave the population between secondary samples. A Bayesian 157 counterpart to this robust design model is presented in [43]. 158 Change in ratio, index-removal and catch-effort models 159 Change in ratio models for population size estimation are closely related to removal 160 models [44,45]. The model requires that the population can be sub-divided into distinct 161 population classes and the removals will be performed in such a way that the ratio of 162 removals of the sub-populations will be the same as the underlying ratio of the 163 sub-classes within the population. 164 We can generalise the basic removal model likelihood function of Eq (2) by extending 165 the definition of the parameters and the summary statistics. Specifically, suppose the 166 population is sub-divided into G mutually exclusive and exhaustive groups, and let N (g) 167 denote the unknown abundance of sub-population g = 1, . . . , G. We now record x t (g) 168 individuals of sub-population g being removed at occasion t. The likelihood becomes, 169 February 18, 2020 5/20 where N 1 (g) = N (g) and N t (g) = N (g) − t−1 k=1 x k (g) for t ≥ 2.

170
Catch-effort models are a straight-forward extension of basic removal models which 171 allow capture probability to be related to sampling effort. If catch per unit effort 172 declines with time, then regressing accumulated removals by catch per unit effort allows 173 the starting population to be estimated. This approach however strongly relies on the 174 assumption that if more effort is put into capturing the individuals then a higher 175 proportion of the population will be caught and if this is not satisfied estimators might 176 be appreciably biased [46]. More generally we can extend the removal likelihood of 177 Equation 2 to define a functional form of capture probability: where p t is the capture probability at occasion t which can be linked to a recorded 179 covariate of survey effort, denoted by w t . Possible forms for the functional form might 180 be logit(p t ) = α + βw t , where α and β are parameters to be estimated, or where θ is a single parameter to be estimated has been used for 182 fisheries applications where w t denotes the amount of time spent fishing. Alternatively 183 if w t denotes the number of traps on occasion t and each animal is assumed to be 184 caught in any trap with probability θ, p t = 1 − (1 − θ) wt [4]. Indeed the logistic form of 185 time-dependent capture probability can also be used to model time-variation in capture 186 probability as a function of climatic conditions -see for example [47].

187
When sampling is with replacement and the sampling efforts are known, [48] 188 modelled the survey sampling process as a Poisson point process where each animal is 189 counted at random with respect to increments of sampling effort and it is assumed that 190 the encounter probabilities for each individual are independent. [49] propose a class of 191 catch-effort models which allow for heterogeneous capture probabilities.

192
The index-removal method makes use of the decline in a measure of relative 193 abundance due to a known removal. The relative abundance is measured in surveys 194 before and after the removal [50]. [51] proposed an index-removal estimator which 195 accounts for seasonal variation in detection. known ratios of sub-populations, thus generalising the change-in-ratio approach.

212
The theory of analysing multiple types of data in an integrated model within ecology 213 has gained traction in recent years -see for example [58]. Early ideas of combining data 214 types has been found in the removal literature. For example, [37] proposed combining 215 capture-recapture and removal methods for fish removals when sampling is over a 216 limited study period and [59] showed how the proportional trapping model can be 217 extended to include data on non-target species. 218 Removal models have been presented as a class of hierarchical models, for 219 example [16] present a hierarchical removal model where the sites are assumed to have 220 several distinct sub-sites located spatially. Suppose removals occur at S sites, then where N s denotes the abundance at site s, p s denotes the capture probability at site s 226 and n s = N s − The likelihood function is then the product over the observations from the S sites, This model is in fact a multinomial N-mixture model [60] and it has been shown that 234 the removal N-mixture model outperforms the standard N-mixture model using 235 simulation [61]. In practice, a value K has been used in place of the infinite sum in 236 equation 8 when evaluating the likelihood, however what value of K is appropriate is 237 subjective, and it has been shown there can be some problems with proposed values [62]. 238 [63] has demonstrated that the multinomial N-mixture model for removal data, with 239 negative binomial mixing distribution, has a closed-form likelihood and therefore no 240 numerical approximations are required to fit the model.

241
Adapting sampling schemes using removal theory 242 Little work has been found which investigates study design for removal surveys. [64] 243 explored how to optimally allocate total sampling effort for multiple removal sites by 244 maximising the Fisher information of the constant capture probability in the classic 245 removal model. This approach was extended by [65] to allocate effort between primary 246 and secondary sampling occasions within the robust design removal model [12].

247
An adaptation of survey design for various data types have been augmented with the 248 concept of removal studies. For example, [66] described a time-removal model that treats subsets of the survey period as independent replicates in which birds are 250 'captured' and mentally removed from the population during later sub-periods. This 251 method has been implemented in many subsequent papers, see for example [67,68]. 252 Further, the removal study design has been proposed for occupancy surveys [69], 253 whereby once a site has been observed as occupied by a species no further surveys are 254 required [70]. [71] developed a spatially explicit temporary emigration model permitting 255 the estimation of population density for point count data such as removal sampling, 256 double-observer sampling, and distance sampling.

258
From the papers we reviewed three ecosystems were identified (Figure 2) based on the 259 species analysed. Almost half of the applied studies were focused on aquatic ecosystems 260 (marine and fresh water); followed by flying species (n=24) and the rest of the applied 261 studies analysing terrestrial ecosystems. However partitioning the papers by taxonomic 262 group shows that the most common group are bird applications, with 23 papers 263 identified. However, it should be noted that many of these applications use other data 264 types but with adapted sampling design as described earlier [14,66,. Removal 265 methods are clearly important in fisheries research and applications are presented in 266 [93][94][95][96][97][98][99][100][101][102][103][104][105][106][107][108][109][110]. The papers analysing data from mammals are [1,5,20,[111][112][113][114][115][116][117][118][119][120][121]. There are a 267 further six papers analysing data from amphibians: [122][123][124][125][126][127] and [51] analysed 268 crustaceans. Insects have being analysed in three papers: [128][129][130] and the less 269 common applications included annelids [131] and Holothuroidea [132]. Three papers 270 presented applications about human disease [17][18][19] and we included these in our  Bayesian framework [136]. [137] demonstrate how removal models can be fitted using 292 Program Mark for fisheries data. RMark [138] is a software package for the R computing 293 environment that was designed as an alternative interface that can be used in place of 294 Program Mark's graphical-user-interface to describe models with a typed formula so 295 that models do not need to be defined manually through the design matrix. At the time 296 of writing RMark supports fitting 97 of the 155 models available in Program Mark. R 297 packages marked [139] and unmarked [140] can also be used to fit the standard removal 298 February 18, 2020 8/20 model and multinomial N-mixture removal model, respectively -see [61]. There is also 299 more specialised software that has arisen for specific applications. Removal Sampling 300 v2 [141] was designed to estimate population size from removal data and [105] apply 301 this software in order to analyse the effectiveness of stream sampling methods for 302 capturing invasive crayfish.

303
In addition there is of course well-known software which accommodates the removal 304 study design when fitting models to other data types. In particular Presence and 305 RPresence [142] for occupancy surveys and Distance [143] for distance sampling.

307
Early removal models were simplistic and did not adequately account for potential 308 variability exhibited by the underlying population, however computational and 309 methodological advances give the possibility of more complex models, increasing 310 opportunities, scenarios and accuracy in population estimation. The framework we have 311 presented here is designed to assimilate the use of removal models in order to assist 312 future practitioners in the effective application of removal models.

313
In our review we have not only presented a list of papers published, differentiating 314 application and methodological advances, but also we have explained the evolution of 315 the model. We have shown how the model has been developed since [1] presented the 316 first case with the evolution in likelihood function from the basic to, for example, that 317 proposed by [49], accounting for heterogeneous capture probability, and more recently 318 the work of [63], theoretically developing the multinomial N-mixture model for removal 319 data. 320 We have shown how the model assumptions have been adapted, trying to fit the 321 model to different scenarios such as unequal catchability [36], non closure of 322 population [41], heterogenity accross sites [16] and temporary migration [12,43]).

323
Software development, means that even the complex models described in this paper 324 are accessible to ecologist, meaning that maximum utility can be obtained from removal 325 data.

326
There are several advantages for non-specialists that wish to apply removal methods: 327 there is a vast array of available models for removal data, with the possibility of 328 selecting the approach where the model assumptions best align with their particular 329 study; there is no restriction to frequentist or Bayesian paradigms; there are several 330 software packages and R code accompanying publications of more recent development to 331 investigate where model assumption might fall short.

332
In our research we noted in earlier papers a thorough assessment of effects if model 333 assumptions were violated but this rigour was not found in late methodological papers, 334 except in some cases through part of a simulation. New methods developed in this 335 research field have been motivated by unique aspects of particular data sets, and 336 therefore nuances of a case study should be embraced rather than avoided in order to 337 encourage methodological advances.

338
There is a worldwide interest in identifying tools for effective estimation of species 339 population size and removal models show great potential for application in a wide range 340 of situations, such as species relocation projects.

341
The potential of removal models to facilitate the estimation of population size in the 342 source population whilst also obtaining a pool of individuals to translocate/reintroduce 343 means that such models will remain important and will likely be further developed.

344
Species relocations are becoming more prevalent in conservation worldwide 345 [144][145][146]. They are performed in several countries on an extensive range of species 346 including plants [146], amphibians and reptiles [147,148]. There are many studies of reintroduced population [149][150][151][152][153][154]. However, there is less information regarding the 350 impact of translocations on the source or donor population [155,156]. These impacts can 351 dramatically affect community stability, which is especially important when translocated 352 individuals are from endangered populations [26]. The main components that can affect 353 the stability of a population are: resistance, that is the ability to maintain its current 354 state when subjected to a perturbation [157]; amplitude, that will determine, after some 355 alteration, if it will return to its original state [158]; elasticity is the property that will 356 determine the rate of return to its initial configuration when the perturbation exceeds 357 the resistance of a community, but not its amplitude [158]. Removal data and removal 358 models may be a powerful tool in order to understand and manage these populations.