Estimating population size when individuals are asynchronous: A model illustrated with northern elephant seal breeding colonies

Our aim was to develop a method for estimating the number of animals using a single site in an asynchronous species, meaning that not all animals are present at once so that no one count captures the entire population. This is a common problem in seasonal breeders, and in northern elephant seals, we have a model for quantifying asynchrony at the Año Nuevo colony. Here we test the model at several additional colonies having many years of observations and demonstrate how it can account for animals not present on any one day. This leads to correction factors that yield total population from any single count throughout a season. At seven colonies in California for which we had many years of counts of northern elephant seals, we found that female arrival date varied < 2 days between years within sites and by < 5 days between sites. As a result, the correction factor for any one day was consistent, and at each colony, multiplying a female count between 26 and 30 Jan by 1.15 yielded an estimate of total population size that minimized error. This provides a method for estimating the female population size at colonies not yet studied. Our method can produce population estimates with minimal expenditure of time and resources and will be applicable to many seasonal species with asynchronous breeding phenology, particularly colonial birds and other pinnipeds. In elephant seals, it will facilitate monitoring the population over its entire range.

reidentified on several surveys. Since this is a substantial hindrance in many 43 circumstances, we present here an alternative method for estimating total population in 44 asynchronous species from single counts. Our approach is based on some prior 45 knowledge of individual behavior, but once that is collected, the model works with 46 counts but requires no marked individuals. We first developed the model [4] using 47 observations of colonies of the northern elephant seal (Mirounga angustirostris). 48 Elephant seals are large marine predators that aggregate on beaches to reproduce at 49 predictable times each year [5,6]. Females come ashore to raise pups and to mate, and 50 their dense groups can be approached and easily counted. Males are also present, but 51 because females each produce a single pup, the most important index of population size 52 is the number of breeding females. Moreover, because most of the population is found in 53 just eight colonies [7], only a few sites need to be counted to generate estimates of the 54 entire population. The problem of asynchrony arises, however, because there is never a 55 day when all breeding individuals are ashore. While the entire breeding season lasts 56 nearly three months, individual females are on shore only one month: some depart 57 before others have arrived. Our goal is determining what fraction of the females are 58 missed in any one count. To do so, we make use of a published model whose purpose 59 was to quantify asynchrony in elephant seals [4]. 60 The model uses observed daily counts to estimate the timing of arrival and 61 departure of females at a colony [4,8,9]. This quantifies female asynchrony and leads to 62 an estimate of the number of animals missed during any one count. The present 63 objective is to apply the model to several of the largest northern elephant seal colonies 64 where females have been counted repeatedly. We quantify consistency of the timing of 65 female arrival, over many years and between sites, and use the model's estimate of the 66 fraction of females onshore on any one day to generate correction factors that convert a 67 single count to total female population. We also provide an online software tool that 68 applies the model to any input data, allowing the method to be tested in other 69 asynchronous species such as colonial birds or pinnipeds.  In what follows, all data are to counts of all females onshore at one colony on one day.

84
At Año Nuevo, we usually counted from high points on dunes near the females [6], but 85 counts from airplanes or drones were used to support ground counts on a few days when 86 numbers were at their highest. Aerial counts confirmed that ground counts had no consistent bias. Most counts at Point Reyes were done from high cliffs overlooking the 88 animals, allowing excellent visibility, but some were done on beaches immediately 89 adjacent to the females [10]. Counts at King Range were done from close proximity to 90 the colony, always within 100 m of the animals [11]. At the Channel Islands, all counts 91 were made from aerial photographs and have been verified elsewhere [12,13]. These daily 92 female counts were our basis for verifying the model and calculating correction factors. 93 Census model 94 We define the census curve, C(t), as the count through time during one season at one 95 colony. In elephant seals, it is bell-shaped, as in other seasonal breeders (eg [14,15]); at 96 Año Nuevo, it starts from zero in mid-December, peaks near the end of January, then 97 declines back to zero by early March (ref [4]; Fig 2). Define arrival a(t) and departure 98 d(t) as the total number of animals arriving (departing) on each day t, then cumulative 99 arrival A(t) = that arrived (departed) through day t. The estimated census curve on day t is 101 Our model describes a(t) and d(t) as Gaussian functions of t, and the parameters 102 describing those curves (Table 1) are estimated by fitting C(t) as closely as possible to 103 the observed daily counts (Fig 2). In northern elephant seals, the model generated 104 census curves C that closely fit observed counts [4], and the same approach has been 105 applied in southern elephant seals [8,9]. Both cumulative arrival and departure curves, 106 A(t) and D(t), reach an asymptote at the total population, N , so once parameters for a 107 and d are estimated, the fraction of animals missed in any one count C can be 108 calculated. Further details are presented in Condit et al. [4].

109
Here we add one substantial improvement. In the previous study [4], we estimated a 110 set of parameters separately in each year. Here, we fit the entire ensemble of counts at 111 one colony in a single hierarchical model, with year as a random effect [16]. Taking the 112 mean arrival date,â, as an example, each year i has an estimated a i , and those dates 113 across years are fitted to a Gaussian distribution known as the hyper-distribution, 114 described by a hyper-mean arrival date, µ a , and a hyper-standard-deviation, σ a (Table 115 1). Because we hypothesize some year-to-year consistency in the timing of breeding, the 116 5 parameters describing timing were constrained by a Gaussian hyper-distribution 117 ( Table 1), but we have no expectation of consistency in population size, so N i was not 118 constrained. The advantage of the hierarchical method arises in years having too few 119 counts to fit the full arrival-departure model. In previous papers, we omitted those 120 years from calculations [4,6]. With a multi-year model, they can be included because 121 the arrival and departure curves from other years constrain the arrival timing (Fig 2). 122 The multi-year hierarchical model was fitted to five sets of data: Año Nuevo Island, 123 Año Nuevo Mainland, Point Reyes, King Range, and the three Channel Islands 124 combined; data were insufficient to model the Channel Islands separately. We did not 125 attempt to create a single grand model with both colony and year levels, since there 126 were too few colonies to support a random effect. All parameters of the model (Table 1) 127 for those five datasets were estimated with a Bayesian Monte Carlo procedure, 128 producing a posterior distribution and thus 95% credible intervals for parameters and 129 the statistics calculated from parameters; details on parameter-fitting are given in 130 [4,17,18]. To compare colonies and to evaluate consistency of the census timing across 131 years, we used the mean arrival date each year, a i , as fitted by the model, and the date 132 of the maximum female census calculated from the fitted census curve, C(t), in each 133 year. Non-overlapping 95% credible intervals were used to infer statistically significant 134 differences.

135
The correction factor 136 Consider site s in year i and define the number we seek as N is , the total number of 137 breeding females using the site in that year. C is (t) is the census curve at site s in year i: 138 the estimated daily female population on each day t (Eq 1, Fig 2). Then C is (t)/N is is 139 the proportion of females on the colony on day t. The inverse is the correction factor, a 140 multiplier for the female count on day t that yields the total female population, The correction factor was calculated separately in every year at a site, but to be useful 142 in future years, we need the average across all years for site s, m is (t). To generate 143 m is (t), census curves for each site s were simulated using random draws of parameters 144 from the joint posterior distributions at that site; the draws included the 5 parameters 145 Table 1. Six parameters for the census model in a single year and their hyper-parameters across years. The six main parameters are subscripted since there is one for every year i; each colony has a separate set. The population size had no hyper-parameters, meaning every year's population was independent, and the two parameters describing tenure had hyper-parameters fixed in advance as narrow priors (µ d = 31.06, σ d = 0.27, µv = 0.1212, σv = 0.0062), a requirement for fitting the model [4]. The remaining parameters had Gaussian hyper-distributions estimated from the data. In a model covering Y years of censuses, 6Y + 10 parameters are needed: 6Y main parameters (6 per year) plus 10 hyper-parameters.

Hyper-parameter Parameter
Annual  Complete census data are published as a Dryad Data Archive [19]. The software is 157 available online, where it can be applied to any input data [20], and source code is 158 posted at Github [21].

160
Census timing 161 Our model converged on estimates of total population size and arrival behavior at all 162 northern elephant seal colonies we studied in every season. Observed counts and fitted 163 census curves followed a bell-shaped curve from late December to early March at all 164 sites (Fig 2). The model quantified year-to-year variation in arrival dates and thus the 165 day of the maximum number of animals present (Fig 3), and it generated consistent 166 census curves in several years having few counts (Fig 2).

167
Comparison of arrival date and peak census date across years and across sites 168 illustrates the degree of consistency (Fig 3). In early years at Año Nuevo Mainland, 169 arrival dates were around 17 Jan, but were consistently near 14 Jan after 1990. Arrival 170 dates at Point Reyes and Año Nuevo Island also started relatively late, then stabilized 171 3-4 days earlier (Fig 3A). The standard deviation in arrival date between years (σ a ) was 172 1.8-1.9 days at Año Nuevo Island, Año Nuevo Mainland, and Point Reyes, the three Figure 2. Daily female counts and fitted census curves in sample years. Black points are observed counts on single days. The black, bell-shaped curve is the fitted census curve, C(t); the blue dotted curve is the estimated cumulative arrival curve, A(t); and the dotted red curve the cumulative departure curve, D(t). A(t) and D(t) reach an asymptote at the total population of females, at Año Nuevo Island, 14 Jan at Point Reyes, and 15 Jan at Año Nuevo Mainland and 176 King Range (Fig 3A). The long-term mean date of the peak census was 15-17 days after 177 mean arrival: 25 Jan at the Channel Islands, 26 Jan at Año Nuevo Island, 28 Jan at 178 Point Reyes, 30 Jan at King Range, and 31 Jan at Año Nuevo Mainland. Variation in 179 the peak mirrored variation in arrival (Fig 3B). 180 Besides year-to-year fluctuations in arrival date or peak census date of < 2 days,

181
there were a few large outliers (Fig 3). Some of the outliers were associated with   Tables A1-A5). 196 A count at the peak had the narrowest credible intervals. Within ±2 days of the peak, 197 correction factors and credible intervals barely changed, but beyond ±5 days, they Channel Islands compared to the other three locations, corresponding to differences in 200 female arrival and thus census timing. At all five sites, the multiplier was between 1.13 201 and 1.20 from 26-30 Jan (Fig 4, Table 2).
202 Table 2. Correction factors for estimating the total breeding female population at five different sites (the Channel Islands site is three colonies combined). The total population using the colony in a year is estimated by multiplying a count on any of the dates by the given factor. Credible intervals and a longer series of dates are provided in Appendix S1. ANI = Año Nuevo Island, ANML = Año Nuevo Mainland, PR = Point Reyes, CI = Channel Islands, KR = King Range.

203
We provide a method for estimating total population size in species whose presence in a 204 study area is asynchronous. In these cases, the entire population is not present at the 205 same time, so no direct count includes all animals [1]. It is common in pinnipeds and 206 colonial birds. Here we demonstrated the method's effectiveness at describing 207 asynchrony and estimating correction factors that account for animals not present 208 during any one count. We found consistent correction factors, with the number of seals 209 missed in the peak count close to 15% at all colonies. In other species, the fraction 210 missed can be much higher [1,3], but our approach should work regardless of the degree 211 of synchrony. We provide software for testing it elsewhere.

212
The complete population using a site is known as the superpopulation, and existing 213 estimators are based on mark-recapture analysis [1,3,24]. These approaches are 214 complemented by our method because the mark-recapture data can describe arrival and 215 departure behavior and thus estimate the length of time individuals remain on site.

216
Once site tenure is quantified, our model can be applied without any marks. The 217 requirement for knowledge about tenure arises because two of the six parameters in our 218 model must have prior estimates, otherwise the system is over-parameterized and 219 abundance cannot be estimated. In elephant seals, we used marked animals to 220 demonstrate that females remain on the colony for a mean of 31 days with a standard 221 deviation of 4 days. Alternatively, the model could be adapted to cases where arrival or 222 departure dates are known in advance but tenure is not. An extra detail in our model, a 223 correlation between arrival and tenure, might be omitted in other species, since it may 224 have little impact on population estimates. 225 Unlike the consistent timing in elephant seals, birth date in gray seals (Halichoerus 226 grypus) shifted earlier by 18 days over 25 years [15]. Our model, however, can handle 227 any variation in timing, indeed we demonstrated a shift toward earlier birth dates over 228 the first few years at Año Nuevo and Point Reyes, though by only 4-5 days. Gray seals 229 evidently responded to climatic variaton by pupping earlier [15], but we see no indication 230 of climatic response in elephant seals. Instead, the slightly later breeding dates in 231 recently-founded colonies are caused by animals assorting themselves. Timing is delayed 232 in new colonies because late-arrivers at established colonies encounter high density and 233 are likely to emigrate to new sites, especially during storm surge [22,[25][26][27]. It does not 234 involve population-wide change in timing, such as must be happening in gray seals.

235
Had elephant seals shifted their timing as much as gray seals, correction factors 236 would have to change through the years. But consistency through time and across 237 colonies meant that 26-30 Jan were close to the optimal day for a count at all sites, and 238 remained so for 30 years where long time series were available. Moreover, the correction 239 factors at all sites were 1.13-1.20, with credible intervals 1.1-1.3. We thus recommend 240 that counts at new colonies, where long-term data are lacking, should be 26-30 Jan, and 241 would lead to a population estimate within 10% error, excellent precision for estimating 242 the population size of a highly pelagic marine predator.

243
In the closely related southern elephant seal, the census curve is similar [8], and the 244 date of the female count is consistent across colonies at the core of the latitudinal range 245 [9]. The far northern Peninsula Valdes colony, however, is 10 days earlier [28], and it is 246 thus clear that a separate census correction for Valdes would be needed. We will need to 247 address this issue with the northern elephant seal because the Mexican colonies breed 248 earlier than the ones we studied in central California, and the optimal census in Mexico 249 is probably 15-20 January [29].

250
The colonies we examined varied greatly in size. The Channel Islands each had over 251 10,000 females spread across many beaches, but at Año Nuevo, we separated the single 252 beach on the island from the nearby mainland groups, and the King Range colony had 253 only 100 animals. Despite these differences, the correction factors barely differed. We We thank Marshall Sylvan, who stated then solved the asynchrony problem in the 1970s 262 at Año Nuevo Island. We also thank numerous scientists for field work and advice,  Table A1. Correction factors for converting daily counts of female northern elephant seals into the total breeding female population at Año Nuevo Island. The total population using the colony in a year is estimated by multiplying a count on any date by the given Multiplier. Lower and upper credible intervals (95%) are derived using the columns Lower and Upper.

Date
Multiplier