Abstract
Resource-selection and step-selection analyses allow researchers to link animals to their environment and are commonly used to address questions related to wildlife management and conservation efforts. Step-selection analyses that incorporate movement characteristics, referred to as integrated step-selection analyses, are particularly appealing because they allow modeling of both movement and habitat-selection processes.
Despite their popularity, many users struggle with interpreting parameters in resource-selection and step-selection functions. Integrated step-selection analyses also require several additional steps to translate model parameters into a full-fledged movement model, and the mathematics supporting this approach can be challenging for biologists to understand.
Using simple examples, we demonstrate how weighted distribution theory and the inhomogeneous Poisson point-process model can facilitate parameter interpretation in resource-selection and step-selection analyses. Further, we provide a “how to” guide illustrating the steps required to implement integrated step-selection analyses using the amt package.
By providing clear examples with open-source code, we hope to make resource-selection and integrated step-selection analyses more understandable and accessible to end users.
Introduction
New technologies (e.g., improved Global Positioning System [GPS] collars) and advances in remote sensing have made it possible to collect animal location data on unprecedented spatial and temporal scales (Kays, Crofoot, Jetz, & Wikelski, 2015; Robinson et al., 2020), which in turn has fueled the development of new methods for modeling animal movement and for linking individuals to their environments (Guisan, Thuiller, & Zimmermann, 2017; Hooten, Johnson, McClintock, & Morales, 2017). Two of the most popular approaches for analyzing telemetry data, resource-selection and step-selection analyses, compare environmental covariates at locations visited by an animal (“used locations”) to environmental covariates at a set of locations assumed available to the animal (“available locations”) using logistic and conditional logistic regression, respectively (Boyce & McDonald, 1999; Fortin et al., 2005; Thurfjell, Ciuti, & Boyce, 2014). These methods are widely available in most statistical software packages, and thus, they provide a robust and easy-to-implement framework for analyzing habitat-selection patterns; note, here and elsewhere, we have used the term habitat-selection rather than resource-selection to highlight our broader interest in modeling the effects of a diverse set of environmental variables (e.g., those capturing risks and environmental conditions in addition to resources), but we will often use these terms interchangeably. Despite their popularity, our collective experience has been that most users struggle to interpret parameters in these models. Further, it seems that papers attempting to address this issue have had limited success, and in some aspects may have increased confusion (see e.g., Keating & Cherry, 2004; Johnson, Nielsen, Merrill, McDonald, & Boyce, 2006; Lele, Merrill, Keim, & Boyce, 2013; Avgar, Lele, Keim, & Boyce, 2017; Chamaille-Jammes, 2019).
Here, we highlight how point-process models and weighted distribution theory provide simple and effective frameworks for interpreting regression parameters in resource-selection and step-selection analyses. In the sections that follow, we begin by reviewing recent research connecting resource- selection functions to point-process models and weighted distribution theory. Using these connections, we demonstrate correct interpretation of parameters using simple examples of models fit to GPS locations of fisher (Pekania pennanti) from upstate New York (LaPoint et al., 2013a, 2013b). We then provide a short review of step-selection analyses, including their history and methods for parameter estimation. Step-selection analyses are particularly appealing because: 1) they provide an objective method for defining habitat availability in terms of movement constraints; 2) they relax the assumption that locations are statistically independent; and 3) by including movement characteristics (e.g., functions of step length and turn angle) as predictors, they provide a means to model both movement and habitat selection processes (termed an integrated step-selection analysis by Avgar, Potts, Lewis, & Boyce, 2016). Recognizing that many biologists may find the mathematics supporting integrated step-selection analyses intimidating, we aim to provide a “how to” guide demonstrating the steps required to implement the approach using the amt package (Signer, Fieberg, & Avgar, 2019). This demonstration is expanded upon using coded examples in the supplementary appendices, which we encourage the reader to explore. We end with a short discussion highlighting challenges related to statistical dependencies and model transferability.
Resource-Selection Analyses
Logistic Regression
Much of the confusion surrounding the interpretation of parameters in resource-selection analyses can be attributed to the use of logistic regression in a non-standard way. Logistic regression is most easily understood as a model for binary random variables that can take on one of two values (0 or 1) with probability that depends on one or more explanatory variables (Hosmer, Lemeshow, & Sturdivant, 2013).
Consider, for example, a prospective study designed to infer how various environmental characteristics influence whether a habitat patch will be used by one or more animals. In this case, we may randomly select n habitat patches and monitor them to determine if they are used (yi = 1) or not (yi = 0) for i = 1, 2, …, n. Logistic regression allows us to model the probability that each patch will be used, P (yi = 1) = pi, as a logit-linear function of patch-level predictors (Xi1, …, Xip) and regression parameters (β1, …, βp):
After having fit a model, we can exponentiate the regression coefficients, exp(βk), to quantify how the odds of use, p/(1 − p), change as we increase the kth predictor by 1 unit while holding all other predictors constant. We can also use the inverse-logit transformation (eqn. (1)) to estimate the probability that a patch will be used, given its set of spatial predictors:
The logit transformation ensures that p will be constrained between 0 and 1 for all values of the predictor variables.
Contrast this approach with how logistic regression is used to study habitat selection. In a typical habitat-selection study, logistic regression models are fit to separate samples of used and available sample units, groups that are not mutually exclusive (i.e., available habitat patches may also be used). We will refer to the combined locations as use-availability data. In this case, yi is no longer a Bernoulli random variable since pi depends on the ratio of used to available points (which is under control of the analyst). Further, most analyses of telemetry data involve point-level sampling in continuous space rather than discrete sample units. In this case, the probability associated with any point is necessarily 0 to ensure the distribution integrates to 1 over all of available space. Thus, it is perhaps not surprising that there has been considerable confusion and controversy surrounding the use of logistic regression with use-availability data (e.g., Keating & Cherry, 2004; Johnson et al., 2006; Chamaille-Jammes, 2019).
Various arguments have been constructed to justify the use of logistic regression when analyzing use-availability data (Manly, McDonald, Thomas, McDonald, & Erickson, 2002; Johnson et al., 2006; Aarts, MacKenzie, McConnell, Fedak, & Matthiopoulos, 2008), but a significant breakthrough came when Warton & Shepherd (2010) made a connection between logistic-regression and an inhomogeneous Poisson point-process (IPP). An IPP is a model for random locations or events in space, where the expected spatial density of the locations depends on spatial predictors (see next section, Inhomogeneous Poisson Point-process Model). Warton & Shepherd (2010) showed that as the number of available points is increased towards infinity, the slope parameters in logistic regression models will converge to the slope parameters in an IPP model. Interestingly, several other popular approaches for analyzing species distribution data, including MaxEnt (Phillips & Dudík, 2008; Elith et al., 2011), weighted distribution theory with an exponential link function (Lele & Keim, 2006), and resource utilization functions (Millspaugh et al., 2006), have been shown to be equivalent to fitting an inhomogeneous Poisson point-process model (Warton & Shepherd, 2010; Aarts, Fieberg, & Matthiopoulos, 2012; Fithian & Hastie, 2013; Hooten, Hanks, Johnson, & Alldredge, 2013; Renner et al., 2015).
Instead of focusing on pi, as is typical in prospective studies, logistic regression applied to use- availability data should simply be viewed as a convenient tool for estimating coefficients in a resource-selection function, w(x; β) = exp(Xi1β1 + … Xipβp) (Boyce & McDonald, 1999; Boyce et al., 2002). As we will see in the next section, this expression is equivalent to the intensity function of an IPP model, but with the intercept (the log of the baseline intensity) removed; the baseline intensity gives the expected density of points when all covariates are 0. Because resource-selection functions do not include this baseline intensity, they are said to measure “relative probabilities of use” or, alternatively, said to be “proportional to the probability of use” (Manly et al., 2002). Although the term probability of use sounds appealing, it is important to remember the challenges with defining probability at the point-level. Further, although probability of use is easily defined for discrete sample units (e.g. grid cells), these probabilities should increase with the size of the spatial unit and also with the study duration (Lele & Keim, 2006; Lele et al., 2013). Thus, with telemetry studies, it seems more natural to model spatial (or spatio-temporal) “hazards” or rates of use in continuous space (and time), from which “probability of use” can be determined by integrating these hazards over whatever spatial (and temporal) unit is deemed appropriate. Point-process models allow us to do just that!
Inhomogeneous Poisson Point-Process (IPP) Model
The IPP model provides a simple framework for modeling the density of points in space as a log-linear function of spatial predictors through a spatially-varying intensity function, λ(s):
where s is a location in geographic space, and X1(s), …, Xp(s) are spatial predictors associated with location s. The intercept, β0, determines the log-density of points (within a small homogeneous area around s) when all Xi(s) are 0, and the slopes, β1, …, βp, describe the effect of spatial covariates on the log density of locations in space. The IPP model can be understood by listing its key features and assumptions, namely:
The number of locations in an area G, nG, is given by a Poisson random variable with mean μG = ∫G λ(s)ds.
Locations are independent (any clustering can be explained by spatial covariates).
If all available spatial predictors are measured only at a coarse scale (e.g., at a set of gridded or rasterized cells), then fitting the IPP model is equivalent to fitting a Poisson regression model (Aarts et al., 2012). Specifically, one may treat the counts, yi, in n discrete spatial units (i = 1, …, n), as a set of independent Poisson random variables with means = λ(si)|Gi| where λ(si) is given by eqn. (2) and |Gi| is the area of grid cell i. Note that log(E[yi]) = log(λ(si)|Gi|) = log(λ(si)) + log(|Gi|); thus, the log-link used in Poisson regression implies the area, |Gi|, should be included as an offset (a predictor variable with with regression coefficient fixed at a value of 1).
When spatial predictors are available at the point-level, as will be the case whenever constructing “distance to” predictors (e.g., distance to nearest road, water source, etc), it will be advantageous to model the locations in continuous space. In telemetry studies, the absolute density of points will be determined by the frequency and duration of data collection. Thus, β0 will not be of biological interest, and it will be appropriate to focus efforts on estimating and interpreting the slope coefficients, β1, …, βp, which determine relationships between the spatial covariates and the relative density of locations throughout the study area (Fithian & Hastie, 2013). As is the case with linear and generalized linear models (e.g., Poisson regression), we can estimate parameters using maximum likelihood. This technique requires writing down an expression, called the likelihood, that captures the data generating mechanism in terms of one or more parameters. With telemetry data, it makes sense to work with the conditional likelihood of the IPP model (Aarts et al., 2012), i.e., the likelihood of the observed locations in space, conditional on there being n observed locations in total. The conditional likelihood is given by:
where the product is over the n observed locations, λ(si) is the intensity function evaluated at observation i, and the integral in the denominator evaluates the intensity function over the spatial domain of interest (Cressie, 1992; Aarts et al., 2012). If we plug λ(si) = exp(β0 + X1(si)β1 + … Xp(si)βp) into eqn. (3), β0 will cancel from the numerator and denominator, leaving us with:
where w(x(s); β) = exp(β1x1(s) + … βpxp(s)) is our resource-selection function.
The binomial likelihood associated with logistic regression differs from eqn. (4), but Warton & Shepherd (2010) showed that logistic regression estimators of slope coefficients converge to the those of the IPP model as the number of available points increases toward infinity. Thus, the connection to the IPP model addresses a common question that arises when estimating resource-selection functions, namely, “how many available points do I need?” The exact answer depends on how difficult it is to estimate the integral in the denominator of eqn. (4); the recommendation we offer is to increase the number of available points until the estimated slope coefficients no longer change much. Fithian & Hastie (2013) later showed that the convergence results of Warton & Shepherd (2010) hold only if the model is correctly specified, but assigning “infinite weights” to available points ensures the results hold more generally. Therefore, when fitting logistic regression or other binary response models (e.g., boosted regression trees) to use-availability data, we also suggest assigning a large weight (say 1000 or more) to each available location and a weight of 1 to all observed locations (larger values can be used to verify that results are robust to this choice). For a coded example in R (R Core Team, 2019), see section Interpreting Parameters in Resource-Selection Functions and Supplementary Appendix A.
Weighted Distributions
Weighted distribution theory provides another way to interpret parameters in resource-selection functions (Lele & Keim, 2006; Johnson, Thomas, Ver Hoef, & Christ, 2008). Let:
u(x) = the frequency distribution of habitat covariates, x, at locations used by our study animals.
a(x) = the frequency distribution of habitat covariates, x, at locations assumed to be available to our study animals.
We can think of the resource-selection function, w(x; β), as providing a set of weights that takes us from the distribution of available habitat to the distribution of used habitat:
The denominator of eqn. (5) ensures that the left hand side integrates to 1 and thus, u(x) is a proper probability distribution; the variable z here is just a dummy variable used to allow integration over the frequency distribution of our environmental covariates. Because these distributions are written in terms of the habitat covariates, x, instead of geographical locations, we say that model is parameterized in environmental space (E) (Hirzel & Le Lay, 2008; Elith & Leathwick, 2009).
To show that weighted distribution theory is consistent with the IPP formulation discussed above, we can rewrite eqn. (5) in geographic space (G):
where the denominator integrates over a geographic area, G, that is assumed to be available to the animal and g is a dummy variable for integration. Here u(s) is equivalent to the utilization distribution encountered in the literature on probabilistic estimators of animal home ranges (Van Winkle, 1975; Worton, 1989; Signer & Fieberg, 2020) and tells us how likely we are to find an individual at location s in geographic space. The utilization distribution, u(s), depends on the environmental covariates associated with location s, through w(x(s); β), and the distribution of available locations in geographic space, a(s). Typically, a(s) is assumed to be a uniform distribution within the geographical domain of availability, G (e.g., the individual’s home range, the population’s range, or the species range depending on the hierarchical level of habitat selection of interest; Johnson, 1980), and all areas within G are assumed to be equally available to the organism. Hence, a(s) is typically a constant, 1/|G|, that cancels from the numerator and denominator. Then, if we let w(x(s); β) = exp(xβ), we end up with the conditional likelihood of the Inhomogeneous Poisson process model (eqn. (4)) (Aarts et al., 2012).
Interpreting Parameters in Resource-Selection Functions
To demonstrate how the IPP and weighted distribution theory frameworks help with interpreting parameters in fitted resource-selection functions, we now consider a simple example using 3,004 locations of a fisher named Lupe tracked as part of a larger telemetry study (LaPoint et al., 2013a, 2013b). These data are publicly available and have been featured in a workshop highlighting Movebank’s Env-DATA system for annotating locations with environmental covariates (Dodge et al., 2013; Fieberg et al., 2018). The location data were combined with available points sampled randomly from within a minimum convex polygon (MCP) formed using Lupe’s locations. The used and available locations were then transformed to a projected coordinate reference system (NAD83 / Conus Albers) and annotated with environmental variables measuring population density (University & CIAT, 2005), elevation (U. S. / Japan ASTER Science Team, 2009), and landcover class (Defourny et al., 2009). The original landcover data were grouped to form a variable named landuseC with the following categories: forest, grass and wet (Fig. 1). We created centered (mean = 0) and scaled (SD = 1) variables labeled elevation and popden from the original elevation and population density variables. We also created an indicator variable, case_, taking on a value of 1 for all used points and 0 for all available points (later, we discuss how to choose the number of available points).
Distribution of used and available locations among different landscape cover classes for a fisher in upstate New York (LaPoint et al., 2013a, 2013b).
For ease of interpretation, we will begin by assuming the effects of elevation, population density, and landcover class are additive and linear on the log scale (eqn. (2)). Later, we will discuss how we can relax these assumptions using interactions to allow the effect of covariates to depend on the value of other habitat covariates and polynomials or splines to to relax the assumption of linearity. We assign a weight of 5000 to the available locations and a weight of 1 to all observed locations (Fithian & Hastie, 2013). We can then fit a weighted logistic regression model using the glm function in R:
Before interpreting the coefficients, it is important to make sure we have included a sufficient number of available points to allow parameter estimates to converge to stable values. To evaluate parameter stability, we fit logistic regression models to data sets with increasing numbers of available points (from 1 available point per used point to 100 available points per used point; see Supplementary Appendix A for the code). The intercept decreased as we increased the number of available points (as it is roughly proportional to the log difference between the numbers of used and available points), but the slope parameter estimates, on average, did not change much once we included at least 10 available points per used point (Fig. 2). Further, as expected, estimates varied less from sample to sample as we increased the number of available points. Thus, we conclude that, in this particular case, having 10 available points per used point is sufficient for interpreting the slope coefficients. Using more available points reduces Monte Carlo error, however, so we will proceed with 100 available points per used point.
Estimated parameters in fitted resource-selection functions using increasing numbers of available points. Each dot represents an estimate from fitting a logistic regression model to 3004 GPS telemetry locations combined with a random sample of available points, with sample size given by the x-axis (where 1 means 3004 available points and 100 means 300,400 available points).
Let’s consider the interpretation of the continuous covariates reflecting elevation and population density (Table 1, Model 1). Qualitatively, we might infer from the positive coefficient for elevation and negative coefficient for popden that, all other things being equal, Lupe is likely to select locations at higher elevations and in areas of lower population density. But, how do we interpret these coefficients quantitatively? Consider the following two locations, both in the same landcover class and with the same associated population density, but differing by 1 unit in elevation (since we have scaled this variable, a difference of 1 implies that the two observations differ by 1 SD in the original units of elevation):
location s1: elevation = 3, popden =1.5, landuseC = wet
location s2: elevation = 2, popden =1.5, landuseC = wet
Regression coefficients (SE) in fitted resource-selection functions fit to data from Lupe the fisher. Models 1 and 3 use forest as the reference level, Model 2 uses wet as the reference level. Model 3 includes interactions between elevation and landcover classes.
Using eqn. (6), we can calculate the relative risk of Lupe using location 1 relative to location 2:
where we have dropped the integral from eqn. (6) because it appears in both the numerator and denominator (and thus, cancels out). Now, if both locations are equally available, then a(s1) = a(s2), and we have:
In epidemiology, exp(β) is referred to as a risk or hazard ratio. In the context of habitat-selection analyses, Avgar et al. (2017) refer to it as quantifying relative selection strength (RSS).
Note that we would arrive at the exact same expression if we chose any two locations that differed by 1 unit of elevation and had the same values for popden and landuseC. Thus, exp(βelevation) quantifies the risk (or hazard) ratio of two locations that differ by 1 SD unit of elevation but are otherwise equivalent (i.e., they are equally available and have the same values of all other habitat covariates). If Lupe were to be presented with two such hypothetical locations, the model suggests she would be 1.35 times more likely to choose the one with the higher elevation. A similar interpretation can be ascribed to popden. Given two observations that differ by 1 SD unit of popden but are otherwise equal, Lupe would be exp(−0.183) = 0.833 times as likely to choose the location with higher population density (or, equivalently, exp(0.183) = 1.20 times more likely to choose the location with the lower population density).
What about the coefficients for the landcover categories? Looking again at the regression output (Table 1, Model 1), we see that grass has a negative coefficient and wet has a positive coefficient. It is tempting to infer that Lupe spends most of her time in wet areas and rarely spends time in grassy habitats. As Figure 1 makes it clear, however, these inferences are not exactly correct. First, it is important to understand how categorical predictors are encoded in regression models. There are a number of different ways to parameterize the effect of categorical variables and unfamiliar readers may want to work through an introductory regression text (e.g., Chapter 6 of Kéry (2010)). The default coding in R is to treat one of the levels (whichever comes first alphabetically) as a reference level and then to create a set of dummy variables that contrast the remaining levels of the categorical variable with this reference level. In our case, forest is the reference level. The coefficients associated with grass and wet represent contrasts between these land cover classes and the forest class.
Let’s again consider 2 locations, this time assuming they have the same elevation and population densities, but with one location in wet and the other location in forest :
location s1: elevation = 2, popden =1.5, landuseC = wet
location s2: elevation = 2, popden =1.5, landuseC = forest
The relative risk of an animal using location 1 relative to location 2 is given by (eqn. (6)):
Thus, assuming the two locations are equally available, we might infer that Lupe would be exp(0.250) = 1.28 times more likely to choose the wet location than the location in forest. Of course, we know from Figure 1 that forest and wet are not equally available on the landscape. The higher availability of forest habitat implies that Lupe is more likely to be in forest than wet. We could attempt to correct for differences in availability within the MCP surrounding Lupe’s locations by multiplying our result by the ratio of habitat availability for wet relative to forest habitats (2.3% versus 95.7%; Fig. 1), giving us exp(0.250)(0.023)/(0.957) = 0.03. This calculation suggests we are (1/0.03 = 33) times more likely to find Lupe in forest than wet habitat. With this calcualtion we had to assume, perhaps naively, that the availability distributions for popden and elevation were the same in both wet and forest cover classes. In reality, if Lupe decides to move from forest to wet, it is likely that she will experience a change in elevation and popden too (i.e., these factors will not be held constant). To quantify the relative risk of finding Lupe in forest versus wet habitat, while also accounting for the effects other environmental characteristics that are associated with these habitat types, we can use integrated hazards – i.e., we can integrate the spatial utilization distribution, u(s), over all forest and wet habitats:
where I(s ∈ forest) and I(s ∈ wet) are indicator functions equal to 1 when location s is in forest or wet, respectively (and 0 otherwise). We can estimate this ratio using:
where the sum is over the distribution of available locations.
This ratio is also equal to 33, which agrees with the observed data; Lupe was found in forest habitat 33 times more often than in wet habitat (see Supplementary Appendix A for code demonstrating how to calculate these quantities in R). Thus, we conclude Lupe is 33 times more likely to be found in forest than wet habitat, assuming she restricts her movements to the MCP surrounding her observed locations and all of this MCP is equally available to her.
Before moving on, it is important to note that naively-adjusted (multiplying by availability of wet and forest habitats) and integrated-hazards (i.e., adjusted) risk ratios will not always agree. In fact, we find that they differ when comparing the risk of finding Lupe in wet versus grass habitat, with the integrated-hazards risk ratio better agreeing with the observed data (see Supplementary Appendix A). Somewhat related, Avgar et al. (2017) suggested calculating average effects for continuous predictors, X, by comparing the change in relative risk from increasing X by 1 unit (to X = x + 1) to the average value of w(x(s); β) for all locations s with x(s) = x. These average effects will also be influenced by cross-correlations among predictor variables included in the model.
Instead of integrating u(s) over discrete cover types, we could integrate over a specific geographic area. In addition, we could choose to change the area of interest (and thus, area of integration) from G to . This approach makes it possible to use the same integrated hazards approach (i.e., eqn. (11)) to project how Lupe would spend her time in a novel environment (referred to as an “out-of-sample” prediction). Out-of-sample predictions often suffer from poor accuracy, especially when compared to “in sample” predictions, i.e., predictions for the same area and time frame from which the original data were collected (Torres et al., 2015; Yates et al., 2018). We return to this important point in the discussion section.
Let’s next consider what happens if we change the reference level of the land cover variable from forest to wet (Table 1, Model 2).
The coefficients for elevation and popden do not change. Note, however, that the coefficient for forest is negative despite Lupe using forest more than its availability (i.e., u(s, s ∈ forest) > a(s, s ∈ forest)) and Lupe spending more than 95% of her time in the forest! What is going on? Remember, the coefficients for categorical predictors reflect use:availability ratios for each level of the predictor relative to the use:availability ratio for the reference class. The coefficient for forest is negative because the use:availability ratio for forest is less than the use:availability ratio for the reference class, wet (see Fig. 1). Depending on the reference level, it is possible to have a positive (negative) coefficient even when that landcover class is used more (less) than its availability. Furthermore, it is possible for landcover class to be used frequently but have a negative coefficient.
We have seen many ecologists, including some that are very quantitatively skilled and familiar with habitat-selection models, make mistakes when interpreting coefficients associated with categorical predictors! This example also highlights the importance of plotting your data (e.g., Fig. 1) and considering habitat availability when interpreting regression coefficients. Plotting distributions of covariates for both used and available locations is one of the best ways to understand fitted habitat-selection models, and is a good strategy to use for both continuous and categorical predictors (Merow, Smith, & Silander, 2013; Fieberg, Forester, et al., 2018).
Interactions Between Environmental Predictors
Consider the distribution of elevation at used and available locations across the different habitat classes (Fig. 3). We see that there is a wider range of elevation in forest and wet habitat compared to grass habitat, and there is a clear association between elevation and landuseC, with higher median elevation at used locations in forest and grass habitat relative to wet habitat. Perhaps more importantly, we also see that values of elevation are higher, on average for used locations (compared to available locations) in forest and grass, whereas the opposite is true in wet habitat. Although we should be skeptical of interactions that we discover while exploring our data (i.e., interactions that were not specified a priori), an analyst may be tempted to include an interaction between elevation and landuseC. In Model 3 (Table 1), we revert to having forest as the reference level and include the interaction between elevation and landuseC.
Distribution of elevation at used and available locations within each of 3 landcover types.
Using this syntax, R creates two new variables elevation:landuseCgrass equal to elevation when landuseC is grass and is 0 otherwise and elevation:landuseCwet equal to elevation when landuseC is wet and is 0 otherwise. The coefficients associated with these predictors quantify the change in slope (i.e., change in the effect of elevation) when the locations fall in grass or wet relative to the slope when the locations fall in forest. Starting from eqn. (6) and using the estimates for Model 3 in Table 1, we can easily derive that the relative risk of choosing between two equally available locations that differ by 1 SD unit of elevation is equal to exp(0.313) = 1.37 when the two locations are in forest, exp(0.313 + 0.112) = 1.53 when the locations are in grass, and exp(0.313 − 0.499) = 0.83 when the locations are in wet habitat. Thus, we might conclude that Lupe would select for higher elevations when in forest or grass, but avoid higher elevations when in wet. Alternatively, we can consider how elevation changes Lupe’s view of the different landcover categories, noting that βgrass = −1.471 + 0.112elevation and βwet = 0.183 − 0.499elevation. Thus, we see that Lupe’s relative avoidance of grass (relative to forest) and selection for wet (relative to forest) both decline with elevation, and Lupe’s inherent ranking of these 3 habitat types will change as elevation increases.
Non-Linear Effects and Other Considerations
When building models, it is important to consider the functional relationships between different environmental characteristics and habitat use. For example, we may classify available predictors based on whether they represent resources (higher values are generally preferable), risks (lower values are generally preferable) or conditions (values that are not too high or too low are preferable) (e.g., Matthiopoulos et al., 2015). It is often useful to allow for non-linear effects of conditions by including quadratic terms or using a set of spline basis functions. In either case, we end up requiring multiple coefficients to capture how relative risk changes with the environmental predictor. Consider, for example, that we could include a quadratic term to model the effect of elevation. Estimating the relative risk for two locations, s1 and s2, that differ in their values of elevation but are otherwise equivalent would be straightforward using eqn. (6) - we would just need to calculate hazard ratios using coefficients for elevation and elevation 2:
Lastly, we note that Avgar et al. (2017) provide simple formulas for calculating risk or hazard ratios under a number of different scenarios (e.g., models with quadratic polynomials, log-transformed covariates, and models with interactions). The log_rss function in the amt package (Signer et al., 2019) relies on R’s generic predict function to aid the user in calculating the log of these hazard ratios for any combination of model structure and two alternative locations; its use is illustrated in Supplementary Appendix B. Understanding how these formulas are derived, however, helps build intuition and frees the user to construct estimators and estimation targets that capture relevant quantities of specific interest.
Statistical Independence
An important assumption of the IPP model, and hence, resource-selection functions, is that any clustering of spatial locations can be explained solely by spatial covariates. Strictly speaking, this assumption will almost never be met, particularly with modern-day telemetry studies that allow several locations to be collected on the same day. Telemetry observations close in time tend to also be close in space - i.e., telemetry observations exhibit serial dependence (Fleming et al., 2014). This serial dependence is likely to manifest itself in residual spatial autocorrelation that could be modeled using a spatial random effect or a spatial predictor constructed to account for the effects of movement constraints on habitat availability (Johnson, Hooten, & Kuhn, 2013). Models with spatial random effects are, however, more complicated and difficult to fit.
Alternatively, if telemetry observations are collected at regular time intervals, then the locations may be argued to provide a representative sample of habitat use from a specific observation window (Otis & White, 1999; Fieberg, 2007). In these cases, it may be helpful to view our estimates of the parameters in our resource-selection function, , as useful summaries of habitat use for tagged individuals during these fixed time periods. Nevertheless, the assumption of independence of our locations is clearly problematic and will lead to estimates of uncertainty that are on average too small. If we are primarily interested in population-level inferences, then we may choose to ignore within-individual autocorrelation when estimating individual-specific coefficients but use a robust form of SE that treats individuals as independent when describing uncertainty in population-level parameters (e.g., using a bootstrap; Fieberg, Vitense, & Johnson, 2020) or generalized estimating equations approach (e.g., Fieberg, Rieger, Zicus, & Schildcrout, 2009; Koper & Manseau, 2009; Fieberg, Matthiopoulos, Hebblewhite, Boyce, & Frair, 2010).
Step-Selection Functions
Step-selection functions were developed to deal with serial dependence as well as temporally varying availability distributions resulting from movement constraints (Fortin et al., 2005; Thurfjell et al., 2014). Rather than treat locations as independent and identically distributed (with availability that does not depend on time), step-selection analyses model transitions or “steps” connecting sequential locations (Δt units apart) in geographical space:
where u(s, t + Δt)|u(s′, t) gives the conditional probability of finding the individual at location s at time t + Δt given it was at location s′ at time t, w(x(s); β(Δt)) is referred to as a step-selection function, and ϕ(s, s′; γ(Δt)) is a selection-independent movement kernel that describes how the animal would move in homogeneous habitat or in the absence of habitat selection (i.e., when w(x(s); β(Δt)) = a constant for all s). Note that we represent the parameter vectors (β and γ) as functions of the step duration (Δt). This notation reflects the fact that step-selection parameters are scale dependent (i.e., different Δt’s will result in different estimates of β and γ; see Avgar et al. (2016) for more details). Thus, we generally require observations to be equally spaced in time, and care must be taken when comparing inference from models fitted at different temporal resolution (but see Munden et al., 2020).
As with resource-selection analyses, it is typical to model w(x(s); β(Δt)) as a log-linear function of spatial covariates and regression parameters, w(x(s); β(Δt)) = exp(X1(s)β1 + … Xp(s)βp). A key difference between resource-selection and step-selection analyses, however, is that the latter allow the available distribution to be time-dependent and equal to a(s, t + Δt) = ϕ(s, s′, γ(Δt)). Consequently, step-selection analyses allow explicit consideration of temporally dynamic environmental covariates, x(s′, t) and x(s, t + Δt) (and, possibly, environmental covariates measured along the path between these two locations). One option that often performs well and enhances interpretability is to include habitat covariates at the start of the movement step in the model for ϕ and habitat covariates at the end of the movement step in the model for w, resulting in a more general formulation: w(x(s, t + Δt); β(Δt))ϕ(s, s′; γ(Δt, x(s′, t))); we provide an example in Supplentary Appendix B.
Models for ϕ(s, s′; γ(Δt))
Step-selection approaches build on an early idea by Arthur, Manly, McDonald, & Garner (1996) to model time-dependent availability via a circular buffer with radius R centered on the previous location. Rhodes et al. (2015) showed that this model is equivalent to assuming:
where ‖s − s′‖ is the Euclidean distance between locations s and s′, referred to as the step length. Rhodes et al. (2015) also demonstrated that circular buffers imply that individuals are more likely to move large distances than short distances since there is more area, and thus probability, associated with outer rings of the circle. Instead, they suggested using an exponential distribution to accommodate right-skewed step-length distributions and a tendency for animals to make shorter rather than longer movements:
Rather than specify a model directly in terms of ϕ(s, s′; γ(Δt)), it is more common to see movement kernels specified in terms of the distribution of step lengths, d = ‖s − s′‖, and turn angles (changes in direction from the previous bearing), θ. In the sections that follow, we will let g(d; γd(Δt)) and f (θ; γθ(Δt)) represent step-length and turn-angle distributions, respectively. Step-selection analyses frequently use either an exponential or gamma distribution for g(d; γd(Δt)). Turn angles may be assumed to be uniformly distributed as in Arthur et al. (1996) and Rhodes et al. (2015). Alternatively, circular distributions, such as the von Mises distribution or wrapped Cauchy or Weibull distributions, allow for a mode at 0 and can thus accommodate correlated movements (i.e., sequential steps are assumed, on average, to follow in the same direction as the previous step).
Although step-length and turn-angle distributions are typically assumed to be independent, animals commonly exhibit a mix of of temporally persistent movement behaviors, ranging between high- displacement movements (e.g., when traveling between habitat patches, migrating, or dispersing) and low-displacement movements (e.g., during foraging or resting bouts). If positional data are collected more frequently than the occurrence of behavioral switches, we might expect a negative cross- correlation between step lengths and turn angles (moving far is likely to coincide with moving straight) and a positive auto-correlation between the current and previous step lengths and turn angles. Moreover, as implied by the more flexible formulation, w(x(s, t + ΔT); β(Δt))ϕ(s, s′; γ(Δt, x(s′, t))), both step-length and turn-angle distribution may shift as a function of spatial and/or temporal covariates such as habitat permeability (e.g., elevation ruggedness, snow depth, or vegetation density), time of day, season, and predation risk (Avgar, Mosser, Brown, & Fryxell, 2013; Avgar et al., 2016). Thus, although ϕ is a “selection-independent” movement kernel, it may still depend on environmental or temporal covariates, and hence, may vary through space and time, resulting in both auto and cross correlations in step attributes.
Cross-correlation between step lengths and turn angles is difficult to model with common statistical distributions, but could be accommodated using copulae (Durante & Sempi, 2010). Alternatively, one could resample (i.e., bootstrap) step length and turn angle pairs, (dt, θt), to preserve any correlation that is present in the data (Fortin et al., 2005). Although we generally find the bootstrap appealing (Fieberg et al., 2020), it has limitations in this context. In particular, the observed distribution of step lengths and turn angles will reflect both inherent movement characteristics of the species (captured by ϕ) as well as habitat selection (captured by w). Using the observed steps as a non-parametric model for ϕ without adjustment for the effect of w can result in biased estimates of β (Forester, Im, & Rathouz, 2009). We will return to this point in the next section. As mentioned previously (see Statistical Independence), and regardless of the source of correlation, it may be preferable to calculate robust SEs by treating individuals as the relevant sampling unit when performing population-level inference (e.g., Prima, Duchesne, & Fortin, 2017). Lastly, cross- and auto-correlations in step lengths and turn angles, as well as their dependencies on various temporal or environmental characteristics, could be modeled parametrically using an integrated step-selection function (Avgar et al., 2016). To do so, we need to include appropriate statistical interactions (e.g., between concurrent and previous step lengths/turn angles and between these step-attributes and environmental or temporal covariates). We discuss this process further below, and provide examples in the Supplementary Appendix B. See also Prokopenko, Boyce, & Avgar (2017), Scrafford, Avgar, Heeres, & Boyce (2018), and Dickie, McNay, Sutherland, Cody, & Avgar (2020).
Estimation of Movement and Habitat-Selection Parameters
Although it is possible to simultaneously estimate movement (γ) and habitat selection (β) parameters using maximum likelihood (e.g., Rhodes et al., 2015) or Bayesian methods (e.g., Johnson et al., 2008), this is rarely done in practice as it would require custom-written code. Instead, it is common to use the following approach:
Estimate or approximate ϕ(s, s′; γ(Δt)) using observed step lengths and turn angles, giving
.
Generate time-dependent available locations by simulating potential movements from the previously observed location, u(t, s′). Similar to applications of RSFs, it is up to the user to decide how many available locations to sample for each used location, and, due to similar considerations (properly approximating the availability domain: a(s, t + Δt) = ϕ(s, s′; γ(Δt)), the more points the merrier.
Estimate β using conditional logistic regression, with strata formed by combining time- dependent used and available locations.
If we knew ϕ(s, s′, γ(Δt)) and could simulate directly from it (skipping step 1), then this approach would provide unbiased estimates of β (Forester et al., 2009). However, as mentioned in the previous section, estimating ϕ(s, s′; γ(Δt)) from observed steps without adjusting for w(x(s); β(Δt)) can lead to biased estimates of γ and β.
Forester et al. (2009) considered the case where the step-length distribution, g(d, γd), is given by an exponential distribution with unknown parameter, λ. They showed that estimating λ directly from the observed distribution of step lengths (without adjusting for the effect of w(x(s); β)), and then proceeding with steps 2 and 3 results in a biased estimators of β, but that the bias (if g(d, γd) is given by an exponential distribution) is eliminated if log(dt) is included as a predictor in the model. Avgar et al. (2016) further showed that the coefficient associated with log(dt) could be used to modify , leading to an unbiased estimator of λ and thus, g(d, γd). In addition, they showed how similar adjustments could be used to obtain unbiased estimators of step-length (γd) and habitat-selection (β) parameters when the distribution of step lengths is given by a gamma, half-normal, or log-normal distribution. Similarly, Duchesne, Fortin, & Rivest (2015) showed that including cos(θ) as a predictor can lead to unbiased estimators of turn angle parameters (γθ) when the distribution of turn angles follows a von Mises distribution. These adjustments are available in the amt package for the exponential, gamma, and von Mises distributions (Signer et al., 2019). Avgar et al. (2016) coined the term integrated step-selection analysis to emphasize that these results provide new opportunities to model both movement and habitat selection via tried and true statistical software for fitting conditional logistic regression models.
In Supplementary Appendix B, we provide a “How to” guide for implementing an integrated step-selection analysis using the amt package in R (R Core Team, 2019; Signer et al., 2019). Conducting an integrated step-selection analysis requires, in addition to the 3 steps outlined in this section, that we add a fourth step that re-estimates the movement parameters in ϕ(s, s′; γ(Δt)) using regression coefficients associated with movement characteristics (e.g., log(dt), cos(θ)). This last step adjusts the parameters in ϕ(s, s′; γ(Δt)) to account for the effect of habitat selection when estimating the movement kernel (Avgar et al., 2016); this step is unnecessary if no inference about movement is being made. Importantly, interactions may be included between movement characteristics (e.g., log(dt), cos(θ)) and environmental covariates, x(s′, t), to allow the movement kernel to depend on the environment. When interactions are included, step 4 results in a movement kernel, ϕ(s, s′; γ(Δt, x(s′, t))), that depends on the habitat the animal is in at the start of the movement step (Fig. 4).
Step-length and turn-angle distributions from an integrated step-selection analysis applied to Lupe’s location data (see Supplementary Appendix B). The conditional logistic regression model included interactions between movement characteristics (step length, log step length, and cosine of of the turn angle) and the landuse category Lupe was in at the start of the movement step. We see that Lupe tends to take larger, more directed steps when in grass and slower and more tortuous steps in wet habitat.
Interpretation of Parameters in an Integrated Step-Selection Analysis
The habitat-selection parameters can be interpreted in the same way as parameters in resource- selection functions (i.e., as spatial hazards, assuming locations are equally available and differing in terms of a single habitat covariate). Hence, the ln(RSS) expressions in Avgar et al. (2017), and the log_rss function in amt, are suitable for calculating and interpreting the effects of the various habitat covariates. However, it is important to recognize that the used and available distributions in step-selection analyses are dynamic and non-uniform in space. In particular, they depend an individual’s current location and movement tendencies (as well as the observed time scale determined by Δt; Barnett & Moorcroft, 2008; Signer, Fieberg, & Avgar, 2017). Thus, questions that require integrating hazards over space (e.g., eqn. (10)) are more difficult to address, but may be computed using simulation modeling (Signer et al., 2017), by solving the master equation (formed by multiplying the right hand side of eqn. (13) by u(s′, t) and then integrating over G with respect to s′) for its steady state (Potts et al., 2014a, 2014b), or in some cases, by translating the fitted model into a partial differential equation model with analytical steady-state distribution (Potts & Schlägel, 2020). We also note that alternative modeling frameworks exist with parameters that directly describe long-term relative risk (e.g., Michelot et al., 2019b, 2019a; Michelot, Blackwell, Chamaillé-Jammes, & Matthiopoulos, 2020), but these methods are more computationally challenging to implement, and therefore, less likely to be widely used in applied settings. The amt package has a basic capacity to simulate the utilization distribution based on a parameterized integrated step-selection function (Signer et al., 2017), and we expect this approach to become more flexible in the near future.
Using an integrated step-selection approach (e.g., as in Fig. (???)(fig:movekern)), it is also possible to draw ecological inference using the selection-free movement kernel. For example, the fitted step-length and turn-angle distributions can tell us how much more likely an animal is to take large versus small steps or to turn left or right relative to moving straight. We can also calculate moments of these distributions under different environmental conditions, which can be informative when our models include interactions between movement characteristics and environmental predictors. For example, we could calculate the expected selection-free displacement rates (and/or directionality) as function of local snow depth (that is, if snow depth was included in our model as an interaction with step length). To calculate these expected values we must first adjust the ‘tentative’ parameters used to sample available steps (e.g., if we use a gamma distribution, the tentative shape and scale parameters) using the coefficient estimates obtained for step length (and/or its transformation) and cos(turn angle). The details of how to carry on these adjustments are provided in Supplementary Appendix C and in Avgar et al. (2016). Once the selection-free movement parameters are obtained, one can use them to calculate various aspects of the (theoretical) distributions of step lengths and turn angles, such as the mean, the median, or the 95% confidence bounds (see Supplementary Appendix B for examples).
Discussion
We have highlighted how connecting resource-selection functions to IPP models and weighted distribution theory helps with interpreting parameters in resource-selection functions using simple examples. We have also reviewed step-selection analyses and demonstrated how to estimate movement and habitat-selection parameters when conducting an integrated step-selection analysis using the amt package. So far, we have focused on interpreting results when analyzing data from a single individual. We end with a brief discussion addressing statistical dependencies, particularly when analyzing data from multiple individuals, along with issues related to model transferability and parameter sensitivity to changes in habitat availability and species population density.
Statistical Dependencies
Earlier, we highlighted the importance of statistical independence as it applies to individual locations when estimating resource-selection functions. We also noted that step-selection analyses typically assume step lengths and turn angles are independent of each other and also over time, though it is possible to account for these correlations using appropriate interactions (e.g., between step length at time t and time t − 1, step length and turn angle both at time t). It would be nice to have multivariate distributions available that are capable of describing correlated step lengths and turn angles and any inherent autocorrelation. It is plausible, however, that models that allow movement parameters to vary by habitat type, using interactions between step length, turn angle, and habitat covariates, will be able to account for much of the autocorrelation and cross-correlation (between step lengths and turn angles) present in the data. Similarly, autocorrelation and cross-correlations may be accommodated by models that include a (possibly latent) behavioral state, with movement and habitat-selection parameters that are state-dependent (Nicosia, Duchesne, Rivest, Fortin, & others, 2017; Suraci et al., 2019).
In addition to cross-correlation between step lengths and turn angles and serial dependencies, individuals living in different environments may exhibit different habitat-selection patterns and thus, repeated observations on the same set of individuals will induce further statistical dependencies. A simple strategy for dealing with repeated measures when individuals can be assumed to be independent is to fit models to individual animals and then treat the resulting coefficients as data when inferring population-level patterns (Murtaugh, 2007; Fieberg et al., 2010). For example, sample means of the regression coefficients can be used to characterize average habitat-selection parameters. Estimating among-animal variability is trickier due to sampling error; naively ignoring sampling error will lead to a positive bias in estimates of among-animal variability, but more formal two-step methods can address this issue (Craiu, Duchesne, Fortin, & Baillargeon, 2011, 2016; Dickie et al., 2020). Alternatively, generalized linear mixed models with random coefficients can be used to quantify among-animal variability in resource-selection and step-selection analyses (Muff, Signer, & Fieberg, 2020).
Although it is possible to conduct integrated step-selection analyses with hierarchical models containing random effects, we have much to learn about how these approaches perform in practice. For example, Muff et al. (2020) found that parameters describing among-animal variability in habitat-selection parameters were biased low when movement characteristics were included in the model. Mixed-effect models with random coefficients are also “parameter hungry”, requiring p(p + 1)/2 variance and covariance parameters to be estimated, where p is the number of random coefficients. Models that allow all coefficients to be animal-specific and to covary are thus likely to be computationally challenging to fit and problematic for small data sets containing only a few individuals. For this reason, Muff et al. (2020) assumed coefficients did not covary in their applied examples. In the context of our fisher analysis, this equates to assuming that knowing an individual’s coefficient for popden tells us nothing about that animal’s parameters for elevation or landuseC variables. For categorical variables, it is natural to expect parameters to have a negative covariance (since, for example, spending more time in forest must come at the expense of spending less time in other landuse categories). Research evaluating the performance of mixed-effect step-selection analyses under various data-generating scenarios would be helpful for evaluating robustness to assumption violations (e.g., those regarding the distribution of random parameters).
Sensitivity of Selection Coefficients to Species Population Density and Habitat Availability
Before concluding, we feel it is important to briefly discuss the oft observed pattern of density and availability dependence in habitat-selection inference (Mysterud & Ims, 1998; Matthiopoulos, Hebblewhite, Aarts, & Fieberg, 2011; Matthiopoulos et al., 2015). Density-dependent inference may be observed when the same analysis is applied to individuals or populations of the same species, under similar environmental conditions, but at different population densities. Availability dependence (also referred to as a “functional response”) may be observed when the same analysis is applied to individuals or populations of the same species, which experience different landscape-scale resource or habitat availabilities. For example, van Beest, McLoughlin, Mysterud, & Brook (2016), found that individual elk display availability-dependent resource-selection patterns (switching from selection to avoidance of certain habitats as function of the availability of these habitats within their home range), but that the strength of this functional response depended on elk population density. Such context dependencies are in fact so common that we do not know of a single instance where researchers were looking for them and failed to find them. Recently, Avgar, Betini, & Fryxell (2020) showed that such context dependencies in habitat-selection patterns are expected to emerge even under the simplest theoretical model of an Ideal Free Distribution (Fretwell, 1969). Thus, habitat-selection models often have poor predictive capacity when transferred across different study areas, or even within the same area over time (e.g., Torres et al., 2015). Yet, these differences may also be exploited; modeling frameworks that leverage data from multiple environments and across a range of population densities can potentially increase predictive capabilities (Matthiopoulos, Field, & MacLeod, 2019). As with any other attempt to model complex ecological data, critical evaluation of model performance for both within and out-of-sample data is essential (Fieberg, Forester, et al., 2018).
Authors’ Contributions
JF developed the idea for the review, led the writing of the manuscript, and drafted the initial version of Supplementary Appendix A; B.S. and J.S. drafted the initial version of Supplementary Appendix B; B.S. and T.A. drafted the initial version of Supplementary Appendix C. All authors contributed critically to the manuscript text and Additional Files, and gave final approval for publication.
Data Availability
All of the data used in this paper are available from within the amt package (Signer et al., 2019).
Supporting Information
Supplementary Appendix A: AppA_RSF_examples.html, a tutorial demonstrating how to fit and interpret parameters in resource-selection functions.
Supplementary Appendix B: AppB_SSF_examples.html, a tutorial demonstrating how to fit and interpret parameters and output when conducting an integrated step-selection analysis.
Supplementary Appendix C: AppC_iSSA_movement.html, a description of methods used to adjust ‘tentative’ parameters in step-length and turn-angle distributions for the effects of habitat selection.
Acknowledgements
We thank J.R. Potts for helpful comments on a previous draft. JF received partial salary support from the Minnesota Agricultural Experimental Station. TA received partial salary support from the Utah Agricultural Experimental Station.
Footnotes
Light edits, minor change to "more flexible" notation for w()phi(), reformatted.