Assessing bias and robustness of 1 social network metrics using GPS 2 based radio-telemetry data

Abstract

of the more commonly reported network metrics, as calculated on interactions data constructed 74 from observations of individuals. 75 One of the fundamental requirements for performing social network analysis on animals is that 76 a substantial portion of individuals in the population is uniquely identified and observed for a suf-77 ficient period (Farine and Whitehead, 2015). Recent  Therefore, data representing a large sample size is a significant limitation, especially while 90 analysing social networks (He et al., 2022). This is a concern as the relations among the members  It is therefore crucial that animal social network studies consider the robustness of current 99 methodological approaches to data rarefaction and randomization, as both the collected data and 100 analytical methods are prone to biases inflicted by specifics of sampling protocols or the species 101 under study (Sosa et al., 2021a). Networks constructed using a random subset of population are 102 termed as partial networks (Silk et al., 2015). The effect of using partial networks on the proper-103 ties of individual metrics in animal social network analysis has received so far little attention (Croft 104 and twitter reply networks (Bliss et al., 2014)). Another simulation study (Silk et   Despite the importance of such findings, since then research has not progressed in this field. In 121 particular, there is the necessity to understand the level of confidence, and associated bias, with 122 whom the current methods adopted to estimate partial network from subsampled populations 123 actually catch the structure of the real-world animal social network (Farine and Whitehead, 2015; 124

Sosa et al., 2021b; Silk et al., 2015)
. 125 Our paper aims to present methods that can assess the sufficiency (i.e., based on estimated 126 bias and uncertainty) of the available data sample to perform social network analysis and obtain 127 a measure of accuracy for global and node-level network metrics (Farine and Whitehead, 2015). 128 Our approach is particularly suited (but not limited) to telemetry relocations considering their au-  1. The first step is to determine if the network structure obtained from the available sample of 133 GPS observations captures any non-random aspects of the association. For this, we generate 134 null networks by permuting a pre-network data stream. If a specific network metric does not 135 meet this requirement, it should be discarded by researchers in their specific study case.   We also assess uncertainty by obtaining confidence intervals around the values of observed 144 network statistics with the help of bootstrapping, which is also critical when it comes to com-145 paring networks (e.g., daily or seasonal changes in sociality, or between two populations of 146 the same species.) 147 4. The fourth and final step is to check how the node-level network metrics are affected by the 148 proportion of individuals present in the sample. We use correlation and regression analyses 149 to assess the robustness of node-level characteristics. 150 We conclude our paper by outlining the methods described above and provide a step-wise 151 protocol for ecologists on the application of these on their datasets. We have recently published 152 a companion R software package aniSNA (Kaur, 2023), which serves as a ready-made toolkit for 153 ecologists to apply the methods described in this paper in their animal social network studies.

155
Data 156 We collated high-frequency GPS telemetry relocations' datasets from five species of ungulates, 157 namely caribou (Rangifer tarandus), elk (Cervus canadensis), mule deer (Odocoileus hemionus), pronghorn 158 (Antilocapra americana), and roe deer (Capreolus capreolus) belonging to four different geographi-159 cal regions (Table 1). These large datasets consist of observations from a proportion of individuals 160 sampled from the population and contain a unique animal identity number, date, time, and spatial 161 coordinates of the observations. 162 Table 1. Summary of the data available for five species of ungulates monitored using satellite telemetry in North America and Europe.

163
Identifying Associations from Raw GPS data 164 We obtained network structure from the raw data stream by identifying associations between each 165 pair. We considered a pair of individuals in the sample to be associating if the two animals were 166 observed within s metres from each other and within a time frame of t minutes. The value of spatial 167 threshold s can be chosen by applying a statistical approach to the observed data. He et al. (2022) 168 suggest one such approach could be to use the first mode from the distribution of inter-individual 169 distances as it likely represents socially associating individuals. The temporal threshold t is dictated 170 by the fix rates in telemetry data. For example, GPS collars on animals send signals consisting 171 of spatial coordinates after a predetermined time interval. These signals can be received a few 172 seconds (up to a few minutes) before or after the expected time. Therefore, temporal thresholds 173 should be chosen in such a way that it accounts for this flexibility. Researchers should generally 174 pick a threshold based on their device accuracy, species ecology, and research question.  To assess if the interactions captured by the observed sample were genuinely caused by social 206 preferences, we generated null models. Null models were constructed to account for non-social 207 factors that lead to the co-occurrence of animals. In animal social network analysis, null models 208 are broadly classified in two ways: network permutations and pre-network permutations (Farine,209 2017). Network permutations are performed after the network is generated from the data, whereas 210 pre-network permutations are performed on the data stream before generating networks from it.

211
GPS telemetry observations generate data in the form of autocorrelated streams. In the permuted 212 versions of the data, we wanted to maintain this autocorrelation structure of each individual's 213 movements but randomize the contacts. Therefore, we obtained pre-network datastream permu-    226 We randomly sub-sampled nodes from the observed network of nodes where < without 227 replacement. All the associations among the sampled nodes were preserved, and the rest were 228 dropped. This resulted in a network structure that would have been obtained if originally just these 229 individuals had been tagged from the population. In this way, we drew 100 samples of size 230 where the value of ranged from 10% to 90% of the total nodes forming each network for five that should be adopted for social network studies on the available samples. 236 We also applied a sub-sampling approach on the permuted networks to determine under what 237 sampling level the observed networks start to resemble the random networks. We sub-sampled   In each replication, edges between two different resampled nodes were retained, whereas edges 253 between the same node resampled twice were sampled uniformly at random from the set of all 254 original edges. Therefore, each bootstrap replication network comprised the same number of 255 nodes (animals) as the original network; however, some of the original nodes were absent, some 256 were present once, and some more than once.

257
Bootstrapping has been used to infer uncertainty in animal social networks (see Lusseau et al. 258 (2008), Whitehead (2008)), however bootstrapping social network data should only be used care-259 fully as zero edges (which could result from unobserved associations rather than two animals not 260 associating at all) are resampled as zeros across all replications (Farine and Carter, 2022). We, 261 therefore, began by assessing whether such algorithms were appropriate for constructing confi-262 dence intervals for our chosen global network metrics. In particular, we wished to ensure that 263 the confidence intervals were not too narrow, as this would lead to false positives in comparing 264 10 of 36 networks i.e. finding statistically significant differences where there may be none and therefore 265 having an inflated Type 1 error rate. For example, consider a scenario in which two ecologists inde-266 pendently sample a random subset of individuals from the same population of animals. They then 267 construct two social networks and compute network metrics, along with confidence intervals using 268 some statistical methods. If they compare their results and obtain statistically significant results 269 (e.g., via a t-test) and conclude that the two social structures are different, they have made a Type 270 1 error because, the two samples come from the same population and should therefore not ap-271 pear statistically significantly different. Any differences are due to the random subsampling of the 272 individuals, but the social structure of the population is the same. We, therefore, wish to confirm 273 that our proposed bootstrapping approach does not exhibit such problematic behaviour (i.e., that 274 analyses based on two subsamples from the same population yield test statistics and p-values con-   284 To assess the accuracy of the node level metrics inferred from a given sample, we looked at the 285 correlation between the values of the metrics in the observed sample, and a smaller sub-sample 286 of the empirical data as suggested by Silk et al. (2015). First, we calculated node-level metrics of 287 degree, strength, betweenness, clustering coefficient, and eigenvector centrality for each node in 288 the observed network. Then we sub-sampled nodes from the observed network at 10%, 30%, 50%, 289 70%, and 90% levels without replacement and calculated node level metrics for each sub-sample. Lastly, we run a regression analysis (Silk et al., 2015) to assess how the values of node-level 295 metrics for partial networks relate to their values in the whole network (See Appendix 2: Regression 296 analysis between node level metrics of sub-sampled and observed networks in Appendix).

299
Spatial threshold is selected to be 10 meters for mule deer sample and 15 meters for rest of the 300 species samples. The temporal threshold is arbitrarily chosen to be 7 minutes and accounts for 301 delays in signal reception by the GPS devices. For example, if a GPS unit records a location at 09:57 302 AM, the observations recorded until 10:04 AM will be evaluated for potential interactions. Table 3   303 shows the values of network summary statistics for each of the five species. The mule deer sample 304 has the highest mean degree and very high mean strength suggesting a dense social network. The 305 elk sample has a maximum diameter with a value of 9 which implies that it will take a maximum 306 of 9 steps to reach from any individual to another in the elk network. The pronghorn sample has 307 the maximum transitivity and mean local clustering coefficient, depicting that any two associates 308 of a pronghorn are likely to be associated to each other. Figure 1 shows the network structures 309 obtained for all five species. and transitivity tends to overlap the null distribution at 50% and 30% levels, respectively.

375
The correlation of all network metrics between the sub-sampled and observed network declined  fore, the inferences generated may not reflect true characteristics and can be highly sensitive to 416 these decisions (Ferreira et al., 2020; Castles et al., 2014). Furthermore, the information available 417 about the sampling protocols may be incomplete or may not be available at all. However, this 418 does not imply that social network analysis should not be conducted on such data. It is prime to 419 use statistical methods that would help extract as much information as possible, along with details 420 about the uncertainties due to partial data and sampling strategies. Performing permutations to 421 randomize autocorrelated GPS data stream (Farine, 2017; Spiegel et al., 2016) is a first step to 422 distinguish the best network metrics that capture the non-random aspects of social interactions. Different network metrics capture different aspects of the network; some networks may have more 424 non-random elements than others, depending on the species' sociality and the sampling strategies 425 adopted to collect the data. The analyses helped highlighting the network metrics that distinctively 426 capture these non-random aspects. Based on these analyses, we recommend using the network 427 metric mean strength as an assessment metric to identify if the captured interactions are signifi-428 cant enough to generate reliable analysis results. Apart from the four network metrics we chose 429 to work with, it is helpful to run this analysis on multiple network metrics that seem suitable given  (Lusseau et al., 2008; Farine and Strandburg-Peshkin, 2015). 438 As a general rule, the smaller the sample size, the more considerable uncertainty can be expected  , 2008; Lusseau et al., 2008). 458 We presented bootstrapping as a powerful approach to evaluate confidence intervals around the not be used with telemetry data, which is typically used to monitor a small proportion of the actual 486 population. Also, we used data from multiple species of large herbivores with very different ecology 487 and characteristics, including migratory/ non-migratory and from very social to solitary species.

488
During the analysis, we voluntarily disregarded the ecology of the species, because our goal was 489 not to perform inter-species comparisons and make inferences on their respective social networks 490 but to determine which social network metrics perform well/poorly across the species.

491
Despite our a priori disregard of the ecology of the five species for the reasons stated above, 492 we found interesting differences among them which deserve to be discussed here. Firstly, the 493 fact that the data collected from a more solitary species such as roe deer (See Table 3) better cap-494 ture the non-randomness of the association compared to more gregarious species such as elk 495 suggests that sample size (in proportion to the actual population size) should be higher in more 496 gregarious species. In addition, sampling regimes can affect the social network patterns and re-497 lated ecological inference. For example, a high-density value of the roe deer network as compared 498 to the distribution of null networks could be due to the fact that the sampling was done across six 499 spatially separated capture sites (within 10 x 10 km). This results in very low density values when 500 the data is permuted across these six clusters (Figure 7). Instead, the mule deer's initial locations 501 (Figure 7) show that the network is already very dense. In the permuted versions of the raw data,  (Table 3). In contrast, the roe deer network has the lowest mean strength and mean degree, explained by their six spatially separated capture sites. longer duration with low temporal resolution or a shorter duration with high temporal resolution.

517
In our analysis, sub-sampling on the observed samples is done randomly. However, this aligns 518 differently from the sampling strategies adopted in real life. Smith and Morgan (2016) investigate 519 the effects on estimates of key network statistics when central nodes are more/less likely to be 520 missing. Application of our methods to determine how the network metrics scale when a different 521 sampling strategy is adopted would be valuable (e.g., whether it is better to sample entire groups, 522 or focus on greater sampling frequency of individuals). Another vital direction forward is to assess 523 the methods presented in this paper to be tested on the GPS telemetry data of the entire pop-524 ulation. If data on the whole population is available (e.g., a fenced one), it will be interesting to 525 perform these methods on a subset of that and test if the predictions align with the true values.

526
Along with all of the advantages to understand animal ecology, SNA presents certain challenges 527 that hinder ecologists from using it to its full extent. We addressed a few of those challenges in this 528 paper and introduced a four-step paradigm to assess the suitability of available data for SNA and 529 26 of 36 extract information for further analysis. The methods are also provided as easy-to-use functions in 530 an R package aniSNA (Kaur, 2023). This package allows ecologists to directly apply these statistical 531 techniques and obtain easily interpretable plots to provide statistical evidence for choosing a par-