Generative Adversarial Networks (GAN) for the simulation of central-place foraging trajectories

Miniature electronic device such as GPS have enabled ecologists to document relatively large amount of animal trajectories. Modeling such trajectories may attempt (1) to explain mechanisms underlying observed behaviors and (2) to elucidate ecological processes at the population scale by simulating multiple trajectories. Existing approaches to animal movement modeling mainly addressed the first objective and they are yet soon limited when used for simulation. Individual-based models based on ad-hoc formulation and empirical parametrization lack of generability, while state-space models and stochastic differential equations models, based on rigorous statistical inference, consist in 1st order Markovian models calibrated at the local scale which can lead to overly simplistic description of trajectories. We introduce a ‘state-of-the-art’ tool from artificial intelligence - Generative Adversarial Networks (GAN) - for the simulation of animal trajectories. GAN consist in a pair of deep neural networks that aim at capturing the data distribution of some experimental dataset, and that enable the generation of new instances of data that share statistical similarity. In this study, we aim on one hand to identify relevant deep networks architecture for simulating central-place foraging trajectories and on the second hand to evaluate GAN benefits over classical methods such as state-switching Hidden Markov Models (HMM). We demonstrate the outstanding ability of GAN to simulate ‘realistic’ seabirds foraging trajectories. In particular, we show that deep convolutional networks are more efficient than LSTM networks and that GAN-derived synthetic trajectories reproduce better the Fourier spectral density of observed trajectories than those simulated using HMM. Therefore, unlike HMM, GAN capture the variability of large-scale descriptive statistics such as foraging trips distance, duration and tortuosity. GAN offer a relevant alternative to existing approaches to modeling animal movement since it is calibrated to reproduce multiple scales at the same time, thus freeing ecologists from the assumption of first-order markovianity. GAN also provide an ultra-flexible and robust framework that could further take environmental conditions, social interactions or even bio-energetics model into account and tackle a wide range of key challenges in movement ecology.

1 Introduction 28 the maximization of the likelihood of observed trajectories at local scales. It typically comes to estimating the joint 82 distribution of step distance and heading turning angle from GPS tracks sampled at regular time intervals. 83 A wide range of probabilistic models can be restated as the composition of deterministic function G and of the sampler 84 of a random latent variable. We may illustrate this point for correlated random walk (CRW) as presented in Patterson 85 et al. (2008). Let is denote by s t and φ t the step length and the heading at time t. The CRW can be written as: where F and H are cumulative density functions with parameters θ F and θ H , often chosen as Log-Normal and Von

87
Mises distributions , and where z i G and z i H are independent samples from the uniform distribution over [0,1].

88
The generative model in a GAN also relies on the application of a deterministic function G to random samples of a la-89 tent variable z according to a predefined distribution. Function G is chosen within a parametric family of differentiable 90 functions and implemented as a neural network for flexibility. The other major difference with statistical inference 91 approaches classically exploited in movement ecology lies in the calibration approach from data. Rather than stating 92 the calibration as the maximization of a likelihood criterion, the calibration of the generator of a GAN involves the 93 simultaneous training of another deep network D (referred as the discriminator) that learns how to distinguish simu-94 lated data (i.e. G(z)) from real data. The architecture of GAN is given in Fig. 1A of generator to put a time series of positions (see Fig. 2).

116
We can also exploit a LSTM for the discriminator. Given a sequence of positions (longitude, latitude), the LSTM acts 117 as an encoder of this sequence in some higher-dimensional latent space. A dense layer was then applied to assign a 118 probability of being realistic at each position of the sequence. Overall, the output of the discriminator is the associated 119 mean probability to assess the quality of the whole trajectory (see Fig. 2).

133
Regarding the CNN-based discriminator, we also applied successive strided convolutions in order to transform the 134 initial trajectory into time-series with decreasing lengths and increasing numbers of features, until we obtained a latent 135 vector describing the whole trajectory. We used batchnorm and LeakyReLU activation after every strided convolutions 136 as suggested by (Radford et al., 2016). The last layer is a dense layer with a sigmoid activation that transforms the 137 latent representation into a probability for the trajectory of being realistic (see Fig. 2).
138 Figure 2: LSTM vs CNN : Architecture of LSTM and CNN networks used in this study.

Adversarial training and spectral regularization 139
For a given architecture, networks' parameters are estimated using adversarial training, i.e. the two networks compete 140 in a minimax two-player game given by Eq. 2. Discriminator D is trained to maximize the probability of assigning 141 the correct label to both training examples and samples from G, i.e. to maximize log D(x) + log(1 − D(G(z))).
Numerically, we apply stochastic gradient descents over the discriminator and generator successively where at each 144 iteration, we compute the training losses for a randomly sampled subset of m trajectories within the training dataset 1 : We may complement the training loss of the generator with additional terms, including both application-specific (Ledig , 2020). We tested here a similar approach with the following spectral loss L spectral to the generator's gradient descent: to increase learning stability. Details on the structure is available on our github repository 1 .

173
HMM For comparison we fitted a 'state-of-the-art' state-switching HMM to seabirds CPF trajectories. We followed 174 the methodology presented by (Michelot et al., 2017), which relies on a rigorous statistical inference.

175
Movements were described as a sequence of step lengths and turning angles that we fitted with gamma distribution 176 and von Mises distribution respectively. Three behavioural states were used for the Peruvian datasets i.e., "searching",

177
"foraging" and "inbound", while a fourth state was added with the Brazilian dataset i.e., "resting" (Fig. 6). For states 178 "searching", "foraging" and "resting", we described movement as correlated random walks (CRW), while for state 179 "inbound" we used a biased random walk (BRW) with attraction toward the colony. In order to force the return to 180 the colony, we fixed some terms of the transition matrix thus ensuring that the sequence of states alternates first with 181 "searching", "foraging" and "resting", and is then forced to stay in state "inbound".

182
These state-switching HMM were fitted to real data according to a maximum likelihood criterion. Fitted models were 183 used to simulate trajectories. The initial step was sample from real data, and we iteratively sampled next steps, until 184 the trajectory went back to the colony. In practice, we stopped the simulation once a location was simulated within a 185 1-km radius around the colony.  Table 2). GANs with LSTM-based discriminators seemed particularly unstable 193 with highly variable performance through epochs (4). Importantly, only GANs with CNN-based generators managed 194 to simulate looping trajectories. For instance, the 'LSTM-CNN' GAN generated relatively good trajectories with a 195 spectral error L spectral lower than 3, yet without being able to loop (Fig. 3).  Table 1). The four different GANs correspond to every generator-discriminator pairs for the considered LSTM and CNN architectures: e.g., we call 'LSTM-CNN' the GAN which has a LSTM Network as generator and a CNN as discriminator.  Fig. 6). However, the spectral distribution of GAN-derived synthetic trajectories matched 199 better the spectral distribution of real trajectories (Fig. 5). In particular, the mean spectral error L spectral was about 200 4 times smaller using GAN than using HMM (Table 3). This was particularly highlighted for the highest frequencies 201 (Fig. 5). On the Peruvian dataset, HMM failed to reproduce spectral distributions both at lower and higher frequencies 202 (Fig. 5A), and on the Brazilian dataset, it failed in the higher frequencies only (Fig. 5B). By contrast, HMMs 203 outperformed GANs to sample relevant step distributions (Fig. 7).

204
Yet, GAN models better capture the real data distribution as it is able to simulate a set a trajectories that has similar 205 global statistics the reference dataset does. For instance, our synthetic trajectories have consistent trip distance, trip 206 duration and the straightness index distributions (see Fig. 7). The straightness index of a trajectory is defined as 207 two times the quotient between the max range to the colony and the trip total distance and is a proxy for tortuosity 208 Figure 4: Convergence of GAN architecture over 5000 epochs : The four different GANs correspond to every generator-discriminator pairs: e.g., we call 'LSTM-CNN' the GAN which has a LSTM Network as generator and a CNN as discriminator. Distance to true spectral density is computed with Eq. 4  Table 1). In these plots, we computed the mean Fourier spectrum for datasets of 100 simulated trajectories. (Benhamou, 2004). The trained GANs also capture spatial information as they reproduce position distributions of 209 observed trajectories ( Fig. 8 and Table 3). GAN-derived synthetic trajectories were indeed mainly heading toward  Table 1)  that GANs are also great tools for the generation of other ecological data such as animal trajectories.

222
GANs showed their great ability to capture the trajectory data distribution, except for first-order step distribution.

223
In opposition, the current state-of-the-art approaches such as multi-state HMM are calibrated at a local scale and  Table 1) to reproduce well step distribution. This may be a general property of convolution GAN architectures. For instance,

230
GANs for image generation including object appearance but may not simulate realistically fine-scale textures Cao