Individual exploration and selective social learning: Balancing exploration-exploitation trade-offs in collective foraging

Search requires balancing exploring for more options and exploiting the ones previously found. Individuals foraging in a group face another trade-off: whether to engage in social learning to exploit the solutions found by others or to solitarily search for unexplored solutions. Social learning can better exploit learned information and decrease the costs of finding new resources, but excessive social learning can lead to over-exploitation and too little exploration for new solutions. We study how these two trade-offs interact to influence search efficiency in a model of collective foraging under conditions of varying resource abundance, resource density, and group size. We modeled individual search strategies as Lévy walks, where a power-law exponent (µ) controlled the trade-off between exploitative and explorative movements in individual search. We modulated the trade-off between individual search and social learning using a selectivity parameter that determined how agents responded to social cues in terms of distance and likely opportunity costs. Our results show that social learning is favored in rich and clustered environments, but also that the benefits of exploiting social information are maximized by engaging in high levels of individual exploration. We show that selective use of social information can modulate the disadvantages of excessive social learning, especially in larger groups and when individual exploration is limited. Finally, we found that the optimal combination of individual exploration and social learning gave rise to trajectories with µ ≈ 2 and provide support for the general optimality of such patterns in search. Our work sheds light on the interplay between individual search and social learning, and has broader implications for collective search and problem-solving.

-We implemented social learning as the use of cues emitted by search agents when finding resources. This form of 89 social learning (similar to stimulus or local enhancement (35)) is widely used to increase search efficiency in various 90 species from bees (36) to primates (37). In our model, social cues attracted other agents with some probability to 91 collectively exploit the information provided by finding resources. In this way, foragers followed a scrounger strategy 92 when moving toward social cues, and a producer strategy when searching for resources individually according to 93 a Lévy walk process. In our model, the value/reliability of social information or the expected pay-off from social 94 learning decreased as distance to the cue increased because resources were likely to decrease or disappear entirely 95 in the time needed to travel long distances (38; 39). Therefore, we operationalized selectivity in social learning or 96 responsiveness to social cues through a parameter α, which modulated the probability of scrounging as a function 97 of distance to social cues (40). Selectivity in the model represented social learning in naturalistic settings where 98 organisms conditionally use social cues based on their reliance and costs of social learning (41). The parameter α also 99 influenced the explore/exploit trade-off between individual foraging and social learning, where increased selectivity 100 also increased the reliance on individual search. The extent of social learning was also affected by the frequency of 101 social cues and the number of foragers pursuing them. We tested the effects of these factors on explore/exploit trade-102 offs by manipulating foraging group size and resource density, where larger groups with more resources produced 103 more social cues and increased the frequency of social learning. 104 We measured group performance in terms of collective foraging efficiency, defined as the average rate of resource find-105 ing per agent and per unit distance moved. We manipulated two parameters, µ and α, that affected the explore/exploit 106 trade-offs at the individual and social level, respectively. We also tested the advantage of selective social learning for 107 efficient foraging (which avoids costly social cues) relative to more indiscriminate use of social learning for different 108 conditions of µ. Finally, we tested how the degree of social learning affected the distribution of movement lengths and 109 altered the original Lévy walk exponent. 110 Given that Lévy walks are random whereas social cues are informative, we can anticipate that responding to social cues 111 will improve performance when resources are sufficiently clustered, but only up to a point depending on the individual 112 search strategy and the degree of selectivity in social learning. Excessive exploitation of social cues may cause agents 113 to overlap with each other more often and reduce exploration for new resources. This problem may be exaggerated 114 in larger groups and avoided when the individual search is more explorative because agents are more likely to avoid 115 overlap by "diffusing" away from each other to find unexploited resources at a faster rate. The agent-based model 116 allowed us to examine the interplay of these factors in producing more or less efficient collective foraging behaviors. 117 We designed this model with the goal to simulate coarse-grained collective foraging for exploring the fundamental 118 dependencies between social learning and independent, individual search, and how they influence group performance. 119 We did not simulate a specific system or organism, instead we provide a basic framework that resembles many natural 120 systems and which can be built upon to model a specific system and make explicit predictions about it (42). 121 Figure 1: A schematic of the model. Agents (blue triangles) decide between individual exploration and using social information based on P (s) = exp(−αd) to copy a resource location (green circles) found by another agent. For α > 0, P (s) will be higher for d 2 than d 1 . The level of individual exploration is dependent on µ, where µ → 1.1 results in high levels of exploration ( See Fig.S2 for actual trajectories and Fig.S3 for the relationship between α and distance).

122
The search space was a two-dimensional L×L grid, and simulations were run with periodic boundaries, and continuous 123 space. For each simulation, the space was populated with N R number of resources, where N R was varied to manipulate 124 resource density, and resources did not regenerate after consumption (i.e, destructive). We manipulated the initial 125 spatial clustering of resources (Fig.S1) using a power-law distribution growth model. The space was initialized with 126 20 seed resources placed in random locations. Additional resources were placed such that the probability of a resource 127 appearing a distance d r from previously placed resources was given by where, d min ≤ d r ≤ L, d min = 10 −3 is the minimum distance that an agent could move and L = 1 is the normalized 129 size of the grid. C is a normalization constant required to keep the total probability distribution equal to unity, such β determined the spatial distribution of resources other than the resource seeds, such that β → 1 resembled a uniform 132 distribution and β → 3 generated an environment where resources were tightly clustered. given moment, agents could tell which other agents were currently on resource patches across the whole environment.

143
In other words, we assume that the agents had a perceptual range limited to radius, r (r = 10 −3 ) for resources that 144 did not emit signals other than direct visual cues, whereas social cues are assumed to be similar to acoustic signals or 145 chemical gradients that can be perceived at long distances. This assumption models realistic foraging scenarios where 146 social cues can substantially increase the perceptual range of a forager and improve prey detection or patch sensing 147 over larger spatial scales (44). For example, birds can detect the pecking behavior of a conspecific from a greater 148 distance than they can detect an individual seed, or scavenging birds can detect a conspecific circling a carcass from 149 many kilometers away.

150
An agent A i detected the closest other agent currently on a resource, A j . The probability of exploiting this social 151 information and heading toward A j was given by where d ij was the distance between agents A i and A j , and α was the social selectivity parameter determined how  And α → 1 corresponded with extreme social selectivity that resulted in no social learning or social information use 159 i.e., pure Lévy walks. An agent could truncate its movement before reaching its destination if it encountered a resource 160 or another social cue. If an agent detected a social cue while already heading towards a previous one, then the agent 161 only switched towards the new signal if the distance to the previous signal was less than that to the newly detected 162 signal. While pursuing a social cue, an agent kept their target location fixed that did not change even if the agent that 163 emitted the cue moved to another location.

164
With the probability, 1 − P S , the agents followed a producer strategy and chose a target location based on their Lévy where, d min ≤ d ≤ L, d min = 10 −3 is the minimum distance that an agent could move, L = 1 is the grid size, and µ 168 is the power-law exponent, 1 < µ ≤ 3. Similar to Eq. 5 C, is a normalization constant such that The Lévy exponent µ modulated the search strategy as a continuum between shorter, more exploitative movements 170 and longer, more explorative movements. If an agent encountered resources or social cues while moving along a path 171 given by the Lévy walk, the agent truncated its movement, and consumed the resource or followed the social cue 172 with the probability, P S , respectively. Multiple agents could occupy a location simultaneously without any penalty. If are also outlined in a flowchart in the Supplementary Materials (Fig.S4).

177
Our model did not have any explicit fitness costs; however, there were various costs associated with optimal searching 178 and foraging, such as opportunity costs and competition. For instance, the resources were limited and did not regener-179 ate, and as more agents reached a patch, the resources depleted, and the agents who followed a cue to walk to that patch 180 faced substantial opportunity costs. Each simulation ended when 30% of the resources were consumed, which ensured 181 that the initial degree of clustering was mostly preserved throughout each simulation. Foraging efficiency η was com-182 puted as the total number of resources found divided by the average distance moved per agent. Efficiency was further 183 normalized by dividing η by the total number of resources available (N R ) to facilitate comparisons across conditions. 184 We varied α to take values between 0 and 1, and µ as 1.1, 2, and 3. We further simulated different conditions for 185 resource density (N R = 1000, 10000), resource distribution (β = 1.1, 2, 3), and group size (N A = 10, 20, 30, 40, 50).

186
Five hundred simulations were run for each parameter combination and averaged results are reported here. Here we 187 report parameter values that affected explore/exploit trade-offs in individual search as well as social learning.

188
In the supplementary materials, we report results on the effects of resource environments, individual search strategies, 189 and group sizes for groups composed of pure producers (α → 0) and scroungers (α → 1) (Fig.S11). We also report  showing that µ = 2 implements the best trade-off for individuals between exploitative and explorative search by 218 generating a random walk that balances long, extensive movements with small movements resembling area-restricted 219 search. As discussed above, when α decreased enough to drive social learning, group search efficiency for clustered 220 resources improved substantially compared with individual Lévy walks. However, the benefits of social learning 221 depended upon the individual search strategy, and the optimal value of the Lévy exponent shifted from µ = 2. We 222 found that with social learning, the optimal Lévy exponent decreased and shifted to µ = 1.1. As agents responded to 223 social information more frequently, group search became more efficient when individual search became increasingly 224 composed of frequent exploratory, long movements with µ → 1.1 (see section 3.4 for more details). High levels of 225 individual exploration helped groups sample the environment faster and created more opportunities for social learning.

226
When individual exploration was lacking (for example, µ = 3), social learning was not as efficient and led to only a 227 small increase in group performance. Moreover, groups with exploitative search behavior and larger sizes benefited 228 more from selective social learning relative to minimally-selective social learning (Fig.S5). We explain this result in 229 the next section.  To illustrate, imagine that an agent happens upon a cluster of resources. It sends a resource signal, and another agent 240 heads towards the cluster. They both find more resources in the cluster, and that increases the time they spend there.

241
In turn, chances are increased of other agents responding to their signal and joining in at the cluster, and so on. This

248
When agents' individual search strategy was closer to a Brownian walk (µ ≈ 3) with frequent turns and short move-249 ments, minimal selectivity (or excessive social learning) led to more substantial grouping between the foragers and 250 restricted them to small areas of the environment for longer durations (Fig.S12b). Thus, a more selective social learn-251 ing strategy decreased the grouping between the agents and increased group performance (Fig.S5). In contrast, when 252 individual search strategy included fast, super-diffusive exploratory bouts (1.1 ≤ µ ≤ 2), agents could quickly disband 253 and disperse across the environment after depleting a resource cluster that further increased their optimality (see previ-254 ous section). However, when social information was less prevalent in scarce clusters (N R = 1000; β = 1.1), selective 255 social learning was less efficient than minimally-selective social learning with high levels of exploration (µ = 1.1). 256 We further manipulated the amount of social information available in the environment by increasing the group size 257 of agents (N A ), where more agents increased the number of overall social cues. We found that when the group 258 size increased (Fig.3), the benefits of minimally-selective social learning relative to selective social learning further in bigger sub-groups (Fig.S12a), and for longer durations (Fig.S12b), which decreased the group-performance (see 263 Fig.S9, S10 for temporal dynamics of this pattern).

264
By contrast, more selective responses to social cues (α = 10 −2 ) helped to avoid over-grouping and instead gave rise 265 to multiple groupings around multiple clusters (see Fig.S12d (right) and S12c). Multiple, simultaneous sub-grouping 266 of agents effectively balanced collective exploration of new clusters with the exploitation of found clusters. Moreover, 267 the advantage of selective social learning relative to minimally-selective social learning was stronger for µ = 3 than 268 µ = 1.1 (Fig.3). An increase in snowballing due to larger group sizes also decreased the exploration of new resources.

269
When agents were slow to disperse after aggregating, an increase in group size further slowed down their dispersal, 270 and more selectivity in social learning was required to maintain exploration. This effect was further exaggerated for 271 richer resource clusters (N R = 10000).

272
Taken together, these results suggest that individual-level explore-exploit trade-off (given by µ) affected the optimal

293
We found that with the use of social cues, exploratory walks (µ = 1.1) were truncated, resulting in trajectories with 294 µ closer towards the theoretical optimum of 2. Thus, it was beneficial for social learners to engage in explorative, 295 independent search and replace exploitative movements driven by random Lévy walks with exploitative search driven 296 by more reliable social cues. When resources were sparse (N R = 1000), the strategies that maximized search effi-297 ciency (µ = 1.1 and α < 10 −2 ) resulted in trajectories with µ ≈ 1.5 (Fig. 4a (top)). However, when the exploitative 298 area-restricted search was beneficial in dense resource clusters, the efficient trajectories were composed of shorter 299 movements (µ > 2). This result is in line with previous findings that showed the advantages of more exploitative 300 search in dense resource environments (47; 48).

301
These effects were also reflected in larger group sizes (Fig. 4a (bottom)). We found that in richer resource patches  individuals balance the explore/exploit trade-off with the Lévy exponent of 2 (µ ≈ 2) in the absence of information 333 about the environment. We found that when social information was available and could be effectively exploited 334 in clustered environments (β = 3), it was optimal to replace exploitation driven by Lévy walks with exploitation 335 driven by social cues and to balance it with high levels of random exploration. Exploratory agents diffused quickly 336 across the environment with minimal overlap, thereby covering territory at a faster rate. Such high diffusion rates also 337 permitted agents to disband from others after exploiting a resource cluster and searching other parts of the environment, 338 especially in larger groups. In this way, groups can balance finding new resources quickly and accurately exploiting 339 the resources found.

340
Furthermore, we found that the optimal combination between independent and random exploration and collective and 341 informed exploitation gave rise to trajectories with µ ≈ 2. Although this result adds to the vast literature on Lévy 342 walks that show the general optimality of search patterns resembling µ = 2, it also demonstrates that Lévy patterns 343 from informed processes are more efficient than from random processes (55), suggests an alternative heuristic that can 344 be used to optimize collective search, and contributes to understanding how information can guide agents to increase  Our model also tested the optimal degree of selectivity on the benefits of social learning under varying conditions.

354
The social selectivity parameter, α simulated a minimal heuristic that modulated the use of social information based collective foraging strategies of Caenorhabditis elegans in patchy food distributions .