Introduction

Visual search, which involves finding an object of interest within a background of distracting visual information, is one of the most important tasks almost every visual system needs to perform1,2,3,4. From detecting food items to locating lurking predators, this ability has to be accurate and fast to ensure survival. In humans, it is commonly accepted that visual search works in one of two major modes: parallel search and serial search4. These modes differ in the dependence of reaction time on the number of distracting items. Thus, in practice, humans perform different visual search tasks with differing degrees of efficiency.

In parallel search mode, differences between the target and distractors that make them distinct can result in a very efficient search that yields detection times that are independent of the number of distracting objects4,5,6, as if the entire visual field was being processed concurrently. In these cases the target is said to “pop out”. In serial search mode, the target and the distractors share properties that make the target harder to find, and no pop-out is facilitated. In this case, reaction time depends on the number of distractors, and performance indicates a serial scanning through the visual scene until the target is detected.

A central component of the prominent theories that account for visual search performance, and pop-out in particular, is the notion of a saliency map, which refers to a transformation of image positions to a visual ‘importance’ map. Saliency maps are generated concurrently across the entire visual field and guide the allocation and shifts of attention. Thus, visual search based on saliency maps does not depend on the size of the field, the complexity of the stimuli, or the number of distracting objects present5. However, if saliency cannot be computed from feature contrast, or if the target cannot be distinguished by a single feature, then the process reverts to the serial mode by forcing attention to scan the display item-by-item.

Despite their intimate link, it is important to note the distinction between saliency and pop-out. Saliency describes how noticeable a stimulus is for an animal as measured by the frequency of selecting a particular target. Pop-out reflects a fast-reaction time that does not increase as the number of distracting stimuli in the environment is increased. It is not about how often or how quickly a target is chosen, but rather how the reaction time is affected by distractors.

While the neural mechanisms responsible for pop-out behaviour are not fully understood, candidate neural correlates that may facilitate this perceptual phenomenon have been studied. It has been suggested that areas involved in oculomotor control, including the lateral intraparietal cortex, frontal eye fields and superior colliculus, are instrumental in distinguishing between target and distractors during visual search tasks7,8 and in the production of a saccade to the target9,10. In addition, it was found that the responses of some neurons in primary visual cortex to stimuli in their receptive field can be modulated by stimuli outside of their classical receptive field11,12,13,14,15,16,17,18,19. Importantly, these so-called contextual modulations were shown to depend on specific differences between the properties of the visual stimuli inside and outside of the receptive field, a key component in computing visual saliency. A retinotopic collection of neurons with this property could constitute a saliency map in the mammalian visual system; pop-out might then be achieved by decision-making computation, such as a winner-take-all mechanism20, that detects the most active (that is, salient) location in the map.

While numerous studies have addressed the behavioural, neural and computational aspects of pop-out in mammals, essentially no research has been devoted to non-mammalian species. Here we were motivated to gain a deeper understanding of pop-out behaviour by asking whether other animals exhibit pop-out and by determining the neural mechanisms that enable pop-out behaviour in these animals.

To address these questions we studied the archer fish21,22,23,24,25,26, a species that lacks a cortex27 but exhibits complex visual behaviours and unique hunting practices. Owing to its remarkable ability to shoot down prey found on foliage above the water level, and its ability to learn to distinguish between artificial targets presented on a computer monitor in an experimental setting28, the archer fish exhibits overt target selection that can be leveraged for reporting psychophysical decisions in controlled lab experiments24,26,29,30. At the same time, electrophysiological recordings from the central nervous system of archer fish enable the necessary exploration of neural mechanisms underlying behavioural phenomena31. In this work, we study pop-out in the archer fish at both of these levels.

Motivated by the observation that moving insects seem to capture an archer fish’s attention more strongly than stationary insects32, we focus our study on motion pop-out. In particular, we ask whether a target that differs from the distractors in motion features such as speed or direction elicits pop-out in visual search tasks. Behaviourally, we find that moving targets indeed elicit pop-out performance, as evidenced by reaction times that are independent of the number of distractors. Extracellular recordings then reveal that a population of neurons in the archer fish optic tectum not only respond to motion cues but also exhibit contextual modulations similar to those observed in mammalian primary cortical neurons. We therefore hypothesize that these neurons may constitute the foundations on which a saliency map is represented in the archer fish brain.

Results

Archer fish exhibit pop-out search mode

We measured the behaviour of four archer fish in two sets of experiments involving visual search. In both the experiments, the fish were presented with a display of moving bars (Fig. 1a). We used displays that contained 4, 6 or 8 distractor bars, and one odd moving bar that was chosen randomly. From now on we refer to the odd moving bar as target. In the speed experiments, the target moved twice as fast as the distractors and therefore twice as far to preserve the direction of motion (Fig. 1b, see Methods). In the direction experiments the target and the distractors moved in opposite directions at the same speed (Fig. 1c). The fish were rewarded for shooting at any of the bars. Each fish performed 30 trials in each of the three conditions in both sets of experiments for a total of 180 trials (see Methods).

Figure 1: Archer fish are capable of using pop-out search of moving targets.
figure 1

(a) Schematic drawing of the behavioural setup: the fish is presented with several targets in the shape of bars moving on a computer screen. We used three conditions with displays that contained 4, 6 or 8 distractors, and one odd-moving bar that was chosen randomly. (b) In the speed experiment all bars are moving in the same direction and the target (central bar in this example) is moving twice as fast. (c) In the direction experiment, all bars are moving at the same speed and the target (central bar in this example) is moving in the opposite direction. (d) The target selection rates (mean and 95% confidence interval for individual fish, mean and SE for all fish) in the direction and speed experiments were calculated to determine whether the target was salient to each individual fish and all fish together. In the speed experiment, all fish shoot at the target with significantly higher probability than predicted by chance alone (black dashed lines). In the direction experiment, the target selection rate was slightly higher than predicted by chance (n=30 for individual fish, n=120 for all-fish data, binomial test). Asterisks denote significant differences (*P<0.05, **P<0.01, ***P<0.001). (e) The reaction time of the four fish and reaction time for all fish (median and 95% confidence interval for individual fish, mean and SE for the all-fish population) were calculated for the 4, 6 and 8 distractor conditions in the speed experiment. The reaction time did not increase as a function of the number of distractors (permutation test, P>0.3), implying the existence of pop-out in visual search for moving bars with different speeds. Blue line denotes the slope of the standard linear regression.

Before exploring the possibility of pop-out in visual search of moving targets, we first had to verify that the target is salient for the fish, that is, selected often enough to facilitate the study of reaction times. To do so, we compared the target selection rate with the chance value of shooting at one of the bars. We found that, in the speed experiment, the target selection rates were significantly higher than chance values (pooled data analysis, mean rates of 68%, 57% and 59% compared with 20%, 14% and 11% chance values, for the 4, 6 and 8 distractors conditions, respectively, P<0.001, Fig. 1d). Thus, target speed is a salient feature for the archer fish.

In the direction experiment, the target selection rates of the fish were significantly lower than in the speed experiment (mean rates of 24%, 28% and 23% for the 4, 6 and 8 distractor conditions, permutation test, P<0.01 Fig. 1d), implying that the direction of target motion is not as salient a feature as target speed. However, after analysing the data from all fish we found that they shot at the target 91 times, while the chance value was only 54 shots (P<0.01, binomial test, see Methods). This indicates that overall the fish selected the target significantly more than predicted by chance. Thus we conclude that both direction and speed are salient features for the archer fish.

Having established the saliency of the target, we then explored whether the archer fish exhibit pop-out in visual search of such moving bars. To do so, we measured the reaction time between stimulus onset and initiation of the shot in the three conditions in the speed experiment (Fig. 1e) and, in particular, for each fish we examined the median reaction time as a function of the number of distractors. To establish pop-out, the reaction times should to not increase with the number of distractors4. For this purpose, we fit a line to the medians using standard linear regression. For all fish, the slope of the regression was not significantly different than zero (0.02 s per distractor, −0.02 s per distractor, −0.05 s per distractor and 0.03 s per distractor for the four fish, respectively, permutation test, all values of P>0.3, see Methods). This result indicates that reaction time does not increase when the number of distractors is increased and therefore that the target effectively pops-out.

To control for possible confounding effects due to change of target direction during the motion cycle and/or grouping of similarly moving distractors, we also tested the fish with moving Gabor patches (see Methods). Two fish participated in these experiments, and here too, a robust pop-out effect was found (Supplementary Fig. 1).

In addition, we conducted two more control experiments to verify that the fish do not target only a bar with a certain speed. In the first control experiment we doubled the speeds of the bars such that the distractors now moved at a speed previously used for the target bar, while the new target moved twice as fast. We found that the fish shot at the target with rates that did not change significantly from the original speed experiment (Supplementary Fig. 2a) and that this task elicited a pop-out response by the fish (Supplementary Fig. 2b). In the second control experiment we first trained the fish to detect the slow moving bar among fast distractors. We found that the target selection rates were significant higher than predicted by chance (Supplementary Fig. 2c, right panel). In addition, we found that the reaction times did not depend on the number of distracting objects, which again indicate that the target popped out during this visual search (Supplementary Fig. 2d). Then we switched between the speed of the target and distractors and trained the fish to shoot at the fast moving bar among slow moving distractors. As expected, again we found that the target selection rates were higher than predicted by chance (Supplementary Fig. 2c, left panel) and that this task elicited pop-out (Supplementary Fig. 2d). In addition, we found that the target selection rates in both tasks were comparable. To conclude, these findings suggest that pop-out in visual search does not depend explicitly on target speed.

Archer fish exhibit serial search mode

We next explored whether archer fish also exhibit serial search, where reaction time depends linearly on the number of distractors. In the first of two experiments we set the motion of all the bars to have the same speed and direction, and the discriminatory feature to be the width of the target. First, we used just two targets, one twice as wide as the other. We found the preferred bar width of each fish, and used that bar as the target and the other as the distractors. For two fish we used the thin bar as the target (Fig. 2a) and for the third fish was the thick bar.

Figure 2: Archer fish can perform serial search.
figure 2

(a) Schematic drawing of the behavioural stimulus for the size experiment: all bars are moving with the same direction and speed. For fish 1 and fish 2, the target is thinner than the other bars (central bar in this example). For fish 5, the target is thicker than the other bars. (b) All fish chose to shoot at the target with significantly higher rate than predicted by chance (mean and 95% confidence interval for individual fish, mean and SE for all fish, chance level is marked by black dashed line, n=30 for individual fish, n=90, for all-fish data, binomial test). Asterisks denote significant differences (***P<0.001). (c) Reaction time in the size experiment increases with the number of distractors (permutation test, P<0.05). This is an indication of serial processing of the visual scene. Blue line denotes the slope of the standard linear regression. (d) Schematic drawing of the behavioural stimulus for the conjunction search experiment: several targets shaped as moving bars are presented to the fish and it needs to select the target that is both thinner and moving faster than the other targets (central bar in this example). (e) The target selection rate is significantly higher than predicted by chance (mean and 95% confidence interval for individual fish, mean and SE for all fish, chance level is marked by black dashed line, n=40 for individual fish, n=120, for all-fish data, binomial test). Asterisks denote significant differences (***P<0.001). (f) Reaction time increases as a function of the number of distractors (permutation test, P<0.05). Blue line denotes the slope of the standard linear regression.

As before, we found that the width of the bar is a salient feature for the archer fish, as the target selection rates in the different distractor conditions were significantly higher than chance (pooled data analysis, mean rates of 82%, 79% and 78% compared with 20%, 14% and 11%, for the 4, 6 and 8 distractors conditions, respectively, P<0.001, Fig. 2b). However, unlike in the speed experiment, this time the median reaction time increased with the number of distractors (Fig. 2c). Linear regression of these medians to a line resulted in positive slopes, significantly different from a flat line (0.31, 0.09 and 0.1 s per distractor for the three fish, respectively, permutation test, P<0.05). This implies that the fish indeed exhibit serial search to detect the target in this case (For additional control, see Methods and Supplementary Fig. 2e,f).

In the second experiment we tested whether the archer fish exhibits serial search in conjunction search scenarios—stimuli in which the target is defined by a unique combination of two visual features. In our case the target was defined by a unique combination of speed and width that distinguished it from the other bars. In each condition half of the distracting bars had the same speed as the target bar but were twice as wide. The other half of the distracting bars had the same width as the target but moved half as fast (Fig. 2d).

Analysing the results of this experiment, we found that conjunctive targets like those used were not salient, in the sense that the fish did not spontaneously shoot more often at the target than the distractors, thus precluding analysis of reaction time. As a result, we conditioned the fish to prefer this target by restricting the reward during training (that is, during training the fish was rewarded only if it shot the designated target). When the learning curve (that is, target selection rate) reached a plateau, the selection rates were indeed significantly higher than chance (P<0.001, Fig. 2e) suggesting that the fish indeed understood the task. Then, after running the experiment again, we fitted the medians of reaction times with a linear regression and found that the slope of the regression was significantly positive (0.69, 0.06 and 0.05 s per distractor, for the three fish, permutation test, P<0.05, Fig. 2f). We conclude that conjunction search elicits serial operation, quite similar to behaviour in humans and other mammals.

Contextually modulated neurons in archer fish optic tectum

Having revealed both serial and parallel search modes in the archer fish at the behavioural level, we switched our exploration to neural mechanisms of saliency in the archer fish. In particular, we tested whether neurons in the fish optic tectum possess non-classical receptive field properties, that is, whether or not they modulate their response to stimuli inside of the receptive field as a result of stimulation outside of their receptive field. To search such contextually modulated neurons we recorded neural activity extracellularly from 65 neurons in the superficial layers of the optic tectum from 9 awake and restrained archer fish (Fig. 3a, see Methods).

Figure 3: Contextually modulated neurons in the archer fish optic tectum.
figure 3

(a) Schematic drawing of the experimental setup: the fish is restrained in a small water tank while looking at a computer monitor. An electrode is inserted in the optic tectum and single neurons are recorded. (b) Schematic view of the stimulus: the receptive field is located on the screen and the preferred orientation of the neuron is determined. For illustration purposes the receptive field boundaries are marked with a red dashed line. Moving bars with speed, direction or no contrast were presented inside and outside of the receptive field. (c) Firing rates as function of time as recorded in a speed-contrast neuron. The stimulus is presented for three periods for 25 repeats. The instantaneous firing rate was higher when the bar inside of the receptive field moves faster than the peripheral bars. The bar position within the receptive field is depicted in the top panel. (d) Mean firing rates of a speed-contrast neuron show stronger response for the speed contrast condition compared with the other two contrast conditions. (ej) The same as in c and d for a direction-contrast neuron (e and f), both-contrast neuron (g and h) and no-contrast neuron (i and h). Differences were considered significant at P<0.05 using t-test (*P<0.05).

Neurons in the optic tectum of the archer fish can be characterized by orientation tuning to bars moving across their receptive field31. Using such moving bars we first determined the preferred orientation and boundaries of the receptive field of each neuron. The experimental stimuli then constitute a moving bar in the preferred orientation inside of the receptive field and eight moving bars outside of the receptive field (Fig. 3b, see Methods). On the basis of this general structure we tested four different conditions: (1). Speed contrast—in which the inner bar moved twice as fast as the peripheral bars. (2). Direction contrast—in which the inner bar and the peripheral bars moved in opposite directions. (3). No contrast—in which the inner bar moved in accordance with the outer bars without any speed or direction contrast. (4). Single bar—a condition where only the center bar was displayed. Each condition consisted of 15–25 repetitions of three cycles of movement, 2 s each, for a total of 6 s per repetition (see Methods). The responses of four selected neurons are presented in Fig. 3c–j.

We found four major classes of neurons. ‘Speed-contrast’ neurons were characterized by higher firing rates in the speed contrast condition compared with the direction contrast and the no contrast conditions. This can be seen in both the firing rate during bar movement (Fig. 3c) and in the mean spike count (Fig. 3d). ‘Direction-contrast’ neurons were characterized by higher firing rate in the direction contrast condition compared with the speed contrast and no contrast conditions (Fig. 3e,f). ‘Both-contrast’ neurons were characterized by high firing rate in both the direction and the speed contrasts conditions compared with the no contrast condition (Fig. 3g,h). Finally, ‘no-contrast’ neurons are those which had roughly the same firing rate in all three conditions (Fig. 3i,j).

We found that 63% of the neurons (41/65) were contextually modulated neurons, while 17% of the neurons (11/65) were classified as no-contrast neurons and 20% of the neurons (13/65) as context inhibited neurons (see Methods). To further assign each of the contextually modulated neurons to a specific class we used the mean firing rates in each condition as a classification measure (see Methods). We classified 17% of the neurons (7/41) as speed-contrast, 41.5% of the neurons as direction-contrast neurons (17/41) and 41.5% of the neurons (17/41) as both-contrast neurons.

Combining two features induces stronger behavioural response

Having established saliency and pop-out at the behavioural level, as well as the existence of contextually modulated neurons at the physiological level, we then moved to further explore their possible correspondence by manipulating the strength of behavioural saliency and exploring the consequences at the neural level. More specifically, we conducted another set of experiments where target bars were defined by two motion features, both speed and direction, in comparison with all the distractors bars (see Fig. 4a). If archer fish are similar to mammals, one would expect such stimuli to induce additivity33 at both the behavioural and neural levels.

Figure 4: Additive effect of the speed and direction visual dimensions in archer fish pop-out behaviour.
figure 4

(a) Schematic drawing of the behavioural stimulus for the additive experiment: The target (central bar in this example) is moving faster and in the opposite direction with respect to the distractors. The fish selects one of the bars and shoots at it. We used displays containing 4, 6 or 8 distractors, and one target that was chosen randomly. (b) The target selection rate averaged across all distractor conditions shows that the combined effect of direction and speed was significantly larger than the effect of each single feature alone (n=90 for individual fish, binomial test). Asterisks denote significant differences (***P<0.001). (c) Reaction time as a function of the number of distractors in the additive experiment indicates that there was no significant increase in reaction time when the number of distractors was increased (permutation test, P>0.15). This is an indication that the archer fish used a pop-out search to detect the odd-moving bar. Blue dashed line denotes the slope of the standard linear regression.

As before, four archer fish were presented with a display of one oddly moving target bar and 4, 6 or 8 distractors and were rewarded for shooting at any of the bars. We measured the target selection rate to explore its saliency and tested for pop-out search mode by calculating the reaction time as a function of the number of distractors. Fig. 4b shows the target selection rates for each of the fish. We judged by the high selection rate of the target, in three of the four fish, that the additive condition was more salient than targets characterized by one feature only (Fig. 4b). The other fish exhibited equally strong responses for the speed experiment, and the additive experiment, both being significantly stronger than the direction experiment. Overall, the additive condition was more salient than targets characterized by one feature only.

Finally, we tested whether ‘additive targets’ elicit pop-out visual search. As before, we measured the reaction times of each fish as the number of distractors increased, and fitted the median reaction times with a linear regression to determine whether they depend on the number of distracting objects. We found that the slope of the regression was not significantly different than zero (0.01, 0.05, 0.08 and 0 s per distractor, for the four fish, permutation test, P>0.15, Fig. 4c), which indicates again a pop-out behaviour for this stimulus, both for individual fish and for all fish combined.

We also compared the slopes of the parallel search tasks (speed experiment and the additive experiment) and the serial search tasks (size experiment and conjunction search experiment) and found out that the latter were significantly steeper than the former (two-sample Kolmogorov–Smirnov test, P<0.02), indicating a qualitative regime shift between these two general conditions.

Combining two features induces stronger neuronal response

To examine whether the additive behaviour carries over to the neural response of the contextually modulated neurons, we measured their electrophysiological response to an ‘Additive condition’. As in the behaviour experiment we used a stimulus where the bar inside of the receptive field of the neuron moved twice as fast and in the opposite direction to the peripheral bars. We found four populations of neurons, samples of which are presented in Fig. 5a–h. As can be seen, some neurons that were previously classified as contextually modulated neurons exhibit much higher firing rate in the additive condition than in the speed contrast, direction contrast and no contrast conditions. This was revealed by the examination of the firing rate during the bar movement (Fig. 5a) and also by the mean spike count (Fig. 5b). In addition, some of the contextually modulated neurons do not exhibit this increased firing rate in the additive contrast condition (Fig. 5c,d), therefore can be considered as non-additive response neurons. The same categorization for additive response neurons and non-additive response neurons can be done for neurons that did not show contextual modulation in the single-feature contrast condition (Fig. 5e–h).

Figure 5: Contextually modulated neurons in the archer fish optic tectum show additive responses to two visual features.
figure 5

We had an additional condition where the bar within the receptive field was different from the bars outside of the receptive field both in direction and speed while also testing the no contrast, direction contrast and speed-contrast conditions as presented in Figure 3. (a) Firing rate as function of time for a contextually modulated additive response neuron showed elevated response when the stimulus was different in both direction and speed. Again the stimulus is presented for three periods for 25 repeats. The bar position within the receptive field is depicted in the top panel. (b) Mean firing rate in the additive condition was significantly elevated with respect to the other conditions. This is an indication that the two visual dimensions of direction and speed have an additive effect in driving the firing of this particular neuron. (c and d) Example of a contextually modulated non-additive response neuron where the additive condition does not produce higher firing rates than the speed contrast and the direction contrast, although the speed contrast response is higher than the no contrast response. (e and f) Example of a contextually unmodulated additive response neuron where the additive condition produces a higher firing rate with respect to the other conditions, but neither the speed contrast nor the direction contrast individually result in higher firing rates than the no contrast response. (g and h) Example of a contextually unmodulated non-additive response neuron: the additive condition does not exhibit higher firing rate than the speed contrast or the direction contrast. Furthermore, the speed contrast response and the direction contrast response are not significantly different from the no-contrast response. Differences were considered significant at P<0.05 using t-test (*P<0.05).

On the basis of these response patterns we classified each neuron as contextually modulated additive or non-additive and contextually unmodulated additive or non-additive. A neuron was classified as having an additive response only if the mean firing rate in the additive condition was significantly higher than the firing rates in the other three conditions. In this way, the frequency of additive response neurons was found to be 25% (16/65), and more specifically, 18% (12/65) of the neurons were classified as contextually modulated additive response neurons.

Discussion

In this study, we examined whether archer fish exhibit pop-out in visual search tasks and explored the possible neural mechanisms underlying this behaviour in the optic tectum, the primary visual area of the archer fish brain. We found that a target differing in speed from its surroundings was salient to the archer fish and elicited reaction times that did not increase as a function of the number of distractors (Fig. 1e). At the same time we found that, under other conditions, the archer fish performs serial search with reaction times that increase with the number of distractors (Fig. 2c,f). Thus archer fish exhibits the two major modes of visual search found in mammals in general, and humans in particular.

We showed that speed of motion allows the target to be found efficiently and in a parallel manner while the size of the target does not. In humans, it was found that colour and motion enable efficient search, but for other features the evidence is less convincing. For example, shape, which was initially considered a basic feature, is considered now to be an illdefined feature since other visual features can be defined as aspects of shapes34. It would be interesting to see whether there is a correspondence between the guiding features in humans to those in the archer fish.

In addition, we also recorded the activity of single neurons in the optic tectum of the archer fish and found that the majority of neurons possess contextual modulation properties. These neurons increase their firing rate when the stimulus inside their receptive field has different motion properties than the stimuli outside of their receptive field. Like the findings in mammals, we hypothesized that these neurons provide an important building block in the neural computation underling saliency, and therefore, pop-out behaviour.

To further connect the behaviour and neural mechanisms, we manipulated the visual stimulus in both the behavioural and electrophysiological experiments and tested for additivity in the response at both levels. We found that when the target was defined by two motion features, its selection rate by the fish was higher than for single-feature targets (Fig. 4b) while a correspondingly stronger response was observed at the neural level (Fig. 5a,b). Taken together, these results strongly support the possibility that the contextually modulated neurons we found provide the substrate on which a saliency map is represented in the optic tectum of the archer fish. Given similar findings in other species, our results suggest a degree of universality in the computational principles used by organisms to address issues of visual saliency and target selection.

To the best of our knowledge, our study is the first to show that an animal outside of the mammalian class performs both serial and parallel visual search in two different modes. Across the animal kingdom it was found that honeybees detect a colour target serially35 while humans4, monkeys1 and cats36 can use parallel search mode to detect targets with various visual features. Our work shows that a pop-out search mode exists in fish and it may indicate that preattentive parallel analysis of the visual field may be a common, and perhaps even universal, mechanisms across vertebrates’ visual systems. At the same time, our study is also the first to demonstrate the existence of contextually modulated neurons in fish for two visual motion features, speed and direction of motion. Such neurons were originally found in the mammalian cortex for visual features such as orientation, colour, shape and motion11,12,14,16,37 and it is commonly accepted that they can function as the basis of saliency maps. That is, they can serve as ‘detectors’ that single out locations in the visual field that are different than their surroundings for the benefit of visual segregation and saliency.

Finally, we note that the results presented in our work may have implications on the understanding of the evolutionary and developmental perspectives of visual search. The optic tectum is analogous to the mammalian superior colliculus27. Our findings of motion-based contextual modulations in the optic tectum agree with previous arguments that saliency maps may be located in attentional control regions outside the mammalian visual cortex, and in particular in the mammalian superior colliculus9,10,38. With the two lines of evidence now converging, this hypothesis now enjoys additional support and promotes further exploration and verification.

In another study it has been shown that archer fish30 can perform a visual search task. Rischawy and Schuster30 showed that the archer fish can perform a visual search task for stationary objects. In their study, reaction times depended linearly on the number of distracting objects in the background. The difference is that for motion-induced search, the archer fish performs the search in a parallel manner, while for stationary objects the search is performed serially. The archer fish will not ignore a static insect; however, it might be that they employ different hunting strategies for hunting moving and stationary insects.

In their natural habitat28,39, archer fish cruise just below the water's surface looking for insect prey on leaves, branches and roots above. While they move, the visual scene will be moving across their visual field in one direction and with constant motion. However, moving insects will appear to move at a different speed and/or direction. Therefore, a motion pop-out mechanism can help archer fish detect small, often well camouflaged, insects against a highly complex and highly contrasting background. The challenge of detecting, identifying and accurately spitting at an insect that is only a couple of millimetres long at distances of over a metre above the water is thus greatly assisted by visual pop-out.

Methods

Animals

All experiments with fish were approved by the Ben-Gurion University of the Negev Institutional Animal Care and Use Committee and were in accordance with government regulations of the State of Israel. Archer fish (Toxotes chatareus), 6–13 cm in length, 10–15 g body weight were used in this study. The fish were kept in a water tank filled with brackish water (2–2.5 g of red sea salts mix for 1 liter of water) at 26–28 °C. The room was illuminated with artificial light with 16/8 h day–night cycle.

Fish training

A total of eight fish were gradually trained, in different experiments, to respond to a moving black bar on white background presented on an LCD screen (E177FP, 17′′, Dell, USA) placed on top of a transparent glass plate 40 cm above water level. In this setting, 1 cm on the screen corresponds to 1.43° in the zenith of each water surface point. Each of the fish was housed in a separate water tank 30 × 50 × 40 cm in size. Fish were first trained to shoot at insect images and were rewarded with a food pellet for each successful shot. This part of the training took about 1–3 training session, of 10–20 trials each. After this training, the image was replaced by a black static bar (0.25 cm × 1 cm) presented at arbitrary locations. Later, the static bar was replaced with a moving bar (0.25 cm × 1 cm). The speed of the moving bar was 4 cm s−1. This part of the training took about 1–2 training session, of 10–20 trials each. Some of the fish died during the work; therefore they do not appear in all of the figures. The fish are listed according to the order of figures.

Behavioural experiments

Stimuli were presented using PowerPoint presentations (Microsoft, Seatle, WA, USA). All experiments were recorded using an HD camera (Handycam HDR-SR11E, Sony, Tokyo, Japan) at 25 frames per second and stored offline for further analysis.

There were six types of behavioural experiments; the first five experiments used displays with 4, 6 or 8 distractors bars, and one odd moving bar that was chosen randomly. In all experiments the bars moved in the orthogonal direction to their orientation. In experiments 1, 2, 3, 5 and 6 the fish were rewarded for shooting at any of the targets. In experiment 4, the fish were rewarded only for shooting at the odd moving bar. In these experiments, black bars sized 0.25 cm × 1 cm were displayed on a white background. The distance between bar centres ranged between 5 and 7 cm. The sixth experiment consisted of three conditions with displays that contained 7, 10 or 14 Gabor patches, and one odd moving Gabor patch. In this experiment, Gabor patches sized 2 cm × 2 cm were displayed on a white background. The distance between Gabor centres ranged between 5 and 7 cm. The different experiments were:

(1) Speed experiment: the target moved twice as fast as the distractors (Fig. 1b). The speed of the target and the distractors were 4 and 2 cm s−1, respectively. Each trial consisted of four cycles of back and forth motion, for a total of 4 s. To control for speed preference, we conducted two additional experiments. In the first control experiment, the speed of the target and the distractors were 8 and 4 cm s−1 respectively. In the second control experiment, the speed of the target and the distractors were 2 and 4 cm s−1, respectively. (2) Direction experiment: the target and the distractors moved in opposite direction (Fig. 1c). The speed of the target and the distractors were 4 cm s−1. Each trial consisted of four cycles of back and forth motion, for a total of four seconds. (3) Size experiment: the odd moving bar was half as wide as the distractor bars (Fig. 2a). The speed of the target and the distractors were 4 cm s−1. Each trial consisted of 10 cycles of back and forth motion, for a total of 10 s. To control for size preference, we had an experiment in which the size of the target was twice as wide as the distractors bars. (4) Conjunction search experiment: half of the distracting bars had the same speed as the odd moving bar but, were twice as wide. The other half of the distracting bars had the same width as the odd moving bar, but half of its speed (Fig. 2d). The speed of the target was 4 cm s−1 and its width was 0.25 cm. Each trial consisted of 10 cycles of back and forth motion, for a total of 10 s. (5) Additive experiment: the target moved both twice as fast as the distractors and in the opposite direction (Fig. 4a). The speed of the target and the distractors were 4 and 2 cm s−1, respectively. Each trial consisted of four cycles of back and forth motion, for a total of 4 s. (6) Gabor experiment: the speed of all the patches was 1.5 cm s−1 with spatial period of 0.6 cm. the odd patch and the distractors moved in opposite directions (Supplementary Fig. 1).

It is important to note that motion parameters phase, speed and amplitude are constrained parameters. That is, if you fixed one of them and change another, the third one is determined. In the speed experiment we fixed the relative phase between the target and distractors and chose to change the relative speed between them. By doing so, there was also a difference in the distance the bars were moving. For simplicity, we chose to refer to this stimulus as a speed stimulus, keeping in mind, that any effect we see could be related to the amplitude of motion as well.

Behavioural data analysis

Target selection rates and reaction times were retrieved from the movies by the experimenter and were saved as Matlab files for further analyses. We computed the binomial cumulative distribution function for the target selection rates and compared them with chance values, using the binomial test, to determine whether the true probability of choosing the target is higher than chance value.

To calculate the 95% confidence intervals of the individual fish reaction time, we used the standard formulae for confidence intervals for the median40. The lower and upper 95% confidence limits are given by ranked values, where n is the number of shots to the target.

To determine whether the reaction time increases linearly as a function of the number of distracting bars, we first calculated the median reaction time for each condition. We then fit a line to these medians using standard linear regression and found the slope of the regression. Second, we used a permutation test with 1,000 repetitions to assess the probability to find a slope equal or greater than the original slope. Probability <0.05 was considered significant and therefore implied that the reaction time increases as a function of distracting bars.

Surgery

Fish were anaesthetised with MS-222, (A-5040, Sigma-Aldrich, St Lewis, MO, USA) 100 mg l−1 of tank water and restrained in a special device and its gills watered continuously with tank water containing MS-222 (50 mg per liter). The watering of the gills was essential due to a possible respiratory failure caused by exposure to MS-222. An incision was made over the optic tectum, the skin and fatty tissue were removed, and Lidocaine (L-7757, Sigma-Aldrich, St Lewis, MO, USA) was applied at the boundaries of the incision. At this point we injected the fish with 5–15 μl of the non-depolarizing muscle relaxant, Gallamine triethiodide (17 gr per liter, G 8134, Sigma-Aldrich, St Lewis, MO, USA) to the spine, towards the tail, to prevent muscle movement during the experiment. Specifically, only after we confirmed that eye movements were eliminated, did we continue with the rest of the procedure. A dental drill (Micro drill #097883, with a 2.7 mm tip diameter, stainless steel trephines, #18004-27, Fine Science Tools, Foster City, CA, USA) was then used to open the skull and meninges over the optic tectum. A silver wire (76.2 μ in diameter, tip coated with silver chloride) was placed in the cerebrospinal fluid near the optic tectum and used as a reference electrode.

In vivo electrophysiology

The fish and the restraining device were placed together in a smaller water tank (length 25 cm, width 6 cm, height 6 cm) filled with brackish water (2–2.5 g per liter of red sea salt) up to 0.5 cm above eye level (no MS-222 at this stage, see Fig. 3a). The fish’s gills were continuously watered through a tube inserted into its mouth to compensate for possible respiratory degradation. The fish was placed so that its right eye was 0.3 cm from the glass wall (parallel to the sagittal plane of the fish) in the center of the tank. This glass wall was higher than the other walls (12 cm height) and thus allowed the fish a wide visual field (about 110° in both the vertical and horizontal axes). The fish was kept in the device for 10–20 min to make sure the effect of the anaesthetic wears off and that it breaths by its own. Using a single electrode (tungsten, glass coated, 250 μ diameter, 2 MΩ impedance, 60 mm long, cat # 366-060620-11, Alpha-Omega, Israel) mounted on a calibrated manipulator (Narishige, Japan) recordings were made from the superficial layers (up to 500 μm deep) of the optic tectum. The signal obtained was magnified (× 104) and filtered (band pass box filter, 300 Hz–10 kHz range) by an amplifier (DAM 50, WPI, USA) and then transmitted through two parallel channels: (1) the signal was sampled and recorded with a computer at 20 kHz and (2) the signal went through an analogue notch filter, removing 50 Hz and then to an audio monitor and an oscilloscope. In this way the neural response could be both heard and seen in real time during the experiment. The first recording in a session occurred about 30 min after the surgery ended. The average duration of a typical recording session was 5 h, and we were able to hold single units up to 30–40 min, to obtain about seven neurons per fish. Spike sorting was done off line using custom-written Matlab routines31. We recorded a total of 86 neurons from 12 different archer fish in all sets of experiments.

Estimation of the location of the receptive field

To estimate the location of the neuron’s receptive field, a bar was moved interactively by the experimenter across the screen with different orientations and moving directions to detect the limits of the receptive field. The bar was moved across the screen until a strong reaction was heard; this point was marked as one of the edges of the receptive field. In a similar way, the bar was moved in different directions to determine the borders of the receptive field. This method enables us to mark the receptive field boundary fast enough so that mapping of many neurons from each animal was possible. The error in determining the exact location of the edges is 1.5°.

Visual stimuli

To determine the neuron’s preferred direction of motion, a bar with length and width adjusted to match the receptive field size was first moved across the receptive field with direction of motion orthogonal to the bar’s orientation. The preferred direction of the neuron was determined using the auditory output of the recording system. Then a bar with width of 1 cm (corresponds to 3–6°, according to the distance of the screen from the fish eye) and length adjusted to match the receptive field length was configured to move according to the width of the receptive field. The speed of motion was one receptive field width per second. Eight bars were displayed in a matrix around the receptive field (Fig. 3b). The distance between the central bar and the surrounding bars was two receptive fields size in the axis of movement and one receptive field size in the axis perpendicular to the movement. The stimulus consisted of 15–25 repetitions of three cycles of movement, 2 s each, for a total of 6 s per repetition, followed by 2 s of black screen, to avoid any adaptation effects.

There were five types of stimulus in the electrophysiological experiment: (1) Single bar: one bar moving inside of the receptive field with speed of one receptive field width per second. (2) Speed contrast: the inner bar moved twice as fast as the outer bars. That is, the bar started its movement half a width of the receptive field before the edge of the receptive field and finished the movement half the width of a receptive field after the edge of the receptive field on the other side, with a speed of two receptive field widths per second. The outer bars moved with a speed of one receptive field width per one second. (3) Direction contrast: the inner bar and the outer moved in opposite directions. All bars moved with speed of one receptive field width per second. (4). No contrast: the inner bar moved in coincidence with the outer bars with speed of one receptive field width per second. (5) Additive contrast: the inner bar moved both twice as fast, that is, two receptive field widths per second, and in the opposite direction to the bars outside of the receptive field.

In addition, we conducted three control experiments. The motivation for the first control was that in the behavioural experiment, the observed angular speed of the moving bars may change from trial to trial due to the movement of the fish (that is, swimming) in the tank and the lack of smooth pursuit eye movements. To control for such an effect, we changed the speed of motion in the speed contrast condition to one receptive field width per second for the inner bar and half of a receptive field width per second for the outer bars. We recorded 21 neurons from 3 different fish for the subset of data in this experiment. We found that the ratios of speed contrast neurons and speed neutral neurons in both conditions were 48% (10/21) and 52% (11/21), respectively. We concluded that the population of speed contrast neurons was not changed significantly as a result of this manipulation. In the second control experiment, we changed the spacing between the inner bar and the outer bars. In this experiment, the distance between the centre bar and the surrounding bars was three receptive fields size in the axis of movement and one and a half receptive field size in the axis perpendicular to the movement (Supplementary Fig. 3a). We recorded 14 neurons from 2 different fish in this experiment. We found that the ratio of contextually modulated neurons in this experiment was 57% (8/14). We concluded that the population of contextually modulated neurons was not changed significantly as a result of this manipulation (χ2-test, P=0.68). In the third control experiment, we changed the polarity of the stimulus. That is, we used black moving bars on a white background (Supplementary Fig. 3b). Here we recorded 21 neurons from 3 different fish. We found that the ratio of contextually modulated neurons in this experiment was 52% (11/21). We concluded that the population of contextually modulated neurons was not changed significantly as a result of this manipulation (χ2-test, P=0.38).

In addition we performed two power analyses. In the first analysis we found that with the number of neurons we actually had, we can achieve significance at the P<0.05 level for contextually modulated proportion of 29% or less. In the second analysis we found that to find a significant difference of 0.05% in 90% of the cases, with an estimate of the variance derived from the actual data collected, we need >200 neurons in each of the control experiment. Therefore, we are convinced that any effect that may exist must be a weak one. We conclude that the contextually modulated neurons population is a robust phenomenon and does not depend in a critical manner on the exact details of the stimulus parameters.

Classification of neurons

To quantify the effect of the surrounding bars on the response of each neuron, we first measured the firing rate of the neuron in each cycle of the stimulus, by counting the spikes in the entire six second cycle of the stimulus display, in each condition. Then we used a t-test to assess the statistical significance of the difference between the firing rates for the different conditions. We used the Holm–Bonferroni method to adjust for multiple comparisons. For convenience, we denote the firing rates in the speed contrast condition, the direction contrast condition, the no contrast and the additive contrast condition as S, D, N and A, respectively. In addition, we denote significant increases by >> and non-significant increase by =. Neurons with S>>N and D=N were classified as speed-contrast neurons. Neurons with D>>N and S=N were classified as direction-contrast neurons. Neurons with S>>N and D>>N were classified as both-contrast neurons. Neurons with S=N and D=N were classified as no-contrast neurons. Neurons with N>>S or N>>D were classified as contextually inhibited neurons. Finally, neurons with A>>S and A>>D and A>>N were classified as additive neurons.

Additional information

How to cite this article: Ben-Tov, M. et al. Pop-out in visual search of moving targets in the archer fish. Nat. Commun. 6:6476 doi: 10.1038/ncomms7476 (2015).