Serotonergic Neurons Mediate Operant Conditioning in Drosophila Larvae

Observed across species, operant conditioning facilitates learned associations between behaviours and outcomes, biasing future action selection to maximise reward and avoid punishment. To elucidate the underlying neural mechanisms, we built a high-throughput tracker for Drosophila melanogaster larvae, combining real-time behaviour detection with closed-loop optogenetic and thermogenetic stimulation capabilities. We demonstrate operant conditioning in Drosophila larvae by inducing a bend direction preference through optogenetic activation of reward-encoding serotonergic neurons. Specifically, we establish that the ventral nerve cord is necessary for this memory formation. Our results extend the role of serotonergic neurons for learning in insects as well as the existence of learning circuits outside the mushroom body. This work supports future studies on the function of serotonin and the mechanisms underlying operant conditioning at both circuit and cellular levels.


25
Animals must rapidly alter their behaviour in response to environmental changes. An important 26 adaptation strategy is associative learning (Dickinson, 1981; Rescorla, 1988), in which an animal 27 learns to predict an unconditioned stimulus (US) by the occurrence of a conditioned stimulus (CS). 28 The US is often a punishing or rewarding event such as pain or the discovery of a new food source 29 (Pavlov, 1927). The nature of the CS distinguishes two major associative learning types: classical 30 conditioning (Pavlov, 1927) and operant conditioning (Skinner, 1938;Thorndike, 1911). 31 In classical conditioning, the CS is an inherently neutral environmental stimulus such as a sound, 32 odour, or visual cue. Pairing with an appetitive or aversive US leads to learned approach or avoid-         vices (DMDs) which were programmed to project small 1 cm 2 squares at the location of individual 141 larvae. Both DMDs, which were positioned to project over the entire plate area, were operated 142 simultaneously ( Figure 1C). 143 Thermogenetic stimulation of individual larvae was achieved by directing a 1490 nm infrared 144 (IR) laser beam through a two-axis scanning galvanometer mirror positioning system (Figure 1C) the 1490 nm wavelength is well-absorbed by water (Curcio and Petty, 1951), larvae exposed to the 147 IR beam were rapidly heated. We took advantage of the galvanometer's high scanning velocity to 148 rapidly cycle the beam between four larvae ( Figure 1D). 149 Software architecture  Optogenetic and thermogenetic stimulation efficiency verified by behavioural readout 192 We conducted proof-of-principle experiments to ensure that our set-up could be successfully used 193 for optogenetic stimulation (Figure 2A). Ohyama (Figure 2C), suggesting that the DMDs could be used for optogenetic 203 stimulation without activating the animals' photoreceptors. 204 We also verified the efficacy of the galvanometer set-up for thermogenetic stimulation (Fig-205 ure 2D). We tested whether 69F06-Gal4 x UAS-dTrpA1 and 72F11-Gal4 x UAS-dTrpA1 larvae rolled 206 upon exposure to the IR laser ( Figure 2E, see also Materials and methods). In each stimulation 207 cycle, we observed above-threshold rolls in over 70% of 69F06-Gal4 x UAS-dTrpA1 larvae and over 208 35% of 72F11-Gal4 x UAS-dTrpA1 larvae; a significant contrast to the attP2 x UAS-dTrpA1 control lar-209 vae whose roll rate was close to zero. We concluded that these heating conditions were effective 210 for targeted Trp channel activation without larvae perceiving strong pain ( Figure 2F). 211 Operant conditioning of larval bend direction 212 We chose optogenetic activation of reward circuits as a US for automated operant conditioning.

213
The main challenge was determining which neurons could convey a sufficient reinforcement sig-214 nal, especially as the capacity for Drosophila larvae to exhibit operant learning was not yet demon-  Gal4 x UAS-CsChrimson larvae to bend more often to one side than the other. Although stimulation 231 side was randomized across trials, we describe (for simplicity) the experiment procedure where 232 this predefined side was the left. Each experiment began with a one-minute test period where 233 no light was presented. What followed were four training sessions, each three-minutes long, in 234 which larvae received optogenetic stimulation when bending to the left. Between training sessions, Figure 2. Optogenetic and thermogenetic stimulation with the high-throughput tracker. a. Hardware design schematic for optogenetic stimulation. Although the high-throughput tracker included two digital micromirror devices (DMDs), only one is shown for simplicity. b. Proof-of-principal experiment protocol for optogenetic stimulation. c. The fraction of larvae for which a roll was detected in each stimulation cycle. 69F06-Gal4 x UAS-CsChrimson and 72F11-Gal4 x UAS-CsChrimson larvae (CsChrimson expressed in neurons triggering roll behaviour; experiment groups) were compared to attP2 x UAS-CsChrimson larvae (no CsChrimson expression; control group). Fisher's exact test was used to calculate statistical differences between the experiment and control groups (*** < 0.001). d. Hardware design schematic for thermogenetic stimulation. Although the high-throughput tracker included four two-axis galvanometers, only one is shown for simplicity. IR: infrared. e. Proof-of-principal experiment protocol for thermogenetic stimulation. f. The fraction of larvae for which a roll was detected in each stimulation cycle. 69F06-Gal4 x UAS-dTrpA1 and 72F11-Gal4 x UAS-dTrpA1 larvae (dTrpA1 expressed in neurons triggering roll behaviour; experiment groups) were compared to attP2 x UAS-dTrpA1 larvae (no dTrpA1 expression; control group). Fisher's exact test was used to calculate statistical differences between the experiment and control groups (*** < 0.001).

Figure 3. Operant conditioning of bend direction in
Drosophila larvae requires the ventral nerve cord. a. Experiment protocol using the high-throughput closed-loop tracker. Behaviours are depicted as larval contours (black) with head (green). During training, the larva received an optogenetic stimulus (red light bulb) whenever it bent to one predefined side (here depicted as the left for simplicity), and light was switched off during all other behaviours (grey light bulb). b,d,e. Larval bend rate shown as the number of bends per minute, grouped by bend direction. The bend rate to the stimulated side (depicted as a left bend with a red light bulb for simplicity) is shown in red and the bend rate to the unstimulated side (depicted as a right bend with a grey light bulb for simplicity) is shown in grey. For larvae that received random, uncorrelated stimulation during 50% of bends, the bend rates to the left and right are shown in black. Statistical differences within groups were tested with a two-sided Wilcoxon signed-rank test; statistical differences between two groups were tested with a two-sided Mann-Whitney U test. c,f. Probability that a given bend was directed towards the stimulated side or, in the case of the uncorrelated training group, towards the left. Grey line indicates equal probability of 0.5 for bends to either side. Statistics calculated from a two-sided Wilcoxon signed-rank test. b-f. Gal4 expression depicted as color-coded CNS. All data is shown as (mean ± s. e. m.). n. s. ≥ 0.05 (not significant), * < 0.05, ** < 0.01, *** < 0.001. b. Bend rate for Ddc-Gal4 x UAS-CsChrimson larvae. Data is shown from the test period before the first training session and the test period after the fourth training session. c. Data from same experiments as in b. d. Same data as in b, but bend rate for uncorrelated training group was calculated without stratification by bend direction. e. Bend rate for Ddc-Gal4 x UAS-CsChrimson; tsh-LexA, LexAop-Gal80 and 58E02-Gal4 x UAS-CsChrimson larvae. The effector, UAS-CsChrimson, is omitted from the figure for visual clarity. Data is shown from the test period immediately following the fourth training session. f. Data from same experiments as in e. For each larva, two measures served as a read-out for bend direction preference: i) the bend 240 rate, measured as the number of bends per minute performed towards a given side, and ii) the 241 probability that a given bend was directed towards the stimulated side, obtained by normalising 242 the bend rate with the total number of bends performed by the larva in that minute. Individual 243 larva variation in bend rate yielded different results for these measures at the population level. In 244 the one-minute test prior to the first training session, we observed no significant difference in larval 245 bend rate to either side and the likelihood of these naïve animals choosing one side over the other 246 was not significantly different from chance. In the one-minute test following the fourth training 247 session, larvae showed a preference for bends towards the side paired with red light stimulation 248 during training, and the probability of these larvae bending towards this previously stimulated side 249 was significantly greater than 50% (Figure 3B). 250 The light-dependent activation of neurons using CsChrimson requires a cofactor, retinal, which iment. This suggested that the US, which triggered a learned direction preference for bends in 255 larvae raised on retinal, was indeed the collective activation of all Ddc neurons and not the red 256 light itself. Notably, when directly comparing larvae raised with retinal to this control group raised 257 without, the two groups showed no significant difference in the bend rate towards the stimulated 258 side. Instead, the bend rate towards the unstimulated side was significantly reduced in larvae that 259 received paired training compared to this control ( Figure 3B). This raised the question whether 260 larvae were learning to prefer the side paired with the rewarding US, or rather to avoid the side 261 without the stimulus.

262
To confirm that the bend preference we observed after training was attributable to pairing 263 light with bends solely in one direction, we conducted another control experiment in which lar-264 vae received random, uncorrelated stimulation during 50% of bends regardless of direction. After 265 training, larvae showed neither a difference in absolute left and right bend rates, nor a significant 266 probability of choosing one side over the other (Figure 3B, Figure 3C). These bend rates aver-267 aged together were indistinguishable from those of pair-trained larvae as they bent to the previ-268 ously stimulated side. However, larvae which received uncorrelated training showed a significantly 269 higher bend rate overall compared to pair-trained larvae bending to the previously unstimulated 270 side ( Figure 3D). 271 The mushroom body is not sufficient to mediate operant conditioning in larvae 272 Our experiments showed that activation of Ddc neurons is a sufficient US for operant conditioning. bend direction preference. Their sufficiency was inconclusive, however, since perhaps two or more 290 distinct groups of Ddc neurons needed collective activation in order to form a memory. 291 We then assessed whether exclusively activating the PAM cluster dopaminergic neurons inner-292 vating the MB could induce operant conditioning, as is the case for classical conditioning. 58E02-293 Gal4 drives expression in the majority of these neurons (Rohwedder et al., 2016). 58E02-Gal4 x UAS- 294 CsChrimson larvae did not develop any direction preference for bends following training ( Figure 3E, 295 Figure 3F). It is unsurprising that activation of these neurons alone could not act as a rewarding 296 US in this paradigm, given our finding that Ddc neurons in the brain and SEZ are insufficient. It is 297 remarkable, however, because it suggests that the neural circuits signalling reward in operant con-   group, larvae were exposed to alternating three-minute presentations of ethyl acetate with red 318 light and air with no light. To ensure that any observed effects were a result of learning rather 319 than innate odour preference or avoidance, an unpaired group was trained simultaneously with 320 reciprocal stimulus presentation (odour/dark, air/light). Following training, larvae in both groups 321 were tested on their preference for the odour in the absence of light ( Figure 4A). All learning scores 322 were compared to a negative control containing no GAL4 driver, w 1118 x UAS-CsChrimson, which 323 did not exhibit a learning phenotype ( Figure 4B). Consistent with prior study results (Rohwedder  During training, larvae in the paired group received three minutes of optogenetic red light stimulation (solid red circles) paired with the odour (white cloud) followed by three minutes of darkness (solid white circles) paired with air (no cloud). The unpaired group received reciprocal stimulus presentation (dark paired with odour, light paired with air). This procedure was repeated three times. In half of the experiments, the order of training trials was reversed, starting with air presentation instead of odour presentation. Both groups were then tested for learned odour preference in the dark with odour presented on one side of the plate and no odour on the other (PI = performance index). b. Performance indices following olfactory conditioning, plotted as raw data points and mean. w 1118 x UAS-CsChrimson was the negative control (grey, = 8), 58E02-Gal4 x UAS-CsChrimson was the positive control (blue, = 8). Statistical comparisons to w 1118 x UAS-CsChrimson were calculated using a two-sided Mann-Whitney U test with Bonferroni correction; n. s. ≥ 0.05∕7 (not significant), ** < 0.01∕7. Statistical comparisons to Tph-Gal4 x UAS-CsChrimson were calculated using a two-sided Mann-Whitney U test with Bonferroni correction; n. s. ≥ 0.05∕2 (not significant), *** < 0.001∕2. c,d. All data is shown as (mean ± s. e. m.), n. s. ≥ 0.05 (not significant), * < 0.05. c. Experiments followed the protocol depicted in Figure 3A. Data is shown from the test period immediately following the fourth training session. Larval bend rate shown as the number of bends per minute, grouped by bend direction. The bend rate to the stimulated side (depicted as a left bend with a red light bulb for simplicity) is shown in red and the bend rate to the unstimulated side (depicted as a right bend with a grey light bulb for simplicity) is shown in grey. Statistical differences within a group were tested with a two-sided Wilcoxon signed-rank test. d. Probability that a given bend is directed towards the stimulated side. Grey line indicates equal probability of 0.5 for bends to either side. Statistics were based on a two-sided Wilcoxon signed-rank test.   previously stimulated and unstimulated sides in the one-minute test period ( Figure 4C). Further-366 more, the probability that any given bend was directed towards the previously stimulated side was 367 not significantly different from chance ( Figure 4D). Activating these dopaminergic neurons was an 368 insufficient substitute for reward or punishment in operant conditioning.

369
Paired activation of Tph-Gal4 neurons during bends to one side resulted in a significantly higher 370 bend rate to the stimulated side relative to the unstimulated side during the test period ( Figure 4C). 371 The probability of bending in the previously stimulated direction was also significantly elevated 372 ( Figure 4D). In this way, activation of Tph-positive serotonergic neurons paired with bends to one 373 side was sufficient for the formation of a learned direction preference. Combining this result with 374 the knowledge that operant conditioning was impaired following restriction of Ddc-Gal4 x UAS- 375 CsChrimson expression to the brain and SEZ suggests that the serotonergic neurons of the VNC 376 were necessary for memory formation in this paradigm. Because Tph-Gal4 is a broad driver line, it 377 is possible that its expression pattern contains brain or SEZ neurons outside of those in Ddc-Gal4. 378 The existence of these neurons could have potentially induced learning through an alternate mech-379 anism independent from that which drove memory formation following Ddc neuron activation.

380
To assess whether the VNC serotonergic neurons were necessary for the observed operant 381 conditioning effect, we used tsh-Gal80 to restrict the Tph-Gal4 expression pattern to the brain and  (Huser et al., 2012). 386 While there are few serotonergic VNC candidates, we could not conclude from our data whether since, to our knowledge, no sparse driver lines exist to exclusively target VNC serotonergic neurons.   432 One possible explanation for these discrepancies is that multiple groups of sensory neurons must 433 be co-activated in order to relay a meaningful reward signal. Alternatively, it may be necessary to 434 adjust the temporal pattern or intensity of optogenetic stimulation.  (Mendoza et al., 2014). The vertebrate 480 homologue FOXP2 is associated with deficits in human speech acquisition (Lai et al., 2001), song 481 learning in birds (Haesler et al., 2007), and motor learning in mice (Groszer et al., 2008).  (Lee et al., 2011). Furthermore, serotonin receptor signalling is required 490 for memory formation in classical conditioning tasks (Johnson et al., 2011). In larvae, aversive ol-491 factory conditioning is impaired by either ablation of serotonergic neurons during development or 492 mutations in a serotonin receptor gene (Huser et al., 2017). 493 Our work suggests a novel role of serotonin as a reward signal for learning in Drosophila larvae.

494
In our olfactory classical conditioning screen, optogenetic stimulation of serotonergic neurons in 495 the brain and SEZ was sufficient to induce strong appetitive learning. Conversely, operant con-    To track larvae over time, the host computer assigned a numerical identifier to each eligible object. 544 We used distance-based tracking with a hard threshold of 40 pixels to maintain larval ID based were edge pixels (Figure 1-Figure Supplement 1). 564 (Figure 1-Figure Supplement 2). 578 We defined the larval spine as 11 points running along the central body axis from head to tail 579 (Figure 1-Figure Supplement 3; Swierczek et al., 2011). In addition to head and tail, the Behaviour  584 The Behaviour Programme transformed the raw contour and spine from camera coordinates 585 (in pixels) to world coordinates (in mm). If stable larval detection criteria were met, all spine points 586 were temporally smoothed using exponential smoothing (Figure 1-Figure Supplement 3). 587 Feature extraction 588 We developed a machine learning approach to address the high deformability of the larva shape, 589 ensure live execution, reduce overfitting, and limit the volume of data tagging. What follows is a 590 brief summary of larval features describing motion direction, body shape, and velocity that were To extract features in real time and address various sources of noise, we implemented exponential smoothing defined as follows for a given feature f (Figure 1-Figure Supplement 7):  (Figure 1-Figure Supplement 7). 640 Convolution was used to approximate a smoothed squared derivative for each feature (Figure 1-Figure Supplement 8); useful for integrating information over time without needing to further expand the feature space. The underlying mathematical concepts were motivated by Masson et al. (2012). For a given feature f at time , f_convolved_squared was calculated as follows:

641
Behaviour classifiers 642 Behaviour classifiers were developed using a user interface similar to JAABA (Kabra et al., 2013). 643 The underlying algorithms combined trained neural networks and empirically determined linear  (Table 1). 649 The bend classifier was based on predefined thresholds for temporally smoothed body shape  on the forward classifier and a threshold on forward tail velocity.

671
The roll classifier was based on thresholds for body shape and velocity combined with no ball  fitting the corresponding camera coordinates. 693 We determined that DMD illumination using the default light output was not uniform at plate 694 level, which could have resulted in variable optogenetic stimulation depending on larval location.

695
The maximum achievable light intensity at the plate's edge was approximately 40% of the peak 696 value at its centre. We therefore normalised the pixel intensity of the DMD image to the highest to camera coordinates using the existing world-to-camera transform and was then mapped to a 728 pair of galvanometer input voltages using the look-up tables.

729
Laser intensity calibration was also necessary to ensure that all larvae received the same stimu-

836
The experiment protocol began and ended with a one-minute test period without optogenetic 837 stimulation. Between these test periods were four, three-minute training sessions during which 838 larvae received red light stimulation of 285 µW/cm 2 for the entire duration of the detected bend. necessary to remove invalid objects from the data set prior to behavioural analysis. These included 857 corrupted objects (e.g. scratches on the plate or residual food) that the software mistook for larvae.

858
They also included larvae that lost their object identity and were consequently detected for only 859 part of the experiment (e.g. after temporarily reaching the plate's edge or touching other larvae).

860
After equally splitting each experiment into 60 s time bins, we retained objects for analysis that  of neurons (US) was paired with odour presentation (CS) to induce olfactory memory ( Figure 4A). 890 For each driver line, data was acquired from at least two separate crosses.

891
Classical conditioning followed a procedure similar to those described in Gerber and Hendel  A simplified example is shown using a 10 x 10 pixel box containing a small object. a. The object (black) was detected against the background (white) using binary thresholding. Edge pixels were detected by combining the results of vertical and horizontal image convolution with a 2 x 1 XOR kernel using an OR operator. b. The contour points were reconstructed in an iterative process, starting with the edge pixel closest to the centre of the box. The next contour point was defined as the first neighbouring pixel that was found to be an edge pixel. Neighbouring pixels were assessed clockwise from the pixel directly above the contour point. The process ended when no eligible edge pixels could be found. The larval contour (black outline) and head and tail (green) are shown. a. Initial detection of head and tail. The head was the contour point with the sharpest curvature. The tail was the contour point with the next-sharpest curvature which did not lie in close proximity to the head. b. The initial detection of head and tail was incorrect in some cases. False detection could be corrected by swapping head and tail, thereby minimising the distances from head and tail in the current frame (solid contour) to head and tail in the previous frame (transparent contour). c. The correction described in b failed if larvae curled up such that the contour appeared circular ("ball"). To eliminate this source of false head and tail detection, these events were detected using a ball classifier.
1481 Figure 1-Figure supplement 3. Calculating a smooth spine and landmark points. The larval contour is shown (black outline). The spine was comprised of eleven points (black), including head and tail (green). a. The raw spine points were obtained by finding the centres between equally spaced contour points on either half of the contour as defined by head and tail. The first spine point was the head, the last spine point was the tail. b. The smooth spine was obtained by exponentially smoothing the raw spine. c. Four additional landmark points, neck_top, neck, and neck_down (blue), and the contour centroid (grey), were calculated. head and tail are shown in green. a. crab_speed (blue) was defined as the component of neck_speed (grey) that was orthogonal to direction_vector_filtered (black). b. parallel_speed (blue) was defined as the component of neck_speed_filtered (grey) that was parallel to direction_vec-tor_filtered (black). c. parallel_speed_tail_raw (blue) was defined as the component of tail_-speed_filtered (grey) that was parallel to direction_tail_vector_filtered (black). d. tail was defined as the angle between tail_speed_filtered (grey) and direction_tail_vector_filtered (black).     Figure 4B.
Gal4 expression depicted as color-coded central nervous system. Preference scores for paired (light/odour, dark/air) and unpaired (dark/odour, light/air) groups are shown in red and grey, respectively.