Birds multiplex spectral and temporal visual information via retinal On- and Off-channels

Early retinal circuits divide incoming visual information into functionally opposite elementary signals: On and Off, transient and sustained, chromatic and achromatic. Together these signals can yield an efficient representation of the scene for transmission to the brain via the optic nerve. For example, primate On- and Off-parasol circuits are transient, while On- and Off-midget circuits are sustained. But this long-standing interpretation of retinal function is based on mammals, and it is unclear whether this functional arrangement is common to all vertebrates. Here we show that poultry chicks use a fundamentally different strategy to communicate information from the eye to the brain. Rather than using functionally opposite pairs of retinal output channels, chicks encode the polarity, timing, and spectral composition of visual stimuli in a highly correlated manner: fast achromatic information is encoded by Off-circuits, and slow chromatic information overwhelmingly by On-circuits. Moreover, most retinal output channels combine On- and Off-circuits to simultaneously encode, or multiplex, both achromatic and chromatic information. Our results from birds conform to evidence from fish, amphibians, and reptiles which retain the full ancestral complement of four spectral types of cone photoreceptors. By contrast, mammals lost two of these cones early in their evolution, and we posit that this loss drove a radical simplification and reorganisation of retinal circuits, while birds and many other extant non-mammalian lineages retain the ancestral strategy for retinal image processing. HIGHLIGHTS First large-scale survey of visual functions in an avian retina Off-circuits are fast and achromatic, On-circuits are slow and chromatic Most avian RGCs are OnOff and encode both types of information Colour and greyscale information can be decoded based on the kinetics


Birds multiplex spectral and
sustained, chromatic and achromatic. Together these signals can yield an 10 efficient representation of the scene for transmission to the brain via the optic 11 nerve. For example, primate On-and Off-parasol circuits are transient, while 12 On-and Off-midget circuits are sustained. But this long-standing 13 interpretation of retinal function is based on mammals, and it is unclear 14 whether this functional arrangement is common to all vertebrates. Here we 15 show that poultry chicks use a fundamentally different strategy to 16 communicate information from the eye to the brain. Rather than using 17 functionally opposite pairs of retinal output channels, chicks encode the 18 polarity, timing, and spectral composition of visual stimuli in a highly 19 correlated manner: fast achromatic information is encoded by Off-circuits, 20 and slow chromatic information overwhelmingly by On-circuits. Moreover, 21 most retinal output channels combine On-and Off-circuits to simultaneously 22 encode, or multiplex, both achromatic and chromatic information. 23 Our results from birds conform to evidence from fish, amphibians, and 24 reptiles which retain the full ancestral complement of four spectral types of 25 cone photoreceptors. By contrast, mammals lost two of these cones early in 26 their evolution, and we posit that this loss drove a radical simplification and 27 reorganisation of retinal circuits, while birds and many other extant non-28 mammalian lineages retain the ancestral strategy for retinal image 29 processing. Vertebrate retinal circuits process a stream of spatial, temporal, and spectral 58 information into parallel channels for transmission to the brain 1 . The number 59 of channels, what each channel encodes, and why, is a subject of active 60 debate 2 , but some general principles have emerged 3 . For example, that to 61 save energy, and to expand dynamic range, retinal circuits divide the visual 62 signal into an approximately equal number of On-and Off-channels 4 , and 63 into transient and sustained channels [5][6][7] . Information about polarity and 64 kinetics is thereby represented by independent 'elementary building blocks' 65 with decorrelated signals 5 . Similarly, wavelength information is efficiently 66 funnelled into a subset of chromatic channels in parallel to the achromatic 67 channels that dominate the retinal output 8,9 . 68 These long-standing principles are overwhelmingly based on research on 69 mammals, such as mice, primates, cats, and rabbits, and evidence from 70 non-mammalian species is beginning to question their generality. is lacking for most non-mammalian lineages, and especially for birds 17,18 . 77 This is in part due to the difficulty of recording from the ex-vivo bird retina 19- bird. 89 We find that unlike mammals 2,26 , chicks represent information about the 90 polarity, kinetics, and wavelength in a highly correlated manner, with fast 91 and achromatic Off-circuits, and slow and chromatic On-circuits. Moreover, 92 most retinal outputs combine both On-and Off-circuits to simultaneously 93 inform about both sets of information. 94 118 119

128
Most recorded cells correspond to retinal ganglion cells. 129 We recorded from n = 17 pieces of dorsal retina (n = 14 animals), yielding n 130 = 3,987 spike sorted cells that passed a minimum response criterion 131 (Methods). To estimate what fraction of these cells stemmed from retinal 132 ganglion cells (RGCs), which are the retina's output neurons, rather than 133 from displaced amacrine cells (dACs) 3 , we computed the 'electrical image' 20 134 (EI) for a subset of spike-sorted cells (Figure 1f-l, Methods, see also 135 Supplemental Video S1). This image permits inferring (i) each cell's spike 136 initiation zone and soma, (ii) the trajectory of its axon, if present and well-137 attached (e.g. Figure 1f), (iii) its conduction velocity (Figure 1g), and (iv) 138 whether it displayed saltatory (Figure 1h) or non-saltatory spike propagation 139 (Figures 1f,I, cf. Figure 1j). We reasoned that axon bearing cells projecting 140 towards the optic disc were likely to be RGCs. In contrast, axon-less cells 141 likely included dACs and any RGCs whose axons were not detected (e.g. 142 because they were on the array-edge, or because their axon was far from 143 the retinal surface to capture some of the diversity found. Cell 1 responded transiently to light 183 offset (Figure 1m), preferred long-over short-wavelengths (Figure 1n), and 184 had bandpass temporal tuning centred around 10 Hz (Figure 1o). The cell's 185 spectral kernels were correspondingly narrow, indicating high-frequency 186 temporal tuning, and confirmed the cell's non-opponent long wavelength 187 preference (Figure 1p). Cells 2 and 3 were both OnOff cells but differed in 188 their kinetics and spectral tuning. Cell 2 was On-sustained but Off-transient, 189 with broad, low-pass tuning to temporal flicker, and its spectral kernels were 190 colour opponent. were sustained. Third, analysis of the spectral kernels (Methods) revealed 229 that colour processing was also linked to polarity: Most OnOff clusters had 230 large-amplitude colour opponent kernels, whereas most Off-or On-231 dominated clusters were non-opponent or had low-amplitude kernels ( Figure  232 2c). As subset of these latter clusters responded more strongly to 'coloured' 233 steps than to the white steps that were used to define their polarity and 234 transience (discussed below) we also computed these measures from their 235 peak responses to the 'coloured' stimuli (Supplemental Figure S2a)

OnOff dominance as a hallmark of non-mammalian retinas? 286
To compare chick RGC responses to those of other vertebrates, we sourced 287 widefield 'white-step' response datasets of RGCs from larval zebrafish 10 , 288 mice 37 , and humans 38  revealed similarities between the chick and zebrafish datasets, which 291 systematically differed from the mammalian systems.

293
Chicks (Figure 2a), and to a lesser extent zebrafish (Figure 2d), feature a 294 sizable complement of OnOff channels (see also Supplemental Figure  295 2c,e,f). OnOff channels are also dominant in salamanders 15 and turtles 39 . In 296 contrast, segregated On-and Off-channels predominate in mice and 297 humans (Figure 2e,f). Even the few well-described mammalian OnOff 298 channels, such as the OnOff direction selective RGCs of mice or the small 299 bistratified RGCs of primates, are dominated by either their On-or their Off-300 components when probed with 'white' steps of light. Second, the link 301 between polarity and kinetics observed for chicks (Figure 2a) was also a 302 feature of the zebrafish dataset ( Figure 2d), but no such trend was 303 detectable for mice or humans, whose RGCs occupy the full coding space 304 encompassed by polarity and kinetics (Figure 2e,f).

306
What might be the benefit for chicks, and perhaps also of other non-307 mammalian vertebrates, to have combined OnOff-channels, rather than 308 segregated channels for encoding On-and Off-events? And why are 309 polarity, kinetics, and colour opponency linked? 310 311 To address these questions, we now analyse each system in turn: Off-, On-312 and then OnOff. We find that Off-channels encode fast achromatic contrast, 313 On-channels primarily deal with spectral information, while their combination 314 into OnOff-channels simultaneously carries information about both 315 achromatic and chromatic aspects of the visual stimulus.

317
Off-circuits encode achromatic temporal contrast 318 No cluster yielded exclusively Off-responses, but the most Off-dominated 319 cluster C1 was distinctive as the only cluster having large amplitude kernels 320 without appreciable colour opponency (

370
Although C1,2 were the two most Off-dominated clusters and had the fastest 371 temporal tuning, their flicker responses were On-at low and Off-at high-372 frequencies ( Figure 3h). Importantly the same reversal of response polarity 373 applied to all clusters with any phase locking (Supplemental Figure 3b).

375
Thus, it appears that chicks use On-and Off-circuits to encode slow and fast 376 temporal contrast respectively. Axonal conduction velocities of the four 377 groups supported this conclusion: Off-RGCs were faster than OnOff-RGCs, 378 which in turn were faster than On-RGCs ( Figure 3i).

380
Beyond speed, Off-circuits had three properties suited to encoding 381 achromatic intensity, exemplified by cluster C1. First, Off-responses had 382 linear contrast-response functions to white steps ( Figure 3j). Second, their 383 spectral tuning matched a log-sensitivity function of the LWS opsin ( Figure  384 3k), which is expressed in red single cones and the double cone ( Figure 1b On-responses were diverse (cf. Figure 4, discussed below). 390 Cluster C1 and C2, which contain the chick's rapid achromatic Off-circuits, 392 comprised 4.6% and 2.8% of recorded cells respectively. By comparison, 393 the functionally similar primate parasol cells comprise 10-16% of RGCs 41 , 394 while the types of alpha cells in mice make up about 5% 31,42-44 . Unlike in 395 chicks, these mammalian fast achromatic contrast systems have equal 396 proportions of On-and Off-cells.

398
On-circuits encode wavelength information 399 In contrast to the homogeneous, fast, and achromatic Off-circuits, On-400 circuits tended to be heterogeneous, slow, and spectrally nuanced. For 401 example, Ontr-cluster C18 had a highly selective On-response to blue-light 402 (Figure 4a-c), while cluster C15 was selective for red-light (Figure 4d-f). Both 403 clusters were selective for coloured over white light with the peak response 404 to the spectral stimuli exceeding the corresponding 'white' response. Both 405 C15 and C18 lacked any overt sign of colour-opponency but their' spectral 406 tuning was narrower than those of the spectrally closest opsins ( Figure  407 4b their spectral tuning functions were broader than a single opsin in isolation.

413
To explore how On-and Off-responses encode wavelength information we 414 computed two indices (Methods): The spectral dominance index compares 415 a cluster's largest colour step response to its 100% white-step response, 416 such that -1 and 1 indicate exclusive response to white and colour steps, 417 respectively, while 0 indicates equal responses. Conversely, the spectral 418 tuning index compares the width of a cluster's spectral tuning function to that 419 of its spectrally nearest log-opsin template, such that -1 and 1 respectively remaining Off and OnOff clusters lie between these extremes. By contrast, 430 Off-responses generally fell around the origin of both indices, indicating no 431 preference for white or coloured light, and spectral tuning consistent with 432 drive from a single opsin (Figure 4k, cf. Supplemental Figure S4b). 433 Accordingly, On-circuits had spectrally distinct responses, while Off-circuits 434 exhibited low spectral diversity. This difference was also illustrated by than the more linear contrast-response functions of Off-circuits (cf. Figure  439 3l). The Onsus group was an exception to this rule, with supra-linear contrast-440 response functions.

442
Taken together, it therefore appears that the chick retina disproportionately 443 leverages Off-circuits to encode rapid achromatic contrast (Figure 3), and 444 On-circuits to encode wavelength ( Figure 4).

OnOff channels multiplex spectral and temporal information 462
Building Accordingly, we explored the possibility that OnOff-channels in the chick 472 retina might be simultaneously informative about both spectral and temporal 473 contrast (i.e. "colour" and "greyscale").

475
Evidence for multiplexing comes from the spectral kernels of OnOff cells 476 ( Figure 5a). The full kernels were colour opponent in all OnOff clusters (C2-477 13), but their opponency was usually time-dependent with the spectral 478 kernels converging to a non-opponent Off-signal in the final ~100 ms 479 preceding the spike. This effect is illustrated by cluster C7 (Figure 5a, fifth 480 entry): at ~500 ms preceding the spike, the cluster is 'red/green'-On 'blue'-481 Off opponent (shaded in brown), but as the blue-kernel was monophasic (i.e.

487
To systematically quantify this property, we defined colour-opponent (+1) 488 and non-opponent (-1) phases in each OnOff cluster's spectral kernels 489 (Figure 5a, brown and grey shadings, respectively). We also normalised 490 each cluster's kernels in time (with "-1" and "0" indicating the timepoints 491 where the kernels' first exceed a minimum threshold amplitude, and the time 492 of the spike, respectively). For all twelve OnOff clusters we then computed 493 the mean±SD 'opponency fraction' over normalised time (Figure 5b; 1 and -494 1 on the y-axis denoting that all clusters were opponent or non-opponent, 495 respectively, Methods). This confirmed the systematic nature of the spectro-496 temporal responses: On average, spectral kernels were colour opponent 497 over long time scales, but non-opponent over short time scales.

499
Even though OnOff clusters differed by nearly an order of magnitude in their 500 overall kernel kinetics (Figure 5a, cf. Extended Data 1, Figure 4b,c), within 501 each cluster, kinetic order was stereotyped: Red-kernels tended to be the 502 fastest, followed by green-, then cyan-, and finally blue. Accordingly, most 503 OnOff clusters encoded a similar spectral hierarchy over different time-504 scales. Clusters C2-4 were fast at all four wavelengths, C5,6 intermediate and 505 C7-12 were remarkably slow. The achromatic and strongly Off-dominated 506 cluster C1 fitted the fast extreme of this pattern, as the fast non-opponent 507 fraction of the kernel was retained but the slow opponent fraction lost 508 (Extended Data 1). Overall, the analysis of spectral kernels suggests that 509 OnOff channels simultaneously encode slow spectral and fast achromatic 510 information, with the two sets of features being segregated, and thus 511 decodable, by their relative timings. 512 513 514 515

536
537 "Time-dependent opponency" in OnOff channels emerges from 538 differential spectral integration across On-and Off-circuits. 539 We next asked how these time-wavelength features might relate to the 540 properties of the On-and Off-circuits in isolation. To this end, we analysed 541 the OnOff clusters' On-and Off-responses to 'white' and 'coloured' steps of 542 light (Figure 5c-f). In line with corresponding observations at the level of the 543 kernels, this revealed a systematic wavelength-dependence of clusters' step 544 responses: For both On-and Off-transitions, the largest amplitudes tended 545 to occur for red, followed in turn by yellow, green, and cyan. For of the Off-546 channel, this trend extended to blue, while for On, blue-responses tended to 547 be larger than cyan responses. Together, these spectral responses 548 accounted for the previously observed biphasic and monophasic spectral 549 tuning functions of the On-and Off-channel, respectively (On: Figure 4l, Off: 550 Figure 3m). 551 552 Step responses differed in their latencies as well as their amplitudes ( Figure  553 5c,d). For example, red responses tended to be large, with short latencies, 554 while cyan responses were smaller and delayed. An inverse link between 555 response amplitude and latency is expected, but the variance across these 556 two parameters was sometimes substantial. For example, the latencies of 557 red-and green-On responses of cluster C6 differed by more than 60 558 milliseconds (red: 104 ms, green: 167 ms) despite their almost identical 559 slopes (Figure 5c, fourth entry). By comparison, the On-response to the 560 100% white step was intermediate at 110 ms. In fact, within the On-channel, 561 the latencies of red-responses tended to be slightly below those of the 562 corresponding 100% contrast white step responses. Conversely, in the Off 563 channel the white response was generally dominant (Figure 5d). 564 565 To explore if and how these types of amplitude and kinetic differences might 566 encode stimulus wavelength and/or intensity, we quantified the OnOff 567 clusters' 'colour' and 'white' step-response amplitudes and latencies and 568 normalised each relative to their respective red-response. (Figure 5e,f). This 569 revealed that for the On-, but not for the Off-channel, the combination of 570 these two simple metrics alone sufficed to substantially disentangle 571 wavelength from intensity information. Red-responses were systematically 572 faster than white-responses, allowing their detection by latency alone 573 (Figure 5e), whereas, green-and blue-, but not cyan-or yellow-, could be 574 distinguished from the 'white' contrast series by their long latencies, despite 575 their large amplitudes. While white-responses increased in latency while 576 dropping in amplitude with decreasing contrast, blue-and green-On 577 responses had almost equal amplitudes to red-responses but a nearly two-578 fold greater latency. In contrast, the Off-responses to 'coloured' and 'white'-579 steps had similar amplitudes and latencies, precluding their differentiation 580 by these metrics (Figure 5f).

582
Beyond amplitude and latency, step responses of the OnOff cells differed in 583 their overall temporal envelopes. Accordingly, to capture amplitude and 584 kinetic differences across 'coloured' and 'white'-step responses we used 585 Principal Component Analysis (PCA). We first averaged step responses to 586 all red-, yellow-, green-, cyan-and blue-step as well as to the 100, 80, 60, 587 40 and 20% contrast steps (Figure 5g,h, top), which as expected reproduced 588 the amplitude and kinetic features discussed previously. From these 589 averages, we performed PCA separately on On-and Off-responses, both of 590 which yielded a relatively slow first component followed by a faster second 591 component that together captured >99% of the total variance (Figure 5g,h, 592 middle). We then projected each response's loading onto these two 593 components (Figure 5g,h, bottom). As predicted from our analysis of 594 amplitudes and latencies alone (Figure 5e,f), this highlighted systematic 595 differences in the encoding of 'coloured' versus 'white' stimuli for the On-596 channel, but a common encoding scheme for the Off-channel. For On-597 responses, the white contrast series was almost completely captured by the 598 first principal component, while colour steps exhibited an approximately fixed 599 loading onto PC1 but wavelength-dependent differences onto PC2. As 600 expected, for Off-responses colour and white steps followed essentially the 601 same trajectory in PC-space.

603
Finally, to establish the generality of this encoding strategy, we performed 604 PCA separately for a subset of individual OnOff-clusters (C3-8) (Figure 5i,j). 605 In each case, ensuring a common polarity of the first two components 606 (Figure 5i,j, top), we normalised the loadings onto the two PCs between -1 607 and 1 to enable side-by-side comparison (Figure 5i,j, bottom). By and large, 608 this strategy recapitulated the same distribution of loadings in PC space 609 across coloured and white steps. White-On responses mostly followed PC1 610 with generally only weak loadings onto PC2, while conversely, red-, yellow, 611 green-and blue-On-responses required a relatively fixed loading onto PC1 612 but systematically distinct loadings onto PC2. In contrast, as before, 613 coloured, and white Off-responses followed a common, intermixed 614 trajectory.

616
Taken together, we conclude that in the chick retina, and perhaps also in 617 some other non-mammalian species, On-and Off-channels predominately 618 encode slow-spectral and fast-achromatic information, respectively, while a 619 majority of OnOff-channels simultaneously capture both types of 620 information. 621 622 DISCUSSION 623 624 We have shown that the spiking output from the chick retina comprises 625 multiple kinetically and spectrally diverse OnOff-channels, alongside sparser 626 populations of fast achromatic Off-, and slow chromatic On-channels. These 627 correlations between polarity, kinetics, and wavelength selectivity ( Figure  628 2a-c) imply that there is a general organising principle in this avian retina, 629 where Off-and On-channels represent fundamentally different aspects of 630 the visual scene namely: time and colour. 631 632 Pathway splitting in non-mammalian retinas 633 The division of the visual signal into On-and Off-pathways at the retina's 634 first synapse is a fundamental and ancient 46 feature of vertebrate vision 4 . 635 The division reduces energy requirements and increases the dynamic 636 range 47 so it is not surprising that sensory systems should balance On-and 637 Off-channels, as exemplified in mammal RGCs 2 (cf. Figure 2d-f). Well-638 segregated and coordinated On-and Off-channels also emerge de novo in 639 retina-inspired computational models aiming to capture as much information 640 as possible from natural scenes 48,49 . Why, then, do chicks not comply with 641 this arrangement? 642 643 One partial explanation might be that beyond efficient use of neural 644 bandwidth and energy, other aspects of visual processing benefit from 645 combining On-and Off-signals. Specifically, encoding wavelength 646 independent of intensity, which is an essential feature of colour vison, 647 requires opponency between spectrally distinct On-and Off-inputs 28 . One 648 example of an OnOff chromatic opponent channel is the primate small 649 bistratified RGC 50,51 , which has two notable parallels to the multiple colour 650 opponent OnOff cells of chicks. First, as in chicks, the Off-circuit is driven by 651 LWS cones, and second, the blue On-and yellow Off-circuits have different 652 time-courses: On is slow, but Off is fast 50 . In primates, this feature is probably 653 inherited from corresponding kinetic differences between SWS1 and LWS 654 cones 52 . It is unknown if also in chicks the different spectral photoreceptors 655 have systematically different kineticsbut if they do, it might partly account 656 for the correspondingly systematic kinetic differences in spectral kernels 657 observed at the level of RGCs (Figure 5a) 'blue and slow'), the overall response kinetics of these clusters varied by 669 more than a log unit in speed. The different OnOff clusters therefore 670 systematically encoded similar sets of spectral and kinetic contrasts, but for 671 a rage of temporal regimes.

673
To what extent these considerations will stand the scrutiny of systematic 674 computational exploration will be important to explore in the future. It will 675 then also be key to establish whether and how such a system might be 676 extendable to the encoding of space. Based on long-standing work on the 677 retinal output of mammals 57 , a central prediction would be that the slow and 678 fast cells should correspondingly encode small and large-scale spatial 679 features, respectively. 680 681 Why are chick and mammal retinas so different? 682 The systematic functional differences in the retinal organisation of chicks, 683 and perhaps of other vertebrate lineages, from that of mammals ( Figure 2) 684 implies that different clades have evolved divergent strategies for 685 communicating visual information from the eye to the brain ( Figure 6). 686 Exploring these differences, their computational consequences, and how 687 they may have come about, will be important in the future, but for now it may 688 be useful to posit one possible avenue of exploration; namely that birds, 689 reptiles, amphibians, and fish differ from mammals in their complements of 690 photoreceptors (Figure 6a). While some lineages in each of the former 691 clades retain the full complement of four ancestral single cones 58often 692 elaborating them in various waysearly mammals lost their SWS2 and RH2 693 cones 28,29 , so that typical mammalian retinas are driven by two cone inputs, 694 but many non-mammalian retinas are driven by four or more.

696
This systematic difference in inputs may lead to different retinal processing 697 challenges. Perhaps the complex interplay of time and wavelength coding in 698 the chick is a consequence of their relatively complex input system that is 699 carried over from ancestral vertebrates, while the loss of two cones more 700 than 200 million years ago, alongsidepresumably -freeing up a substantial 701 diversity of now disused inner retinal circuits, left mammals with an 702 opportunity to evolve new, powerful processing strategies that were 703 previously precluded. 704 705 706 707

722
In support, mammalian retinas also differ from those of birds, reptiles, 723 amphibians, and fish in other ways (Figure 6b)

The avian double cone as the input to the fast, achromatic Off-circuit? 740
Bird and reptile eyes are unique having 'double cones' that are distinct from 741 the full complement of ancestral single cones 28,40,63 . Double cones are made 742 up of two tightly associated cells: a principal and an accessory member, 743 which are independently wired into outer retinal circuits 63 . Both members 744 express the same 'red' LWS opsin that is also found in the ancestral LWS 745 single cones. However, unlike LWS single cones of other species, which are 746 generally associated with achromatic processing 29,30,33 , direct recordings 747 from either member of avian double cones have not been achieved, leaving 748 insights as to their functions speculative 70 . In general, their numerical 749 abundance in the periphery 71 , but absence from the fovea 72 , hints at a key 750 role in finely resolved temporal rather than spatial processing 10,73,74 . 751 752 In chick, the only robust responses to fast achromatic flicker occurred in the 753 Off channel, whose spectral tuning was consistent with near-exclusive drive 754 from an LWS-expressing photoreceptor system (Figure 3). The simplest 755 explanation for these observations is that the chick's rapid and achromatic 756 Off-circuits are driven by either or both members of the double cone, and/or 757 the red-single cone. In tentative agreement, at least two anatomical types of 758 chick bipolar cells receive exclusive direct input from the principal member 759 of double cones alongside inputs from rods 63 . Both these cells stratify in the 760 upper to middle fraction of the inner retina which is generally associated with 761 Off-dominated processing 5 . Our data therefore lends further credence to the 762 idea that the avian double-cone system might support fast, achromatic 763 vision 40 , and add the perhaps surprising notion that this signal appears to be 764 exclusively carried by Off-circuits.

766
On for 'colour', Off as a common reference 767 To distinguish wavelength from intensity, circuits for colour vision use colour 768 opponency as their fundamental 'currency' 28 . However, beyond a basic 769 requirement of combining spectrally distinct On-and Off-signals, there are 770 many options to build an opponent circuit. A short-wavelength On-pathway 771 could be combined with a long-wavelength Off-pathway, but it is equally 772 plausible to do the reverse, i.e. to combine short-Off with long-On. And yet, 773 overwhelmingly, vertebrate circuits for colour vision, including in mammals, 774 appear to favour the former 9,11,28,29,32,75 . Second, when more than one colour 775 opponent axis is established, the second axis could either oppose two 776 entirely new wavelength ranges, or it could reuse one of the two signals from 777 the first axis. Here, the spectral heterogeneity of the On-channels (Figure 4), 778 but homogeneity of the Off-channels (Figure 3), suggests that chicks do the 779 latter: They systematically oppose their spectrally diverse On-signals to a 780 common, LWS-driven Off reference. Similarly, the zebrafish brain appears 781 to be dominated by spectrally narrow and diverse On-signals, but spectrally 782 homogenous, broad and LWS-shaped Off-signals 13 . 783 784 Cortex-like hue-coding in the avian retina? 785 Despite lacking overt signs of colour opponency, many cells were 'spectrally 786 selective' in that they exhibited sharper-than-opsin spectral tuning and a 787 preference for spectrally narrow over 'white' light ( Figure 4). Sharper-than-788 opsin tuning can, in principle, be achieved by rectification of a spectrally 789 broader, non-opponent drive. However, in that case the cell would 790 nevertheless be expected to respond strongly to 'white' stimulation, which 791 was not the case for many of the spectrally narrow clusters (Figure 4j). 792 Alternatively, narrow tuning could be built by rectifying an already opponent 793 input. For example, a hypothetical RGC rectifying an incoming drive from a 794 colour opponent BC 12,69,76 such that only the On-lobe of the opponency 795 persists could readily account for the profusion of 'colour-selective' On-796 RGCs in our dataset. Neurons with similar properties have long been 797 discussed as part of the colour vision machinery of the primate cortex, where 798 they are usually referred to as hue-selective 77 . Similarly, narrower-than-799 opsin On-responses also exist in the zebrafish tectum 13 hours overnight. In total darkness using infrared goggles, chicks were 840 sacrificed by cervical dislocation followed cutting of the aorta. Eyes were 841 enucleated by first cutting the eyelid around the cornea, followed by a single 842 anterior leading cut between eyes and beak. Using curved forceps (FST 843 11652-10, FST Heidelberg, Germany), the eyes were then lifted, and the 844 optic nerve was cut using scissors with a partially blunt tip (FST 14083-08).

845
In the following the remaining muscle tissue around the eyes were removed 846 and lifted from the skull. The eyes were cut two times proximal to the edge 847 of the cornea using pointed scissors (FST 15017-10) and transferred into 848 two bottles containing preheated, high-magnesium oxygenated ringer 849 solution at 37°C. The bottles were light sealed. Eyes were transferred to the 850 experimental site for retinal dissection. Total time for the above was ~15 min. 851 Dissection followed procedures detailed in Ref 80 . However, all steps were 852 performed under infrared light using night vision googles (PSV-14, ACT in 853 Black, Luxembourg), and in a high-magnesium dissection ringer solution. 854 The following steps were performed in a petri dish unless otherwise stated: 855 1. Removal of the cornea from the eyeball with as few cuts around the 856 horizon as possible (FST 15017-10). 857 2. Cutting the eyeball along the dorsalventral axis using two cuts from 858 opposite sides. 859 3. Removing the vitreous from the eyeball using forceps. hemisphere. 862 5. Transfer of this piece onto a filter paper, with the RGCs facing the filter 863 paper and the remaining sclera facing up. 864 6. The filter paper and tissue were then transferred onto a kitchen roll 865 paper to draw solution, which flattened the tissue and aided attachment to 866 the filter paper. 867 7. The remaining sclera, choroid and retinal pigment epithelium were 868 removed from the retina by using forceps to peel those layers off the retina, 869 which remained attached to the filter paper. 870 8. The filter paper and retina were transferred back to the petri dish. 871 9. The retina was removed from the filter paper using forceps. 872 10. A smaller, about 2.5 mm 2 piece was cut. The retina would commonly 873 get folded at a few places during step 6 due to flatting of what normally is a 874 curved tissue. In this step, an area without folds was chosen. Folded parts 875 of the retina were avoided even if this resulted in smaller preparation. In 876 addition, corners in the tissue were rounded off since these sites tended to 877 trigger the degenerative waves. 878 11. The retina was transferred to the MEA chamber using a spoon. The 879 tissue was guided onto the spoon using forceps and was continuously 880 guided while being on the spoon to avoid strong movement of the tissue 881 during transfer. 882 12. In the MEA chamber the corners of the tissue were cut. This was done, 883 to remove parts of the tissue that had been damaged by the forceps during 884 tissue guiding steps. 885 13. The tissue was placed onto the electrode array with a fine paintbrush. 886 14. The MEA chamber was dried using kitchen roll to suck out the ringer 887 solution. As soon as the tissue was exposed to the air new ringer solution 888 was added directly on top of the tissue. In what follows we describe the procedures followed to cluster the data. We 1002 clustered using all four of the above stimuli: CS, WS, Chirp and SK. 1003 To determine the spiking rate over time for each cell in response to each of 1004 the CS, WS and Chirp stimuli we mapped all spike times onto the time 1005 interval spanned by the first repeat of that stimulus and applied kernel 1006 density estimation (KDE) using the Matlab routine ksdensity. We used the 1007 default probability density function for the KDE, such that the area under the 1008 resulting curve is equal to one, thus normalising the spiking rate across cells 1009 defined as the most common amplitude value throughout the full response 1119 trace to these two stimuli, which yielded reasonable estimates as judged by 1120 manual inspection. This cluster-wise baseline estimate was also individually 1121 associated with each cell within a cluster because this procedure was judged 1122 to yield more robust cell-wise estimates compared to computing the same 1123 metric based on each cell. The baseline value was used in two ways: As a 1124 display item in the spectral tuning and contrast-response plots (e.g. Figure  1125 3j,k), and as a means to normalise response amplitudes across clusters, 1126 where tuning functions were normalised between 0 (baseline) and 1 (peak 1127 response, e.g. Figure 3l,m). 1128 For each step transition (i.e. On and Off), we also computed 'transient' and 1129 'sustained' response measured based on the peak spike rate within time 1130 windows of 80-160 and 240 -2,000 ms following the step transition, 1131 respectively. To ameliorate the effects of noise, we used box-smoothed 1132 (window size of 40 ms) response traces to estimate these latter two metrics. 1133 The transient and sustained response metrics were used as the basis of the 1134 transience indices (e.g. Figure 2b). The transient-response amplitude was 1135 further used as the basis of spectral dominance index (Figure 4j,k, 1136 Supplemental Figure 4), and to compute normalised amplitudes used for 1137 relating latencies and amplitudes in Figure 5e,f.

1139
Polarity index (PI). For clusters, the polarity index (PI) was computed based 1140 on On-and Off-response amplitudes (see above) of the 100% contrast WS 1141 as: 1142 Where AOn and AOff are the On-and Off-response amplitudes, respectively. 1144 Accordingly, PI ranged from -1 to 1 to denote entirely Off-and On-dominated 1145 responses, respectively. A PI of 0 denotes a cell with equal amplitude On-1146 and Off-responses. The same measure was also used to compute polarity 1147 of the zebrafish, mouse and human datasets (Figure 2d-f, Supplemental 1148 Figure 2c), with the exception that in this case, AOn and AOff were taken as 1149 the total response for 1 second (rather than 2 seconds) following a step-1150 transition. This adjustment was necessary because the step-duration in the 1151 mouse dataset was 1 second.

1153
Transience index (TI This 'compound TI' was used for Figure 2b-e, while the individual On-and 1166 Off-TIs were used for Supplemental Figure 2d. CS responses, and that of the 100% contrast white response. SD ranged 1173 from -1 to 1, indicating responses entirely dominated by WS and CS stimuli, 1174 respectively.

1176
Spectral tuning (ST). An index of "spectral tuning" (ST) was devised to 1177 indicate how closely the spectral tuning of a given response matches that of 1178 any of the four opsins expressed across the chick's cones (LWS, RH2, 1179 SWS2, SWS1 17 ). To compute ST, we first log-transformed Govadovski-1180 templates 84 of each of the four opsins and stretched it over one log-unit, such 1181 that in each case a linear response of 10% and 100% maximum was 1182 mapped to zero and 1, respectively. Such a ~10-fold log-transform is 1183 expected based on the phototransduction cascade of cones 85 . The resultant 1184 transforms are also used as the basis of all opsin-templates presented 1185 across the figures. For each cell or cluster's normalised spectral tuning 1186 function (see above) we next determined which of the four templates 1187 provided the closest match based on their correlation coefficient. We then 1188 subtracted the normalised tuning functions from their respective template, 1189 and computed ST as the mean of their difference, multiplied by -1. As such, 1190 ST was zero when the opsin template and response were perfectly matched, 1191 but negative and positive, respectively, if the response was spectrally 1192 broader or narrower than the opsin. Throughout, we applied a minimum 1193 response threshold of 3 spikes as the peak responsetuning functions 1194 based on fewer spikes were not considered.

1196
Latency. Response latency was computed separately from On-and Off-1197 transitions of all time-smoothed WS and CS responses (40 ms window size) 1198 as the time to half peak. 1199 1200 Principal Component Analysis (PCA). To compare the temporal response-1201 envelopes elicited by CS and WS stimuli, we used principal component 1202 analysis. Computing separately for On-and Off-transitions, for a given 1203 cluster (Figure 5i,j), or the mean responses of all OnOff clusters ( Figure  1204 5g,h), we combined the first five CS (i.e. excluding UV) and first five WS 1205 (100, 90, 80, 70, 60% contrast) into a 50x10 input matrix (50 time bins of 20 1206 ms each, 10 responses). To exclude residual activity from the preceding 1207 stimulus, we zeroed the first 4 bins (=80 ms). The first two principal 1208 components emerging from PCA across this matrix consistently explained 1209 >94% of the total variance. Accordingly, higher PCs were discarded. To 1210 relate the results from the PCA across clusters (Figure 5i,j), sets of loadings 1211 were individually peak normalised to 1.

1213 1214
Spectral Kernels (SK) 1215 Kernel amplitudes. Kernels were individually (R, G, C, B) z-normalised 1216 based on timepoints between 1,000 and 500 ms preceding the spike, and 1217 amplitudes were subsequently computed (in z-scores) as the difference 1218 between their maximum and minimum values. Cluster-mean kernels with an 1219 amplitude <2.5 were discarded from further analysis. 1220 1221 Spectral Centroids. To quantify the kinetics of each kernel, we estimated 1222 their central frequency in the Fourier domain ('spectral centroid') as follows.

1223
We first computed each (R, G, C, B) kernels' probability mass function as its 1224 the onset of the On and Off-phase, respectively, as indicated e.g. in Figure  1278 3h). For each cell and time bin we then determined resultant median vector 1279 strength r (ranging from 0 to 1 to indicate a random relationship between 1280 each spike and the stimulus, and perfect phase locking, respectively) as 1281 described elsewhere 86 . For statistical testing, we also computed the same