Near-optimal rotation of colour space by zebrafish cones in vivo

For colour vision, retinal circuits must separate information about intensity and wavelength. This requires circuit-level comparison of at least two spectrally distinct photoreceptors. However, many vertebrates use four or more, and in those cases the nature and implementation of this computation remains poorly understood. Here, we establish the complete circuit architecture and function of outer retinal circuits underlying colour processing in the tetrachromatic larval zebrafish. Our findings reveal that the specific spectral tunings of the four cone types near optimally rotate the encoding of natural daylight in a principal component analysis (PCA)-like manner to yield one primary achromatic axis, two colour-opponent axes as well as a secondary UV-achromatic axis for prey capture. We note that fruit flies – the only other tetrachromat species where comparable circuit-level information is available - use essentially the same strategy to extract spectral information from their relatively blue-shifted terrestrial visual world. Together, our results suggest that rotating colour space into achromatic and chromatic axes at the eye’s first synapse may be a fundamental principle of colour vision when using more than two spectrally well-separated photoreceptor types.


INTRODUCTION
In visual scenes, information about wavelength is fundamentally entwined with information about intensity because the spectrum of natural light is highly correlated [1][2][3] . Accordingly, wavelength information must be extracted by comparing the signals from at least two spectrally distinct photoreceptors. To this end, most animal eyes use up to five spectral types of photoreceptors for daylight vision, with around four being the norm for vertebrates (reviewed in Ref 4 ). However, our knowledge on how the signals from four or more spectral types of photoreceptors are harnessed at a circuit level to extract this specific chromatic information remains limited.
Increasing the diversity of available spectral photoreceptors exponentially expands the diversity of theoretically detectable spectral contrasts. However, there is a law of diminishing returns: In natural scenes, some spectral contrasts are much more abundant than others. For efficient coding [5][6][7] , animal visual systems should therefore prioritise the specific contrasts that are particularly prevalent in their natural visual world.
Here, we explored how zebrafish compute wavelength and intensity information from their natural visual world. Like many surface-dwelling fish, already their larvae use the 'full' ancient tetrachromatic cone-photoreceptor complement comprising red-, green-, blue-and UV-cones 8 . Importantly, their retinal circuits can be non-invasively monitored and manipulated in the live animal 9 to provide insights into the computation of colour in the intact circuit.
We asked three questions: (i) What is the in vivo spectral tuning of zebrafish cone-synapses, (ii) how does this specific tuning support efficient sampling and decomposition of natural light and (iii) what is the circuit implementation? Surprisingly, we found that zebrafish effectively 'solve' the basic wavelength discrimination problem already at their first synapse: Red-cones encode broadband achromatic information, green-cones signal along an optimal primary chromatic axis, blue/UV cones together encode a second, near-optimal chromatic axis, while UV-cones by themselves provide a secondary 'UV-isolating' achromatic signalpresumably for visual prey capture 10 . We go on to show how this spectral tuning is anatomically and functionally implemented at the circuit-level using horizontal cells.
Finally, zebrafish are not alone in using such an efficient strategy. By linking the spectral tuning of Drosophila melanogaster photoreceptors 11 with hyperspectral natural imaging data 12 we note that fruit flies use essentially the same trick. However, their spectral tunings are systematically blue-shifted compared to those of zebrafish, presumably to directly acknowledge the relatively blue-shifted statistics of natural light above the water 13 . Taken together, our findings highlight a potentially general circuit-level mechanism of vision whereby incoming light is nearoptimally decomposed into "colour" and "greyscale" components at the earliest possible site.

Spectral tuning of zebrafish cones in vivo.
To determine spectral tuning functions of the larval zebrafish's four cone types 8 (red, green, blue, UV), we custom-built a hyperspectral full-field stimulator based on an earlier design 14 (Fig. 1a,b). A diffraction grating was used to reflect the light from 14 LEDs (350 to 680 nm) into a collimated fibreoptic that was pointed at the live zebrafish's eye mounted under a 2-photon (2P) microscope. To avoid spectral cross-talk with the 2P imaging system, we line-synchronised each LED's activity with the scanner retrace 15,16 . Together, this arrangement permitted spectrally oversampling the much broader cone opsins (Fig. 1b,c) during in vivo 2P imaging in the eye. 14 : Two banks holding a total of 14 spectrally distinct and collimated LEDs are pointed at a diffraction grating. A spectrally narrowed fraction of each LED's light is then further collimated into a light guide and presented as full-field to the zebrafish under the 2P microscope. To prevent spectral cross-talk, the LEDs are time-interleaved with the 2P scan 15 . b, Effective spectra of the 14 LEDs measured at the sample plane. c, Govadovskiitemplates of opsin absorption spectra for the four zebrafish cones 49,50 . Green and blue cones, but not red and UV cones, display strong spectral opponency. We generated four cone-type specific SyGCaMP6f lines (Fig. 2a,b) 17 and measured the spectral tuning of each cone type at the level of their pre-synaptic terminals (pedicles) (Fig. 2c,d). Here, cones connect with other cones via gap junctions 18 , with horizontal cells (HCs) which provide both feedback and feedforward inhibition 19 , as well as with bipolar cells (BCs) which carry the photoreceptor signal to the feature extracting circuits of the inner retina 20 .

Figure 1 | Hyperspectral stimulation under 2-photon. a, Schematic of the stimulator, inspired by Ref
From fluorescence traces, we extracted tuning functions (Methods), inverting both the x-and y-axes ( Fig. 2d and inset). The inversions were done to display tuning functions from short-to long-wavelengths as is conventional, and to compensate for the fact that vertebrate photoreceptors hyperpolarise in response to light 21 . We further adhered to the time-inversion henceforth to facilitate comparison between raw data and summary plots (e.g. Fig. 2e). We systematically measured such tuning functions for n = 409, 394, 425, 431 individual red-, green-, blue-and UVcones, respectively (n= 9, 11, 12, 7 fish). A total of n = 172, 288, 312, 410 recordings, respectively, passed a quality criterion (Methods, Fig. S1a-d) and were kept for further analysis.
Because the larval zebrafish eye is both structurally and functionally asymmetrical 10,13,[22][23][24] , we always sampled from four different regions of the eye's sagittal plane: dorsal (D), nasal (N), ventral (V) and the area temporalis (acute zone, AZ (also known as "strike zone" 13 )). With exceptions noted below, we found that the spectral tuning of cones was approximately eye-position invariant (Fig. S1d). For further analysis we therefore averaged across cones irrespective their position in the eye (Fig.  2e,f). On average, red-and UV-cones had monophasic (non-opponent) tuning functions that were largely in line with the tuning function of their respective log-transformed opsins (Methods). Such a log-transform is expected from the nature of signal transfer between outer segment phototransduction to synaptic calcium in the pedicle 11,25,26 . Red-cones were broadly tuned and never exhibited opponency (Fig. 2f, left). In fact, some individual red-cones hyperpolarised in response to all tested wavelengths (Fig. 2e, left, cf. Fig.  S1d). Nevertheless, on average red-cone sensitivity was weakly suppressed in the UV-range compared to the log-transformed opsin template. In contrast, all UV-cones were narrowly tuned up to the shortwavelength cut-off imposed by the eye optics (~350 nm, unpublished observations). Their tuning curve near perfectly matched the respective opsin template (Fig. 2b, right). UV-cones in the AZ and ventral retina moreover exhibited weak but significant opponency to mid-wavelengths (Fig. S1d).
Unlike red-and UV-cones, the in vivo tuning functions of green-and bluecones did not match their log-transformed opsin templates. Instead, these cones consistently exhibited strong spectral opponency to mid-and/or long-wavelength light. Here, blue-cones had a highly consistent zerocrossing at 483±1 nm, while most green cones inverted at 523±1 nm (mean, 95% confidence intervals, Methods). Green-cones in the acute zone were slightly long-wavelength shifted with a zero-crossing at 533±1 nm (Fig. S1d).
Together, to our knowledge this established the first direct and in vivo measurements of cone-pedicles' spectral tuning functions in a vertebrate.
Near-optimal encoding of achromatic and chromatic contrasts in natural light. To explore how the specific in vivo cone tuning functions may support zebrafish vision in nature, we next computed the distribution of achromatic and chromatic content of light in their natural habitat. For this, we used a total of n = 30 underwater hyperspectral images (1,000 pixels each: 30,000 spectra) 12,13 (Fig. 3a-c). Using one example scan for illustration (Fig. 3a), we first computed each cone's view of the world in the absence of outer retinal feedback by multiplying each opsin spectrum with each pixel spectrum pointwise and summing over the entire spectrum ( Fig.  3d-f). In this configuration, the intensity-normalised representations of the scene by each of the four cones were extremely similar as expected from high spectral correlations in natural light (Fig. 3d). In contrast, when the same scene was computed for the intact outer retinal network by taking the in vivo cone-tuning functions (from Fig. 2f), the different cones instead delivered much more distinct images ( Fig. 3g-i).
Next, to determine the spectral axes that optimally capture the variance of natural light in the zebrafish's natural underwater world, we used principal component analysis (PCA) across the spectra of all n = 30,000 pixels in the data set (Fig. 3c, j-l). Due to the strong spectral correlations in natural light, the first component (PC1) captured the achromatic ("black and white") image content, while subsequent components (PC2, PC3 etc.) captured the major chromatic ("colour") axes in decreasing order of importance 1,5 . We computed what the example scene would look like if sampled by detectors that were directly based on PC1-3. We found that the scenes processed by PC1 and PC2 (Fig. 3j) were highly reminiscent of the scenes sampled by in vivo red-and green cones, respectively (Fig. 3g). Next, PC3 closely resembled the scene when reconstructed by a short-wavelength axis (S, cyan) that was the average of in vivo blue-and UV-cones (cyan: short-wavelength, S). Direct superposition of these cones' spectral tuning functions with the first three principal components further illustrated this striking match (Fig. 3m), especially in the case of the achromatic (PC1 vs red-cone) and first chromatic axis (PC2 versus green-cone). These specific in vivo cone-tuning functions served to decorrelate cone-type responses to natural light (Fig. S2a), as further quantified in a substantial reduction in the multivariate dependence (Fig. S2b,c; see Methods). Taken together, it therefore appears that larval zebrafish effectively rotate colour space already at their visual system's first synapse signal along an achromatic axis (red-cones) as well as along two chromatic axes (green-cones, blue+UV-cones), which are chosen to decorrelate the chromatic space almost optimally.

Figure 3 | In vivo cone tunings near-optimally decompose natural light statistics. a-c,
Hyperspectral data acquisition from zebrafish natural visual world. A 60° window around the visual horizon of an example scene recorded in the zebrafish natural habitat (a) was sampled at 1,000 equi-spaced points with a custom-built spectrometer-based scanner 12 (b) to yield 1,000 individual spectral readings from that scene. (c) summarises the pooled and z-normalised data from n = 30 scenes (30,000 spectra) with mean±SD (data from Ref 13  Next, a conceptually similar decomposition of natural light also appears to be used in Drosophila (Fig. 3n), the only other tetrachromatic species where in vivo spectral tuning functions of photoreceptor outputs are available 11 . In flies, R7/8 photoreceptors are associated with colour vision 27 . We therefore compared spectral tuning curves of the four varieties (pR7, yR7, pR8, yR8) of Drosophila R7/8-type photoreceptors (taken from 11 ) with the principal components that emerged from natural spectra of n = 4 1,000 pixel daytime field and forest scenes 12 . Like for zebrafish, this showed that their spectral tuning curves were reasonably well approximated by the first three terrestrial PCs: PC1 and yR8, PC2 and pR8 and finally PC3 and pR7/yR7. However, relative to zebrafish, all spectra were blue-shifted (Fig. S2d,e), in line with the relatively-increased predominance of short-wavelength light above the water (Fig. S2f). This suggests that rotating colour space into achromatic and chromatic axes as early as possible may be a fundamental principle of colour vision when using more than two spectrally well-separated photoreceptor types, in a striking example of convergent evolution.
We next wondered how the specific cone tunings in larval zebrafish ( Spectral tuning of zebrafish cones is fully accounted for by expressed opsins and horizontal cell feedback. The nature of phototransduction in cone-photoreceptors dictates that the absorption of photons can lead to a drop in synaptic calcium. Accordingly, light-driven increases in synaptic calcium ( Fig. 2f) must come from a sign-inverting connection from other cones, most likely via horizontal cells (HCs) 28,29 . We therefore pharmacologically blocked HC (Methods, Fig. 4a,b). This completely abolished all spectral opponency and increased the UVresponse amplitude of red cones. As a result, now all four cone-tuning functions were fully accounted for by the respective log-transformed opsins ( Fig. 4a,b, Fig. S3a). This further implied that possible heterotypical conecone gap junctions, if present, do not strongly contribute to spectral conetuning. In support, cone-tunings were essentially invariant to additional genetic ablation of UV-cones in the absence of HCs ( Fig. 4c-f). Moreover, reducing overall stimulus brightness to probe for possible response saturation had no major effects on tuning functions (Fig. 4g,h). Taken together, our results strongly suggest that in vivo, the spectral tuning of all zebrafish cones is driven by the expressed opsin variant and shaped only by specific connections with HCs relaying feedforward signals from other cones. What are these HC connections?

Figure 4 | Opsin-like cone-responses in the absence of horizontal cells. a,b, Population responses of each cone type during pharmacological blockage of HCs (a, Methods) and population mean±95% confidence intervals with log-transformed respective opsin template superimposed (b, Methods). c, pharmaco-genetic UV-cone ablation in the background of pan-cone GCaMP labelling before (top) and 24h after 2h treatment of metronidazole (10 mM) application (bottom, Methods). d, e, red-cone tunings after UV-cone ablation (n = 77) (d) and after additional pharmacological HC blockage (n = 103) (e). Shown are heatmaps (left) and means±SD (solid lines+shadings), and analogous data in the presence of UV-cones (dotted, from Figs. 2f, 4b). f, as (d), but here recording from blue cones (n = 30). g,h, red-(n = 17) (g) and UV-cone tunings (n = 43) (h) at reduced overall stimulus-light intensities (solid lines + shadings, Methods), compared to tunings at 'standard' light intensities (from Fig. 2f). Grey bars on the x-axis in (d-h) indicate significant differences based on the 99% confidence intervals of the fitted GAMs (Methods). Note that heatmaps (a,d-h) are time-inverted to facilitate comparison to summary plots (b, d-h).
A connectome of the larval zebrafish outer retina. Light-microscopy studies in adult zebrafish described at least three types of cone-HCs (H1-3), which contact R/G/B/(U), G/B/U and B/U cones, respectively 28,30 . However, for larval zebrafish HC-types and their connections to cones are not known except for H3 31 . To complete this gap in knowledge we used a connectomics approach based a combination of serial-section electron microscopy ( Fig. 5) and confocal imaging (Fig. S4, Methods). In total, we reconstructed a 70 x 35 x 35 µm patch of larval outer retina in the acute zone, which comprised n = 140 cones n = 16 HCs (Fig. 5a-d). UV-and blue-cones were identified directly in the EM-volume based on their characteristic OPL-proximal mitochondrial pockets (UV, Fig. S4a) and somata (blue, Fig. S4b), respectively. This allowed initially sorting cones into three groups: UV, blue and red/green. Next, we traced each HC's dendritic tree and identified their connections to cones belonging to each of these cone-groups ( Fig. 5d-k, Fig. S4c-h). Relating each HC's relative connectivity to UV-cones to their connections to red/green-cones allowed separating HCs into three groups (Fig. 5i, Fig. S4g), which were verified by clustering the HCs on all extracted features (Methods). These were dubbed H1, H2, and H3, based on their similarity to known adult HC types 30,32,33 . The same classification was then further confirmed by confocal microscopy (Fig. S4d-h). Of these, H1 reliably contacted all red/green-cones within their dendritic field, while H2 systematically avoided approximately half of these cones. In line with confocal data (Fig. S4), this allowed disambiguating red-cones (contacted only by H1) from green-cones (contacted by both H1 and H2). With the exception of n = 14 of 66 redgreen cones that could not be unequivocally allocated due to their location at the edge of the volume (yellow, counted as 0.5 red, 0.5 green in Fig.  5b,d), this completed cone-type identifications.
From here, we quantified each HC groups' connections to the four cone types. This revealed that H1 contacted essentially all red-, green-and bluecones within their dendritic fields, but imperfectly avoided UV-cones (Fig.  5j,k). In contrast, H2 by definition never contacted red-cones, but contacted all other cones including UV cones. Finally, H3 was strongly dominated by UV-cone contacts, with a small contribution from blue-cones. H3 never contacted red-or green-cones. Together, this largely confirmed that adult HC connectivity is already present in larvae, and moreover contributed cone-weighting information for the three HC types. We next asked how this specific HC-connectivity matrix underpins cone-spectral tunings.

H1 horizontal cells likely underlie most spectral tuning.
To explore how the three HC-types contribute to spectral cones-tunings, we first set up a series of circuit models for all possible combinations of HCs (Methods). These linear interaction models included the established connectivity structure (Fig. 5k) and were driven by the cone tunings in the absence of HCs (Fig. 4a), with the goal of explaining cone-tunings in the presence of HCs (Fig. 2f). We performed Bayesian inferences for the model parameters using likelihood-free inference 34 based on the cones' tunings, and we assumed sign-preserving connections from cones to HCs but sign-inverting connections from HCs to cones. The model recapitulated well the in-vivo tuning functions of all cones when simultaneously drawing on all three HC types. However, almost the same fit quality was achieved when using H1 alone ( Fig. 6a-d, cf. Fig. S5a-c), while H2 mainly fine-tuned the blue-and UV-cones and H3 had negligible impact on any cone-tunings (SFig. 5a). In fact, any model that included H1 outperformed any model that excluded H1 (Fig. S5a-c). H1, where present, also consistently provided the strongest feedback amongst HCs (Fig. 6d, Fig. S5c). Together, modelling therefore suggests that H1-like HCs are the main circuit element underlying the invivo spectral tuning of zebrafish cones. Moreover, the inferred relative cone-type weighting for H1 approximated their anatomical connectivity established by EM (Fig. 5j), with the exception of green-cones which had stronger-than-expected weights (Fig. 6d)possibly uncovering a relativelyincreased synaptic gain at this site. Fig. 5k (Methods). Cone tunings are initiated based on in vivo data during HC block (Fig. 4b) 52

and ROIs (bottom)) and responses (g, mean superimposed on individual repeats). h-j, results of clustering of mean responses from n = 86 ROIs (h, n = 15 fish) with cluster means (i) and extracted tuning functions (j, means±SD). k,l, mean tunings of in vivo HC clusters (k, from j), and superposition of each measured modelled (solid lines, from e) and measured (shading, from k) HCs. Note that raw-(g) and averaged (i) HCresponses as well as the summary heatmap (h) are time-inverted to facilitate comparison with summary plots (j-l).
Finally, we sought to verify the model by experimentally measuring the spectral tunings of HCs and comparing these to the predicted HC tunings from the full model (Fig. 6e). For this, we used in vivo 2P voltage imaging of HCs somata using ASAP3 35 (Fig. 6f-l) (Methods). In total, recordings from n = 86 HCs that passed a quality criterion (Methods) were sorted into three clusters (Methods). The largest cluster exhibited a spectrally broad, monophasic response that closely matched the model's prediction for H1 (Fig. 6l, see also Refs 28,32 ). Next, short-wavelength biased clusters 2 and 3 closely matched the model's prediction for H2 and H3, respectively 28,32 .
Taken together, our physiological recordings from cones (Figs. 1,2,4) and horizontal cells (Fig. 6f-l), linked to synaptic level EM-reconstructions (Fig.  5) and computational modelling (Fig. 6a-e) provide a comprehensive in vivo account of spectral processing for near-optimal decomposition of natural light (Fig. 3) at the visual system's first synapse in a tetrachromatic vertebrate.

Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Tom Baden (t.baden@sussex.ac.uk).

Data and Code Availability
Pre-processed functional 2-photon imaging data, natural imaging data, EM-data, HC circuit modelling data, and associated summary statistics will be made freely available via the relevant links on http://www.badenlab.org/resources and http://www.retinal-functomics.net. Code for the model and the statistical analysis of the experimental data is available on Github (https://github.com/berenslab/cone_colour_tuning). Natural imaging datasets were published previously as part of Refs 12,13 .

Light Stimulation.
With fish mounted on their side with one eye facing upwards towards the objective, light stimulation was delivered as full-field flashes from a spectrally broad liquid waveguide with a low NA (0.59, 77555 Newport), positioned next to the objective at ~45˚. To image different regions in the eye, the fish was rotated each time to best illuminate the relevant patch photoreceptors given this stimulator-geometry. The other end of the waveguide was positioned behind a collimatorfocussing lens complex (Thorlabs, ACL25416U-A, LD4103) which collected the light from a diffraction grating that was illuminated by 14 spectrally distinct light-emitting diodes (LEDs) (details on LEDs below). Following an earlier design 14 , the specific wavelength and relative angle of each LED to the diffraction grating defined the spectrum of light collected by the collimator to be ultimately delivered to the fish's eye, according to: where α is the angle of light incident to the diffraction grating, λ the wavelength (in nm), β the first order diffraction exit angle and G the diffraction grating's groove density. Moreover, each LED was individually collimated (Signal Construct SML 1089 -LT-0454) and attached to a rail (Thorlabs, XE25L450/M; XE25L225/M) by a 3D printed holder (available at https://github.com/BadenLab/HyperspectralStimulator). From here, stimulus power was adjusted to be equal across all LEDs From here, stimulus power was adjusted to be equal across all LEDs at 0.44 µW with the exception of the two most shortwavelength ones, which were set to 0.20. An Arduino Due (Arduino) and LED driver (Adafruit TCL5947) were used to control and drive the LEDs, respectively. Each LED could be individually controlled, with brightness defined via 12-bit depth pulse-width-modulation (PWM). To time-separate scanning and stimulating epochs, a global "blanking" signal was used to switch off all LEDs during 2P scanning but enable them during the retrace, at line-rate of 500 Hz (see also Refs 15,16 . The stimulator code is available at https://github.com/BadenLab/HyperspectralStimulator).

2-photon calcium and voltage imaging.
All 2-photon imaging was performed on a MOM-type 2-photon microscope (designed by W. Denk, MPI, Martinsried; purchased through Sutter Instruments/Science Products) equipped with a mode-locked Ti:Sapphire laser (Chameleon Vision-S, Coherent) tuned to 960 nm for SyGCaMP6f and ASAP3 imaging. To measure HC tuning functions, we first expressed GCaMP6f in HCs. However, while we observed strong light-driven calcium responses at their dendritic tips, adjacent to cone terminals and thus indicative of local processing, we did not observe robust calcium responses in the HC soma (as a proxy of global processing). This lack of somatic calcium responses could be due to a putative lack of voltage-gated calcium channels in larval HC somata (unlike e.g. in adult mouse 43 ). Instead, we therefore measured voltage responses using the genetically encoded voltage sensor, ASAP3 35 , which presumably also gave a more direct readout of HC global function. We used two fluorescence detection channels for SyGCaMP6f/ASAP3 (F48x573, AHF/Chroma) and mCherry (F39x628, AHF/Chroma), and a water immersion objective (W Plan-Apochromat 20x/1,0 DIC M27, Zeiss). For image acquisition, we used custom-written software (ScanM, by M. Mueller, MPI, Martinsried and T. Euler, CIN, Tuebingen) running under IGOR pro 6.3 for Windows (Wavemetrics). Recording configurations were as follows: UV-cone SyGCaMP6f 128x128 pixels (2 ms per line, 3.9 Hz) or 256x256 pixels (2 ms per line, 1.95 Hz); all other cones SyGCaMP6f and horizontal cell ASAP3 256x256 pixels (2 ms per line, 1.95 Hz).

Pre-processing and extraction of response amplitudes of 2-photon data.
Regions of interest (ROIs), corresponding to individual presynaptic cone terminals were defined automatically based on local thresholding of the recording stack's standard deviation (s.d., typically > 25) projection over time, followed by filtering for size and shape using custom written scripts running under IGOR Pro 6.3 (Wavemetrics), as used previously in (Yoshimatsu et al., 2020). Specifically, only ellipsoidal ROIs (<150% elongation) of size 2-5 μm 2 were further analyzed. For ASAP3 recordings, ROIs were manually placed to follow the shape of individual HC somata. Calcium or voltage traces for each ROI were extracted and z-normalized based on the time interval 1-6 s at the beginning of recordings prior to presentation of systematic light stimulation. A stimulus time marker embedded in the recording data served to align the traces relative to the visual stimulus with a temporal precision of 2 ms. Following the approach used in Ref 44 , a quality criterium (QC) of how well a cell responded to a stimulus were computed as where C is the T by R response matrix (time samples by stimulus repetitions) and 〈 〉x and Var[ ]x denote the mean and variance across the indicated dimension, respectively. If all trials are identical such that the mean response is a perfect representative of the response, QC is equal to 1. If all trials are random with fixed variance, QC is equal to 1/R. For further analysis, we used only cells that responded well to the stimulus (QC >0.4 for SyGCaMP6f or >0.32 for ASAP3).
After filtering out poorly responsive cells using QC, outliners were removed using principal component analysis. Because in all cone types, PC1 explained >80% variance of the data, we computed the loading values of the principal component 1 of cone tuning function within each cone type and defined outliners as the cones with PC1 loading below 1.25 times the length of the 97 percentile departure from the mean. To extract response amplitudes to each stimulus wavelength, an exponential curve was fit to the entire rising (or falling, for hyperpolarising responses) phase during each stimulus presentation, with the maximum value of the fitted curve was taken as the response amplitude. Because cones are intrinsically "Off-cells" (i.e. hyperpolarize to light) we then signinverted extracted amplitude values such that Off-responses would yield positive amplitude readings, and vice versa for On-responses. However, for voltage imaging, because ASAP3 fluorescence intensity increases as cells hyperpolarize, we preserved the polarity of the response amplitudes.
Immunostaining and confocal imaging. Larval zebrafish (7-8 dpf) were euthanised by tricane overdose and then fixed in 4% paraformaldehyde (PFA, Agar Scientific, AGR1026) in PBS for 30 min at room temperature. After three washes in PBS, whole eyes were enucleated and the cornea was removed by hand using the tip of a 30 G needle. Dissected and fixed samples were treated with PBS containing 0.5% TritonX-100 (Sigma, X100) for at least 10 mins and up to 1 day, followed by the addition of primary antibodies. After 3-5 days incubation at 4°C, samples were washed three times with PBS 0.5% TritonX-100 solution and treated with secondary antibodies. After one day incubation, samples were mounted in 1% agar in PBS on a cover slip and subsequently PBS was replaced with mounting media (VectaShield, H-1000) for imaging. For HC imaging (Fig.  S5c-f), the retina was flat-mounted with the photoreceptors facing to the cover slip. For cone side-view imaging (Fig. S5a), the lens was kept attached the retina to maintain the spherical shape of the retina, with the whole "retina-ball" mounted with the lens side facing to the cover slip. All presented data was imaged in the acute zone.
UV-cone ablation. Larval zebrafish were immersed in fish water containing 10 mM Metronidazole (Met) for 2 hours to ablate nfsBexpressing UV-cones. Following Met treatment, zebrafish were transferred into fish water without Met and fed regularly until used for two-photon imaging.

Electron-microscopy data acquisition, reconstruction and annotation.
A larval zebrafish (8 dpf) was euthanised by tricane overdose and then a small incision on a cornea was made using 30G needle in a fixative solution containing 4% glutaraldehyde (AGR1312, Agar Scientific,) in 0.12M cacodylate buffer, pH 7.4. The tissue was immediately transferred into a 1.5 ml tube with the fixative, centrifuged at 3,000 rpm for 3 min, and further fixed in the fixative over-night on a shaker at room temperature. Subsequently, the tissue was washed 3 times in 0.12M cacodylate buffer, pH7.4 and incubated in a solution containing 1.5% potassium ferrocyanide and 2% osmium tetroxide (OsO4) in 0.1M cacodylate buffer (0.66% lead in 0.03M aspartic acid, pH 5.5) for 1 hour. After washing, the tissue was placed in a freshly made thiocarbohydrazide solution (0.1g TCH in 10 ml double-distilled H20 heated to 600 C for 1 h) for 20 min at room temperature (RT). After another rinse, at RT, the tissue was incubated in 2% OsO4 for 30 min at RT. The samples were rinsed again and stained en bloc in 1% uranyl acetate overnight at 40 C, washed and stained with Walton's lead aspartate for 30 min. After a final wash, the retinal pieces were dehydrated in a graded ice-cold alcohol series, and placed in propylene oxide at RT for 10 min. Finally, the sample was embedded in Durcupan resin. Semi-thin sections (0.5 -1 µm thick) were cut and stained with toluidine blue, until the fiducial marks (box) in the GCL appeared. The block was then trimmed and mounted in a Serial-blockface scanning electron microscope (GATAN/Zeiss, 3View). Serial sections were cut at 50 nm thickness and imaged at an xy resolution of 5 nm. Two tiles, each about 40 µm x 40 µm with an overlap of about 10%, covering the entire photoreceptor and horizontal cell layers in a side view at the acute zone were obtained. The image stacks were concatenated and aligned using TrackEM (NIH). The HCs and cones were traced or painted using the tracing and painting tools in TrackEM2 46 .

Clustering of HCs in EM and Confocal data.
To validate the ad hoc group assignment based on UV contacts (HC area) and R/G contacts for the electron microscopy (Fig. 5h,i) and confocal data (Fig. S5g) we used Mixture of Gaussian (MoG) clustering on all extracted features. These features (area size, number of contacts to R/G, B, U, for EM and area size, tip density, number of contacts to R, G, B/U for CM) were z-normalized and clustered in the same framework as the HC recordings (see below). The MoG clusters did coincide with the ad hoc group assignment.

Opsin Templates and log transforms.
For the log-transformed opsin templates (Fig. 2f, 4b) we assumed a baseline activation (represented by b in Eq. 2) and fit a linear transformation to take the arbitrary scaling of the recordings into account. We then optimized the function to minimize the mean squared error (MSE) between and the data of the HC block condition for each cone type: (1) where y is the mean of the HC block condition and f is the function . ( For the optimization we used the python package scipy.optimize.minimize (version 1.4.1). The inverse of this procedure is shown in Fig. S3a, where the mean of HC block condition is fitted in the same way to the opsin curves of each cone with the function: The data distribution (25 and 50 and 75 percentiles) is then calculated by passing each individual HC block recording through the optimized function .

Model of cone and HC interaction.
We modelled cone-HC interactions as a linear model and included the established (Fig. 5k)  Hereby we assume a symmetric connectivity pattern which is justified by the symmetrical cone mosaic in zebrafish. With these definitions, we can formulate the model recurrently as following: The inputs to the HCs is defined as where represents the raw activity in the synapse, which still has to be shifted according to the baseline. The summed outputs of the HCs are computed as Finally, the raw activity in the synapses are computed as where represents the wavelength dependent opsin activation.
The same formulas hold for computing the baseline of the cones, for which was set to 1, which accords to the applied normalization on the recorded data. The final output of the model are the tuning curves shifted to the cone specific baselines and normalized.
The same normalization procedure was applied to the shown HC spectra, which are the normalized spectra .
In the reduced models, in which we only included specific types of HCs, we set the corresponding entries in the weight matrix W to zero but did not change the model otherwise.
Model input. To extract the cone tuning curves from the experimental data for the model, we computed the mean amplitude of each bright and dark three seconds interval but excluded in each interval the first second as adaption time. We then took for every individual trace the difference of each bright interval to its preceding dark interval based on these means.
Finally, we averaged over these values for each cone type and experimental condition and, by assuming smooth tuning functions, interpolated (using the scipy function scipy.interpolate.interp1d) the data to an equidistant resolution of 1nm.
As input to our model we took the normalized traces of the blocked HC condition. This normalization can be interpreted as a maximal dark current of 1 and a minimal current of 0 during activation. The input acted as "opsinsensitivity" curves of the cones. We decided to use these curves instead of the theoretical available opsin tuning curves since we have a pure linear model and as shown in Fig. 4b these traces are a good proxy for the log-transformed opsin templates, which is the effective activation for this linear model. All spectral tuning curves of the cones were normalized to have a maximal absolute value of one.
Fitting procedure. We used the Sequential Neural Posterior Estimation method (also called SNPE-B) described in Ref 34 (code available at https://github.com/mackelab/delfi, version: 0.5.1) with small modifications which were already applied in 47 to fit our model.
In brief, SNPE-B draws parameters over several rounds from a (proposal) prior and evaluates the model for these parameters. For the evaluations the discrepancy function is computed and a mixture density network (MDN) is trained on the data pairs . The posterior is then calculated as and used as a new proposal prior in the next sampling round: . We took the MSE between model output and the data as discrepancy function. This implies , but as our data is noisy, our model cannot get to a MSE of zero. This would mean, that the MDN has to extrapolate to unreached discrepancy values, which could lead to an unstable behaviour. As a consequence, we took as the 0.01-percentile of in each round. This evaluation of can be understood as the posterior over the parameters for the "best possible" model evaluations. Testing for different percentiles in a reasonable range did not change the results. We took the same approach for setting an adaptive bandwidth for the kernel (see also Ref 47 ). As for a few models the posteriors became slightly worse after some rounds, we compared post-hoc the posterior distributions of each round and took the one with the smallest 1-percentile of its samples.
We ran SNPE-B over five rounds, with 200,000 samples per round. The prior was a multivariate normal distribution with mean and covariance where n is the number of model parameters, ranging from 11 (all HCs) to 3 (only H2). We chose three Gaussian components for the MoG and a MDN with two hidden layers with 100 nodes each. In each round the network was trained for 600 epochs with a minimum batch size of 500 and continuous learning started in round two. To let the MDN focus on regions of low discrepancy, we used a combined Uniform-Half-Gaussian kernel which was constant 1 up to and decayed then as a half Gaussian. The scale of the Half-Gaussian part was in each round chosen as the 20-percentile of the discrepancy values. For the presented tuning curves 100,000 samples were drawn from the final posterior and the model evaluated.

HC clustering based on spectral tuning.
To identify functional clusters we used a Mixture of Gaussians model (sklearn.mixture.GaussianMixture, version 0.21.2) with three components and diagonal covariance matrices on the pre-processed tuning curves (n = 86) which were additionally normalized to have maximal value of one. Aiming for a stable clustering, we ran the algorithm 1,000 times with different random seeds and chose the ones with the smallest BIC and under these chose the partition which appeared most often. The different runs did not change the general shape of the cluster means, but the specific assignment was variable for some traces. With this procedure we got a partition with n = 12, 19, 55 elements, which were allocated to the known functional tunings for HCs of adult zebrafish 28,32 .
Natural Imaging Data Analysis. The hyperspectral data were elementwise multiplied with a deuterium light source derived correction curve [S.x]. The data were restricted to the domain of 360-650 nm and z-normalised within a given scan. Here, the long-wavelength end of the domain was decided based on the long-wavelength opsin absorption curve; the shortwavelength end was dictated by the sensitivity of the spectrometer. The hyperspectral PCs were obtained using the scitkit-learn 0.22.1 implementation of the Principal Component Analysis algorithm. Only the first four components are displayed.
Hyperspectral measurement points were spatially aligned within the scan according to the scan raster (see Ref 12,13 for details). Pixel brightness is the projection of a given PC, or mean of the convolution with the opsin absorption or the observed cone response curves respectively. Presented images were smoothed using a Gaussian filter (sigma = 2px). Sum of Squares difference was taken between pairs of z-normalised images as well as their negatives. The lowest Sum of Squares (=highest correlation, either with the original or the negative) is displayed. Smoothing did not significantly affect this measure.
To statistically compare scene reconstructions by different sets of tuning functions (Fig. S2a-c), we used two parallel strategies. First, we computed the correlation coefficient between reconstructions by the different channels (e.g. in vivo red cone vs. green cone) as indicated for each of n = 30 scenes, thus yielding 30 correlation coefficients for each combination of channels in each condition. Amongst each comparison we then computed the mean and SD, as shown.
Second, to capture the multivariate dependence directly, we computed the mutual information under Gaussian assumption: where is the correlation matrix of the scene representations in the different channels (e.g. 4x4 in vivo: red-, green-, blue-, UV-cone). As the diagonal of is constant and equal to , the mutual information is proportional to the latter quantity. We normalized this quantity by the mutual information of the opsin set of tuning functions.

Statistics.
No statistical methods were used to predetermine sample size. Owing to the exploratory nature of our study, we did not use randomization or blinding.
We used Generalized Additive Models (GAMs) to analyse the relationships between wavelength and cone activity under different experimental conditions (Fig. 4d-h, Fig. S3). GAMs can be understood as an extension to the generalized linear model by allowing linear predictors, which depend on smooth functions of the underlying variables 48 . We used the mgcvpackage (version 1.8-31) in R on an Ubuntu 16.04.6 LTS workstation with default parameters. We modelled the dependence of the variable of interest as a smooth term with 13 degrees of freedom. The models explained ~59-82% of the deviance. Statistical significance for differences between the dependence of activation in the different experimental conditions were obtained using the plot_diff function of the itsadugpackage for R (version 2.3). Significance of opponency (Fig. S1d) and zero crossings of the tuning curves (Fig. 2f, Fig. S1d) were also calculated based on GAMs with "zone" as an additional predictive variable and grouping where applicable. Figure   S1 | Quality filtering of cone-recordings (related to Figure 2). a, heatmaps of all cone recordings of each type as indicated (n = 1,659), ranked by quality criterion (QC, Methods) from these recordings. c, First principal components from tunings extracted from all ROIs of a given cone type (PC1 and 2 are shown). In all cone types, PC1 explains >80% of the variance in the data. Thus, we used PC1 loadings to remove outliners (Methods). d, Means±SD for all ROIs that passed QC and outliner filtering, segregated by recording region (acute zone (AZ), black; dorsal (D), dark grey; nasal (N), mid grey; ventral (V), light grey). Note that most respective tunings superimpose well, indicating that cone-tunings are approximately eye-region invariant. Minor exceptions to this notion are labelled as applicable. Note also that heatmaps (a) are time-inverted to facilitate comparison to summary plots (c,d). Figure 3). a-c, statistical comparisons of between cone in vivo tunings and PCs: (a) pairwise correlation coefficients between component-reconstructions (as in Fig. 3e,h,k) for all n = 30 example scenes, for opsins, cone in vivo-tunings and PCs, as indicated. For opsins and cone in vivo tunings, both RGBU and RGS groups are indicated. Note generally low correlations for in vivo tuning functions and PCs, but high positive correlations for opsins. Shown are means±SD superimposed on scatter of all data. b,c, multivariate dependence measured using mutual information (Methods) normalised between opsins (mean defined as 1) and PCs (mean defined as 0). Note that in vivo cone tunings are generally smaller than 1 (indicating reduced dependence compared to opsins), and often close to 0 (i.e. maximally decorrelated). d-f, (cf. Fig. 3n,m) comparison of datasets from Drosophila (thick shaded lines) and zebrafish larvae (thin solid lines) for "functionally homologous" photoreceptor tunings (d), natural scene PCs (e) and overall spectral means±SDs (f). note that the overall waveforms are similar, but systematically blue-shifted for both Drosophila photoreceptor tunings and natural scene PCs. Figure 4). a, exponential fits of the HC blockage to the opsin curves. Dots indicating for which wavelength the opsin curve lies within (black) or outside (red) 50% of the data distribution. Figure 5). a, vertical cross-section through outer retina, with the four cone-types individually labelled by the combination of transgenic labelling of cone types in Tg(Opn1sw1:GFP, Opn1sw2:mCherry, thrb:Tomato) and zpr-1 antibody immunostaining (red, green, blue, magenta, as indicated) on a background of a DAPI nuclear stain (grey) (Methods). b, as (a) but now DAPI signal in specific cone types were extracted. c, Example single HC randomly labelled by plasmid injection into one-cell stage eggs to express membrane targeting YFP, with cone pedicles of red cones (Tg(thrb:Tomato), red) and both red-and green cones labelled (zpr-1 antibody immunostaining, cyan). This allowed directly attributing each HC contact with red-, green-cones (e.g. arrowheads) or others (blue-or UV-cone). d-f, example single HCs identified as H1 (d), H2 (e) and H3 (f), with cone-type contacts indicated. g, HC dendritic area in relation to tip density (g1) and percentage of red-and green-cone contacts (g2) for n = 25 HCs allowed splitting HCs into 3 groups, here allocated to H1-3 as indicated. h, Relative cone contributions to the three HCs. Figure 6). a, cone tuning functions (as in Fig. 4b) that emerge from models comprised of different HC combinations as indicated, with best fit, median and 25/75 percentiles plotted on top of the measured in vivo cone tunings (thick shaded lines). b-e, normalised loss (b), distribution of weights (c) and emergent HC tunings (d) for all modelled HC-combinations (as Fig. 6c-e,  respectively).