Abstract
The visual cortex is organized hierarchically, but extensive recurrent and parallel pathways make it challenging to decipher signal flow between neuronal populations. Here, we recorded spiking activity from neurons in six interconnected areas along the mouse visual hierarchy. By analyzing leading and lagging spike-timing relationships among all measured neurons, we created a cellular-scale directed graph of the network. Using a novel module-detection algorithm to cluster neurons based on shared connectivity patterns, we uncovered several multi-regional communication modules that are distributed across the visual hierarchy. Based on the direction of signal flow between modules, differences in layer and area distributions, and distinct temporal dynamics, we found these modules support different stages of sensory processing. One module is positioned to transmit feedforward sensory signals along the hierarchy, whereas the other integrates inputs for recurrent processing. These results reveal a novel multi-area modular view of signal propagation in the mouse visual cortical hierarchy.
Introduction
Information processing in the neocortex involves signal representation, transformation, and transmission between processing levels or modules (Perkel and Bullock, 1968). Sensory systems are organized as anatomical hierarchies of many connected regions (Felleman and Van Essen, 1991; Harris et al., 2019). In this scheme, areas at different hierarchical levels are thought to correspond to distinct processing modules linked by feedforward and feedback projections. Whereas stimulus representation (such as single neuron tuning preference and population coding) has been extensively studied (Averbeck et al., 2006; Cunningham and Yu, 2014; Hubel and Wiesel, 1962; Mountcastle et al., 1963; Orban, 2008; Parker and Newsome, 1998), the principles by which spiking signals are transmitted between neuronal populations across areas of a processing hierarchy are much less understood (Kohn et al., 2020; Kumar et al., 2010; Zavitz and Price, 2019).
Tracking signal propagation in distributed networks requires simultaneous measurement of large numbers of interacting neurons both within and between cortical areas in awake, behaving animals (Buzsáki, 2004; Kohn et al., 2020; Zavitz and Price, 2019). However, multi-area cellular recordings are technically challenging and have only recently become more common. Previous work examining inter-area communication between neurons on millisecond timescales typically has involved only two areas (Chen et al., 2017; Goldey et al., 2014; Jia et al., 2013; Reid and Alonso, 1995; Semedo et al., 2019; Zandvakili and Kohn, 2015). These studies have shown the stimulus dependence and functional specificity of these pair-wise interactions. However, the larger view of network communication spanning multiple hierarchical levels of sensory processing is missing because of the limited number of simultaneously recorded regions. Other studies of inter-area communication studies have used local field potentials (LFPs) to measure aggregate population activity across many areas. These studies associate feedforward versus feedback signaling with gamma and beta frequency rhythms (Bastos et al., 2015, 2018; van Kerkoerle et al., 2014; Wong et al., 2016). However, LFP measurements don’t resolve single neurons so understanding cellular-scale signal flow is more difficult. Widefield imaging techniques provide simultaneous access to many cortical regions (Musall et al., 2019; Salkoff et al., 2019) and can be cellular-scale. However, these signals have much slower dynamics, and fail to capture signal transmission through spiking at millisecond timescales. In general, the field lacks datasets that investigate signal propagation in multi-area sensory networks with single neuron resolution and high temporal fidelity.
In our previous work, we built a recording platform using multiple Neuropixels probes to measure populations of spiking neurons from six levels of the mouse visual cortical hierarchy (Siegle et al., 2021) (Fig. 1A). By analyzing correlations between spiking units we found that the average direction of the signal flow during bottom-up sensory drive follows the anatomically defined mouse visual hierarchy (Harris et al., 2019; Siegle et al., 2021). Consistent with this we observed longer sensory response latencies at higher hierarchy levels. However, many fundamental questions remain. Ample anatomical evidence demonstrates the existence of parallel pathways and recurrent connections between hierarchical levels of the cortex (Felleman and Van Essen, 1991; Gămănuţ et al., 2018; Harris et al., 2019; Markov et al., 2014a). In the mouse, primary visual cortex (V1) makes direct projections to all higher visual areas (Harris et al., 2019; Wang and Burkhalter, 2007), and even single cells can project in parallel to multiple areas via branching axons (Han et al., 2018). In addition, local subnetworks exist that show clustered connectivity that could support the coactivation of groups (or assemblies) of neurons (Perin et al., 2011; Song et al., 2005; Yoshimura et al., 2005).
Given the complexity of local and inter-area connections, it is unclear what are the relevant signal transmission modules in the cortex. The basic view is that each cortical area corresponds to a processing stage that performs local computations and sends output signals sequentially from one area to the next. However, the notion that separate anatomical areas correspond to distinct processing stages is likely only a first order characterization. Because of the dense lateral and recurrent connectivity in the cortex, it is possible that major communication hubs actually span multiple anatomical areas. Thus, a critical foundational step for understanding signal flow in the cortex is to subdivide the system into functionally relevant processing modules at the neuronal, yet multi-regional network level.
Here, we take a novel approach to decompose the cortical network communication structure into functional modules. Using multi-area recordings from six hierarchical levels, we infer signal flow based on the statistics of leading-lagging spike timing of pairs of neurons. Treating each neuron as a node we create an adjacency matrix with directed weights for simultaneously recorded cortical units, and we use an unsupervised clustering algorithm to uncover multi-regional sets of neurons based on their shared functional connectivity patterns. This method revealed two major communication modules, both of which included neurons that spanned the cortical hierarchy.
Results
In each experimental session we used multiple Neuropixels probes to record populations of neurons from six areas of the mouse visual hierarchy. Each cortical area has its own map of visual space (Garrett et al., 2014), and resides at a different hierarchical level as determined both anatomically (D’Souza et al., 2020; Harris et al., 2019) and physiologically (Siegle et al., 2021). Area V1 is at the bottom of the hierarchy, followed by areas RL/LM, AL, PM, and AM (Fig. 1a). Overall, recording sessions in this study yielded 632 ± 18 simultaneously recorded neurons (a.k.a. sorted units (Siegle et al., 2021)) distributed across cortical layers and areas (n = 19 mice, mean ± SEM) (Fig. 1b). We used full-field drifting grating stimuli to provide a strong bottom-up sensory input to the system to evoke a large number of spikes recorded per unit time for our functional connectivity estimation. Consistent with the known visual hierarchy in the mouse (Harris et al., 2019; Siegle et al., 2021), the mean response latency in each area followed a sequential progression (Fig. 1c). However, all areas were co-active for substantial portions of the sensory response, thereby providing opportunities for recurrent interactions. To facilitate functional connectivity analysis, neurons in our dataset were filtered by firing rate and receptive field location (n = 3487; 29% of total units recorded across all mice; see Methods and Fig. S1).
Identification of multi-regional functional modules
To characterize fast timescale functional interactions relevant to signal transmission, in each mouse we quantified spiking correlations between all pairs of neurons using jitter-corrected cross-correlogram (CCG) analysis; this captures relative spike timing between two neurons within the jitter window (25 ms) but removes stimulus-locked signals and correlations longer than the jitter window (Jia et al., 2013; Smith and Kohn, 2008a). For each neuronal pair we determined their connection weight by computing the difference of the CCG in a 13 ms window (half of the jitter window) before and after 0 time lag (Fig. 1d). The sign of this weight describes the signal flow direction (temporally leading or following) between pairs of neurons. Computing this for all pairs produced a connectivity matrix describing the directed functional interactions of all simultaneously recorded neurons in each mouse (Fig. 1e). These functional connectivity matrices displayed non-random structure (Fig. S2). Inspection indicated there existed separable groups of neurons with similar patterns of functional connectivity. We next sought to algorithmically uncover these groups by clustering their functional connection profiles.
To systematically identify sets of source neurons with shared functional connectivity patterns, we clustered the connectivity matrix by treating connections from each source neuron to all target neurons as features (Fig. 1e,f; Fig. S3; see Methods for details). This procedure yielded three robust clusters of source neurons (Fig. S4): cluster 1 had mostly weak connections; neurons in cluster 2 were dominated by strong positive connection weights indicating they tended to lead (or drive) network activity; and neurons in cluster 3 were dominated by strong negative connection weights indicating they are mainly driven-by and follow activity in the network (Fig. 1g, h). Given the bias in positive versus negative connection weights from each cluster (and its functional implication of directionality), we refer to the set of cluster 2 source neurons as the ‘driver’ module and the set of cluster 3 source neurons as the ‘driven’ module (Fig. 1h). Supporting the robustness of these clusters, we observed similar network modules using spectral clustering and bi-clustering algorithms (Pedregosa et al., 2011) (Fig. S4). Moreover, these three clusters were identified in each mouse we examined, suggesting they are a core organizational feature in the mouse visual cortical network (Fig. S5).
In which cortical areas and layers do neurons in these separate modules reside? Interestingly, neurons in both modules were spatially distributed across all levels of the cortical hierarchy rather than being localized to specific regions (Fig. 2a). Nonetheless, the proportion of neurons in the two modules showed area biases. Overall, driver neurons gradually decreased along the hierarchy (Fig. 2a; Spearman’s correlation with each area’s hierarchical position (Harris et al., 2019): r = -0.89, p = 0.019), whereas driven neurons increased (Spearman’s correlation = 0.89, p = 0.019 for driven module). Both driver and driven neurons were present in all cortical layers but showed laminar biases. Driver neurons were enriched in the middle and superficial layers, whereas driven neurons were more common in deeper cortical layers (Fig 2b; layers were defined by current source density analysis, see Fig S6). Anatomical tracing studies have shown that neurons mediating feedforward projections tend to originate in superficial layers (Felleman and Van Essen, 1991; Markov et al., 2014a), and the fraction of feedforward projecting neurons in superficial layers decreases along the hierarchy (Barone et al., 2000; Markov et al., 2014b). Neurons in the driver module followed this same pattern suggesting this module could be involved in feedforward processing, while the driven modules might be more involved in recurrent processing.
Signal integration and distribution by neurons in different modules
Node convergence degree (input to one neuron from others) and divergence degree (output from one neuron to others) are two network properties that differentially support the integration versus distribution of signals (Tononi et al., 1998). The mean functional connection weight of driver (positive) and driven (negative) neurons suggests their convergence and divergence differs. To quantify this directly, we computed the divergence degree of each neuron as the number of significant positive connections (outward projections from the neuron) relative to the size of the network (see Methods). A significant connection is identified when the absolute weight > 10−6, a threshold value defined by half of the standard deviation of the weight distribution across all mice (see Methods). Likewise, convergence degree of a neuron was computed as the fraction of significant negative connections (weight < -10−6; inward projections to the neuron) relative to the size of the network. Driver module neurons had higher divergence degree than the driven neurons (Fig. 2d; 2-way ANOVA across areas, between modules F = 1058.6, p = 3e-188; among areas F = 33.2, p = 1e-32; interaction F = 3.0, p = 0.013). In contrast, the driven module neurons had higher convergence degree (Fig. 2e; between modules F = 239.9, p = 5.6e-51; among areas F = 20.6, p = 3.9e-20; interaction F = 3.1, p = 0.007). Thus, from a network perspective, neurons in the driver module are better positioned to distribute information, whereas neurons in the driven module are better positioned for signal integration.
The greater convergence to driven neurons indicates they combine signals during visual processing. In the visual system, the convergence of multiple simple-like neurons that modulated by phase change of drifting gratings can give rise to a complex-like neurons with tolerance to the different grating phases. Simple- and complex-like neurons can be quantified using a modulation index (MI) that describes the degree of modulation at the preferred temporal frequency of a drifting grating stimulus (Matteucci et al., 2019). To test whether the driven module contains more complex-like neurons consistent with their dominant converging inputs relative to the driver module, we computed the MI for individual neurons and compared between the two modules (Fig. 3a,b). Overall, the driven population was more complex compared to the driver population (p=3.3e-24; Mann Whitney U test). This result suggests even though the driver and driven units are classified based on their functional network connectivity, these neurons could be fundamentally different with distinct receptive field properties (simple-vs complex-like).
Signal transmission between and within modules
Having now defined several robust network modules, each spanning the visual hierarchy, we next sought to evaluate the direction of signal flow within and between these separable network clusters. To do this we took advantage of the weight (adjacency) matrix that we defined previously (Fig. 1), which describes interactions and the direction of signal flow between all recorded neurons. We first focused on signal transmission from one module to the other by examining the subnetworks that are defined by the connections from driver to driven neurons (Fig. 4a top) and from driven to driver neurons (Fig. 4a bottom). Connections between these modules were largely unidirectional; that is, driver neurons from each source area (including top areas of the visual hierarchy, e.g. PM and AM) made output connections to driven neurons, and driven neurons in each source area received input connections from driver neurons (Fig. 4b). The asymmetry of these connections indicates a largely unidirectional signal flow from the driver to driven module.
To quantify this signal flow from the perspective of individual cortical areas, we define a metric called the ‘area in-out index’, which describes the relative fraction of input versus output connections from a source area for a given subnetwork. An in-out index of 1 indicates all connections are inputs to the source area, and an in-out index of -1 indicates all connections are outputs (see Methods). For all areas of the driver-to-driven subnetwork the in-out index was close to -1 (Fig. 4c, top), indicating virtually all connections are outputs. In contrast, the in-out index was close to 1 in each area for the driven-to-driver subnetwork across mice (Fig. 4c, bottom), supporting the view of unidirectional signal flow from driver to driven module. Moreover, even though the in-out index is similar across areas for each subnetwork, the absolute number of outward connection in the driver-to-driven subnetwork gradually decreased along the visual hierarchy; while the number of inward connections in the driven-to-driver subnetwork gradually increased along the visual hierarchy. Overlaying these subnetworks for each source area showed a clear separation of inward and outward projections across the cortical depth for each source area (Fig. 4d), consistent with the laminar dependency of the two modules in each area (Fig. 2b).
The directional communication from driver to driven modules suggests these two modules might be sequentially activated during visual stimulation. To test this, we quantified the stimulus response latency for each module and found that spiking activity in the driver module preceded the driven module (time to peak = 60.5 ± 0.3 ms versus 80.0 ± 0.6 ms; Rank sum test statistics = -29, p = 7.1e-186) (Fig. 4e). We performed simulations demonstrating that the existence of brief timescale correlations between neurons in these modules does not by necessity entail a temporal offset between modules in the stimulus-triggered average response, and vice versa (Fig. S7). Thus, the directional communication implied by the subnetworks of the adjacency matrix is supported by the temporal progression of signals between the modules.
Whereas connectivity between modules was largely unidirectional, the connections within each module contained both inputs and outputs suggesting more possibilities for recurrent processing (Fig. 5a,b). Thus, to explore the within-module communication structure, we computed the in-out index for each source area of the within-module subnetworks (Fig. 5c). Interestingly, within the driver module the in-out index systematically increased across the hierarchy of areas: V1 had a negative in-out index and mostly made output connections to driver neurons in other areas; in contrast, driver neurons in AM received more inputs compared to outputs as indicated by a positive in-out index (Fig. 5c, top; Spearman’s correlation with hierarchy is -0.94, p = 0.0048). Connections within the driven module were more balanced: the in-out index was close to 0 for each area and did not significantly correlate with the hierarchy (Spearman’s correlation, p = 0.16; Fig. 5c bottom).
The within-module patterns of connectivity suggest that neurons in the driver module relay feedforward signals to other driver neurons across areas, whereas the connections within driven module are positioned to mediate recurrent inter-areal interactions. Consistent with this, in the driver module, visually evoked response latencies systematically increased with level in the anatomical hierarchy (Fig. 5d ; correlation with mouse anatomical hierarchy score from (Harris et al., 2019): Spearman’s r = 0.83, p = 0.04). In contrast, visual latencies in the driven module were delayed relative to the driver module, and did not show an organized progression across the hierarchy (Spearman’s r = 0.14, p = 0.79). Together, these results suggest a working model in which these separate neuronal modules serve as separates stages of signal propagation during visual processing: one module transmits feedforward signals about external stimuli along the hierarchy; the other integrates and processes recurrent signals given inputs from the driver module (Fig. 5e).
Population response precision differs between modules
Modeling studies of feedforward networks have investigated how spiking signals are propagated across modules of sequential processing (Kumar et al., 2010; Reyes, 2003; Vogels and Abbott, 2005). Depending on network connections and synaptic weights, successive stages can propagate synchronous activity (e.g. a synfire chain) or asynchronous fluctuations in firing rate (rate-coded signals). In vitro experiments with cultured networks reported decreased within-module synchrony of spiking activity as signals were relayed across sequential processing modules (Barral et al., 2019). However, evidence from networks in vivo in an awake animal is rare (except for (Zandvakili and Kohn, 2015)). The distributed functional modules we describe here could represent sequential stages of signal processing that are interdigitated within anatomically defined areas. In this context, we compared onset-latency synchrony of the first stimulus-evoked spike on a trial-by-trial basis for the driver and driven modules (Fig. 6a-c). Neurons in the driver module were more tightly synchronized compared to those in the driven module (Fig. 6c driver 15.1 ± 4.2 ms and driven 16.4 ± 4.0 ms, n = 8480 trials, Student’s t-test statistics T = -20.2, p = 2.6e-89). In addition, the spread of the first peak (pulse packet (Kumar et al., 2010)) of the population response within driver module for each trial (Fig. 6e-f) was also more compact (Fig. 6e Student’s t-test statistics T = -31.4, p = 3.3e-211, n = 8480 trials across 19 mice), with significantly earlier peak response (Fig. 6f T = -17.4, p = 1.4e-67). These results show that the transmission between the driver and driven module is associated with increased temporal spread of within-module spiking and decreased population onset synchronization. This is consistent with the concept of increasing recurrent interactions deeper into the processing chain (Goris et al., 2014; Lu et al., 2001).
Discussion
Our results provide a multi-area perspective on signal flow in the mouse visual network. Contrary to a simple feedforward model in which each hierarchical level sequentially transmits signals up the chain of areas, we provide evidence for module-based signal propagation that involves neurons at multiple levels of the hierarchy simultaneously. The cluster of cells we have denoted the driver module are engaged earlier in the sensory processing stream, while driven module neurons are recruited later; this suggests these modules might represent distinct processing stages, which is supported by their functional interaction, their anatomical biases, their network convergence, and their within-module temporal coordination.
A method for identifying multi-area functional modules
Community detection methods have been used in human brain imaging studies to identify functional modules at the millimeter scale with distributed organization (Betzel et al., 2018; Power et al., 2011; Sporns and Betzel, 2016), but these techniques are rarely been applied to cellular resolution networks of a sensory system. The novel clustering algorithm we developed in this study provides a new approach for identifying functional modules in spiking networks and could be helpful for dissecting network substructure in a variety of systems and contexts. Our method for identifying functional modules relied on first quantifying fast timescale functional connectivity (putative synaptic interaction) in the recorded network. We define a directed rather than undirected matrix of interactions between all neurons. Our unsupervised clustering method then identifies distinct neuronal populations in the network based on their shared patterns of inferred input and output connectivity. By allowing our algorithm to cluster neurons from all recorded brain regions together we uncovered modules that do not map directly onto single cortical areas or layers, although there are area and layer biases. Thus our approach has the potential for identifying functionally relevant and interacting subpopulations of neurons that are not constrained simple by anatomical parcellation. Many behavior and cognitive operations are likely mediated by such distributed neuronal ensembles.
Separate sets of neurons for distributing and integrating information
In each recorded mouse we consistently identified similar modules of neurons based on their shared functional connectivity during visual stimulation. One module is dominated by weak connections, which could be due to non-optimal activation or incompleteness of the recorded network. The other two modules had strong positive connections (driver module) or strong negative connections (driven module) and are the focus of this paper.
Neurons in the driver module lead activity in the network and show higher divergence implying they could be involved in distributing feedforward signals. The driver units also have more transient dynamics compared to the driven units, and their responses are more synchronized as a population. Anatomically, the driver units are distributed in the middle and superficial layers and their proportion decreased along the visual hierarchy. All of these features are consistent with these neurons being more involved in feedforward processing. The driven module, in contrast, is positioned to process recurrent signals. It has higher neuronal convergence and there are more bi-directional functional connections amongst neurons in this module.
Future work can seek to identify the anatomical substrate for these functional observations. In the mouse visual cortex, V1 projects to all higher order visual areas (Harris et al., 2019; Wang and Burkhalter, 2007), and single neurons can make branched projections to multiple areas (Han et al., 2018). If these projections were module-specific, this anatomical connectivity could underlie the functional network structure we observe. There is also evidence that feedforward and feedback circuits are mediated by separate sets of neurons (Berezovskii et al., 2011; Markov et al., 2014b); this compartmentalization could help explain the division between the driver and driven modules identified by functional connectivity patterns.
Beyond feedforward and feedback circuits, the idea of separate sets of neurons for distributing versus integrating signals hasn’t been explored functionally in a sensory hierarchy. These findings could have interesting computational implications on the architectural design of models for cortical networks. Rather than building a hierarchical network with sequential stages representing areas, these models can include intercalated groups of neurons in each area that participate in these separate processing levels.
Future directions
Although our recordings spanned six levels of the visual hierarchy, this is still an incomplete recording of the network. It is likely that the separation into two modules is only a coarse first-order separation. Given greater sampling of the network (e.g. more areas, more neurons) and more complexed stimuli could be used to drive neurons with different dynamics. Future work could use other types of more naturalistic stimuli (with longer recording) and with behavior tasks to drive the network into a different state and then apply the same method to test the dynamics of the network modules under different brain states. Finally targeted perturbations should be used to relate functional module dynamics to specific behavioral and cognitive operations.
Author contributions
Conceptualization: X.J., J.H.S. and S.R.O.
Investigation, validation and methodology: J.H.S., X.J., S.D., G.H and T.R.
Formal analyses: X.J.
Visualization: X.J., S.R.O.
Original draft written by X.J. and S.R.O. with input and editing from J.H.S.
All co-authors reviewed the manuscript.
Competing interests
The authors declare no competing interests.
Materials and Methods
Mice
Mice were maintained in the Allen Institute animal facility and used in accordance with protocols approved by the Allen Institute’s Institutional Animal Care and Use Committee. Four mouse genotypes were used: wild-type C57BL/6J (Jackson Laboratories) (n = 11) or Pvalb-IRES-Cre (n = 1), Vip-IRES-Cre (n = 2), and Sst-IRES-Cre (n = 5) mice bred in-house and crossed with an Ai32 channelrhodopsin reporter line. Following surgery, all mice were single-housed and maintained on a reverse 12-hour light cycle. All experiments were performed during the dark cycle.
Data collection
Experimental data collection followed the procedures described in Siegle, Jia et al., 2019 (Siegle et al., 2019). A summary of these methods is provided below. 13 out of 19 of datasets in this study were previously released on the Allen Institute website via the AllenSDK (https://github.com/AllenInstitute/AllenSDK).
Surgical methods
All surgical methods used here are the same as (Siegle et al., 2019). Briefly, to enable co-registration across the surgical, intrinsic signal imaging, and electrophysiology rigs, each animal was implanted with a titanium headframe that provides access to the brain via a cranial window and permits head fixation in a reproducible configuration. To implant the headframe, mice were initially anesthetized with 5% isoflurane (1-3 min) and placed in a stereotaxic frame (Model# 1900, Kopf). Isoflurane levels were maintained at 1.5-2.5% for surgery and body temperature was maintained at 37.5°C. Carprofen was administered for pain management (5-10 mg/kg, S.C.). Atropine was administered to suppress bronchial secretions and regulate heart rhythm (0.02-0.05 mg/kg, S.C.). The headframe was placed on the skull and fixed in place with White C&B Metabond (Parkell). Once the Metabond was dry, the mouse was placed in a custom clamp to position the skull at a rotated angle of 20°, to facilitate the creation of the craniotomy over visual cortex. A circular piece of skull 5 mm in diameter was removed, and a durotomy was performed. The brain was covered by a 5 mm diameter circular glass coverslip, with a 1 mm lip extending over the intact skull. The bottom of the coverslip was coated with a layer of silicone to reduce adhesion to the brain surface. At the end of the procedure, but prior to recovery from anesthesia, the mouse was transferred to a photo-documentation station to capture a spatially registered image of the cranial window.
On the day of recording (at least four weeks after the initial surgery), the cranial coverslip was removed and replaced with an insertion window containing holes aligned to six cortical visual areas. First, the mouse was anesthetized with isoflurane (3%–5% induction and 1.5% maintenance, 100% O2) and eyes were protected with ocular lubricant (I Drop, VetPLUS). Body temperature was maintained at 37.5°C (TC-1000 temperature controller, CWE, Incorporated). The cranial window was gently removed to expose the brain. An insertion window with holes for probe penetration based on each mouse’s individual visual area map was then placed in the headframe well and sealed with Metabond. An agarose mixture was injected underneath the window and allowed to solidify. The mixture consisted of 0.4 g high EEO Agarose (Sigma-Aldrich), 0.42 g Certified Low-Melt Agarose (Bio Rad), and 20.5 mL ACSF (135.0 mM NaCl, 5.4 mM KCl, mM MgCl2, 1.8 mM CaCl2, 5.0 mM HEPES). This mixture was optimized to be firm enough to stabilize the brain with minimal probe drift, but pliable enough to allow the probes to pass through without bending. A layer of silicone oil (30,000 cSt, Aldrich) was added over the holes in the insertion window to prevent the agarose from drying. A 3D-printed plastic cap was screwed into the headframe well to keep out cage debris. At the end of this procedure, mice were returned to their home cages for 1-2 hours prior to the Neuropixels recording session.
Intrinsic Signal Imaging
Intrinsic signal imaging was performed approximately 15 days after the initial surgery and 25 days before the experiment. Intrinsic signal imaging was used to obtain retinotopic maps representing the spatial relationship of the visual field (or, in this case, coordinate position on the stimulus monitor) to locations within each cortical area (Fig. 1a). The maps made it possible to delineate functionally defined visual area boundaries in order to target Neuropixels probes to retinotopically defined locations in primary and higher order visual cortical areas (Garrett et al., 2014).
Habituation
Mice underwent two weeks of habituation in sound-attenuated training boxes containing a headframe holder, running wheel, and stimulus monitor. Each mouse was trained by the same operator throughout the 2-week period. During the first week, the operator gently handled the mice, introduced them to the running wheel, and head-fixed them with progressively longer durations each day. During the second week, mice run freely on the wheel and were exposed to visual stimuli for 10 to 50 min per day. The following week, mice underwent habituation sessions of 75 minutes and 100 minutes on the recording rig, in which they viewed a truncated version of the same stimulus shown during the experiment.
Electrophysiology Experiments
All neural recordings were carried out with Neuropixels probes (Jun et al., 2017). Each probe contains 960 recording sites, a subset of these (374 for “Neuropixels 3a” or 383 for “Neuropixels 1.0”) can be configured for recording at any given time. The electrodes closest to the tip were always used, providing a maximum of 3.84 mm of tissue coverage. The sites are oriented in a checkerboard pattern on a 70 μm wide x 10 mm long shank. The signals from each recording site are split in hardware into a spike band (30 kHz sampling rate, 500 Hz highpass filter) and an LFP band (2.5 kHz sampling rate, 1000 Hz lowpass filter).
The experimental rig was designed to allow six Neuropixels probes to penetrate the brain approximately perpendicular to the surface of visual cortex (Siegle et al., 2019). Each probe was mounted on a 3-axis micromanipulator (New Scale Technologies, Victor, NY), which were in turn mounted on a solid aluminum plate, known as the probe cartridge. The mouse was placed on the running wheel and fixed to the headframe clamp. The tip of each probe was aligned to target the desired retinotopic region in each area. Brightfield photo-documentation images were taken with the probes fully retracted, after the probes reached the brain surface, and again after the probes were fully inserted. An IR dichroic mirror was placed in front of the right eye to allow an eyetracking camera to operate without interference from the visual stimulus. A black curtain was then lowered over the front of the rig, placing the mice in complete darkness except for the visual stimulus monitor.
Neuropixels data was acquired at 30 kHz (spike band) and 2.5 kHz (LFP band) using the Open Ephys GUI (Siegle et al., 2017). Gain settings of 500x and 250x were used for the spike band and LFP band, respectively. Each probe was either connected to a dedicated FPGA streaming data over Ethernet (Neuropixels 3a) or a PXIe card inside a National Instruments chassis (Neuropixels 1.0). Raw neural data was streamed to a compressed format for archiving which was extracted prior to analysis.
Cortical Area Targeting
To confirm the identity of the cortical visual areas, images of the probes taken during the experiment were compared to images of the brain surface vasculature taken during the ISI session (see above). Vasculature patterns were used to overlay the visual area map on an image of the brain surface with the probes inserted (Fig 1a). To maximize measurable functional connectivity across areas, we targeted the center of gaze in all areas (except for RL, which targeted the center of mass because of geometry) with overlapping receptive fields (RF) guided by a retinotopic map. Targeting was validated by mapping receptive fields of all sorted units with small Gabor patches presented at different locations on the screen (see below). All analysis was restricted to neurons with well-defined receptive fields within the screen boundaries.
Visual Stimulus
Visual stimuli were generated using custom scripts based on PsychoPy (Peirce, 2007) and were displayed using an ASUS PA248Q LCD monitor, with 1920 x 1200 pixels (21.93 in wide, 60 Hz refresh rate). Stimuli were presented monocularly, and the monitor was positioned 15 cm from the mouse’s right eye and spanned 120° x 95° of visual space prior to stimulus warping. Each monitor was gamma corrected and had a mean luminance of 50 cd/m2. To account for the close viewing angle of the mouse, a spherical warping was applied to all stimuli to ensure that the apparent size, speed, and spatial frequency were constant across the monitor as seen from the mouse’s perspective.
Visual stimuli for receptive fields (RFs)
Receptive field location was mapped with small Gabor patches. The receptive field mapping stimulus consisting of 2 Hz, 0.04 cycles per degree drifting gratings (3 directions: 0°, 45°, 90°) with a 20° circular mask. These Gabor patches randomly appeared at one of 81 locations on the screen (9 x 9 grid with 10° spacing) for 250 ms at a time, with no blank interval.
Visual stimuli for current source density (CSD)
Current source density for layer estimation used the full-field flash stimuli (a series of dark or light full field image with luminance = 100 cd/m2). lasting 250 ms each and separated by a 1.75 second inter-trial interval.
Visual stimuli for functional connectivity
Functional connectivity during the stimulus-driven condition was measured using drifting grating stimuli, which were presented at 4 directions (0°, 45°, 90°, 135°), with temporal frequency equal to 2 cycle/sec and contrast equal to 0.8. In each trial, the grating is presented for 2 sec followed by 1 sec gray screen. Each condition was presented for 75-100 trials.
Spike Sorting
Prior to spike sorting, the spike-band data passed through 4 steps: DC offset removal, median subtraction, filtering, and whitening. First, the median value of each channel was subtracted to center the signals around zero. Next, the median across channels was subtracted to remove common-mode noise. The median-subtracted data file is the input to the Kilosort2 Matlab package (https://github.com/mouseland/kilosort2), which applies a 150 Hz high-pass filter, followed by whitening in blocks of 32 channels. The filtered, whitened data is saved to a separate file for the spike sorting step.
Kilosort2 was used to identify spike times and assign spikes to individual units (Stringer et al., 2019). Kilosort2 attempts to model the complete dataset as a sum of spike “templates.” The shape and locations of each template is iteratively refined until the data can be accurately reconstructed from a set of N templates at M spike times, with each individual template scaled by an amplitude, a. A critical feature of Kilosort2 is that it allows templates to change their shape over time, to account for the motion of neurons relative to the probe over the course of the experiment. Stabilizing the brain using an agarose-filled plastic window has virtually eliminated probe motion associated with animal running, but slow drift of the probe over ∼3-hour experiments is still observed. Kilosort2 is able to accurately track units as they move along the probe axis, eliminating the need for the manual merging step that was required with the original version of Kilosort (Pachitariu et al., 2016). The spike-sorting step runs in approximately real time (∼3 hours per session) using a dual-processor Intel 4-core, 2.6 GHz workstation with an NVIDIA GTX 1070 GPU. We used the default parameters in Kilosort2, with an initial threshold of 12, and a final-pass threshold of 8.
The Kilosort2 algorithm will occasionally fit a template to the residual left behind after another template has been subtracted from the original data, resulting in double-counted spikes. This can create the appearance of an artificially high number of ISI violations for one unit or artificially high zero-time-lag synchrony between nearby units. To eliminate the possibility that this artificial synchrony will contaminate data analysis, the outputs of Kilosort2 are post-processed to remove spikes with peak times within 5 samples (0.16 ms) and peak waveforms within 5 channels (∼50 microns).
Kilosort2 generates templates of a fixed length (2 ms) that matches the time course of an extracellularly detected spike waveform. However, there are no constraints on template shape, which means that the algorithm often fits templates to voltage fluctuations with characteristics that could not physically result from the current flow associated with an action potential. The units associated with these templates are considered “noise,” and are automatically filtered out based on 3 criteria: spread (single channel, or >25 channels), shape (no peak and trough, based on wavelet decomposition), or multiple spatial peaks (waveforms are non-localized along the probe axis).
Following the spike sorting step, data for each session was uploaded to the Allen Institute Laboratory Information Management System (LIMS). Each dataset was run through the same series of processing steps using a set of project-specific workflows (AllenSDK v1.0.2) in order to generate NeurodataWithoutBorders (NWB) files used for further analysis.
Analysis Methods
Dataset
In total, units from 19 mice were included in our functional connectivity analysis. Spike sorting, quality control, and preprocessing steps followed the same procedures as (Siegle et al., 2019). 13 out of 19 of these datasets were previously released on the Allen Institute website via the AllenSDK (https://github.com/AllenInstitute/AllenSDK). On average, 632 ±18 sorted cortical units were simultaneously recorded in each mouse. We set a firing rate threshold to select units for functional connectivity analysis. Firing rate (FR) was defined as the average number of spikes in a window from 50 ms to 500 ms after the onset of the drifting gratings stimulus (Fig. S1). Only units with mean FR > 2 spikes/second were used for pairwise cross-correlogram (CCG) calculation, which resulted in an average of 356 ±7 units in each mouse (n = 6773 units in total). Because functional connectivity varies with receptive field position (Jia et al., 2013), we further constrained the dataset to include units with receptive field centers at least 10 degree away from the edge of the monitor (see Visual receptive fields section below; Fig. S1). After filtering by receptive field location, we ended up with 184 ±8 per mouse used for the final clustering procedure (n = 3487 units in total). After applying clustering on the functional connectivity matrix constrained by both FR and RF location in each mouse, the total numbers of units belonging to each cluster were: n_cluster1 = 1386, n_cluster2 = 1131, n_cluster3 = 970.
Quantification and statistical analysis
All analyses were performed in Python. The main analysis packages used in this paper are Scipy (Virtanen et al., 2020), scikit-learn (Pedregosa et al., 2011), statsmodels (Seabold and Perktold, 2010), and network (Hagberg et al., 2008). Error bars, unless otherwise specified, were computed as standard error of the mean. When comparing the difference between two independent variables, if their distribution is Gaussian like (normality test), we used Student’s t-test; if their distribution is non-Gaussian, we used a rank sum test. When testing whether a distribution is significantly different from 0, we used a one-sample t-test. When comparing variables between modules across cortical areas, we used two-way analysis of variance (ANOVA) to assess both the main effect between modules and whether there is any interaction across areas. When comparing similarity to the previously established anatomical visual hierarchy in mouse (Harris et al., 2019), we calculated the correlation between our measured variable (e.g. first spike latency) and the previously calculated hierarchy score (V1: -0.50, RL: -0.12, LM: -0.13, AL: 0.00, PM: 0.11, AM: 0.29), using Spearman’s correlation to estimate the rank order significance. Statistical details and p-values can be found in the Results section or figure legends.
Visual receptive fields
Receptive fields were mapped with Gabor patches (20 degree each; 3 different orientations (0, 45, 90), temporal frequency = 2 cyc/s, spatial frequency = 0.04 cyc/deg) shown randomly at 81 different locations (9 x 9 grid, 10° separation between pixel centers) with gray background on a 120° x 95° monitor (1920 x 1200 pixels, 21.93 inches wide, 60 Hz refresh rate). The receptive field map (RF) for one unit is defined as the mean 2D histogram of spike counts at each of 81 locations (Fig. S1a), each pixel covers a 10° x 10° square. The receptive field was then thresholded at 20% of maximum response (Fig. S1b) to remove potential noisy pixels. Then, a 2D Gaussian
was fit to the thresholded visual receptive map to estimate the center of the receptive field location (Fig. S1c).
Peristimulus time histogram (PSTH)
To visualize the temporal dynamics of a neuronal population (Fig. 1, Fig. 3, and Fig. 4), the activity of each neuron was binned at 1 ms, averaged across trials (n = 75), smoothed with a Gaussian filter with standard deviation of 3 ms, baseline subtracted (baseline period from 0 to 0.03s relative to stimulus onset), and normalized by dividing the maximum of the response between 0 to 1.5 s after stimulus onset. The normalized PSTHs of individual neuron were averaged within a neuronal population; the error bars indicate standard error of the mean across neurons.
Functional connectivity
We analyzed functional interactions between pairs of simultaneously recorded neurons by calculating the spike train cross-correlogram (CCG) (Jia et al., 2013; Smith and Kohn, 2008b; Zandvakili and Kohn, 2015). For a pair of neurons with spike train x1 and x2, the CCG is defined as:
where M is the number of trials, N is the number of bins in the trial, and are the spike trains of the two units on trial i, τ is the time lag relative to reference spikes, and λ1 and λ2 are the mean firing rates of the two units. The CCG is essentially a sliding dot product between two spike trains. θ(τ) is the triangular function which corrects for the overlapping time bins caused by the sliding window. To correct for firing rate dependency, we normalized the CCG by the geometric mean spike rate. An individually normalized CCG is computed separately for each drifting grating orientation and averaged across orientations to obtain the CCG for each pair of units.
The jitter-corrected CCG was created by subtracting the expected value of CCGs produced from a resampled version of the original dataset with spike times randomly perturbed (jittered) within the jitter window (Harrison and Geman, 2009; Smith and Kohn, 2008b). The correction term (CCGjittered) is the true expected value which reflects the average over all possible resamples of the original dataset. CCGjittered is normalized by the geometric mean rate before subtracting from CCGorginal. The analytical formula used to create a probability distribution of resampled spikes was provided in Harrison and Geman, 2009. This method disrupted the temporal correlation within the jitter window, while maintaining the number of spikes in each jitter window and the shape of the PSTH averaged across trials.
For our measurement, a 25 ms jitter window was chosen based on previous studies (Jia et al., 2013; Zandvakili and Kohn, 2015). This jitter-correction method removes both the stimulus-locked component of the response, as well as slow fluctuations larger than the jitter window. The remaining fast timescale correlation is more likely to be related to signal propagation between two neurons. Therefore, the jitter-corrected CCG reflects temporal correlations between a pair of neurons within the jitter-window (25ms).
We then calculated the directed connection weight by subtracting the sum of (−13 to 0) ms of the CCG from the sum of (0 to 13) ms of the jitter-corrected CCG (Fig. 1d). The 13ms window was defined as half of the 25 ms jitter window we used, and also because real functional delay between neurons in mouse occur on the timescale of milliseconds to tens of milliseconds (Siegle et al., 2019). The resulting value indicates the strength and the sign indicates the directionality of the functional connection between a pair of neurons. Computing this for all pairs of neurons produced a directional, cellular-resolution connectivity matrix for each mouse (Fig. 1e).
Clustering
Non-randomness
We first tested whether there is modular structure (non-randomness) in the measured connectivity matrix by computing the graph spectrum (based on spectral graph theory (Spielman, 2008). The eigenvalues of a graph are defined as the eigenvalues of its adjacency matrix (Fornito et al., 2016). The set of eigenvalues of a graph forms a graph spectrum. The randomness of the matrix is quantified by comparing the graph spectrum of the original connectivity matrix with its shuffled connectivity matrix, where the x and y axis are shuffled independently, and a randomly generated connectivity matrix with the same size. We found that the graph spectrum of the original matrix showed significantly higher explained variance by the top eigenvalues than the shuffled matrix and the random matrix, suggesting that the measured connectivity matrix has non-random structure (Fig. S2).
Defining the number of clusters
The number of clusters was determined using several complementary methods (Fig. S4a):
The Elbow method estimates the percentage of variance explained for a given number of k. The number of cluster is estimated at the point when the curve turns into a plateau. The following measure represents the sum of within-cluster distances (pairwise distances) between all points in a given cluster Ck containing nk points:
Adding the normalized within-cluster sum-of-squares gives a measure of the compactness of our clustering, or the pooled within-cluster sum of squares around the cluster means:
Wk increases monotonically with number of clusters k. The number of clusters is chosen at the point where the marginal gain drops (or the point slope change most dramatically), the ‘elbow’.
Gap statistics (Tibshirani et al., 2001) seeks to standardize the comparison of logWk with a null reference distribution of the data, i.e. a distribution with no obvious clustering. The estimate for the optimal number of clusters K is the value for which logWk falls the farthest below this reference curve. This information is contained in the following formula for the gap statistic:
Where denotes the expectation under a sample of size n from the reference distribution. The estimate into account. will be the value maximizing Gapn (k) after we take the sampling distribution into account.
Clustering density estimates the data distribution density for a given k by calculating a density function f (k) (Pham et al., 2005). The value of f (k) is the ratio of the real distortion to the estimated distortion. When the data are uniformly distributed, the value of f (k) is 1. When there are areas of concentration in the data distribution, the value of f (k) decreases. Therefore, the number of clusters is determined by finding the minimum value of f (k).
Combining the estimation of clusters to be 3. using the above three methods, we determined the optimal number of
Method for clustering
In order to find neurons that have correlated connectivity patterns to the rest of the network, we clustered the directed connectivity matrix by treating the connectivity pattern from each source neuron to all target neurons as features (Fig. 1f and Fig. S3). To reduce noise, we projected the connectivity features into a lower dimensional space with principal component analysis (PCA), only keeping the top principal components that explained 80% of total variance. We then applied a consensus clustering method (Monti et al., 2003) with k-means to obtain robust clusters that are not biased by random initial conditions. First, we constructed a co-clustering association matrix by running k-means with different initial conditions 100 times (reached stable co-clustering). Each entry in the matrix represents the probability of two units belonging to the same cluster. Then, we clustered the association matrix with hierarchical clustering to determine the cluster labels. The number of clusters was determined using methods described in the previous section.
Comparing different clustering methods
Our consensus clustering was based on k-means clustering methods, which measures the compactness of points based on features in the reduced PCA space (see above). We compared this clustering method with two other clustering methods to detect modular structure in the adjacency matrix: the spectral clustering method (sklearn.cluster.SpectralClustering) and bi-clustering method (sklearn.cluster.SpectralBiclustering).
Spectral clustering determines the clusters based on the connectivity of data points: points that are connected or immediately next to each other are placed in the same cluster. In spectral clustering, the data points are treated as nodes of a graph, and the clustering is treated as a graph partitioning problem. The nodes are then mapped to a low-dimensional space that can be easily segregated to form clusters. The spectral clustering is carried out in 3 steps: 1. Compute a similarity graph (k-nearest neighbors). 2. Project the data onto a low-dimensional space (compute Graph Laplacian, and eigenvalues and eigenvector for L). 3. Create clusters (based on the eigenvector corresponding to the 2nd eigenvalue to assign values to each node, then split the nodes with k-means for the given number of clusters).
Biclustering (or block clustering) is a method to simultaneously cluster the rows and columns of a matrix. For a m (sample) by n (feature) matrix, the algorithm generates biclusters, which are a subset of rows that exhibit similar connectivity pattern across a subset of columns.
The results of consensus clustering, spectral clustering, and biclustering of the functional connectivity matrix are shown in (Fig. S4c). The three methods showed relatively consistent clustering results (Fig. S4d) in detecting units that belong to the three clusters dominated by different weight pattern. Therefore, our clustering findings are general and do not depend on the specific clustering method we used.
Cluster quality
We used two methods, which were previously used to evaluate spike sorting cluster quality (Siegle et al., 2019), to quantify neuronal population cluster quality given different number of clusters (Fig. S4b). The d-prime (d’) was calculated using Fisher’s linear discriminant analysis to find the line of maximum separation in PC space (Hill et al., 2011). d′ indicates the unbiased separability of the cluster of interest from all other clusters. The higher the value, the more distinguishable are the clusters. Hit-rate was calculated with nearest-neighbors method (n_neighbors = 3), which is a non-parametric estimate of exemplar contamination in each cluster. For each unit belonging to the cluster of interest, the three nearest units in principal-component space are identified. The “hit rate” is defined as the fraction of these units that belong to the cluster of interest. This metric is based on the “isolation” metric from (Chung et al., 2017). The higher the value, the less contamination in each cluster.
Module distribution
The area distribution of neurons within each module was quantified by calculating the proportional number of units in one area relative to the total number of units in all areas for a given module. The proportion of one module across areas sums to 1. To minimize sampling bias across areas, we subsampled units in each area to match the number of units across areas. The final result was a bootstrapped mean (sampling with replacement to match the number of units in each area; n_boot = 100). Error bars represents the bootstrapped standard deviation across all units in all mice. Results are only shown for the ‘driver’ and ‘driven’ modules. No systematic area bias was observed for cluster 1 (the cluster with non-significant connection) units (result not shown).
The distribution of each neuronal module across layers was quantified by first dividing units into superficial, middle, and deep layers according to the location of layer 4 estimated from the CSD (Fig. S6). We then calculating the proportion of units across these three layers for a given neuronal module. To minimize sampling bias across layers, we subsampled units in each layer to match the number of units across layers. Means and error bars were calculated using the same bootstrapping method as for the area distributions.
Graph creation
To create graphs visualizations (Fig. 3b,d), we first condensed our single-unit connectivity matrix to a single-recording-site connectivity matrix by combining units with peak channels on the same electrode. Then, we treat each site as a node in the graph. For an intuitive representation, nodes belonging to the same cortical area are close by and arranged clockwise from superficial to deep layer. The location of each area is determined by the top-down view of the physical locations of visual areas on the left hemisphere (Fig. 1a). The edges of the graph represent connections between sites, with red lines indicating projections from the source unit (positive weight) and blue lines indicating projections back to the source unit (negative weight). The threshold for significant connections is defined as an absolute weight larger than 10−6 coincidences/spike, which is half of the standard deviation of the weight distribution across all mice.
Divergence and convergence degree
Divergence degree is similar in concept to the outdegree of a graph. It is defined as the proportion of significant positive connections (weight > 10−6) from a source neuron to the rest of the network (N neurons). Ci,+ represents the number of positive connections from neuron i to the network.
Convergence degree is similar in concept to the indegree of a graph. It is defined as the proportion of significant negative connections (weight < -10−6) to source neuron i from the rest of the network.
Temporal dynamics analysis
Response latency
Two different measurements were used to estimate response latency. The peak response latency was defined as the time when a neuron’s response reached its first peak after stimulus onset. The time to first spike was estimated in each trial by looking for the time of the first spike 30 ms after stimulus onset. If no spike was detected within 250 ms after stimulus onset, that trial was not included. The overall latency for each unit was defined as the mean time to first spike across trials.
Population onset response synchrony
We used the spread of time-to-first spike for all neurons within a module for a single trial as an indicator of population response onset synchronization. The spread was calculated by fitting a Gaussian to each trial’s time-to-first-spike distribution:
where x is the spike time relative to stimulus onset, µ is an estimate of the average time-to-first-spike and σ is an estimate of the spread of the time-to-first spike distribution for one trial.
Population response spread of the first peak
To quantify spike the response spread of a neuronal ensemble, we estimated the width of the population PSTH for each trial. The PSTH was calculated with 2 ms bins, and convolved with a Gaussian kernel of width 5 ms. The properties of the first peak were estimated using scipy.signal.find_peaks. The peak width represents the half-width at half maximum of the peak, while peak height represents the maximum of the peak. The spike spread of a neuronal population is an important parameter for quantifying how signals are transmitted through a feedforward network.
In-out index
To quantify the proportion of projections out from a source area relative to inputs back into the source area, we defined the in-out index as
where Cout is the number of connections from source to other areas in the given network and Cin is the number of connections from other areas to source area. This index reflects the asymmetry of in-and-out degree of a source area. When the value is close to -1, the source area is dominated by outward projections (positive weights). When the value is close to 1, the source area is dominated by inward projections (negative weights). When the value is 0, the source area has balanced outward and inward connections.
Layer definition
We estimated the depth of the middle layer of cortex by first calculating the current source density (CSD) using simultaneously recorded local field potentials (LFP) (Fig. S6). The CSD was computed using the method in (Stoelzel et al., 2009), using the LFP within 250 ms after stimulus onset. First, we calculated the average evoked (stimulus locked) local field potential at each recording site. Next, we duplicated the uppermost and lowermost field traces and smoothed these signals across sites
where φ is the field potentials, r is the coordinate perpendicular to the layers, h is the spatial sampling interval (40 μm in our case). Then, we calculated the second spatial derivative
In the resulting CSD map, current sinks are indicated by downward deflections and sources by upward deflections. To facilitate visualization, we smoothed the CSD with 2D Gaussian kernels (σx = 1; σx = 2). To find the middle layer, we defined the first sink within 100 ms after stimulus onset as the input layer (center channel) by searching for the local maximum on the CSD map (first sink), followed by source.
We used the middle layer estimation for two metrics in our paper. For the calculation of layer distribution bias of ‘driver’ and ‘driven’ modules, we partitioned the cortical layers into three layers: middle layer (center channel ±8 channels, which is ±40μm), superficial layer (channels above middle layer), and deep layers (channels below middle layer and above white matter). For the layer dependence of PSTH response latency (Fig. S6), we set the middle layer to depth = 0 and defined depths around the middle layer with 8-channel spacing (40μm spacing). Units within a depth range were grouped together to calculate a PSTH and the latency for that depth was estimated based on this grouped PSTH.
Simulations to test mathematical relationship between PSTH shape and CCG sharp peak
Because we observed that the functional connectivity defined ‘driver’ module responded earlier than the ‘driven’ module (Fig. 3d), we wondered whether the brief timescale relationship of ‘driver’ leading ‘driven’ was a consequence of the general latency reflected in averaged PSTH. Even though our jitter-correction method should have removed stimulus-locked components and the observed directionality should only reflect brief-timescale signal transmission, we still wanted to rule out the possibility that the observed asymmetry in the CCG is merely a reflection of the trial-averaged PSTH latency.
We used a simple simulation to carry out positive and negative controls (Fig. S7). The negative control tested whether two neurons with correlated, but temporally offset, PSTH traces will necessarily show significant peaks in their jitter-corrected CCG (25 ms jitter window). The positive control tested whether two neurons with uncorrelated PSTH traces can produce a significant peak in their CCG if we artificially introduce millisecond-timescale correlations. The mathematical expression of the tests is formulated as follows:
Given two PSTH traces: λ1(t) and λ2(t), we simulated Poisson spike trains:
over time T for 100 repeats, where the PSTHs of the two simulated spike trains (X1 and X2) matched the shape of λ1(t) and λ2(t). Synchronized spikes were introduced to x1(t) and x2(t) only for the positive control. CCGs before and after jitter correction were calculated between X1 and X2. We found that brief timescale correlations between two neurons (identified by significant peaks in the CCG) do not depend on the shape and relative timing of their PSTHs, but—as expected—reflect only their fine-timescale temporal relationship.
Data and code availability
The majority of the data in this study (13 of 19 experiments) was publicly released as an open dataset on the Allen Institute website in October 2019, and is available via the AllenSDK (https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html). Additional data and software will be deposited to Github.
Acknowledgments
We thank the Allen Institute founder, Paul G. Allen, for his vision, encouragement and support. We thank the Transgenic Colony Management for mouse breeding and Laboratory Animal Services for mouse import and wellness care. We thank the Neurosurgery and Behavior Team for surgical procedures and habituation. We thank Shiella Caldejon for running intrinsic signal imaging experiments, and Rusty Nicovich and Kiet Ngo for collecting optical projection tomography data. We thank the following for helpful discussions: Yazan Billeh, Uygar Sumbul, and Daniel Denman. We thank the following for helpful feedback on manuscript: Daniel Denman, Hannah Choi, Marina Garrett, Gabe Ocker, Adam Kohn, and Christof Koc.