## Abstract

Modern electrophysiological recordings and optical imaging techniques have revealed a diverse spectrum of spatiotemporal neural activities underlying fundamental cognitive processing. Oscillations, traveling waves and other complex population dynamical patterns are often concomitant with sensory processing, information transfer, decision making and memory consolidation. While neural population models such as neural mass, population density and kinetic theoretical models have been used to capture a wide range of the experimentally observed dynamics, a full account of how the multi-scale dynamics emerges from the detailed biophysical properties of individual neurons and the network architecture remains elusive. Here we apply a recently developed coarse-graining framework for reduced-dimensional descriptions of neuronal networks to model visual cortical dynamics. We show that, without introducing any new parameters, how a sequence of models culminating in an augmented system of spatially-coupled ODEs can effectively model a wide range of the observed cortical dynamics, ranging from visual stimulus orientation dynamics to traveling waves induced by visual illusory stimuli. In addition to an efficient simulation method, this framework also offers an analytic approach to studying large-scale network dynamics. As such, the dimensional reduction naturally leads to mesoscopic variables that capture the interplay between neuronal population stochasticity and network architecture that we believe to underlie many emergent cortical phenomena.

## Introduction

Multi-channel recordings and optical imaging have revealed a vast repertoire of spatiotemporal activity patterns in the brain. This rich hierarchy ranges from localized activation to traveling waves, to dynamically switching cortical states [1–4]. The activities can be stimulus-driven or internally generated and are shown to affect information processing, sensory perception and cognitive tasks [5–11]. Mathematically, the emergence of the many spatial and temporal scales in cortical dynamics presents a tremendous challenge for modelers and theoreticians. The rapid development in computational power has allowed us to study very large networks, and a combination of large-scale network simulations [12–15], reduced dimensional models (e.g., neural field models, mean-field populations models, and kinetic theories [3, 6, 9, 16–19]), and machine learning [20–22] have been used to successfully describe many experimental phenomena. The principal mechanism underlying the diverse spectrum of network activities is likely to be the strong competition between excitatory and inhibitory neuronal populations. However, a theoretical account of how the detailed biophysical properties of individual neurons, the local network properties and cortical architecture can lead to the observed emergent multi-scale dynamics is lacking. Here we make progress towards such a theoretical model by making use of a coarse-graining formalism that has been successful at capturing the rich repertoire of heterogeneous dynamics that exists even in a small homogeneous neuronal network. Recently, working on a small, idealized network of linear integrate-and-fire neurons, through using a partitioned-ensemble-average (PEA), we derived a sequence of population dynamics models, ranging from Master equations, to Fokker-Planck models, and finally to an augmented system of ODEs that explicitly accounts for the interaction between neuronal spiking activity and internal neuronal voltages [23].

In [23] (hereafter ZSRT19), from the Fokker-Planck description of neuronal population dynamics, we derived an augmented low-dimensional ODE system by introducing a hierarchy of neuronal voltage moments and a maximum entropy closure. We showed that, by carefully introducing the PEA into our simulation algorithms, our dimensional reduced population models could faithfully capture highly heterogeneous network dynamics, ranging from transient to sustained, from driven to self-organized, from oscillatory to nearly synchronous network activity in the form of multiple-firing events. More importantly, the PEA formalism provided a conceptual framework to mathematically coarse-grain the emergent network dynamics from first principles. However, so far, the applications have been restricted to small networks with homogeneous connectivities, which is an idealization that is met in but the smallest of local cortical networks. Here we extend our methodology to networks with slowly varying spatial inhomogeneities and apply it study the emergent dynamics of the primary visual cortex (V1).

The mammalian V1 is of particular interest to many neuroscientists owing to its fundamental role in visual processing and the common belief that understanding network functions in V1 will advance our understanding of other areas of the mammalian brain. While individual V1 neurons show preference to orientation of a visual stimulus by elevated firing rates, optical imaging experiments show that orientation preference is distributed in pinwheel-like hypercolumns that tile the cortical surface. Recent optical imaging techniques, particularly, voltage-sensitive dye (VSD) imaging, can capture V1 network dynamics with high spatial and temporal resolution and revealed the important interplay between visual stimulus, subthreshold population dynamics and large-scale, coherent activities [1, 2, 24–26].

The application of the ZSRT19 formalism to V1 naturally produces a spatially-coupled ODE system, consistent with locally organized visual feature maps. By examining local network patches smaller than orientation hypercolumns, we show that sub-hypercolumn temporal dynamics can be well captured by a low-dimensional set of voltage moments. By adding orientation specific couplings between orientation hypercolumns, we were able to recapitulate the cortical wave generation and propagation induced by visual illusory stimuli. Finally, by modeling the temporal difference between On- and Off-visual pathways as temporal differences of the respective inputs into V1, our reduced-dimensional model can account for the induction of propagating voltage waves from darks to brights.

## Materials and Methods

### A Large-scale V1 Model

#### An Integrate-and-Fire Neuronal Network

We model individual V1 neurons (excitatory and inhibitory) as current-based, 92 linear, integrate-and-fire (I&F) point neurons, whose membrane potentials evolve by
where the superscript *Q* ∈ {*E*, *I*}represents the type (excitatory or inhibitory) of each neuron, the subscript *j* indexes the spatial location of the neuron within the V1 network, and the leakage timescale of the membrane potential *τ*_{V} = 20 ms. We normalize the membrane potentials by setting the spiking voltage threshold *V*_{T} = 1 and the rest (leak) potential *V*^{L} = 0. In I&F dynamics, the voltage evolves continuously until it reaches a threshold *V*_{T} after which it is immediately reset to rest for an absolute refractory period (tau = 2~3 ms). Individual neuron’s is driven by its synaptic currents, arising from feedforward input through the LGN , and also from recurrent network activities of excitatory and inhibitory populations . The I&F neuron model has become a widely-used model for the description of spiking neurons, because of its relative ease for mathematical analysis, and yet its dynamics is sufficiently rich to capture diverse neural processing. The I&F model describes the membrane potential of a neuron in terms of the synaptic current inputs it received, either from cortico-cortical recurrent interactions or from external injections. The various synaptic inputs are as follows:

#### Feedforward LGN Input

We model the feedforward LGN input as an independent Poisson process of spike times with a firing rate and synaptic strength *S*^{QY}. The temporal kernel *α*_{ext} models the time course [27, 28] of the synaptic current induced by each LGN spike . Our first coarse-graining approximation is to use continuous firing rates rather than discrete Poisson spikes , to describe the LGN input [13,29].

For the rotating drifting grating stimuli used in the first experiment, the screen displays an intensity pattern *I*(** X**,

*t*), which is given by where

*φ*describe the stimulus spatial phase and

**= |**

*k**k*|(cos

*θ*

_{t},sin

*θ*

_{t}) reflects the spatial frequency and instantaneous orientation. The response of each LGN cell () can be modeled as a rectified, linear spatiotemporal convolution of the visual stimuli, where space and time kernels are constrained by experiments [30, 31]. The LGN cells come in two polarities, “On-” and “Off-” cells, and each V1 cell receives synaptic inputs from a collection of LGN cells that are arranged spatially in a Gabor-like pattern [13, 27, 30]. Here, following Shelley & McLaughlin [28], we model the summed LGN inputs into an individual V1 neuron by note that here the index

*i*has not limited to labeling single neuron, but can also index individual cortical patches in our coarse-grained network model (see below).

In our second experiment, transient On- and Off-visual stimuli were used to probe the dynamics of the On- and Off-visual pathways. The On- and Off-visual pathways already exhibit differences at the LGN [32–35], such as spatially segregated On- and Off- afferent couplings from the RGCs, different response times of the On- and Off- visual pathways, and so on. Here we use two sets of parameters ({*τ*_{1},*τ*_{2}} in Eq (3)) to model the On- and Off-LGN feedforward time courses (Fig 1B).

#### Cortical Architecture

To study the dynamics of a patch of a layer of V1, we construct a 2-dimensional network with spatially structured synaptic connections, through which V1 excitatory and inhibitory neurons recurrently interact. These recurrent cortical connections are represented by the third and fourth terms in Eq (1).

In our first experiment, a single orientation hypercolumn with multiple orientation domains was modeled and only local cortical interactions (< 500 *μm*) were included in the simulations. Therefore, we regard the network as an idealized two-dimensional neuronal network, with all-to-all, isotropic cortical connections. The strengths of these synaptic couplings fall off as the spatial separation between two neurons [36–38].

In the more complicated, second experiment, we model a larger patch of V1, with a spatial range of about 2.5 × 1.5 *mm*^{2} and containing 5 × 3 orientation hypercolumns. On this scale, it is crucial to include long-range synaptic connections in the model. The strengths of long-range (> 1000 *μm*) horizontal synaptic connections specifically depend on orientation preferences of the pre- and post-synaptic neurons [39–41]:
where the excitatory and inhibitory synaptic currents have the form Eq (6,7), where *Q* ∈ {*E*,*I*}. Here denotes the time of the *k*_{th}; spike of the *j*′_{th}; excitatory/inhibitory neuron. We include slow NMDA synaptic currents in addition to the fast excitatory synaptic currents mediated by AMPA and the fast inhibitory synaptic currents mediated by GABA [1, 40–43]. The normalized spatial profile of the cortical coupling strengths (*K*^{QQ′}(*d*)), i.e., both short-range local connections and long-range horizontal connections, are modeled as normalized 2D Gaussian functions (Eq (8)) of the cortical distance between two neurons (or two coarse-graining populations) *d* = |*c*_{j} − *c*_{j′}|. denotes the spatial length-scale of the corresponding type of connections. The spatial kernels are normalized by , as described in previous studies [36, 44].

Time courses due to individual spikes at the time {*T*_{j′,k}} of the cortical recurrent input can be expressed in the form of alpha functions:
where *α*_{P}(*t*), *P* ∈ {*AMPA*, *NMDA*, *GABA*}, *τ*_{1} > *τ*_{2} and *B* is a normalization factor that assures the peak value *A*_{max}, with a different rise-(*τ*_{r} = *τ*_{1}*τ*_{2}/(*τ*_{1} − *τ*_{2})) and decay-(*τ*_{d} = *τ*_{1}) time constants (Eq (9,10)). In our theoretical model below, we model the synaptic time course of fast excitatory (AMPA) and inhibitory (GABA) synapses with an instantaneous rise-time and an infinitely fast decay-time. Therefore, once a cortical neuron fires, the fast synaptic cortical currents create an instantaneous jump in the membrane potential of the post-synaptic neuron. We model the effects of the slow NMDA-type current with nearly instantaneous rise-time and a decay-time . Thus, the recurrent, cortical synaptic inputs are given by

Fast/slow excitatory recurrent synaptic strengths associated with excitatory- and inhibitory-type postsynaptic neurons are and , respectively. Similarly, the fast-inhibitory synaptic strengths associated with excitatory- and inhibitory-type postsynaptic neurons are represented by and . Because all spatial kernels are normalized, these parameters label the strengths and relationships of synaptic couplings.

Thus, our model is an idealization of the experimentally observed connectivities of a V1 layer, the strengths and spatiotemporal properties of the synaptic connections depend on the neuronal types, on orientation preferences, and whether they lie in the same orientation hypercolumn.

### A Coarse-Grained Neuronal Network

Intensive studies have suggested that cortical functional maps, such as orientation pinwheel structure, phase preference, spatial frequency preference, are arranged in regular, organized spatial patterns across visual cortex. Therefore, we partitioned the 2D cortical network into small patches, each of which is large enough to contain a large number of neurons, but still small enough to ensure that functional (i.e., physiological) properties, like orientation preferences, are roughly constant within one single patch.

Specifically, in the first experiment, only one hypercolumn was simulated to model the cortical responses to rotating drifting-grating. In the primary visual cortex, the cortical responses of cells show sensitivity to orientation through elevated firing rates, and the spatial phases, which depend on the position of the grating stimulus. Orientation preference is arranged in pinwheel while spatial phase preference is distributed randomly. To ensure neuronal populations contain disordered, well-sampled preferred spatial phases, we designed 4 similar cortical patches covering every single hypercolumn, but with different preferred spatial phases (0°, 90°, 180°, 270°) (see Fig 1). Each hypercolumn has regular pinwheel-structured orientation map and further coarse-grained (CG) into 6 × 6 CG patches. Individual CG patch with a particular preferred spatial phase consists of 58 neurons, these neurons located in clusters and held similar orientation preference. Furthermore, single hypercolumn simulated in this model suggested that only local cortical interactions (< 500 *μm*) were included in the model, and these interactions were all- to-all connected and assumed to be isotropic.

In the second experiment, a larger cortical area comprising many hypercolumns, was modeled. Thus, the synaptic connections of this simulating model consisted of isotropic short-range connections and long-range horizontal synaptic connections, which were known to depend on orientation preference. In order to capture the population dynamics of this large cortical area and emphasize the significant role of long-range orientation-dependent connections, we structured our model, to divide each hypercolumn into 4 × 4 CG patches. According to their positions in the pinwheel-structured orientation map, these CG patches belonged to 4 different orientation clusters. In addition to local isotropic short-range connections similar to those in the first experiment, long-range horizontal connections across different hypercolumns were also considered. In summary, this model has 5 × 3 hypercolumns, each of which further divided into 4 × 4 CG patches. The specific space-time settings of population dynamic framework studied in this work are shown in Fig 1.

#### Coarse-Grained Network Model

Once the full I&F model network configuration is set up, we can start to coarse-grain. As it is commonly used in population density methods, we consider two biophysically relevant mesoscopic quantities − the firing-rate and the distribution of neuronal membrane potentials.

The voltage distribution of finding a neuron whose membrane potential is in voltage bin (*v*, *v* + *dv*) at time point t within a given ensemble (labeled by *j* and *Q*), is governed by a Master equation:
where *m*^{E/I}(*t*) is the excitatory and inhibitory population-averaged mean firing rate of each neuron as a function of time. Note that this equation has already related the two relevant mesoscopic quantities, firing rate and distribution of membrane potential.

To simplify further, we assume that the jumps of membrane potential due to the external Poisson spikes as well as the recurrent network spikes are small, so the Master equation Eq (15) can be approximated by a standard Fokker-Planck equation of the form:
where the flux *J*^{Q} is defined as:

The drift and diffusion coefficients (also called slaving-voltage) and depend on the spatiotemporal network couplings as well as cortical activities , and can be written as

The firing rate of the *j* patch is determined by the flux to cross the threshold *V*_{T}

Furthermore, the neurons that just reached the threshold are immediately reset to the rest voltage *V*_{R}. Therefore, to ensure the voltage distribution is continuous, we have
with the boundary conditions . Combining with Eq (20), and substituting into Eq (16), we obtain the equilibrium voltage distribution

Note that and are determined by Eq (18,19), and the equilibrium firing rate of a single neuron is determined via Eq (20), which contains and and then normalized. Furthermore, and in turn, depend on (if we assume that ). Also because of the assumption that , the interdependent relationship also occurs between and , and which means Eq (22) is an implicit equation. In this sense, Eq (16) is nonlinear.

We can reduce Eq (16) further by using voltage moments. Let us define the voltage-moments and by:

Combining Eq (23) and the Fokker-Planck Eq(16), the evolution equations for the voltage-moments become

For homogeneous networks, the detailed derivation of Eq (24,25) is given in [45].

We note that in general, the *k*_{th} moment does not depend on higher order moments, so at first glance, this infinite hierarchy of moments does not suffer from the standard problem with closure. However, the dynamics of each moment is dependent explicitly on the population firing rate, which can only be computed from the population voltage distribution . Therefore, any finite set of moments cannot uniquely define the population , and so, to solve this nonstandard closure problem, we introduce a modified maximum entropy (ME) method. That is we compute a voltage distribution that was ‘closest’ to the equilibrium voltage distribution given a finite set of moments , *k* = 1,2,…*M*. We found that in the simulations we performed in this paper, good performance can be achieved with only 4 augmented variables for each CG patch. (A more complete and detailed discussion can be found in our previous work [19, 45, 46].)

One of the major purposes of coarse-graining the large-scale neural network is to improve simulation efficiency. Our moment CG method is an ensemble-average method, corresponding to the average across many independent trials of a large-scale I&F neural network. The time taken by our CG model to run many simulations to produce an ensemble average is therefore far less than that of the corresponding large-scale I&F neural network (when we need to average hundreds or thousands of different realizations of trials with identical statistical properties).

In our simulations, we found that the most time-consuming step is the optimization via a maximum entropy algorithm. In cases where the slaving voltage is relatively far from *V*_{T} and the entire voltage distribution is not sharply localized close to the voltage threshold, the equilibrium voltage distribution can be well approximated by a Gaussian
and we can obtain the firing rate from Eq (20) with the drift and diffusion terms as in Eq (18,19).

#### Parameter calibration

First, we choose the strength of external, feedforward input (i.e., LGN input) based on observations under a single square stimulus. Single darkening/brightening stimulus generates moderate population activity (i.e., subthreshold membrane potential and suprathreshold firing-rate), we specifically change the external driving strength *S*^{QY} so that the voltage component generated by the external input is greater than the voltage threshold, which means that the external input can trigger population firing event in the directly stimulated regions at the initial stage. This initial population firing event is one of the conditions for the succeeding isotropically spreading activity. Second, we choose a larger strength for excitatory local connection than inhibitory one, to ensure the local spreading (recruitment). Furthermore, based on the observed differences in the initial time of the responses to On-/Off-stimulus, we select different time parameters {*τ*_{1},*τ*_{2}} (Eq (3)) to model the different feedforward time kernels of the On-/Off-visual pathways.

Considering experimentally observed VSD data under visual stimulus with more complex spatiotemporal structure, i.e., counterchanging On-/Off-stimulus, Hikosaka LMI stimulus, the propagating wavelike population activity pattern suggests the critical role of long-range NMDA-type synaptic connection [47, 48]. We then choose the strengths of the long-range connections within our model, so that our simulation results could reproduce the traveling wave population activity pattern, qualitatively matching the wave speeds in the experimental data.

## Results

We first validate our moment closure by applying it to model the network response to a rotating drifting-grating stimulus. Here, the visual stimulus is a sinusoidally modulated grating, drifting at 4 Hz and rotating at 20° /sec. We numerically simulate one single orientation hypercolumn (see Fig 1A and Methods). Figure 2 shows, over one rotation period, the temporal dynamics of the total synaptic inputs (Fig 2A), which consist of the external (i.e., LGN) input (Fig 2B), excitatory cortical input (Fig 2C) and inhibitory cortical input (Fig 2D). (The faster oscillations correspond to the drift rate of the sinusoidally modulated grating.) Each panel plots the dynamics at two locations with respect to the pinwheel center, with red/blue representing neuronal populations far from/near the pinwheel center, respectively. Clearly, the neurons away from the pinwheel center (in the so-called iso-orientation domain) have flatter temporal responses compared to neurons near the pinwheel center. Since the grating is also slowly rotating, the envelope of the temporal response curve can be used to estimate the orientation tuning of each neuronal population. Fig 2E displays the population-averaged membrane potentials and slaving-voltages (see Methods) of individual patches (marked in Figs 2F and H); red/blue solid lines are population-averaged membrane potentials of neurons away/near the pinwheel center; pink/cyan dash lines are corresponding slaving-voltages. Figs 2F and H are spatial patterns of population-averaged membrane potential and population firing rate at *t* = *3475 ms*. From Figs 2E-H we can infer that the neurons near the pinwheel center are more selective to stimulus orientation while the neurons far from the pinwheel center are less selective, consistent with the network model of McLaughlin & Rangan [49].

Population activity in cortex forms characteristic clusters both in space and time. While the processing of local, small-scale stimulus orientation appears to be performed within individual orientation hypercolumns, the integration and processing of more global features are believed to be functions of a network organization on scales spanning multiple hypercolumns. Furthermore, many experimental studies have revealed the existence of spatially separated On- and Off-visual pathways [50–52], in addition to temporal differences, originating from the RGCs and persisting to V1. On the scale of single neurons, the well-known spatial arrangement of LGN inputs shapes individual V1 receptive fields. The temporal asymmetry between these pathways was revealed by comparing the neuronal responses to brightened versus darkened stimuli [50].

Recently, using VSD optical imaging, Rekauzke et al showed that the temporal difference in the On-/Off-visual pathways can lead to propagating subthreshold cortical activity, possibly contributing to motion perception [26]. In their experiments, Rekauzke et al used darkening and brightening square stimuli to probe the difference between the On-/Off-visual pathways. The darkening square stimulus is an initially bright square on a uniformly dark background that is changed to grey, and the brightening square stimulus is an initially grey square switched to bright (the leftmost vertical rows in Figs 3A and B). VSD imaging captured the cortical activities aroused in the directly stimulated location before nearly isotropically spreading via horizontal connections (the top lines of the right panels in Figs 3A and B). They also found that responses to the off stimulus (darkened square) arrived ~10 ms before the on stimulus (brightened square), thus confirming the existence of temporal differences between On-/Off-processing.

Using simultaneously darkening and brightening squares at adjacent locations (so-called counterchanging stimulus; see the leftmost vertical row in Fig 3C), Rekauzke et al found propagating cortical activity flowing from the darkened area towards the brightened location. This propagating activity is similar to the wave-like response to a moving square stimulus, suggesting the temporal asymmetries in the On-/Off-visual pathways can lead to motion signals in higher-order visual perception, a hypothesis corroborated by psychophysics experiments.

To generate a physiologically plausible model of the cortex, we use these experimentally recorded VSD data to calibrate our large-scale I&F neuronal network. First, using the results from single darkening and brightening square stimulus, we adjust the LGN input strengths, the On-/Off-LGN temporal kernels, and the strengths of local connectivities. Using the wave propagation properties revealed in the counterchanging stimulus, we calibrate the strengths of long-range connections.

We applied this large-scale I&F model to study the dynamical responses of a realistic cortex, which comprises hypercolumns and neurons. Individual neurons and massive, complex inter-connections are the essential network elements in this model. Although the network dynamics can be accurately traced by a set of ordinary differential equations (ODEs), direct simulations of such a large-scale I&F neuron model is computationally expensive (considering the large number of neurons and the many stochastic realizations of network dynamics that may be required). Furthermore, because of the dimensionality of the network, it is difficult for mathematical analysis.

In matching the dynamical responses of our large-scale I&F model cortex, we found that the illusory phenomenon can be appropriately captured and modeled as the complex collective activities of the cortical circuit, and is crucially dependent on the interaction of neuronal populations. So here, we perform a reduction of the large-scale I&F model, organize neurons with similar properties into spatially coarse-grained subpopulations.

For our coarse-grained, augmented ODE model, we take the parameters directly from the large-scale I&F model. Although a direct comparison between the large-scale I&F model and the CG model shows slightly different spatial activity patterns, the essential wave propagation, from darks to brights, can be reproduced (right panels in Fig 3C). (See Methods for a detailed description for parameter calibration.)

To illustrate our dimensional reduction to capture spatiotemporal cortical activity in general, we stimulate our moment model with a novel version of the LMI stimulus paradigm. In Fig 3E, we combine features of the On-/Off-counterchanging square of Rekauzke et al with the original Hikosaka LMI into a new visual stimulus, which we call the ON-/OFF-LMI. The stimulus starts with a cue of a small, stationary bright square that switches OFF followed a few milliseconds (~10-30 ms) later by a grey stationary bar that turns ON to bright at an adjacent location.

To compare the stimulus-generated neuronal activities between the large-scale I&F network model with our CG results, in Fig 3F, we display activity patterns in a spatially one-dimensional representation (see Fig 3D for details). Excitatory, inhibitory, aggregated population-averaged membrane potentials of patches in the CG model and membrane potentials of point neurons in the large-scale I&F model are displayed from left to right. The membrane potentials initially arise in patches receiving the darkened square stimulus --- the earliest responses (cyan/green) appear on the upper left corner of each panel, before spreading isotropically, while the amplitudes gradually increase (yellow/red). In the second stage, after the bar stimulus turns ON to bright, a gradual wave-like propagation of population-averaged membrane potentials emerges. This wave propagation emerges in the region between the middle and bottom portions (white rectangle in Fig 3F) of the activity which receives brightening bar stimulus. The moment in time when the population activities of CG patches reach a particular level (here we chose 80% of the maximum activity) shows a continuous shift in time, i.e., the farther the patch is from the initial stimulus, the later the activity reaches this level. The counterclockwise tilting contours (activities with the same amplitude/color) in the corresponding region intuitively reflect this phenomenon (upper line in Fig 3F). Lower left sub-plot in Fig 3F summarizes the temporal traces of population-averaged membrane potentials at seven evenly spaced locations within the area receiving bar stimulus; we observe a rightward time shift (time delay) along the direction away from the initial stimulus occurring before subthreshold neuronal responses corresponding to the brightening bar stimulus (about 50 ms). Lower right two sub-plots in Fig 3F show the measurement of the propagating wave position as a function of time, from the CG model (left) and large-scale I&F model (right). Both of which have a wave speed of about 0.05 m/s, consistent with experimental results [24].

ON-/OFF-LMI reveals two nontrivial temporal properties: first, the time delay of about 20 ms, inherited from the intrinsic time-latency between On-/Off-visual pathways, and the second is the time difference of about 30 ms between the appearance of the cue-square and the long bar. The combination of these two timing differences ‘primes’ the cortical network and initiates a traveling cortical wave when a second stimulus immediately appears in a nearby location [14]. Our CG model indicates that the V1 network can integrate and make full use of these two types of temporal differences, to induce a ‘priming’ effect, so that an appropriately placed second stimulus triggers cortical voltage propagation. The spatiotemporal activity patterns from the CG model are presented in Fig 3F; as we see, the activities closely resemble the I&F model (rightmost panels in Fig 3F).

We note that the traveling wave is largely independent of the cue contrast. In Fig 4 we plot the trajectory of the propagating wave (in the region marked by white rectangle in Fig 3F) as a function of time. Reducing the darken square’s contrast from 100% to 50%, we observe a lagged initiation of the wave. In addition, under these circumstances with different cue contrasts, every point on the trajectory of the traveling wave, shifts by roughly the same period of time (shifts along the x-axis). Fig 4 indicates that after an initial transient, the induced traveling wave reaches a steady state velocity of about 0.05 mm/ms, independent of cue contrast. This speed is roughly consistent with the wave speed induced by the Hikosaka Line-Motion Illusion paradigm.

In Fig 5, we show the CG results from 1) the Hikosaka LMI stimulus (Fig 5A), 2) a moving square (Figs 5D and S1L), 3) two types of drawn-out squares, one directly drawn out to bar length (Figs 5G and S1I), and another starting out as the Hikosaka LMI, with a priming square that vanishes, followed by a drawn-out square (Fig 5J). VSD signals evoked by these stimuli are presented in Figs 5B, E, H (lacking VSD data for the second type drawn out stimulus); corresponding CG model results are displayed in Figs 5C, F, I.

We note that in the cases with VSD imaging data, our CG model results reproduce the main features of cortical wave propagation. For instance, in the LMI of Fig 5C, the initial square stimulus activates cortical neurons after about 40 ms, with a persistence of neuronal activity even after the disappearance of the square stimulus (t = 48 ms or 5Δ). Now, at about 60 ms, a bar is flashed. The neuronal patches (VSD or our CG model) in the region near the previous firing build up activities after ~20ms, before spreading to the right. We compare this to a physically moving square stimulus (Figs 5D-F) to demonstrate that the wave-like cortical activity pattern under LMI is very similar to the activity induced by a square moving at the appropriate speed. The blue line in Fig 5D denotes the trajectory of the head of the moving square stimulus, and the slope of the blue line indicates its speed (~32 deg/sec).

## Discussion

Information processing in the brain is often reflected by organized, coherent activity patterns that are distributed across almost the entire cortex [11, 24, 53, 54]. This population-level neuronal activity is often thought of as an emergent property of strongly coupled recurrent networks [6, 55–57]. Furthermore, it is believed that the information is embedded in the spatiotemporal patterns arising from the collaborations and competitions between the external stimulus, the intrinsic neuronal dynamics and the network architecture. Modern experimental techniques (such as VSD imaging) are capable of capturing this phenomenon with high spatial and temporal resolution [1]. However, the attempt to understand the network mechanisms underlying the generation and maintenance of neuronal population activity has been mainly addressed through large-scale numerical simulations, and a mathematical framework to extract simplified representations remains a major theoretical challenge [15, 49, 56, 58].

The mammalian early visual pathway is a complex system whose functions emerge from interactions that take place simultaneously on a vast range of spatial and temporal scales. Optical imaging experiments of V1 reveal visual feature preferences organized into mm-scale hypercolumns that tessellate the V1 network. While local connectivities on the hypercolumn scale appear to be isotropic, longer range, reciprocal connections are excitatory-only and orientation-specific [39–41, 43]. These long-range horizontal connections are likely to be responsible for the synchronization of gamma-band oscillations that signify visual processing beyond single-cell receptive fields. Voltage-sensitive dye imaging has also revealed the interaction between various spatial and temporal scales to produce visual illusions, such as the line-motion illusion.

Standard ensemble averages of neuronal network dynamics lead naturally to Masters equations, Fokker-Planck systems, and other types of kinetic theories [12, 16, 17, 31, 42, 59–62]. The classic formulation of the neuronal network dynamics on the population level, i.e., population/ensemble density models or Wilson-Cowan models, evolves according to the conservation of probability density flux and represents the population activity of a cortical region by one or multiple variables.

While these reductions are effective descriptions of coherent population dynamics, they yield systems of partial differential equations that are still time-consuming to solve and not always easily amenable to analysis and applications. Mean-field approximations replace the ensemble density with the expected value of the network state variables. Further improvements [3, 9] take the fluctuations of neuronal activity into consideration, by introducing *ad hoc* Gaussian noise terms. Recently, a Master equation formalism proposed by Boustani and Destexhe, allows for a ‘mesoscopic’ level description of population dynamics. Their model can be used beyond stability analysis, but needed to make use of a phenomenologically fitted f-I curve.

In ZSRT19, by introducing voltage moments to a homogeneous neuronal population (see Methods), we were able to derive equations governing population voltage moments that, when combined with a maximum entropy closure, resulted in an augmented ODE system. Numerical simulations suggested that many network dynamics can be captured by 2 to 4 moments. Based on this formalism, here we developed an analytical framework towards modeling and analyzing large-scale coherent cortical activity. First, we applied our method to a single orientation hypercolumn in V1 before using it to model an extended network that spans roughly 25 square millimeters in V1, an area that contains neurons. On this scale, VSD imaging of cat V1 revealed coherent wave propagation that may underlie motion perception [26].

While many previous studies have derived dimensional reductions, our work is able to demonstrate theoretically how a low-dimensional characterization can be derived from a careful assessment of the local-scale population dynamics, without the introduction of any new parameters. Furthermore, our voltage moment description can be extended to nonlinear I&F models (see, for instance, [63]) and be generalized to include other cellular effects, e.g., adaption and short-term depression [7–9]. However, each of these additions may lead to maximum entropy closure models that are hard to solve numerically. Therefore, investigating suitable approximations and numerically expedient solutions to the maximum entropy closure will be urgent future work.

In surveying the rich repertoire of cortical dynamics, a natural question arises: what concise, unified characterization can capture the co-existence of diverse, heterogeneous dynamical states in highly recurrent neuronal networks. It is believed that large-scale neuronal information processing emerges from the interaction between the external input, individual neuronal dynamics and cortical architecture. Many studies have been carried out to analyze the dynamical effects of different types of mechanisms [59, 64, 65]. Here we showed how effective a system of voltage moments can be used naturally to model various VSD imaging experiments. Furthermore, our coarse-grained representations can be used to assess the importance of various mechanisms and facilitate our understanding of the rich dynamic states within the mammalian brain. Future work will focus on the incorporation of higher-order network motifs, which are responsible for higher-order correlations beyond the mean-field approximation, and are likely to have important consequences for information processing.