Dynamic causal modeling of layered magnetoencephalographic event-related responses

The layered architecture of cortex is thought to play a fundamental role in shaping cortical computations. However, direct electrophysiological measurements of layered activity are not possible non-invasively in humans. Recent advances have shown that a distinction of two layers can be achieved using magnetoencephalography in combination with head casts and advanced spatial modeling. In this technical note, we present a dynamic causal model of a single cortical microcircuit that models event related potentials. The model captures the average dynamics of a detailed two layered circuit. It combines a temporal model of neural dynamics with a spatial model of a layer specific lead field to facilitate layer separation. In simulations we show that the spatial arrangement of the two layers can be successfully recovered using Bayesian inference. The layered model can also be distinguished from a single dipole model. We conclude that precision magnetoencephalography in combination with detailed dynamical system modeling can be used to study non-invasively the fast dynamics of layered computations.


Introduction
Its layered structure is one of the defining anatomical features of cerebral cortex (Douglas and Martin, 2004). How this layered, hierarchical organization relates to the computational properties of the human brain is still not understood in detail. However, theoretical predictions have been made about a possible role of the layered architecture (Bastos et al., 2012;Douglas and Martin, 2007;Heinzle et al., 2007). For example, predictive coding (Friston, 2005;Rao and Ballard, 1999), a specific instantiation of the general "Bayesian brain" theory (Knill and Pouget, 2004), can be implemented by a hierarchy of layered microcircuits, with distinct computational roles of neurons in different layers (Bastos et al., 2012;Shipp, 2016;Stephan et al., 2019).
In order to test hypotheses like predictive coding in humans, non-invasive measurements of layered activity are required. One option for this is presented by recent advances in highresolution layered functional magnetic resonance imaging (fMRI) Kok et al., 2016;Muckli et al., 2015). However, fMRI provides an indirect and intrinsically slow measure of neural activity which might suffer from blood draining effects when applied to layers (Heinzle et al., 2016). Separation of feedforward (bottom-up) and feedback (top-down) streams would be facilitated by a more direct, electrophysiological measurement of layered activity as, for example, provided by magnetoencephalography (MEG, for a review see Baillet, 2017). The spatial specificity of MEG can be improved by assuring precise positioning of the head within the MEG and a high-resolution cortical model of layers (Bonaiuto et al., 2018a;Troebinger et al., 2014b). Simulation studies have shown that under these conditions oscillatory activity can be reliably distinguished between two layers located at the pial surface or gray matter -white matter boundary, respectively (Bonaiuto et al., 2018b;Troebinger et al., 2014a). To date, these studies have focused on inversion of a precise "spatial model" describing the mapping from one (or several) current dipole(s) distributed on the two cortical surfaces to sensor activity.
In contrast, temporal models which describe the dynamics of neural activity are routinely considered within the dynamic causal modeling (DCM) framework. DCMs of EEG and MEG are generative models that explain measured MEG/EEG sensor data through a forward mapping of activity in one or several connected cortical columns. These are modeled as microcircuits consisting of several neural populations (e.g. supra-and infragranular pyramidal neurons, spiny stellate and inhibitory neurons of a cortical column). For reviews see for example (Kiebel et al., 2009;Moran et al., 2013). However, the spatial model in DCM has, to date, not allowed the investigation of neural signals at the resolution of layers. Instead, this approach has considered contributions of individual populations from different layers by attributing the weighted summed activity to a single point source or current dipole. Using this forward model, networks of layered microcircuits were used to study the relative importance of feed-forward and feed-back connections in auditory mismatch (Garrido et al., 2008) and the processing of face  and somatosensory stimuli (Auksztulewicz and Blankenburg, 2013). Recently, a dynamic causal model was applied to invasively recorded depth dependent cortical activity of mice (Pinotsis et al., 2017), directly predicting layered signals. This model was based on previous work that proposed a small network of groups of neurons to explain somatosensory evoked potentials (Jones et al., 2007) and mu-rhythms (Jones et al., 2009) in MEG. Notably, Jones et al (2007) used detailed compartmental neurons to directly model currents in dendritic trees. They manually adjusted parameters to match the resulting current dipole changes to source reconstructed MEG data.
Here, we augment previous generative modeling attempts that focus mainly on temporal models (e.g. Garrido et al., 2008) or on spatial aspects (Bonaiuto et al., 2018a;Troebinger et al., 2014a). We fit a temporal model of a laminar circuit to simulated MEG data while making use of precise anatomical information. This allows us to assign different dipoles to different cortical layers. As a basis for the temporal model we use a canonical microcircuit (Bastos et al., 2012;Moran et al., 2013) approximation to the Jones model (Jones et al., 2007). In simulations, we show that a model with correct layer information can be distinguished from models where this information is reversed or missing at signal to noise levels that are commonly observed in MEG. We show how the addition of temporal information removes considerable ambiguity in the spatial model by constraining sources to lie on the correct sulcal wall. We then investigate how this inference on layered circuits depends on parameter settings and how well the distance between current dipoles in supra-and infragranular layers can be estimated. Finally, we investigate to what degree adding more sensors increases the sensitivity of the method. In summary, our simulations show that inversion of layered dynamic models is possible under reasonable SNR settings.

Methods
The simulations and model inversions performed in this paper use the DCM for EEG and MEG framework Kiebel et al., 2009Kiebel et al., , 2006. The generative model for MEG data consists of a temporal model (a set of differential equations that describe the temporal evolution of neural activity in a layered microcircuit) and a spatial model (the lead field mapping that converts current dipoles in different cortical layers to signals in MEG sensors).
We first introduce the dynamical layered model, a canonical microcircuit (Bastos et al., 2012;Moran et al., 2013) which we adapted to replicate the dynamics of a detailed compartmental neuronal circuit used to explain somatosensory event related MEG responses (Jones et al., 2007), referred to as Jones compartmental microcircuit model (JCM) throughout this paper.
Second, we explain the spatial model used for simulating MEG traces from layered cortical activity (Bonaiuto et al., 2018b(Bonaiuto et al., , 2018aTroebinger et al., 2014aTroebinger et al., , 2014b. Third, we outline the simulation and inversion of DCM for layered MEG and present the simulations carried out to test the feasibility of inferring layered circuit activity from MEG data. Figure 1 graphically summarizes the model used in this paper. The temporal dynamics of the model were controlled by a canonical microcircuit model (CMC: spiny stellates: black, inhibitory interneurons: red, superficial pyramidal cells: green, deep pyramidal cells: blue). The CMC received two waves of bottom-up input to the granular layers (iGR and lGR) and top-down input to the pyramidal cells (sGR). The parameters of the CMC were fitted in order to match responses of CMC supra-and infra-granular pyramids (red traces) to the corresponding cells simulated by the JCM red traces. Bottom: In order to generate MEG sensor signals we assumed that the activity of the pyramidal neurons generated a current dipole positioned in the upper (pial surface) and lower (grey matter -white matter boundary) layer of cortex, with an orientation perpendicular to the cortical surface. The distance d specifies the distance between the two dipoles and hence relates to cortical thickness.

The temporal model
In this section, we describe the canonical microcircuit model (MCM; Moran et al., 2013) and how it was initially fitted to the Jones compartmental microcircuit model (Jones et al., 2007).

Canonical microcircuit model
The layered microcircuit in this paper was a neural mass model (Moran et al., 2013) that describes the dynamics of a cortical layer by focusing on key neuronal populations in different laminae. In order to allow for compatibility with the standard models in SPM, we used the canonical microcircuit model (CMC) as the basic layered neuronal network. This circuit includes one population of spiny stellate neurons, one population of inhibitory interneurons, and two populations of pyramidal neurons in layers 2/3 and 5, respectively. A schematic of the model circuit is shown in Figure 1. The dynamics of the populations are controlled by a second order differential equation following the work by Jansen and Rit (1995): Here, is the activity of population k. , controls the oscillatory behavior of the cell population and may differ between excitatory (e) and inhibitory (i) neurons. , is a gain parameter defining the maximal postsynaptic potential, are connectivity parameters and � � is a sigmoid function transforming activation into a "synaptic" input. Inhibitory connections are defined by setting the corresponding to be negative. Finally, is the "external" input into population k.
The parameters of the CMC were adapted to approximate the responses of the JCM. In order to adequately reproduce the JCM with the CMC, we included the three different inputs proposed in (Jones et al., 2007): an initial granular (iGR) layer input targeting layer 4, a supragranular (SGR) layer input -modelling feedback from higher regions and targeting all four populations, but with different weights (see Table 2, below), and a late granular (lGR) layer input targeting layer 4. All three inputs were described by Gaussian distributions. Table 1 summarizes the corresponding mean values and dispersion.

Fitting the CMC to the JCM
In order to simulate realistic traces of supra-and infragranular currents, we used the model of (Jones et al., 2007). We downloaded the version available at http://senselab.med.yale.edu/senselab/modeldb/ and simulated a somatosensory evoked potential. The model was run 100 times as proposed by Jones et al. (2007). Traces were created by taking the average of these 100 runs. In the next step, we fitted (using SPMs spm_nlsi_GN.m) the CMC to the JCM using a simple forward model that matched the activity of the supra-and infra-granular layer to the corresponding current dipoles of the JCM model.
The signal was simulated for the first 160 ms after stimulation (i.e. 0-160 ms). Figure 2A illustrates the traces of the JCM model and the fit. We used these maximum a posteriori (MAP) estimates of this fitting procedure for all subsequent simulations of MEG data. The resulting parameters for the simulation of the temporal model are summarized in Table 2.

Spatial Model
While the temporal model fully determines the traces of the two pyramidal population, a spatial model is needed to transform the neuronal signals to sensor data. The spatial model constitutes the forward part of the generative model, mapping hidden states to observations (measurements). The current dipoles of the two populations were modelled as a dipole pair, one for the superficial and one for the deep layer traces. The apical dendrites of both superficial and deep pyramidal neurons are oriented perpendicularly to the cortical surface.
Therefore, we assume that the dipole pairs have the same orientation and their positions differ only along this axis (see Bonaiuto et al., 2019, for a comparison of different options to define dipole orientation). In order to mimic a somatosensory stimulation experiment, the dipoles were placed in area BA3b of a normalized brain (MNI space). Here, we used the average location of BA3b according to Papadelis et al (2011). For simulations, dipole orientation was defined by the angle between the dipole and its location vector with respect to the MEG volume conductor origin. Note that in an inversion with real data, the orientation of the dipole would be defined perpendicularly to cortex based on the anatomy of the subject. Here, the radial angle was chosen to be 51° except for simulations where the angle was varied. These exceptions are indicated in the Results section. Finally, the dipoles of the two layers were separated by a distance of 2 millimeters along the dipole orientation. The distance of 2 mm is motivated by cortical thickness measurements in humans which ranges from 1 to 4.5 mm (Fischl and Dale, 2000) but tends to be relatively thin, around 2 mm, in primary sensory areas (Scholtens et al., 2015).
The lead fields of the different layers were calculated with the single shell model (Nolte, 2003) of SPM (Litvak et al., 2011). Synthetic sensor data of an MEG scanner were simulated as described in an earlier paper (Bonaiuto et al., 2018b). Briefly, we used an affine spatial transformation from MNI space to sensor space in order to place the current dipoles in the correct position within the MEG scanner. We then simulated a total of 274 sensors. Figure 2B illustrates the lead fields for two dipole pairs in right BA3b. For display purposes and to highlight the difference between the lead fields of superficial and deep pyramidal neurons, the distance between the two dipoles of a pair was set to 4mm. As expected, the lead fields of the superficial and deep dipole within a pair are very similar. Nevertheless, the two dipoles of a cortical column show differences which are better visible when plotting the lead field in one dimension ( Figure   2B). In this work, we try to exploit these small differences to make statements about layer differentiation. Figure 2C illustrates an example of simulated data from the whole model, using a layered microcircuit in the right BA3b. are mapped to the sensors through the lead fields illustrated in B. Note that for display purposes the distance between the dipoles was 4 mm in these simulations.

Simulations
In order to test whether the proposed layered circuit could be inferred from sensor data, we simulated MEG data using the generative model described above with a forward model consisting of two lead fields for the two current dipoles of upper and lower layer pyramidal cells. The strategy of the simulations is explained in Figure 3. In general, we simulated data from a model that assigned superficial activity to a current dipole close to the pial surface and activity of deep pyramidal neurons to a current dipole close to the gray-white matter boundary.
We refer to this model as the layer correct (LC) model. The sensor traces generated from these simulations were then used as data for subsequent model inversion: In order to investigate whether and how well the assignment of the two sources could be inferred from the data, we inverted several models using the generated data. We then used Bayesian model comparison to establish which was the most likely model, given the data. Model inversion was performed using the variational Bayes approach in SPM (Variational Laplace) which yields the variational negative free energy (F) as an approximation to log model evidence (Friston et al., 2007). Log model evidence is a measure of model goodness used to score different models that were inverted based on the same data set. A difference in log model evidence larger than 3 (equivalent to a Bayes factor larger than 20) is usually considered strong evidence in favour of one model compared to another (Kass and Raftery, 1995).
In a first set of simulations, we compared the LC model to two competing hypotheses: First, we assessed a model with the inverted assignment of layered activity (layer inverted model, LI), that is, with superficial activity assigned to a current dipole close to the gray-white matter boundary and activity of deep pyramidal neurons assigned to a dipole close to the pial surface.
Second, we also considered a model where both dipoles were located at the same location (at mid-cortical depth, comparable to traditional DCMs). We refer to this model as layers detached (LD) or single-source model. In all simulations, we created models with the correct association between neural populations and dipoles. For model inversion, different models were fitted to the generated data and then compared, for example the layer correct to the layer inverted model.
After this first set of model comparisons, we conducted a series of simulations to test how sensitive the separation between LC and LI models was and how much it depended on the particular choice of parameters. For this, we simulated different dipole pairs and tested how well the LC and LI models could be separated. In particular, we varied the orientation of the dipoles, cortical thickness, and the distance to the closest sensor. Finally, we conducted a simulation to explore the effects of the arrangement of MEG sensors with a particular focus on the possibility of reducing the number of sensors.
All data were simulated using the CMC with the parameters indicated in Table 2. If not stated otherwise, the model generating the data had a distance of 2 mm between the superficial and deep dipole. After creating the sensor data (sampled at a frequency of 2400 Hz), white Gaussian noise was added to the signals. The noise of each sensor was assumed to be independent from the other sensors and also independent between time points. Furthermore, we assumed the same amount of noise, i.e. variance, for all sensors. We characterize the amount of noise by the signal to noise ratio (SNR) as defined by Goldenholz et al (2009): where 2 is defined as the signal on sensor k for unit input, 2 is the variance of the noise added to the k th sensor, N is the number of sensors, and a² the source amplitude. Note that describes the overall SNR. The SNR of individual sensors varies and depends on their respective signal strength. Here, we explored SNRs between -20 and 10 dB. In order to avoid effects of any particular noise instance, each simulation was repeated with 20 randomly created noise traces. It is worth noting that the model we propose here can be fitted to averaged event-related responses (ERPs). Thus noise levels have to be compared to the SNR of averaged evoked potentials. Averaging the signal over n trials reduces the variance by a factor of n: 2 = 2 ⁄ . This results in an additive increase of SNR by 10 log 10 .

Software note
All code used here was implemented in MATLAB (MathWorks, Inc., Natick, Massachusetts,

Results
In this section, we will first show that Bayesian model comparison can distinguish the correctly layered model from two alternative models, one with both dipoles placed in the middle of cortex and one with an inverted dipole position. Next, we explore to what extent model inversion provides model evidence estimates specific enough to distinguish different cortical depths. We then proceed to a thorough analysis of the influence of dipole properties such as orientation and separation of the two layers on model inversion. Finally, we illustrate through simulation how the combined spatio-temporal model is able to distinguish the sign of the current dipole, i.e. the orientation of pyramidal neurons in cortex.

Proof of Concept
Subsequently, we will show that using both temporal and spatial features improves model prediction at moderate SNR. We do this by comparing the free energy of three different models that were all fitted to the data generated with a layer correct model with a distance of 2 mm between the two layers.  Figure 4 for different SNRs ranging from 10dB to -20dB. As expected, models could be more clearly distinguished with increasing SNR. Also, as expected, this difference in model evidence was greater for the more spatially distinct pairings. That is, there was a greater difference between layer correct (LC) and layer inverted (LI) models than between layer correct and layer detached (LD or point-source) models. The improvement was decisive (ΔF > 3) for both comparisons for SNR levels of 0dB or larger. The correct model was reliably selected in most cases even for SNRs of -3.3dB (for some instantiations of noise even -6.7dB) when comparing the LC-and LI-model.

Model Comparison: Specificity for Cortical Thickness
Next, we considered the situation where the exact cortical thickness is not known. For this, we created a set of models with thicknesses between -2mm (layer inverted) and 4mm (thick cortex) and tested which of these models best predicted data generated by a model with cortical thickness of 2mm. These simulations were performed for different SNRs of -20, -10, 0, and 10dB, respectively. Figure 5 summarizes the results of these simulations. A discrimination of cortical depth was possible based on the negative free energy difference between models at SNRs of 10dB and 0dB. At a SNR of -10dB and -20dB, none of the models was significantly more likely than the others, i.e. all ∆F < 3. Overall, these simulations illustrate that model comparison can be employed to investigate cortical thickness. However, this requires relatively high SNR of 0dB.

Influence of Dipole Properties
In order to determine how properties of the dipole pair impact on a successful layer differentiation, we conducted simulations with certain properties changed. In particular, we varied four different quantities: (i) cortical thickness, (ii) dipole orientation, and (iii) the distance to the sensors, i.e. whether the source was closer to or further away from the skull and, therefore, the closest sensor. For all of these properties, we investigated how much they influenced model discrimination. While in previous simulations the noise was always defined relative to the signal, we adopted a different strategy here. In order to avoid the possibility that changes in signal strength due to changing dipole properties would lead to changes in noise, a fixed noise level was added to all simulations discussed in this paragraph. Concretely, the noise level was chosen to yield an SNR of 0dB when considering the dipole pair used in Figure   4. The same amount of noise was then added to all simulations, potentially leading to different SNRs, e.g. when the current dipole was placed further away from the skull.
Cortical thickness was defined by changing the distance between the deep and superficial dipole for both generation of data as well as model inversion. In simulations, cortical thickness varied from 0 to 4 millimeters. The thicker the cortex, the stronger the LC model was favored over the LI model. For the given noise level, a significant layer discrimination occurred for a cortical thickness above 1 mm. The orientation of the dipole was defined as the angle between the dipole and the radial, i.e. the vector connecting the centre of the volume conductor with the position of the dipole. It was varied from 0° to 90° in steps of 6°. As expected for MEG dipoles the difference in free energy between models increased with increasing angle. In other words, discriminability between layered models is difficult in patches of cortex that are oriented in parallel to the sensor grid (i.e. with current dipoles oriented towards the sensor grid), while sources from sulci where dipoles are oriented in parallel to the sensor grid will facilitate layered model discriminability. While the two first parameters (dipole distance and orientation) correspond mainly to anatomical features of cortex, the distance to the closest sensor can be strongly influenced by how sensors are arranged around the head.
When varying the distance to the closest sensor, the dipole pair location was changed along the line connecting the origin in MNI space with BA3a. The distance between the dipoles (cortical thickness) was set to 2 mm and the dipole angle was kept as in the main simulations.
As expected, we observed that the closer the dipole to the sensors, the more successful layer differentiation became. Hence, for discrimination between different layered models, it is highly advantageous to measure close to the sources, i.e. bring the sensors as close as possible to the skull. This would be the case, for example, in MEG measurements using optically pumped magnetometers (OPMs; Tierney et al., 2019).

Dipole Direction Estimation
In this section, we demonstrate, using simulated data, how the combination of a temporal and spatial model facilitates estimating the correct direction of cortical current flow. This is a longstanding problem in M/EEG as data can often be equally well explained by one of two sources on either side of a sulcus each with different polarity. We assumed that the simulated dipole pair was located in the bank of a sulcus and tested whether it could be distinguished from a second dipole pair at a distance of 7 mm at the opposite side of the sulcus (Fig. 7A). Hence, all dipoles were oriented along the same axis but with different polarities. Data were simulated with the first dipole pair assuming a separation of 2mm and both sources pointing towards the pial surface. We then tested four different models to explain the data. Two of the models were placed at the correct location but with the dipoles pointing either towards the pial surface (pial positive) or away from the pial surface (pial-negative). The other two models were placed at the opposite bank of the sulcus and also had dipoles either oriented towards or away from the pial surface. For all four models, we inverted the data for cortical thicknesses between -4mm (layer inverted) and 4mm.
The results of this simulation are depicted in Figure 7. The free energy difference relative to the (true) generating model is shown for all models as a function of the cortical thickness of the estimated model. Models with the correct dipole location and pial-positive dipole orientation performed much better than all other models, independently of the estimated cortical thickness.
Examining the competing models more closely, it becomes obvious that a dipole location error of 7mm was much less severe than getting the generator polarity wrong (compare the cyan and violet curves in Figure 7). It is worth noting that for the wrong position, the model with pialnegative dipoles points into the same direction as the generating model (pial-postivie dipoles at true position) because the orientation of cortex is inverted on the other side of the sulcus.
As a consequence, the LI model is favoured over the LC model in this scenario. Anatomical information therefore considerably reduces the number of possible source locations as source models situated on half of the sulcal walls will not be fitted well given the temporal model's prediction. The temporal model can thus act as a constraint to help enforce that the sign of the dipole orientation, in particular of the dominating dipole, has been chosen correctly.

Figure 7:
Comparing layered circuits across a sulcus. A: Illustration of a simulated scenario. We simulated the problem of distinguishing two layered microcircuits located in the two banks of a sulcus. The correct location used for generating the data was on the right (red and magenta rectangles). The wrong location of the microcircuit was located in the other bank of the sulcus 7 mm away (blue and cyan). For each dipole location we introduce a model with correctly orientated, pial-positive (pp) dipole moments (red and blue) and inverted, pial-negative (pn) dipole moments (magenta and cyan). B: Free energy (F) difference of all inverted models compared to the generating model (red, dipole distance: 2mm). Colours as in panel A. Mean (solid lines) free energy difference and standard deviations (dashed lines) over 20 simulations with SNR = 0dB are shown. The dashed vertical line indicates the true (i.e. generating) dipole distance. Note the separation and change in scale between the three segments of the y-axis.

Reducing the number of sensors.
So far, we used data from all 274 sensors for model inversion. While this includes all possible information it makes simulations and model inversion slow. Furthermore, many sensors are quite far away from the dipoles and do not contribute much information because their signal is dominated by noise. For the inversion of real data this could potentially become a problem because a model predicting temporal traces that are 0 everywhere would explain a large part of the data, namely all sensors far away from the current moment. In addition, recent advances in the development of OPMs have led to a new generation of flexible MEG sensors (Boto et al., 2018(Boto et al., , 2017 and make it possible to place sensors in specific positions. Here, we investigate how using only a subset of the sensor traces influences model comparison. Simulations were conducted with the standard dipole in BA3 and an SNR of 0 dB as defined above. The results are plotted in Figure 8. We slowly reduced the number N of sensors using two different approaches. First, we always selected the N dipoles with the highest average (between superficial and deep dipole) lead field potential (blue bars in Figure 8). Second, we selected the N dipoles with the largest difference between the two dipole sources (red bars in Figure 8).
Generally, free energy differences are larger (and model selection is thus more robust) the more sensors are used. If the sensors with the highest summed lead field are used, average ∆F remains around 3 up to N = 17 sensors. With more than 17 sensors, free energy differences rise clearly above this level. When the sensors with the highest difference between upper and lower layer lead fields are used, free energy differences remain at a level slightly above 3 up to N = 29 and then start to rise. Hence, in order to clearly separate the LC model from the LI

Discussion
In this work, we have shown that the contribution of the deep and superficial layers of a cortical column can, in principle, be discriminated using MEG combined with layered electrophysiological models of neural activity. In particular, it is possible to separate the correct layered model that generated the data from a model where the current dipoles of pyramidal neurons are swapped between superficial and deep layers. We further explored how sensitive this model selection is to anatomical details (location and orientation of the dipole) and the distribution of sensors. Overall, our simulations suggest that inference of layered models is feasible in MEG data at a signal to noise ratio that can be achieved with MEG when event related potentials are averaged over several trials. Importantly, we have shown that both temporal and spatial information improve the discrimination between competing models. In particular, the temporal model adds considerable robustness to the spatial model by constraining current flow direction and eliminating competing models on opposing sulcal walls.
In the following, we will discuss this approach in the light of alternative methods such as fMRI, highlight its limitations and future improvements, and propose applications on real data.

Measuring layered cortical activity
Layer-specific activity of evoked potentials can be measured directly with invasive electrodes with many electrical contacts and therefore many electrophysiological measurements throughout cortical depth (Javitt et al., 1996;Schroeder et al., 1995;Self et al., 2013). The resulting local field potentials or current source densities are usually directly compared between different experimental conditions, for example to investigate whether a cortical column is involved in figure ground segregation or not (Self et al., 2013), or to study layered differences of inputs into somatosensory areas (Schroeder et al., 1995). Although these invasive extra-cellular measures are far more "direct" measures than EEG/MEG, they can still be open to interpretation as they are still subject to modelling assumptions (Gratiy et al., 2011;Haegens et al., 2015). Such invasive data were used in conjunction with a two layered DCM (for oscillatory activity) to infer layered activity (Pinotsis et al., 2017). Hence, depth resolving electrophysiological recordings offer an important bridge and testbed for the method proposed here. However, with the exception of patients implanted with electrodes for presurgical localization of epileptic foci, such invasive measures are not possible in humans.
Recent advances in magnetic resonance imaging (MRI) at high-fields (for a review see van der Zwaag et al., 2016) have made it possible to measure functional MRI (fMRI) at submillimeter resolution, allowing for separating at least two cortical layers. While the initial studies focused on early visual (Koopmans et al., 2011(Koopmans et al., , 2010Siero et al., 2011) and motor areas (Huber et al., 2015;Siero et al., 2011), more recent applications have shown a variety of applications in cognitive neuroscience Finn et al., 2019;Huber et al., 2017;Kok et al., 2016;Lawrence et al., 2019;Muckli et al., 2015). In addition to these cognitive neuroscience approaches, a relationship between oscillatory activity measured with EEG and laminar fMRI has been demonstrated (Scheeringa et al., 2016).
However, there are two major challenges when analyzing and interpreting layered fMRI which measures neural activity only indirectly. First, fMRI activity is filtered by the hemodynamic response function whose dominant frequency is in the range of 0.04 Hz. This might be too slow to fully capture fast computations. Second, layered fMRI can be contaminated by blood draining effects between layers. Although these can be modeled (Havlicek and Uludag, 2019;Heinzle et al., 2016), they pose an additional complexity for the analysis and interpretation of the data.

Models of layered cortical activity
Computational models of layered cortical activity have a long tradition dating back to the introduction of a canonical microcircuit to model the activity in visual cortex after stimulation of thalamic afferents (Douglas et al., 1989;Douglas and Martin, 1991). This cortical circuit is believed to form the computational substrate for all cortical computations and can, for example, be adapted to model the monkey frontal eye fields that guide eye movements (Heinzle et al., 2007). However, this model was not directly fitted to electrophysiological data. Similarly, the detailed layered compartmental model by Jones and colleagues (Jones et al., 2007) is able to explain evoked potentials as well as oscillatory behaviour (Jones et al., 2009), but was never rigorously fitted to data. A simulation toolbox based on this model allows simulations of EEG and MEG data (Neymotin et al., 2020). In addition, as with most other modelling attempts of EEG/MEG data, the relative laminar displacements of the different neuronal populations were not considered in the forward model. Instead, it was assumed that currents of all pyramidal cells are linearly combined at a virtual point-source.
A similar approach is taken by DCM for EEG/MEG (Kiebel et al., 2009;Moran et al., 2013), where the activity of different cell populations of a canonical microcircuit is summed (with population specific weighting) to yield a dynamically changing electric or current dipole at a single position in cortex. In a recent version of DCM for fMRI (Friston et al., 2019) activity in a layered canonical microcircuit is converted to a single hemodynamic signal. Hence, while both approaches take into account models that consider different layers, the spatial information about layers is not considered in the modelling approach. This is a practical assumption for most EEG/MEG recordings where uncertainty in head or electrode position has a much greater influence on the forward model (Dalal et al., 2014;Hillebrand, 2003). The approach outlined here is a first step towards aligning the advanced temporal modelling literature with more recent spatial modelling work in which relative head-to-sensor geometry issues have been mitigated (Bonaiuto et al., 2018b(Bonaiuto et al., , 2018aTroebinger et al., 2014bTroebinger et al., , 2014b.

Limitations and potential future improvements
In this section, we discuss limitations of the current approach and suggest some potential improvements for the future. One of the key assumptions of the simulations in this paper was that the location of the dipole pair is known. Future versions could also include a forward model where the geometry of the source configuration is estimated as part of the inversion. This could attempted in a similar manner to conventional DCM, where spatial priors give initial dipole locations (Kiebel et al., 2009. We should stress that this model contains just a single additional parameter (the displacement) to conventional point source models; and the prior mean and variance of this parameter (related to cortical thickness) can be informed by the anatomy. A second important limitation is that we have so far considered only one region. In most experiments, several cortical regions will be active, and it is not evident how well the mixture of the activity in several cortical microcircuits can be separated into layered activity.
While the above points concern mainly the spatial model, one could also think of enhancing the temporal model. One obvious extension would be to use the JCM directly to fit the data.
However, this is complicated by the spiking nature of neural activity which is not suitable for the gradient based optimization in the variational Bayes approach used here.
In our simulations, we have used independently and identically distributed Gaussian noise at the sensor level. This is a simplification of the colored and possibly temporally dependent noise in real MEG data (Engemann and Gramfort, 2015). The results indicate that high SNR (-3dB to 0 dB) data is needed to robustly infer layered structure. The estimated SNR of MEG data is roughly in the range of -20 dB to -30dB (Goldenholz et al., 2009) depending on the location of the source. Averaging over 100 or even 1000 trials would bring this range to 0dB to -10dB or 10dB to 0dB, respectively. Indeed, the improved SNR afforded by head-cast MEG allows for identifying laminar-specific spectral responses in sensory and motor cortices (Bonaiuto et al., 2018a). Our approach demonstrates the feasibility for temporal and spatial DCMs for evoked responses, but in the future this approach could be extended to laminar-resolved human MEG of spectral activity.

Future applications
One of the most prominent contemporary theories of brain function is the Bayesian brain hypothesis (Friston, 2010;Knill and Pouget, 2004). It has been suggested that predictive coding, an implementation of the Bayesian brain, has a natural embedding in cortical microcircuits (Bastos et al., 2012;Mumford, 1992;Rao and Ballard, 1999). Hence, both layered fMRI (Stephan et al., 2019) and layered MEG could provide important experimental evidence in support of or against this hypothesis. One of the advantages of MEG over fMRI is its high temporal resolution. Cortical computation is relatively fast (on the order of tens to hundreds of milliseconds) which is reflected in the timing of event related responses. In order to make inferences about such fast processes, it is highly advantageous to acquire data at a high temporal resolution.
Having established the feasibility of layered DCM in MEG using simulations, the next step will be to invert the model on data from an ERP experiment. Given that the distance to the sensors is critical, the somatosensory cortex seems the most promising candidate. As a next step, one could then move to model several regions. In all these applications, it will be paramount to constrain the spatial location of the dipole pair as narrowly as possible using anatomically precise measurements. One possibility to improve SNR in MEG is the use of OPMs which allow recordings using head-mounted sensors (removing relative head-movement issues), and promise higher SNR recordings as well as potentially much longer recording times (Boto et al., 2018(Boto et al., , 2017(Boto et al., , 2016. These sensors, placed directly on the scalp, could also take advantage of the higher-spatial frequency information generated by these dipole pairs (figure 2B). We have shown here that layered circuits can be inferred with relatively few cryogenic sensors (roughly 20 in our simulations). This number could potentially be further optimized by using different sensor types of geometries.