Abstract
We developed a biophysically-detailed model of the macaque auditory thalamocortical circuits, including primary auditory cortex (A1), medial geniculate body (MGB) and thalamic reticular nuclei (TRN), using the NEURON simulator and NetPyNE multiscale modeling tool. We simulated A1 as a cortical column with a depth of 2000 μm and 200 μm diameter, containing over 12k neurons and 30M synapses. Neuron densities, laminar locations, classes, morphology and biophysics, and connectivity at the long-range, local and dendritic scale were derived from published experimental data. The A1 model included 6 cortical layers and multiple populations of neurons consisting of 4 excitatory and 4 inhibitory types, and was reciprocally connected to the thalamus (MGB and TRN) mimicking anatomical connectivity. MGB included core and matrix thalamocortical neurons with layer-specific projection patterns to A1, and thalamic interneurons projecting locally. Auditory stimulus related inputs were simulated using phenomenological models of the cochlear auditory nerve and the inferior colliculus, which served as input to MGB. The model generated cell type and layer-specific firing rates consistent with overall ranges observed experimentally, and accurately simulated the corresponding local field potentials (LFPs), current source density (CSD), and electroencephalogram (EEG) signals. Laminar CSD patterns during spontaneous activity and responses to speech input were similar to those recorded experimentally. Physiological oscillations emerged spontaneously across frequency bands without external rhythmic inputs and were comparable to those recorded in vivo. We used the model to unravel the contributions from distinct cell type and layer-specific neuronal populations to oscillation events detected in CSD, and explore how these relate to the population firing patterns. Overall, the computational model provides a quantitative theoretical framework to integrate and interpret a wide range of experimental data in auditory circuits. It also constitutes a powerful tool to evaluate hypotheses and make predictions about the cellular and network mechanisms underlying common experimental measurements, including MUA, LFP and EEG signals.
1. Introduction
The auditory system is involved in a number of crucial sensory functions, including speech processing (Hamilton et al. 2021), (Matsumoto et al. 2011), (Fontolan et al. 2014), (Gourévitch et al. 2008), sound localization (Andéol et al. 2011), (Carlile, Martin, and McAnally 2005), (Ahveninen, Kopčo, and Jääskeläinen 2014), pitch discrimination (Tramo, Shah, and Braida 2002), (Tramo et al. 2005), (Dykstra et al. 2012), (Hyde, Peretz, and Zatorre 2008), and voice recognition (Latinus et al. 2013), (Holmes and Johnsrude 2021). Aberrations along this pathway can result in a wide variety of pathologies. Hearing loss, for example, can result from lesions in either the peripheral (Merchant and Rosowski 2008), (Raveh et al. 2002) or central (Taniwaki et al. 2000), (Brody et al. 2013), (Cavinato et al. 2012) parts of this pathway, while other abnormalities can result in increased sensitivity to sound volume (Baguley 2003) or difficulty processing music (Zendel et al. 2015), (Albouy et al. 2013).
Achieving a full understanding of this system is complicated by the many interareal pathways, the complexity of the inter- and intralaminar circuitry, the heterogeneity of neuronal cell types and behaviors, and the diversity of network coding mechanisms. A growing body of experimental data, with findings drawn from different methods at different biological scales, begets the need for a framework which can integrate these disparate findings and be used to investigate the system as a whole. The model we present here uses multiscale information with macaque-specific cortical dimensions, a diversity of excitatory and inhibitory cell types with data-driven electrophysiology (Povysheva et al. 2007), data-driven population density and connectivity, detailed thalamic circuits (including a full thalamocortical loop) (Markov et al. 2011), and realistic inputs from upstream structures such as cochlea and inferior colliculus.
Given the multiscale detail of this model, it can also be used to make predictions about the cellular and network-level mechanisms governing oscillatory dynamics in auditory cortex, which is important since cortical oscillations are known to play a prominent role in neural information processing. In auditory cortex, these oscillations may be particularly important for speech processing (Dimitrijevic et al. 2017), (Giraud and Poeppel 2012), (Ghitza 2011), with oscillations in different frequency bands synchronizing to and tracking the dynamic properties of speech waveforms (Giraud and Poeppel 2012). In some cases, oscillatory behavior in the auditory cortex can even be used to predict speech intelligibility (Ghitza 2011), (Dimitrijevic et al. 2017). Abnormalities in auditory cortex oscillations have been observed in pathologies that include auditory processing deficits, such as schizophrenia (Y. Hirano et al. 2020), (Spencer et al. 2009), (S. Hirano et al. 2018) and autism spectrum disorder (Gandal et al. 2010), (De Stefano et al. 2019). Increased oscillatory activity at rest (Y. Hirano et al. 2020), strong cross-frequency synchronization (Spencer et al. 2009), (S. Hirano et al. 2018), and impaired phase-locking between auditory cortex oscillations and incoming speech stimuli (Gandal et al. 2010), (De Stefano et al. 2019), (Jochaut et al. 2015) have been observed in these disorders, and may help explain some of the auditory processing related deficits seen in these disease states (Spencer et al. 2009; Paciello et al. 2021; Lakatos et al. 2019; Edgar et al. 2015; Gandal et al. 2010).
Our model can be used to investigate the cellular- and network-level mechanisms underlying thalamocortical oscillations, by first reproducing the oscillations in silico and then examining the activity which occurs at different scales: from subthreshold currents and dendritic effects, to activity in different thalamic pathways (e.g. core vs. matrix). Bridging these hierarchical levels also allows us to gain insights into the biophysical mechanisms underlying activity observed during experimental recordings that occur at different scales (e.g. single cell recordings, multiunit activity, local field potentials, current source density, electro- and magneto-encephalography).
2. Results
2.1 Development of a data-driven model of macaque auditory thalamocortical circuits
We developed a biophysically-detailed model of macaque auditory thalamocortical circuits, including medial geniculate body (MGB), thalamic reticular nucleus (TRN) and primary auditory cortex (A1). To provide input to the thalamic populations, we connected a phenomenological model of the cochlear nucleus, auditory nerve and inferior colliculus (IC). This resulted in a realistic model capable of processing arbitrary input sounds along the main stages of the macaque auditory pathway (Fig. 1). While details of each stage can be found in the Methods section, the current section includes an overall description of the main features of the model.
We reconstructed a cylindrical volume of 200 um radius and 2000 um depth of A1 tissue (Fig. 2). The model included 12,187 neurons and over 25 million synapses, corresponding to the full neuronal density of the volume modeled. The model was divided into 7 layers -- L1, L2, L3, L4, L5A, L5B and L6 -- each with boundaries, neuronal densities, and distribution of cell types derived from experimental data (J. A. Winer and Larue 1996; Coen-Cagli, Kanitscheider, and Pouget 2017; Markram et al. 2015; Billeh et al. 2020; Kelly and Hawken 2017; Tremblay, Lee, and Rudy 2016; Lefort et al. 2009; Schuman et al. 2019; Harris and Shepherd 2015; Huang, Larue, and Winer 1999). We included the four main classes of excitatory neurons: intratelencephalic (IT), present in all layers except L1; spiny stellate (ITS) in L4, pyramidal tract (PT) in L5B, and corticothalamic (CT) in L5A, L5B and L6. The dendritic length of cell types in different layers was adapted according to experimental data. While many previous cortical models only include one or two interneuron types, we incorporated a greater diversity of cell type by including four classes of interneurons: somatostatin (SOM), parvalbumin (PV), vasoactive intestinal peptide (VIP), and neurogliaform (NGF). All 4 classes were present in all layers except L1, which only included NGF. The MGB included two types of thalamocortical neurons and thalamic interneurons. The TRN included a population of inhibitory cells. Thalamic populations were in turn divided into core and matrix subpopulations, each with distinct wiring. The total number of thalamic neurons was 721, with cell densities and ratios of the different cell types derived from published studies.
Connectivity in the model was established for each pair of the 42 cortical and thalamic populations resulting in layer- and cell type-specific projections (Fig. 3). Each projection between populations was characterized by a probability of connection and unitary connection strength (in mV), defined as the PSP amplitude in a postsynaptic neuron in response to a single presynaptic spike. The probability of connection from cortical inhibitory populations decayed exponentially with cell-to-cell distance. Synapses were distributed along specific regions of the somatodendritic tree for each projection. Excitatory synapses included colocalized AMPA and NMDA receptors, and inhibitory synapses included different combinations of slow GABAA, fast GABAA and GABAB receptors, depending on cell types. The values for all the connectivity parameters were derived from over 30 published experimental studies (see Methods). Afferent projections from other brain regions were modeled by providing background independent Poisson spiking inputs to apical excitatory and basal inhibitory synapses, adjusted for each cell type to result in low spontaneous firing rates (∼1 Hz). Where available we used data from the NHP auditory system, but otherwise resorted to data from other species, including rodent, cat and human. We employed automated parameter optimization methods to fine tune the connectivity strengths to obtain physiologically constrained firing rates across all populations.
To achieve variability in the baseline model we modified the randomization seeds used to generate the probabilistic connections and spike times of Poisson background inputs. Specifically, we ran 25 simulations with different seeds (5 connectivity × 5 input seeds), each for 11.5-second simulations (first 1.5 seconds required to reach steady state). This resulted in 250 seconds of simulated data, which is comparable to some of the macaque experimental datasets used. Modeling results that include statistics were calculated across all of the 25 × 10 second simulations, which provided a measure of the robustness of the model to variations in connectivity and inputs and is comparable to the variability across different macaques and trials, respectively.
We developed the models using NetPyNE (Salvador Dura-Bernal et al. 2019) and parallel NEURON (Lytton et al. 2016). Overall, we ran over 500,000 simulations in order to tune the model parameters and explore model responses to different inputs and conditions. This required over 5 million core hours on several supercomputers, primarily Google Cloud Platform. All model source code, results, and comparisons to experimental data are publicly available on ModelDB and Github.
2.2 Cell type and layer-specific activity recorded at multiple scales
The model generated layer and cell type-specific spontaneous activity (Fig. 4). Distinct spiking patterns were recorded across thalamus and cortex (Fig. 4A): TC and TRN showed clear alpha oscillations (∼8 Hz); cortical granular and supragranular layers exhibited a similar oscillatory pattern but more diffuse over time and with higher variability in the peak amplitudes; and infragranular layers showed more tonic firing and a slower delta (∼2 Hz) oscillation. Spiking responses also varied across cell types within a layer, e.g. L5B IT cells fired tonically whereas L5B CT cells showed phasic firing at delta frequency (only 2 peaks of activity). Overall, average spontaneous firing rates were below 5 Hz for excitatory neurons and below 20 Hz for inhibitory neurons, consistent with experimental data (X. Wang et al. 2005; Sakata and Harris 2009; Eggermont 1992; Hromádka, Deweese, and Zador 2008). Spontaneous activity was simulated by driving the thalamic and cortical neurons with non-rhythmic Poisson-distributed low amplitude background inputs. Therefore, the distinct responses of neural populations must be a consequence of their heterogeneous biophysical properties and synaptic connectivity.
The model responses were recorded and analyzed at multiple scales (Fig 4): neuronal membrane voltage traces (Fig. 4C), spike times (Fig. 4A), firing rate statistics (Fig. 4B), local field potentials (LFPs) and current source density (CSD) analysis (Fig 4D), and current dipole moments and electroencephalogram (EEG) signals (Fig 4E). These measurements represent the same underlying biophysical phenomenon as evidenced by activity features shared across them, e.g. activity peaks around 800 ms and 1300 ms (see Figs. 4A, 4D, 4E). This illustrates how the model can be used to interpret common experimental measurements (MUA, LFP, EEG) and relate them to the underlying biophysical circuit properties. In Section 2.4 we use this approach to disentangle the layer and cell type-specific biophysical sources of an oscillation event.
To examine patterns of LFP/CSD activity in NHP recordings, we first determined the supragranular, granular, and infragranular layer depths for each macaque subject, as done previously (Lakatos et al. 2016). In macaques, the determination of the supragranular, granular, and infragranular layer depths relied on functional demarcation of these regions based on responses to preferred modality stimuli. For each NHP subject, we examined an averaged CSD profile resulting from the presentation of a stimulus which provoked an excitatory response in A1 (e.g. clicks, best frequency tones). An early sink in this CSD profile indicated the presence of the granular layer, while source / sink pairs above and below the granular layer designated the presence of the supragranular and infragranular layers, respectively.
Characteristic laminar LFP/CSD activity patterns recorded experimentally in macaques were qualitatively reproduced in model simulations (Fig. 5). For example, during spontaneous activity we observed examples in both experiment and model showing: 1) ∼50 ms long current sinks (red) in the granular layer together with current sources (blue) immediately above and below, plus current sources (blue) in the most superficial electrodes (Fig. 5A); 2) ∼150 ms long current sinks fluctuating around the border of the granular and infragranular layers with current sources immediately below, and again in the most superficial electrodes (Fig. 5B); and 3) ∼150 ms long current sources in the granular layer with current sinks above and below (Fig. 5C). While reproducing responses to specific speech utterances is outside the scope of this paper, examples of laminar LFP/CSD responses to speech are shown in Figure 5 and illustrate that the cochlea and IC model can be used to provide auditory stimuli to the biophysical thalamocortical model, which in turn generates activity patterns that resemble those recorded experimentally. For example, ∼150-200ms long current sinks in the granular layer, with alternating current sources and sinks in the infragranular layers (Figs. 5D,E); and short ∼30ms current source in granular layer with similar duration current sources in the supra- and infragranular layers (Fig 5F). Although there are similarities, these three examples of spontaneous and speech responses also illustrate the variability observed both within and between the experimental and modeling datasets. In the next section, we further quantify this variability in terms of the oscillatory power of spontaneous responses.
2.3 Emergence of spontaneous physiological oscillations across frequency bands
Physiological oscillations across a range of frequency bands were observed in both the macaque and model thalamocortical circuits. In the model, these oscillations emerged despite having no oscillatory background inputs, suggesting they resulted from the cellular intrinsic biophysics and circuit connectivity. We quantified the power spectral density (PSD) of 10-second LFPs recorded from different macaques and from the model (Fig. 6). These results illustrate the high variability of spontaneous responses measured within and across macaques, which was comparable to that generated by the model. Despite this variability, the model exhibits features similar to those observed consistently across macaques, including peaks at delta, theta/alpha and beta frequencies. To quantify these similarities and establish whether the LFP PSD generated by our model could be distinguished from that of macaques, we performed principal component (PCA) analysis (Fig. 6B). PCA explained a large proportion of the variance (PC1=57%, PC2=14%). As can be observed the cluster of model data points partly (10/25 data points) overlapped those of macaques 1 and 2, yielding them indistinguishable via PCA (circled green points in Fig. 6B). Furthermore, the mean PCA distance between the macaque 2 and 3 clusters is greater (2.2) than between the macaque 2 and model clusters (1.3). Further validation was provided by plotting a shuffled version of the model LFP PSDs, which appeared as a clearly separate cluster with no overlap with the macaque data (except for 1 outlier from macaque 3). The correlation matrix (Fig. 6C) across all LFP PSDs shows a much stronger correlation between the model and macaque than between shuffled model and macaque (0.31 vs 0.006).
Individual oscillation events were detected in current source density (CSD) data from resting state recordings gathered in silico from the A1 model, and in vivo from non-human primates (NHP) using software that has previously been used to detect and quantify features of oscillation events in human and NHP electrophysiology recordings (S. A. Neymotin et al. 2020). Once identified, oscillation events were classified according to frequency band: delta (0.5-4 Hz), theta (4-9 Hz), alpha (9-15 Hz), beta (15-29 Hz), gamma (30-80 Hz). Oscillation events were then sorted once more based on their laminar location, in either the supragranular, granular, or infragranular layers. We were thus able to compare model and NHP oscillation events that occurred in the same regions of the cortical column, within the same frequency band. Examples of matching individual oscillation events from each frequency band are shown in Figure 7 (Note Theta and Alpha oscillations events are from supragranular layers, Beta and Delta events are from infragranular layers, and the Gamma event is from the granular layer).
Several oscillation event features were used to compare across model and NHP data from different frequency bands, including temporal duration, peak frequency, and number of cycles in the oscillation (see Fig. 8). Overall, these three features showed similar average values and overlapping distributions when compared across the model and NHP, and across cortical layers. Duration was the most consistent value, with close average values across model and NHP at all frequency bands (t-test, p>0.05). Average peak frequency was the same for most frequency bands (t-test, p>0.05), with the exception of: 1) theta, which showed a slightly lower average frequency compared to the NHP data (t-test, p<0.05), and 2) gamma, which showed a slightly higher average frequency compared to NHP (t-test, p<0.05). Similarly, the number of cycles per oscillation event were on average the same across frequency bands (t-test, p>0.05), with the exception of gamma, which showed a slightly higher average value (t-test, p<0.05). The minor discrepancies in average values may however represent an artifact due the short overall duration of simulations (10 sec), compared to the duration of NHP recordings, which were on the order of minutes.
2.4 Unraveling the biophysical mechanisms underlying physiological oscillations at the cellular and circuit scales
After verifying that the oscillation events detected in the model data were comparable to the events observed in the NHP data, we used the model to examine the network- and population-level activity occurring during these oscillation events. This illustrates one of the advantages of the model. Not only were we able to obtain the overall LFP data, but the biological detail of the model also allowed us to record and examine each population’s contribution to the overall LFP signal. Additionally, we were able to examine the spiking activity of each population at the time of each event, similar to multiunit activity observed during neurophysiological recordings but with additional cell-type specificity.
Figure 9 illustrates this approach using a physiologically realistic oscillation event detected in the simulated A1 column supragranular layer (Fig. 9A, also seen in Fig. 7D, left). To examine the activity occurring in the network during this oscillation event, we first examined the LFP signal from each cell population over all electrodes (Fig. 9B). This information showed that the intratelecephalic cells from layers 3 (IT3) and 5A (IT5A), along with the pyramidal tract population in layer 5B (PT5B), made the strongest contributions to the overall LFP signal during this event. Interestingly, the dominant LFP frequency of each of these populations was more broadly distributed in time, and centered at ∼7 Hz, slightly higher than the theta event’s peak frequency of 5.5 Hz. The slower peak frequency detected in the theta event could arise from phase shifts between the individual population LFPs, each of which are integrated in the supragranular CSD signal from which the event was extracted. The corresponding spiking activity was then examined for each of these cell populations (Fig. 9C, D, E, bottom panels). Notably, the low-frequency theta event coincided with higher frequency gamma events in both the LFP spectrograms and the spike rate spectrograms for all populations involved. In each population, these gamma events also occurred during periods when the population spike rates were elevated, consistent with the observation that elevated excitatory neuron firing leads to detectable gamma signals (Leszczyński et al. 2020). In addition, there was evidence of spike/field coherence, seen in the peak firing times of IT3 neurons coinciding with troughs in the LFP theta rhythm. Overall, the presence of coincident theta and gamma demonstrates a cross-frequency interaction often observed in neural oscillations (Jensen and Colgin 2007), and highlights how the model can be used to make predictions on the origins of these complex dynamics observed in auditory cortex in vivo (O’Connell et al. 2015).
3. Discussion
3.1. Key findings and novelties
We have developed the first detailed multiscale model of macaque auditory thalamocortical circuits, including MGB, TRN and A1, and validated it against in vivo experimental data. The model integrates experimental data on the physiology, morphology, biophysics, density, laminar distribution and proportion of different cell types, as well as their local and long-range synaptic connectivity (Figs. 1-3). Realistic auditory inputs can be provided to the thalamus via a phenomenological model of the cochlear nucleus and IC. The model generates cell type and layer-specific firing rates consistent with overall ranges observed experimentally, and can accurately simulate the corresponding measures at multiple scales: local field potentials (LFPs), laminar current source density (CSD) analysis, and electroencephalogram (EEG) signals (Fig 4). We identified multiple laminar CSD patterns during spontaneous activity and responses to speech similar to those recorded experimentally (Fig. 5). Physiological oscillations emerged across frequency bands without external rhythmic inputs and were comparable to those recorded spontaneously in vivo: despite significant variability across animals and time, the spectral power showed peaks at delta, theta/alpha and beta frequencies (Fig. 6). We analyzed individual CSD oscillation events and identified closely matching examples across all frequency bands (Fig. 7). Broadly, the statistics on simulated CSD oscillation event duration, peak frequency and number across layers and frequency bands were consistent with those reported in vivo (Fig. 8). We used the model to make predictions about the contributions from the distinct neural populations to specific oscillation events and their potential cross-frequency interactions (Fig. 9). Notably, the model highlights the importance of disentangling individual neuronal population contributions to oscillatory activity: the event detected (Fig. 9) displayed an apparent peak at theta, which the model suggests is the integrated contribution of phase-shifted faster (∼7Hz) population level activity. This also underlines the importance of modeling to help interpret the basic properties of electrophysiology data recorded in vivo.
Although circuit models of similar size and complexity have been developed, these have largely focused on rodent visual (Billeh et al. 2020) and somatosensory (Markram et al. 2015) cortices. A highly-detailed rat somatosensory cortex model was used to study stimulus specific adaptation in the auditory cortex by modifying thalamic inputs (Amsalem et al., n.d.), but the overall model cortical architecture and connectivity was not adapted to replicate the particularities of auditory circuits or the macaque species. Compared to our model, previous models of auditory cortex lack significant detail in terms of neuron model complexity, range of cell types, neuronal density and distribution, and/or circuit connectivity (Park and Geffen 2020; Stanley et al., n.d.; Loebel, Nelken, and Tsodyks 2007; Zulfiqar, Moerel, and Formisano 2019; Kudela et al. 2018).
In short, the main novelties of our model are: incorporates available data specific to macaque species and auditory cortex; includes wide range of cortical and thalamic excitatory and inhibitory cell types; synaptic connectivity is cell type and layer-specific and includes bidirectional thalamic connections with distinct core and matrix projections; can simulate realistic auditory inputs through a cochlear and inferior colliculus model; can generate realistic multiscale measures, including spiking activity, LFP, CSD, and EEG; and recapitulates a range of macaque A1 in vivo results.
3.2. Challenges and limitations
The model is necessarily incomplete and inaccurate due to gaps in experimental data and in our theoretical understanding of biological principles. This is particularly true for the NHP auditory system, which has been less studied and is not as well characterized as, for example, the rodent visual system. More specifically, the availability of electrophysiological and connectivity data from the macaque auditory system for the different cortical and thalamic cell types was limited, so, when required, we used data from other macaque regions or from other mammalian auditory systems. Validating the layer and cell type-specific firing rates was also challenging due to lack of macaque A1 data, thus many of the comparisons to experiments rely on the readily available laminar LFP and CSD measures. Despite these limitations, we believe our model incorporates more properties specific to macaque A1 than any previous model. Furthermore, it can be iteratively improved and further validated as new and more precise data becomes available.
Generating physiologically constrained firing rates in all model populations required parameter tuning (also referred to as parameter fitting or optimization) of the connection strengths within biologically realistic ranges. This process was particularly challenging in this model, e.g. compared to our previous motor cortex model (Subhashini Sivagnanam et al. 2020), and required developing and iteratively improving our automated parameter optimization methods. We believe the reason for this is the additional inhibitory cell types (VIP, NGF) and thalamic circuits, including the complex recurrent intracortical and thalamocortical interactions. The optimization methods resulted in a range of distinct model parameter combinations that produced valid network dynamics -- this is known as parameter degeneracy. It is well known that biological neural circuits exhibit this same property: different combinations of neuron intrinsic and synaptic properties -- each varying up to several orders of magnitude -- can result in the circuit exhibiting the same physiological and functional outcome (Prinz, Bucher, and Marder 2004).
The relatively narrow diameter (200 um) of our simulated cortical column did not allow for a detailed implementation of the tonotopic organization of thalamic inputs. Nonetheless, the A1 column was tuned to a specific best frequency, as determined by the filtering of inputs through the cochlear and IC model, and A1 received a realistic number of afferent core and matrix thalamic inputs, with layer and cell type specificity. Future model versions can be extended to have a larger diameter column or multiple columns, each receiving distinct thalamic projections, enabling the studying of circuit mechanisms that support frequency discrimination of auditory stimuli. Hence, in this study, we did not attempt to reproduce speech responses in detail, and instead focused on reproducing features of spontaneous activity, including the high variability observed experimentally.
We simulate, for the first time, EEG signals based on the current dipoles of individual neurons in a realistic model of macaque auditory cortical circuits. Calculating the voltage at the different scalp electrodes requires a realistic head volume conduction model. Unfortunately, we did not find a macaque head model, and had to use the standard human head model available within the LFPy tool (Hagen et al. 2018). This served as a proof-of-concept of the multiscale capabilities of our model.
3.3. Outlook on research and clinical applications
Overall, the computational model provides a quantitative theoretical framework to integrate and interpret a wide range of experimental data, generate testable hypotheses and make quantitative predictions. It constitutes a powerful tool to study the biophysical underpinnings of different experimental measurements, including LFP, EEG, and MEG (Samuel A. Neymotin et al. 2020). This theoretical framework represents a baseline model that can be updated and extended as new data becomes available. Ongoing efforts by the BRAIN Initiative Cell Census Network (BICCN) and others may soon provide a cell census of the mammalian auditory cortex, similar to that recently made available for the motor cortex (BRAIN Initiative Cell Census Network (BICCN) 2021). Our model is fully open source and implemented using the NetPyNE tool (Salvador Dura-Bernal et al. 2019), which was explicitly designed to facilitate integration of experimental data through an intuitive language focused on describing biological parameters. This will enable other researchers to readily adapt the model to reproduce experimental manipulations, e.g. chemogenetic or pharmacologic interventions, or dynamics associated with different brain diseases. Work is already ongoing to adapt the model to study the EEG correlates of schizophrenia in A1 (Metzner and Steuber 2021), and to evaluate a novel LFP recording device (Abrego et al. 2021). To facilitate interoperability with other tools, NetPyNE can also export the model to standard formats, such as NeuroML (Gleeson et al. 2019) and SONATA (Dai et al. 2020), making it widely available to the community. This is also the first detailed circuit model to incorporate naturalistic auditory inputs, allowing future research linking structure, dynamics and function, and providing insights into neural representations during naturalistic stimulus processing. Given the general similarities between NHP and human thalamocortical circuitry (Herculano-Houzel 2009; Passingham 1973) this data-driven model has high translational relevance, and can start to bridge the gap across species and offer insights into healthy and pathological auditory system dynamics in humans.
4. Methods
We developed a model of the macaque auditory system consisting of a phenomenological model of cochlea and IC, and biophysically-detailed models of auditory thalamic and cortical circuits (Fig. 1). We validated the model against macaque in vivo experimental data. This section detailed the modeling, experimental and analysis methods used.
4.1 Single neuron models
Morphology and physiology of neuron classes
The network includes conductance-based cell models with parameters optimized to reproduce physiological responses. We used simplified morphologies of 1 - 6 compartments for each cell type, and sized dendritic lengths to match macaque cortical dimensions ((Oliver et al. 2018), Figs 8.3/4.4). We fitted the electrophysiology of each cell type to extant electrophysiology data from macaque when available, or other animal models when it was not. Passive parameters, such as membrane capacitance, were tuned to fit RMP and other features of subthreshold traces (e.g. sag from hyperpolarization). Active parameters included values such as the fast sodium channel density, and were tuned to reproduce characteristics like oscillatory bursting and firing rate curve.
Within the A1 network, we modeled four classes of excitatory neurons: the intratelencephalic spiny stellate (ITS), intratelencephalic pyramidal (IT), pyramidal tract (PT) and corticothalamic (CT). These were distributed across the six cortical layers. The ITS model consisted of 3 compartments (a soma and 2 dendrites), and was adapted from a previously published Layer 4 spiny stellate model (Mainen and Sejnowski 1996). There is evidence for the presence of stellate cells in A1 in mammals, including rodents, rabbits, bats, cats and humans (Meyer, González-Hernández, and Ferres-Torres 1989; Oliver et al. 2018; Harris and Shepherd 2015; Y. Wang, Brzozowska-Prechtl, and Karten 2010), although in some species these were relatively rare compared to visual and somatosensory cortices. Several macaque studies also mention the role of A1 L4 stellate cells in receiving input from thalamus (Steinschneider et al. 1998, 1992; Fishman et al. 2000). The ITP, PT, and CT cell models were each composed of 6 compartments: a soma, axon, basal dendrite, and 3 apical dendrites. These models were based on previous work (Samuel A. Neymotin et al. 2017), in which simplified cell models were optimized to reproduce subthreshold and firing dynamics observed in vivo (Suter, Migliore, and Shepherd 2013), (Yamawaki et al. 2014), (Oswald et al. 2013). Apical dendrite lengths were modified to match macaque cortical dimensions and layer-specific connectivity requirements. The classification of cortical neurons into IT, PT and CT is based not only on their projection targets, but also in their local connectivity, laminar location, morphology, intrinsic physiology and genetics (Harris and Shepherd 2015; Shepherd and Yamawaki 2021). Although the PT terminology may be confusing for A1, this cell class refers to subcerebral projection neurons, including brainstem, and has been previously used for non-motor cortical regions (A1, V1, S1) (Harris and Shepherd 2015; Shepherd and Yamawaki 2021; Baker et al. 2018). PT neurons have also been labeled as ‘extratelencephalic’ (ET), but this does not distinguish them from the also extratelencephalic CT neurons. In A1, a category of neurons described as ‘large pyramidal cells’ overlap significantly with features of the PT cell class: mostly occupy L5B, have thick-tufted morphologies reaching up to L1, are intrinsically bursting and project to brainstem, including inferior colliculus (IC), superior olivary complex and the cochlear nuclear complex (Budinger and Kanold 2018; Jeffery A. Winer et al. 2002; Baker et al. 2018; Jeffery A. Winer and Schreiner 2010).
Four classes of inhibitory neurons (NGF, SOM, PV, VIP) were also simulated in the A1 network model. The vasoactive intestinal peptide (VIP) cell model utilized a previously published 5-compartment model (Turi et al. 2019), as did the 3-compartment somatostatin (SOM) and parvalbumin (PV) interneurons (Konstantoudaki et al. 2014). Parameters such as dendritic length were modified to better fit extant cortical data regarding rheobase and f-I curve ((Tripathy et al. 2015)). The neurogliaform (NGF) cell model was adapted from an existing model in rodent (Bezaire et al. 2016), with soma compartment size modified to more closely match the geometry of NGF cells in monkeys (Povysheva et al. 2007). Channel mechanisms, including A-type potassium and Ih currents, were also added to the soma compartment to replicate the electrophysiological characteristics (e.g. sag, f-I curve) described for these cell types in the literature (Povysheva et al. 2007).
In thalamus, the modeled MGB contained thalamocortical (TC) cells, high-threshold thalamocortical cells (HTC), and local thalamic interneurons (TI). The TC and HTC cells were both single-compartment models capable of tonic and burst firing (Iavarone et al. 2019), with the HTC model having the addition of a high-threshold T-type channel mechanism (Vijayan and Kopell 2012). The locally inhibitory TI cells had 2 compartments (a soma and a dendrite), and were fitted to in vitro electrophysiology data recorded from lateral geniculate nucleus (Zhu, Uhlrich, and Lytton 1999a), (Zhu, Uhlrich, and Lytton 1999b), (Zhu, Lytton, and Xue 1999). These cells were optimized to reproduce the oscillatory bursting observed in this cell type (Zhu, Lytton, and Xue 1999). The thalamic reticular nucleus (TRN) contained the single-compartment inhibitory reticular (IRE) cells, with parameters also optimized to display this cell type’s characteristic intrinsic rhythmicity (Destexhe et al. 1994), (Destexhe et al. 1996).
4.2 Thalamocortical circuit model populations
Auditory thalamus
Our auditory thalamus model included the medial geniculate body (MGB) and the thalamic reticular nucleus (TRN). The MGB was composed of two types of thalamocortical neurons (TC, HTC) and thalamic interneurons (TI). TRN was composed of reticular nucleus cells (IRE). The overall proportion of excitatory to inhibitory neurons was 3:1. For TC, TI and IRE cell types we included two separate populations in order to capture the distinct connectivity patterns of the core vs matrix thalamic circuits. Matrix populations were labeled with an ‘M’ at the end: TCM, IREM, TIM. The proportion of core to matrix neurons was 1:1. The density and ratio of the different thalamic populations was based on experimental data (J. A. Winer and Larue 1996) ((Huang, Larue, and Winer 1999)). The resulting ratio of thalamic to cortical neurons was 1:17, consistent with published data (Coen-Cagli, Kanitscheider, and Pouget 2017).
Auditory cortex
We modeled a cylindrical volume of the macaque primary auditory cortex (A1) with a 200 µm diameter and 2000 µm height (cortical depth) including 12,187 neurons and over 25 million synapses (Fig. 2). The cylinder diameter was chosen to approximately match the horizontal dendritic span of a neuron located at the center, consistent with previous modeling approaches (Markram et al. 2015; Billeh et al. 2020). Macaque cortical depth and layer boundaries were based on macaque published data (Kelly and Hawken 2017; Tremblay, Lee, and Rudy 2016). The model includes 36 neural populations distributed across the 6 cortical layers and consisting of 4 excitatory (IT, ITS, PT, CT), and 4 inhibitory types (SOM, PV, VIP and NGF). Details of the biophysics and morphology of each cell type are provided in section [“Single Neuron Models”] above. The laminar distribution, cell density and proportion of each cell type was based on experimental data (Kelly and Hawken 2017; Lefort et al. 2009; Schuman et al. 2019; Harris and Shepherd 2015). Layer 1 included only NGF cells. Layers 2 to 6 included IT, SOM, PV, VIP and NGF cells. Additionally, ITS cells were added to layer 4 PT cells to L5B; and CT cells to layers 5A, 5B and 6. The resulting number of cells in each population depended on the modeled volume, layer boundaries and neuronal proportions and densities per layer.
4.3 Thalamocortical circuit model connectivity
Connectivity parameters: connection probability and weight
We characterized connectivity in the thalamocortical circuit using two parameters for each projection: probability of connection and unitary connection strength. Probability of connection was defined as the probability that each neuron in the postsynaptic population was connected to a neuron in the presynaptic population. For example, If both pre- and postsynaptic populations have 100 neurons, a probability of 10% will result in an average of 1,000 connections (10% of the total 10,000 possible connections). The set of presynaptic neurons to connect to was randomly selected and autapses and multapses were not allowed. Given the neuronal morphologies were simplified to 6 or less compartments, we used a single synaptic contact for each cell-to-cell connection.
Unitary connection strength was defined as the EPSP amplitude in response to a spike from a single presynaptic neuron. GIven that synaptic weights in NEURON are typically defined as a change in conductance (in uS), we derived a scaling factor to map unitary EPSP amplitude (in mv) to synaptic weights. To do this we simulated an excitatory synaptic input to generate a somatic EPSP of 0.5 mV at each neuron segment. We then calculated a scaling factor for each neuron segment that converted the EPSP amplitude (mV) values used to define connectivity in NetPyNE into the corresponding NEURON synaptic weights (in uS). This resulted in the somatic EPSP response to a unitary connection input being independent of synaptic location, also termed synaptic democracy (Poirazi and Papoutsi 2020). This is consistent with experimental evidence showing synaptic conductances increased with distance from soma, to normalize somatic EPSP amplitude of inputs within µm of soma (Magee and Cook 2000). We thresholded dendritic scaling factors to 4x that of the soma to avoid overexcitability in the network in cases when neurons receive hundreds of inputs that interact nonlinearly (Spruston 2008; Behabadi et al. 2012)
Types of synapses
Excitatory synapses consisted of colocalized AMPA (rise, decay τ : 0.05, 5.3 ms) and NMDA (rise, decay τ 15, 150 ms) receptors, both with reversal potential of 0 mV. The ratio of NMDA to AMPA receptors was 1:1 (Myme et al. 2003), meaning their weights were each set to 50% of the connection weight. NMDA conductance was scaled by 1/(1+0.28 · Mg · e(−0062 · V)) with Mg = 1mM (Jahr and Stevens 1990). Inhibitory synapses from SOM to excitatory neurons consisted of a slow GABAA receptor (rise, decay τ : 2, 100 ms) and GABAB receptor, with a 9:1 ratio; synapses from SOM to inhibitory neurons only included the slow GABAA receptor; synapses from PV consisted of a fast GABAA receptor (rise, decay τ : 0.07, 18.2); synapses from VIP included a different fast GABAA receptor (rise, decay τ : 0.3, 6.4) (Pi et al. 2013); and synapses from NGF included the GABAA and GABAB receptors with a 1:1 ratio. The reversal potential was 0 mV from AMPA and NMDA, -80 mV for all GABAA and -93 mV for GABAB. The GABAB synapse was modeled using second messenger connectivity to a G protein-coupled inwardly-rectifying potassium channel (GIRK) (Destexhe, Babloyantz, and Sejnowski 1993). The remaining synapses were modeled with a double-exponential mechanism.
Connection delays
Connection delays were estimated as 2 ms to account for presynaptic release and postsynaptic receptor delays, plus a variable propagation delay calculated as the 3D Euclidean distance between the pre- and postsynaptic cell bodies divided by a propagation speed of 0.5 m/s. Conduction velocities of unmyelinated axons range between 0.5-10 m/s (Purves et al. 2018), but here we chose the lowest value given that our soma-to-soma distance underestimates the non-straight trajectory of axons and the distance to target dendritic synapses.
Intra-thalamic connectivity
Intrathalamic connectivity was derived from existing rodent, cat and primate experimental and computational studies (Bonjean et al. 2012; Cruikshank et al. 2010; Serkov and Gonchar 1996; Jones 2002; Billeh et al. 2020) (see Fig. 2). More specifically, connection probabilities and unitary strength for TC→RE, RE→TC and RE→RE (both core and matrix populations) were largely based on a previous primate thalamus study (Bonjean et al. 2012) and validated with data from mouse ventrobasal thalamus (Cruikshank et al. 2010) and cat MGBv ((Bonjean et al. 2012; Cruikshank et al. 2010; Serkov and Gonchar 1996; Jones 2002; Billeh et al. 2020)). No evidence was found for TC recurrent connections. Thalamic interneurons connectivity was derived from the same cat MGBv study, which provided the number of synaptic contacts for TI→TI, TI→TC and TC→TI, from which we estimated the probability of connection from each projection. We also verified that our model intra-thalamic connectivity was generally consistent with that of the Allen Brain Institute visual thalamocortical model (Jones 2002; Billeh et al. 2020). Given that thalamic neuron models were single-compartment, no specific dendritic synaptic location information was included.
Intra-cortical connectivity
Connectivity within the A1 local circuit populations was defined as a function of pre- and postsynaptic cell type and layer. Given the overall lack of detailed cell type-specific connectivity experimental data for macaque A1, we used as a starting point the connectivity from two experimentally-grounded mammalian cortical microcircuit modeling studies: the Allen Brain Institute (ABI) V1 (Billeh et al. 2020) and the Blue Brain Project (BBP) S1 (Markram et al. 2015). We then updated the model connectivity with experimental data specific to macaque A1, when available, or simply mammalian A1.
Both studies included the projection-specific probability of connection and unitary connection strength parameters that we required for our model. However, the ABI V1 model had less excitatory (1) and inhibitory (3) broad cell types than our A1 model (4 E and 4 I), whereas the BBP S1 model included significantly more (11 E and 15 I). Neither model included the distinction between L5A and L5B present in our A1 model. The ABI V1 did provide length constants to implement distance-dependent connectivity, which we wanted to include for some of the A1 projections. Therefore, as a first step, we mapped our cell types to the closest ones in the ABI V1 model and obtained the corresponding connectivity matrices for A1. We then updated the A1 connectivity of cell types that were missing from ABI V1 based on data from BBP S1, more specifically, the ITS, PT, CT and VIP cell types. To do this we mapped A1 cell types to those closest in BBP S1, and scaled the connectivity parameters of missing cell types proportionally, using shared cell types as reference (e.g. IT or PV). Through this systematic approach we were able to combine data from ABI V1 and BBP S1 in a consistent way, to determine the connectivity parameters of all the A1 populations.
Inhibitory connections were further refined using data from A1 (Budinger and Kanold 2018; Kato, Asinof, and Isaacson 2017; Pi et al. 2013) or from studies with more detailed cell type-specific data (Tremblay, Lee, and Rudy 2016; Naka and Adesnik 2016). We updated the L2/3 SOM connectivity so they project strongly not only to superficial layer excitatory neurons, but also to deeper ones by targeting their apical dendrites; this was not the case for PV cells, which projected strongly mostly to intralaminar excitatory neurons (Kato, Asinof, and Isaacson 2017; Naka and Adesnik 2016). More specifically, probabilities of connection from L2/3 SOM and PV to excitatory neurons were a function of the postsynaptic neuron layer (L1-L6) based on data from an A1 study (Kato, Asinof, and Isaacson 2017). The probability of connection from VIP to excitatory neurons was set to a very low value derived from mouse A1 data (Pi et al. 2013). Following this same study, VIP→SOM connections were made strong, VIP→PV weak, and VIP→VIP very weak. Connection probabilities of all I→E/I projections decayed exponentially with distance using a projection-specific length constant obtained from the ABI V1 study.
Information on the dendritic location of synaptic inputs was also incorporated, when available, into the model. Cortical excitatory synapses targeted the soma and proximal dendrites of L2-4 excitatory neurons, distal dendrites of L5-6 excitatory neurons, and were uniformly distributed in cortical inhibitory neurons (Billeh et al. 2020; Budinger and Kanold 2018; Harris and Shepherd 2015). L1 NGF neurons targeted the apical tuft of excitatory neurons, L2-4 NGF targeted the apical trunk of L2-4 excitatory neurons and the upper trunk of L5-6 excitatory neurons, and L5-6 NGF targeted the lower trunk of L5-6 excitatory neurons (Tremblay, Lee, and Rudy 2016; Budinger and Kanold 2018; Naka and Adesnik 2016). Synapses from SOM interneurons were uniformly distributed along excitatory neurons, those from PV and VIP neurons targeted the soma and proximal dendrites of excitatory neurons (Naka and Adesnik 2016; Kato, Asinof, and Isaacson 2017; Tremblay, Lee, and Rudy 2016).
Thalamocortical and corticothalamic connectivity
Thalamocortical connections were layer- and cell type-specific and were derived from studies in mouse auditory cortex (Ji et al. 2016) and rodent somatosensory cortex (Constantinople and Bruno 2013; Cruikshank et al. 2010). Core MGB thalamocortical neurons projected cortical excitatory neurons in cortical layers 3 to 6. The strongest projections were to layer 4 ITP, ITS and PV neurons. Weaker thalamocortical projections also targeted L3 IT and PV; L4 SOM and NGF; L5-6 IT, CT and PV; and L5B PT and SOM. Matrix thalamocortical neurons projected to excitatory neurons in all layers except 4, and to L1 NGF, L2/3 PV and SOM, and L5-6 PV. Core thalamic inputs targeted the soma and proximal dendrites of cortical excitatory cells, whereas matrix thalamic inputs targeted their distal dendrites (Bonjean et al. 2012; Jones 2002).
Corticothalamic projections originating from L5A, L5B and L6 CT neurons targeted all core thalamus populations (TC, HTC, TI and IRE); whereas projections from L5B IT and PT neurons targeted the matrix thalamus populations (TCM, TIM, IREM). Connectivity data was derived from primate and rodent studies on auditory cortex and other cortical regions (Bonjean et al. 2012; Yamawaki and Shepherd 2015; Budinger and Kanold 2018; Harris and Shepherd 2015; Jones 2002; Crandall, Cruikshank, and Connors 2015).
4.3 Background inputs
To model the influence of the other brain regions not explicitly modeled on auditory cortex and thalamus, we provided background inputs to all our model neurons. These inputs were modeled as independent Poisson spike generators for each cell, targeting apical excitatory and basal inhibitory synapses, with an average firing rate of 40 Hz. Connection weights were automatically adjusted for each cell type to ensure that, in the absence of local circuit connectivity, all neurons exhibited a low spontaneous firing rate of approximately 1 Hz.
4.4 Full model synaptic weight tuning
Overview of approach
Although we followed a systematic data-driven approach to build our model, the complete experimental dataset required to build a detailed model of the macaque auditory thalamocortical system is currently not available. Therefore, we had to combine experimental data from different species, different brain regions and obtained using different recording techniques. It is therefore not surprising that in order to obtain physiologically constrained firing rates across all populations we needed to tune the connectivity parameters. Automated optimization methods have been previously used for simpler networks (e.g. recurrent point-neuron spiking networks) (Nicola and Clopath 2017; Sussillo and Abbott 2009; S. Dura-Bernal et al. 2017; Carlson et al. 2014; Hasegan et al. 2021). However, optimization of large-scale biophysically-detailed networks typically requires expert-guided parameter adjustments (Bezaire et al. 2016; Markram et al. 2015), for example through parameter sweeps (grid search) (Billeh et al. 2020). In order to find a more systematic approach to tune this type of model, here we explored automated optimization methods, and gradually refined them and combined them with heuristic approaches as needed. Here we describe the final approach employed to obtain the tuned network.
Automated optimization algorithm
Our starting point was the network with cell type-specific background inputs so that all cells fired at approximately 1 Hz in the absence of connectivity. We then added connectivity with parameters taken from the literature and similar existing models. The resulting network included many silent populations (0 Hz) and others firing at very high rates (>100 Hz). Our aim was to obtain a baseline network where all populations fired within biologically constrained rates.
After classical grid search methods failed, we evaluated the Optuna (http://optuna.org) (Akiba et al. 2019), a hyperparameter optimization framework designed for machine learning applications, which dynamically searches the parameter space. Compared to evolutionary algorithms we used in the past, Optuna has the advantage of producing similar results while not requiring all candidates of a generation to be completed before moving to the next one; instead it dynamically decides the next candidate to explore based on all the evaluated candidates up to that point. This makes it faster and less resource-consuming.
In order to automate the process, we implemented a fitness function to automatically evaluate how good each of the solutions was: where Np is the number of neural populations, Nt is the number of time periods that are evaluated, p is the population index, i is the time period index, r(p, i) is the average firing rate for population with index p during time period with index i, t(p, i) and s(p, i) are the target rate mean and standard deviation for population with index i, and fitmax is the maximum (worst possible) fitness value. For each population the target firing rate is described by a Gaussian function with mean and standard deviation and a minimum threshold (for E pops: mean=5, std=20, min=0.05; for I pops: mean=10, std=30, min=0.05). This Gaussian function is evaluated across four consecutive 250ms periods, to ensure relative homogeneity in the firing rates, e.g. to avoid populations firing only during the first 100ms, but with an average firing rate matching the target rate. The mean output of this function across populations and time periods constitutes the fitness error (Fig 11).
Layer-specific and cell-type specific parameters
In order to reduce the fitness errors we gradually included more tuned parameters (see Fig. 11A). Our final approach included 4 projection-class weight gains (E→E, E→I, I→E, I→I) for each of the 7 layers (L1-6). Analysis also revealed the highly specific dynamics for each of the four inhibitory cell types, which prompted us to include inhibitory cell type-specific weight gains: E→ PV, E→ SOM, E→ VIP, E→ NGF and PV→ E, SOM→ E, VIP→ E, NGF→ E. Including both layer-specific and cell type-specific parameters overall resulted in overall better solutions with lower fitness errors.
Stepwise layer-by-layer tuning
Increasing the number of parameters (dimensions), increases the size of the parameter space to explore, which increases the number of optimization trials (simulations) required to obtain a good solution, and increases the risk of getting stuck in local minima. There are two main ways to reduce the parameter space: 1) reducing the number of parameters, e.g. including only parameters for a subset of layers, or of projection types; and 2) reducing the range of parameter values explored, e.g. constraining these based on previous optimization results. Both of these are implemented in the stepwise layer-by-layer tuning approach we employed. This approach reduced the massive HPC resources required to explore the large model parameter spaces.
To implement the layer-by-layer tuning approach we first optimized the parameters within L4 alone. Once this layer achieved valid firing rates in all cell populations we added L3, and tuned the L3 connectivity parameters, while we kept L2 parameters within a small range of the previously obtained solution. We repeated this for L2, L5A, L5B, L6 and finally L1. Due to a small bug when tuning L2 and L3, once the full model was tuned, we retuned L2 and L3 while keeping the rest of parameters within a small range (Fig. 11). A similar layer-by-layer approach was followed to tune the Allen Brain Institute V1 model (Billeh et al. 2020), although they used a heuristic unidimensional grid search approach, whereas we employed an automated multidimensional dynamic search using Optuna.
Projection-specific weight tuning
Once we had obtained a reasonable solution for most model populations using the layer-by-layer approach, we required additional fine-tuning to improve the rate of specific populations. In particular, the SOM2 and SOM3 were 0 Hz and PV2 and VIP2 were firing too high (>100 Hz). The current parameters explored did not appear to provide enough specificity to improve the rate of these populations without worsening some of the others. Therefore, we had to tune the weight gains of specific population-to-population projections, e.g. from IT2 to SOM2. Using Optuna we optimized the weights of all projections targeting the populations with inadequate rates: PV2, SOM2, VIP2 and SOM3. This resulted in improved rates for these populations.
Final model
Our final network included all 43 thalamic and cortical populations firing within 0.1 and 25 Hz, i.e. no epileptic or silent populations. Due to the unprecedented scale and level of detail in the model, e.g. complex interaction between 4 interneuron types, we had to employ an exploratory approach evaluating many several methods to tune the weights. Overall, this required over 500k simulations and over 5M core hours on Google Cloud HPCs. The lessons learned during this process should facilitate the automated tuning of similar detailed models in the future.
4.5 Phenomenological models of peripheral auditory structures
To simulate spontaneous activity in our baseline model we used background white noise as inputs to our thalamus and cortical populations. However, in order to accurately simulate auditory stimuli input we also connected a model of peripheral auditory structures such as the auditory nerve (AN) and inferior colliculus (IC). To simulate these structures we used phenomenological models that captured the signal transformations occurring in these regions (Krishna and Semple 2000). These models produced outputs that were used to drive the thalamocortical cells in the downstream, more biologically detailed portion of the auditory pathway model. The AN responses modeled here included several characteristic nonlinearities such as rate saturation, adaptation, and phase locking (Carney, Li, and McDonough 2015; Krishna and Semple 2000). Outputs from the AN model were convolved and modulated with synaptic information and used as inputs to a phenomenological model of inferior colliculus (IC). Model neurons of the IC utilized different types of modulation transfer functions to capture both the spectral and amplitude modulation tuning observed in this structure (Carney, Li, and McDonough 2015; Nelson and Carney 2004; Krishna and Semple 2000; Joris, Schreiner, and Rees 2004). These phenomenological models mitigated common encoding issues encountered at high frequencies and high sound levels, providing us with IC outputs that were useful throughout a broad range of frequencies and noise (Carney, Li, and McDonough 2015).
The AN and IC models were implemented in Matlab and available within the UR_EAR 2.0 tool. We used .wav files as input to this tool and obtained the IC time-resolved firing rates. The model allowed customization of several options, including the cochlear central frequency and bandwidth. We saved the firing rates for different input sounds and converted these to spike times using a Python-based inhomogeneous Poisson generator (Muller et al. 2007). We then used NEURON spike generators (VecStims) defined in NetPyNE to provide the IC spike times as input to the model thalamic populations.
4.6 Model building, simulation and optimization
We developed the computational model using the NetPyNE tool (Dura-Bernal et al. 2019), and ran all parallel simulations using NEURON 8.0 (Lytton et al. 2016; Carnevale and Hines 2006) with a fixed time step between 0.05 ms. NetPyNE is a python package that provides a high-level interface to NEURON, and allows defining complicated multiscale models using an intuitive declarative language focusing on the biological parameters. NetPyNE then translates these specifications into a NEURON model, facilitates running parallel simulations, automates the optimization and exploration of parameters using supercomputers. We executed our simulations primarily on Google Cloud supercomputers using a Slurm-based cluster with 80-core compute nodes (Subhashini Sivagnanam et al. 2020). Some simulations were also run on XSEDE supercomputers Comet and Stampede, either using our own allocations or through the Neuroscience Gateway (NSG) (S. Sivagnanam et al. 2013). We used the NetPyNE software tool to design, execute, organize, and analyze the simulations, as well as to export our model to the SONATA (Dai et al. 2020) and NeuroML (Gleeson et al. 2019) standards.
4.7 Data Analysis and Visualization
Spiking raster plot, firing rate statistics and voltage traces
The NetPyNE package (Dura-Bernal et al. 2019) was used to record and analyze simulation output data, and visualize spiking raster plots, firing rate statistics, and neuronal membrane voltage traces.
Local Field Potential (LFP)
At each in silico electrode, the local field potential (LFP) was calculated as the sum of the extracellular potential from each neuronal segment. For estimation of extracellular potential, we used the line source approximation method and assumed that the model neurons were immersed in an ohmic medium with a fixed conductivity of sigma = 0.3 mS/mm (Parasuram et al. 2016; Gold et al. 2006; Dura-Bernal et al. 2019). Electrodes were spatially distributed at 100 μm intervals along a vertical axis of the 2000 μm A1 column. Model LFP recording, analysis and visualization was performed using the NetPyNE package.
Current Source Density (CSD)
We compared the in vivo and in silico current source density (CSD) signals from data recorded at rest in supragranular, granular, and infragranular layers. CSD was calculated as the second spatial derivative of the LFP. CSD analysis and visualization was performed using the NetPyNE package (Dura-Bernal et al. 2019).
Oscillation event detection
Using the OEvent package (S. A. Neymotin et al. 2020), we also used Morlet wavelet spectrograms and their corresponding CSD waveforms to identify individual oscillation events occurring in spontaneous data, and to compare these events across in vivo and in silico contexts. OEvent extracts moderate/high-power events using 7-cycle Morlet wavelets on non-overlapping 10 s windows (Sherman et al. 2016; Samuel A. Neymotin et al. 2020). We used linearly spaced frequencies (0.25 Hz frequency increments), ranging from 0.25 - 125 Hz. Power time-series of each wavelet transform are normalized by median power from the recording/simulation. We apply a local maximum filter to detect peaks in the spectrogram. Local peaks are assessed to determine whether their power exceeds a 4x median threshold to detect moderate- to high-power events. Frequency and time bounds around the peak are determined by including time and frequency values before/after, above/below peak frequency until power falls below the smaller of ½ maximum event amplitude and 4× median threshold. As shown in Fig. 7, this produces a bounding box around each oscillation event that can be used to determine frequency spread (minF to maxF), duration, and peak frequency (frequency at which maximum power is detected). We merge events when their bounding box overlapping area in the spectrogram exceeds 50% of the minimum area of each individual event. This allows continuity of events separated by minor fluctuations below threshold. We then calculate additional features from this set of events, including the number of cycles (event duration × peak frequency). We classify events into standard frequency bands on the following intervals: delta (0.5-4 Hz), theta (4-9 Hz), alpha (9-15 Hz), beta (15-29 Hz), gamma (30-80 Hz). Classification was based on the frequency at which maximum power occurred during each event. These oscillation event analysis techniques yielded morphologically similar events between the simulated and NHP data (Fig. 7, 8).
4.8 Experiment recordings
We used a dataset of invasively-recorded laminar electrode array local field potentials from the primary auditory cortex (A1) of non-human primates (NHP), previously described (Neymotin et al. 2020). NHP electrophysiological data was recorded during acute penetrations of area A1 of the auditory cortex of rhesus macaques weighing 5-8 kg, who had been prepared surgically for chronic awake electrophysiological recordings. Prior to surgery, each animal was adapted to a custom fitted primate chair and to the sound proofed recording chamber. All procedures were approved in advance by the Animal Care and Use Committee of the Nathan Kline Institute. Preparation of subjects for chronic awake intracortical recording was performed under general anesthesia using aseptic techniques. To provide access to the brain, either Cilux (Crist Instruments) or Polyetheretherketone (PEEK; Rogue Research Inc.) recording chambers were positioned normal to the cortical surface of the superior temporal plane for orthogonal penetration of A1. These recording chambers and a PEEK headpost (both used to permit painless head restraint) were secured to the skull with ceramic screws and embedded in dental acrylic. Each NHP was given a minimum of 6 weeks for post-operative recovery before behavioral training and data collection began. Before and after recordings, NHPs had full access to fluids and food. No rewards were offered during recordings.
The data were recorded during waking rest with eyes mostly open, while the macaques were in a dark room. In a subset of recordings, short sentences in the English language were presented at 80dB SPL. During recordings, NHPs were head-fixed and linear array multielectrodes (23 contacts with 100, 125 or 150µm intercontact spacing, Plexon Inc.) were acutely positioned to sample all cortical layers of A1. Neuroelectric signals were continuously recorded with a sampling rate of 44 kHz using the Alpha Omega SnR system. For NHP data analyses using current-source density (CSD) signals, CSD was calculated as the second spatial derivative of laminar local field potential recordings. This was done to reduce potential issues related to volume conducted activity.
Acknowledgements
Research supported by R01DC012947, P50MH109429, NIBIB U24EB028998, NIBIB U01EB017695, NYS DOH01-C32250GG-3450000, NSF 190444, Army Research Office W911NF-19-1-0402, ARO URAP supplement. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S.Government is authorized to reproduce and distribute reprints for Government purposes, notwithstanding any copyright notation herein.