## Abstract

Oscillations figure prominently in models of brain function and dysfunction. It would be informative to analyze oscillations at the level of individual neurons, by estimating the power spectral density (PSD) of spike trains. Unfortunately, spike train spectra exhibit a global distortion generated by the neuronal recovery period (“RP”, the post-spike drop in spike probability, which can extend beyond the refractory period). This distortion can increase false negative and false positive rates in statistical tests for oscillatory effects. An existing “shuffling” procedure corrects for RP distortion by removing the spectral component explained by the inter-spike interval (ISI) distribution. However, this procedure sacrifices any oscillation-related information in the ISIs, and power at the corresponding frequencies in the PSD. Here, we ask whether a three-step “residuals” method can improve upon the shuffling method’s performance. First, we estimate the RP duration (*n _{r}*) from the ISI distribution. Second, we fit the spike train with a point process model that predicts spike likelihood based on the time elapsed since the most recent of any spikes falling within the preceding

*n*milliseconds. Third, we compute the PSD of the model’s residuals. We initially compared the residuals and shuffling methods’ performance over diverse synthetic spike trains. The residuals method generally yielded the most accurate classification of true-versus false-positive oscillatory power, with this result principally driven by enhanced sensitivity to oscillations in sparse spike trains. Subsequent evaluations used single-unit data from the internal globus pallidus (GPi) and ventrolateral anterior thalamus (VLa) of a parkinsonian monkey, in which pathological alpha-beta oscillations (8-30 Hz) were anticipated. Over these units, the residuals method reported the greatest incidence of significant alpha-beta power, with low firing rates predicting residuals-selective oscillation detection. Overall, these results encourage further development of the residuals approach, including expansion to capture additional aperiodic spectral features.

_{r}**Author Summary** Models of brain function propose that some neurons fire action potentials (“spikes”) rhythmically. That is, the spikes arise from a latent rate function that oscillates at a consistent frequency. We might attempt to detect such oscillations by seeking peaks against a flat baseline in a spike train’s power spectrum. Yet the baseline is not flat: The transient post-spike suppression in spike rate (“recovery period”) introduces a distortion that can both obscure true oscillatory peaks and generate false positives. An established method aims to address the distortion by removing the spectral content that can be recreated with shuffled versions of the spike train. However, this procedure can remove some information about spike rhythmicity, thereby hindering sensitivity to true oscillations. We developed an alternative method that uses regression to remove recovery period effects from the spike train, and computes the spectrum of the residuals. Evaluations on synthetic data demonstrated that this “residuals” method improved upon the established method’s sensitivity to true oscillations. In experimentally-acquired data from a primate model of Parkinson’s Disease, the residuals method also returned increased detections of the expected pathological 8-30 Hz oscillations. These results suggest that the residuals method may support future discoveries of otherwise undetected spike rhythms.

## Introduction

To estimate oscillations in neural activity, it is conventional to compute the power spectral density (PSD) of measures of extracellular fields (e.g., local field potentials (LFPs), or electrocorticograms (ECoG)). Such signals reflect a complex mixture of contributions from neuronal populations [1], and are therefore not optimal when the objective is to investigate oscillations at the level of individual neurons. In these situations, we can attempt to apply spectral analysis to single-unit spike trains. Ideally, the resulting PSD output – along with post-processing and statistics -- should enable the inference of oscillations in the rate function that generated the observed spikes [2]. We adopt the working definition of an oscillation as a rhythmic modulation in action potential (“spike”) occurrence within a narrow frequency range (expanding upon the usage from [3], in the context of continuous signals).

To detect true periodic components in a spike-derived PSD function accurately, we must first address a challenge: These spectra exhibit a distortion pattern that is common to phenomena subject to measurement “dead time” (i.e., an interval after an event, when the occurrence or detection of a subsequent event is impeded [4]). In the context of single-unit spike trains, this dead time is typically attributed to the neuronal refractory period, which may be absolute (*p*(spike) = 0) or relative (gradually recovering *p*(spike)). The simplest example of this distortion occurs in the context of a hypothetical spike train arising from a Poisson process interrupted by an absolute refractory period. In this case, the resulting PSD departs from the flat spectrum anticipated for an unmodified Poisson process, and is instead characterized by a trough over a range of low frequencies, followed by elevated power (often accompanied by visible ringing) over the high frequencies [5, 6]. The precise form of the distortion differs modestly for relative refractory periods (see Fig. 1 for a synthetic example), and is exacerbated by high firing rates (FR) and long refractory period durations [7].

The present research examines two methods – including a commonly-used approach (the “shuffling method”), and a novel alternative (the “residuals method”) – that aim to remove this dead time-induced distortion from spike train spectra, and ultimately facilitate more sensitive and selective detection of the periodic effects of interest. Beyond these two options (and an analytical variant on the shuffling approach [8]), we are aware of no additional methods specifically designed to correct distortion in neuronal spike train spectra. Therefore, we limit our comparison to these two methods.

Before describing the correction methods, it is necessary to clarify a terminology choice: Instead of attributing the dead time to the refractory period, we will use a mechanism-neutral term: “recovery period” (see [9] for similar usage). In data from our lab (including the empirically-sampled units we present here), we have observed several examples of units that exhibit a dead time (i.e., sharp post-spike drops in the hazard function) that far exceeds the typical duration of the channel dynamics responsible for refractoriness (5-10 ms [10]). Such long post-spike pauses might be generated by either circuit-level or intrinsic mechanisms (such as those underlying pacemaker patterns [11]). Since neither of the methods that we discuss attempt to selectively correct for the biological refractory period *per se*, we adopt the generic recovery period (RP) terminology.

When a spike train features an oscillation, the corresponding peak in the PSD is superimposed on the recovery period-associated distortion pattern (see Fig. 2A for synthetic examples). The distortion complicates standard statistical procedures that aim to distinguish such oscillatory PSD peaks from baseline by using a null hypothesis model that is estimated from the spectrum itself. In particular, detection problems arise when the baseline does not account for the RP contribution.

An especially detailed treatment of RP-associated distortion [7] -- which also introduced the established shuffling method -- illustrated the oscillation detection challenges with a straightforward statistical test that assumes a Poisson process (in the time domain) and corresponding white noise (in the frequency domain) as the baseline. These authors observed that, over the highest frequencies (beyond the frequency bands typically prioritized by neuroscientists), the PSD functions of neuronal spike trains tended to approach those of matched-FR Poisson processes, thereby motivating the construction of confidence intervals (CI) based on the mean and variance over these high-frequency points. Points in the PSD function that cross the CI bound (see Fig. 2A, dashed lines, for CI examples) are labeled as significant, with the possible added requirement that the points also fall within an *a priori* search range (set at (0, 100] Hz in the figures). Because this statistical test does not account for the RP distortion, it is prone to miss oscillatory peaks that fall within the low frequency trough (see the two examples in Fig. 2A), and incorrectly label as significant points at higher frequencies, beyond the trough, which exhibit elevated power as a consequence of the distortion effect alone [7]. Although the examples in Fig. 2A do not show false positives within the (0, 100] Hz search range, the PSD in the left panel does include a peak that crosses the CI threshold at ≈112 Hz. Note that this general pattern of false negatives and positives also appears when the CIs for the Poisson null hypothesis are estimated using an alternative analytical procedure (see [7] and [12] for details).

A switch to a different, but still RP-neglecting baseline model is insufficient to resolve these issues. For example, consider the common practice of fitting a “1/f-like” function to the spectra of field potential measures, in order to separate an assumed aperiodic baseline from periodic effects [3]. Although this practice can be effective for such population-level signals – on which individual units’ RPs will not have a discernable impact – it is unclear how a 1/f-like function could be successfully fit to a raw spike train spectrum.

To account for the RP distortion, the correction method proposed by [7] builds upon the null hypothesis that the spike train emerges from a renewal process with a matched inter-spike interval (ISI) distribution. Recall that a renewal process is a sequence of events (here, spikes) for which the waiting times (the ISIs) are independently sampled from an identical distribution [13]. Since we define the recovery period as an influence solely on the waiting time between two successive events, and not on any higher order structure (e.g., correlations between successive waiting times), the renewal null hypothesis does capture the temporal structure attributable to the RP.

The ISI shuffling method described by [7] estimates the renewal equivalent spectrum for a spike train. According to the simplest, global version of this scheme, several control spike trains are generated by randomly permuting the ISIs of the original. The PSDs computed from these control spike trains are then averaged to estimate the renewal equivalent spectrum. An analytical alternative to this Monte Carlo procedure does also exist, which leverages an established method to compute the renewal process spectrum from an ISI distribution [8, 14]. Although this analytical approach is fast and non-stochastic, it does require the use of rectangular spectral analysis windows. For the present work, we focus on the shuffling approach, given its flexibility to tapered windows (e.g., Hamming windows, or the Slepian sequences used by the multitaper method [15]).

Regardless of whether shuffling or the analytical method is used, the resulting control spectrum is then divided out from the original spectrum to obtain the corrected PSD. Given a corrected spectrum, it is customary to proceed with the aforementioned procedure of constructing CIs based on the high frequencies, under the assumption that the null hypothesis of a flat spectrum is now appropriate.

Shuffling correction continues to see common use in single-unit studies, especially in research on the motor control circuitry (e.g., [16–20]), but also in other domains [21, 22]. The method is a well-justified option when researchers are willing to forgo the temporal information present in the ISI distribution. However, as noted by [7], the ISIs can carry information about the rhythmic occurrence of individual spikes, which the shuffling method will remove. A simple, rigid example of this occurs in the case of precisely timed pacemaking activity – that is, single spikes separated by a relatively consistent period *P*, plus or minus a modest amount of jitter (i.e., an approximation of a Dirac comb [23]). In such cases, the pacemaking interval is indistinguishable from a recovery period, implying that such patterns should be detected through means other than RP-corrected spectra (e.g., by seeking instances of extremely low ISI variance). A more subtle example occurs when rhythmic spiking appears to be driven by an approximately sinusoidal function, but the mean FR is so low that spikes often occur only once per oscillation cycle, and may at times skip cycles altogether. To detect and statistically validate such cases, spectral analysis may prove more helpful, but how to effectively address RP distortion while preserving the sparse information regarding the underlying rhythmicity remains unresolved.

Here we explore a new “residuals” method which aims to selectively remove RP-associated variance from a spike train, while leaving intact the information present in the long-lag ISIs. The specific method we describe should be seen as one instantiation of a general, mixed time- and frequency-domain framework that can be adapted flexibly to accommodate analysis goals that are more complex than those prioritized here. We also note that the method represents the combination of a number of existing ideas and techniques from prior work that has sought to estimate spike rhythms exclusively in the time domain (e.g., [24–26]).

The residuals method consists of three stages. First, similar to previous works [24, 27], we use the ISI distribution to inform an estimate of the RP duration () for the input spike train. Second, we construct a point process model (PPM) that predicts the time-varying spike probability (*p _{spk}(t)*) as a function of the time elapsed since the most recent spike, but only when a spike was observed within the last ms; outside of this RP interval, the prediction relies on the default constant term alone. In other words, this “bounded last-spike” model estimates the component of the data that can be explained by the base firing rate and the first ms of the estimated post-spike hazard function.

The fitted PPM is assumed to account for the variance associated with the RP, thereby implying that the raw residuals (actual - predicted values) should represent a “corrected” time series. Therefore, as the third step, we submit the residuals to spectral analysis. The resulting PSD is subjected to the same statistical procedures as were described for the shuffling-corrected spectra.

To our knowledge, no prior work has applied such an approach to the analysis of spike train oscillations. However, we do note some conceptual relatedness to a two-step PPM fitting procedure reported by [24], which was used to account for slow fluctuations in spike rate prior to seeking oscillations in the time domain.

We present a comparison of the shuffling and residuals methods on both synthetic and real neuronal spike trains. In the synthetic data – for which ground truth is certain -- we report a general advantage for the residuals method in the accurate classification of true and false oscillatory points in the corrected spectra. In follow-up analyses, we find that the residuals method is particularly advantageous in the detection of true oscillations, especially when these are of a moderately strong amplitude, and arise from units that fire at rates only modestly greater than the frequency of the oscillation to be detected.

The neuronal data were acquired from the internal globus pallidus (GPi) and ventrolateral anterior thalamus (VLa) of a parkinsonian monkey, thereby presenting a dataset for which the *a priori* expectation of pathological oscillations is high (specifically in the 12-30 Hz beta- and 8-12 Hz alpha-frequency ranges [28, 29]), and for which spiking may tend to be sparse (particularly within the VLa, [30]). As expected for biological data, the findings are complex. We do find several instances of candidate alpha-beta oscillations that are detected by the residuals method exclusively, especially amongst the slowest-spiking units. At the same time, we also find a small number of alpha-beta oscillations flagged exclusively by the shuffling method. We additionally observe aperiodic spectral components (1/f-like trends and probable bursting) that residuals but not shuffling correction leave intact, and propose options for expansion upon the residuals method that could account for these additional phenomena.

## Materials and Methods

### Ethics Statement

The single nonhuman primate (NHP) dataset analyzed in this study (see “Experimental Data Methods”) was acquired under a protocol approved by the Institutional Animal Care and Use Committee of the University of Pittsburgh (#18093682). Experimental procedures and animal care complied with the National Institutes of Health Guide for the Care and Use of Laboratory Animals, the PHS Policy on the Humane Care and Use of Laboratory Animals, and the recommendations of the Weatherall report. The animal was housed with a single cage mate in a climate-controlled room, received regular enrichments, and consumed a diet that included fresh fruits and vegetables daily. Surgical procedures were conducted under general anesthesia, with analgesics administered post-operatively.

### Methods Overview

In the subsections that follow, we first describe the procedures for comparing the two correction methods (shuffling and residuals) on synthetic spike trains. These include the steps for generating the spike trains, calculating the original and corrected PSDs, and evaluating the methods’ accuracy in detecting known oscillations. Subsequently, we describe the experimental procedures used in acquiring the NHP data, and also the few details of PSD analysis that differed between the NHP data and the synthetic spike trains.

All simulations and analyses were implemented in Matlab (R2022a, The Mathworks, Inc., Natwick, MA; RRID:SCR_001622). Matlab code sufficient to reproduce the reported results is available on Github (https://github.com/kc13/residuals_spectral_analysis; DOI: 10.5281/zenodo.8323322). We make use of conventional Matlab notation for referring to the ranges spanned by / increments separating vector elements ([min:step:max]).

Data files are available in both a Zenodo repository (DOI: 10.5281/zenodo.8313070) and the Github repository (see also the Data Availability statement). The original synthetic datasets and NHP spike train data are available on Zenodo. The Github repository stores copies of the NHP spike train data, and also synthetic datasets that are represented in a postprocessed state (the subsampling and pROC output, see “Comparison of ROC curves for the Correction Methods”; note the original synthetic datasets exceed Github file size limits).

### Synthetic Data Methods

#### Spike Train Simulation

We generated synthetic spike trains through the procedure described in [7]. For each time bin *t* (Δt = 1 ms), the spike probability *p _{spk}(t)* was determined according to the discrete-time approximation of an inhomogeneous Poisson process:
Here,

*n*indexes the millisecond latency since the last spike occurrence, and

*n*denotes the duration of the recovery period (RP). Therefore, the spike probability at

_{r}*t*depends on whether the simulated unit is in the recovery period (RP) or has recovered the steady state (SS) firing pattern. During the steady state,

*p*reflects the sum of two components: (1)

_{spk}(t)*p*

_{base}, corresponding to the mean steady state firing rate (FR), and (2) an oscillation term, in which 0 ≤

*p*≤

_{osc}*p*

_{base}governs the oscillation amplitude, and

*f*(specified in Hz) indicates the oscillation frequency. Following from [2], we set

_{osc}*p*

_{ose}indirectly through variation of the modulation index,

*m = p*/

_{osc}*p*; i.e., the ratio of the peak and mean steady state FRs, respectively. The above

_{base}*p*

_{osc}constraint implies that 0 ≤

*m*≤ 1. For all simulations,

*p*<< .5, thereby implying that

_{base}*p*remained bounded in [0,1].

_{spk}(t)During the recovery period, *p _{spk}(t)* is set equal to the steady state probability following modulation by an exponential recovery term, where 0 ≤

*k*< 1. Following from [7], our simulations adopted default RP parameter settings of

*k*= 0.7 and

*n*= 9; a set of follow-up tests considered the use of other values (see “Evaluation of the PSD Correction Methods”).

_{r}The duration of the synthetic spike trains (*T*) was constrained to multiples of 1024 ms (the segment size used for spectral analysis). The settings used for *T* and three other free parameters (*p _{base}*,

*m, f*) are described in further detail in “Evaluation of the PSD Correction Methods”. The output of each simulation run was a

_{osc}*T*x 1 binary vector (a “delta vector”), indicating time bins of spike occurrence at a resolution of 1 kHz.

#### Uncorrected PSD

To obtain the initial, uncorrected PSD estimates, we submitted the delta vectors (demeaned segment-wise) to Welch’s method (Matlab *pwelch*, Hamming window length = NFFT = 1024 ms, Fs = 1 kHz, zero segment overlap). PSD functions were represented as 513-element vectors spanning the frequency range [0, 500] Hz, implying a frequency resolution of *Δf* ≈ 0.9766 Hz.

Within each PSD function, we sought points of statistically significant power in the range (0,100] Hz. Significance was determined using the CI-construction procedure from [7] which was described in the Introduction. For our analyses, points in the (0, 100] Hz range of a spectrum were labeled as significant if they exceeded the upper limit of a one-sided, [100*(1-α_{c})]% confidence interval formed using the mean and SD over the [250, 500] Hz range of the same spectrum (α_{c} = Bonferroni-corrected significance level, with respect to the 102 points in the (0,100] range). All figures of uncorrected PSDs were generated with α_{c} = .05.

For the shuffling and residuals methods detailed below, all spectral analysis and statistical testing steps made use of the same routines as were applied to the uncorrected PSDs, and the same default α_{c} = .05 for figure generation; α_{c} was varied away from this default for the ROC analyses described in “Evaluation of the PSD Correction Methods”.

#### Shuffling-corrected PSD

To obtain the shuffling-corrected PSD for a given spike train, we first generated 100 surrogate, ISI-shuffled versions of the original delta vector (Fig. 2B). We implemented the global shuffling method described by [7] (i.e., random permutation of all ISIs across the entire vector). We chose this approach over local shuffling (permutation of ISIs within nonoverlapping segments) since our simulations emphasized relatively low firing rates (necessitating a broad shuffling scope to admit a sufficient number of ISIs) and did not use time-varying simulation parameters (eliminating the need to capture local variations). Additionally, using the same primary evaluation that we describe for the shuffling and residuals methods (see “Evaluation of the PSD Correction Methods”), we verified superior performance for global as compared to local shuffling. Code sufficient to reproduce this observation is available in the Github repository.

Each of the 100 globally shuffled spike trains was submitted to spectral analysis, and the mean of the resulting PSDs was divided from the original uncorrected PSD, producing the final shuffling-corrected PSD.

#### Residuals-corrected PSD

The residuals method entails three steps: (1) estimation of the recovery period duration, (2) fitting a point process model (PPM) to the spike train, which estimates the effect of the time elapsed since the most recent spike, up to a history bound corresponding to the estimated RP duration, and (3) applying spectral analysis to the residuals of the PPM fit.

##### Estimation of Recovery Period Duration

We define the recovery period as a post-spike reduction in a unit’s spike probability, terminating upon return to a steady state firing rate. For our synthetic spike trains (excluding a series of follow-up datasets), the ground truth duration of the recovery period was *n _{r}* = 9 ms.

Our estimation approach builds upon a general logic that was described in [27], although note that these authors did not propose an explicit estimation algorithm. We expand upon these authors’ ideas by detailing a specific algorithm that operates on the probability density function (PDF) of the inter-spike intervals (ISI). To introduce our approach, we use a simple synthetic spike train for which *m* = 0 (i.e., no oscillation present, such that firing is shaped by only *p*_{base} and the RP); however, subsequent analyses will show that the algorithm performs reasonably when *m* > 0.

The ISI PDF for an example spike train generated with *m* = 0 is shown in Fig. S1A. Recall that, in the case of Poisson firing (i.e., *n _{r}* = 0, with an intensity λ =

*p*

_{base}), the observed ISIs are sampled from an exponential PDF [31]:

*P(ISI = x) = λe*. Because a spike train with an RP is not Poisson, an exponential curve fit to such a spike train’s ISI PDF will overestimate the probability mass assigned to the ISI lags ≤

^{-λx}*n*

_{r}. Now, consider what we should observe if we progressively shift the exponential curve fit rightward. Fig S1A illustrates this process for a series of example lags

*L*assigned to the

*x*= 1 position of the exponential fit. When

*L*≤

*n*

_{r}, the exponential curve continues to overestimate the probability mass for the lowest lags. However, when

*L*=

*n*+ 1, the exponential model of this left-cropped ISI distribution is appropriate, as firing outside of the RP is Poisson. Our strategy for identifying the boundary between the end of the RP and the start of the steady state relies on this expected transition. We additionally leverage the observation that statistical evidence in favor of the exponential fit (i.e., the extent to which the addition of an exponential curve explains variance beyond that explained by a constant term alone) tends to be stronger when

_{r}*L*=

*n*+ 1 than when it is incremented to the immediately neighboring lags to the right.

_{r}The estimation algorithm proceeded as follows:

Given a delta vector, compute the ISI histogram, with bins spanning the ms lags in [1, max(ISI)].

For each

*L*in*lags*:Extract the histogram values over the range [L, max(ISI)].

Normalize the extracted histogram to unit area to form a PDF.

Fit a single scaled exponential to the PDF using the generalized linear model (GLM) of the form

*PDF*(*x*) =*e*^{BO}*e*^{B1x}→ log(*PDF*(*x*)) =*B*_{O}+*B*_{1}*x*. See the next subsection for more detail on this form of GLM.Compute the deviance difference, which is a standard statistic for assessing the improvement in goodness-of-fit contributed by a full GLM (“model 1”, here set to the constant term plus the exponential lag effect), compared to a GLM restricted to the constant

*B*term (“model 0”). The difference statistic is Δ_{0}*D*=*D*_{O}−*D*_{1}= −2[log − log ]; i.e., the log likelihood ratio scaled by −2 [31, 32].Upon confirming the first instance of a

*ΔD*value that is a local maximum (*ΔD(L-1) < ΔD(L) > ΔD(L+1*)), break out of the loop iterating over*lags*.

The L associated with the local

*ΔD*maximum marks the start of the steady state region; the estimated recovery period duration, , is set to L-1 ms.

For the synthetic unit with *m* = 0, this process of finding the lag *L* associated with the local *ΔD* maximum is illustrated in Fig. S1B. Note that in this case, the duration of the recovery period is accurately estimated as L-1 = 9 ms.

Figures 3A-B illustrate the same process as applied to the two synthetic oscillatory spike trains introduced in Fig. 2. Although the presence of non-zero oscillation (i.e., *m* > 0) implies that the steady-state firing is not Poisson, we observe that the curve-fitting algorithm is robust to this departure and again correctly identifies the ground truth RP duration of 9 ms. Evaluation of the RP estimation accuracy over a full dataset of synthetic spike trains is presented in the Results.

##### Bounded Last-Spike Point Process Model

Once the RP duration is estimated, the next step is to estimate the RP shape. For this purpose, we fit the spike train with a point process model (PPM).

With reference to [9, 24, 31, 33, 34], we will briefly review the fundamental concepts and terminology for point process models in general, and for our implemented Poisson GLM in particular. A PPM seeks to predict the likelihood that a discrete event (e.g., a spike) will occur at time *t*, given a set of covariates. For a stochastic point process represented in continuous time (e.g., a spike train expressed as a series of spike timestamps), the PPM models the conditional intensity function (CIF):
where *P*[*N*(*t* + Δ*t*) − *N*(*t*) = 1 | *θ*, *H*(*t*)] is the conditional probability of a discrete event instance (e.g., a spike) occurring in the time interval (*t*, *t* + Δ*t*], given a covariate history *H*(*t*) and model parameters *θ*. For our discrete-time spike train representation (the delta vectors), this continuous formulation may be approximated using the assumptions that for *Δt* ≤ 1 ms, *P*(spike in (*t*, t+ *Δt*]) ≈ *λ*[*θ*, *H*(*t*)]Δ*t*. For the remainder of this text, as a shorthand, we will follow the convention of using *t* to index the ms-width bins of the discrete-time representation (as opposed to continuous time).

Given this discrete-time representation, a Poisson GLM (i.e., a generalized linear model with a log link function, also known as Poisson regression) is one example of a PPM that may be used to predict spike probability. We implemented the bounded last-spike model as a Poisson GLM of the form
where *s*(t)* denotes the time of the most recent spike [33], and *δ _{t-j,s∗(t)}* is the Kronecker delta function, returning 1 when

*t-j*equals

*s*(t)*, and 0 otherwise. In other words, the model estimates the variance in the logarithm of spike intensity that is explained by either the constant term alone (outside of the RP) or the constant plus the weighted contribution of the millisecond interval transpired since the last spike. Consequently, inside the RP, λ is modeled as the multiplicative contribution of the exponentiated constant and last-spike factors (

*e*

^{βO}

*e*

^{βj}, for

*j*=

*t*-

*s*), and as

_{*}(t)*e*

^{βO}outside of the RP.

The model was fit to the spike train data using Matlab *fitglm,* which applies the conventional iteratively re-weighted least squares (IRLS) method to seek the log likelihood-maximizing parameters. Note that the log likelihood function for a model such as (3) is known to be concave [9], thereby implying that if a maximum exists, it must be unique.

##### Spectral Analysis of Model Residuals

Given a set of parameter estimates , the raw residuals of the modeled spike train (the observed - fitted values) are calculated as
where *I(t)* represents the binary spike train input to the PPM. The final step of the residuals method is submission of *r(t)* to spectral analysis. Note that (4) only defines *r(t)* for *t* ≥ + 1 ms. To maintain alignment with the original time series (and the divisibility by the 1024 ms segment size), the initial 1024 ms segment submitted to *pwelch* consisted of the initial 1024- ms of computed residuals, demeaned and prepended with zero-padding of ms.

##### Clarifications and Relationship to Pre-existing PPM Approaches

We conclude this description of the residuals method with two points concerning the “bounded last-spike” PPM in particular. The first point concerns the use of the term “bounded”: Here, we intend to emphasize the bounded, -millisecond historical window within which the most recent spike must fall, to be eligible to contribute to the spike intensity prediction at time *t*. We make this point to prevent confusion with another common usage of “bounded”: For a discrete-time point process, the count of events that may occur in each sampled time bin is expected to be bounded in [0,1], thereby satisfying the conditions for “orderliness” [35]. Although our model was not named for this type of boundedness, we note that the orderliness expectation is met by the synthetic time series we are analyzing (given the constraints of our simulation framework) and is very likely to be met by our sample of empirically-acquired spike trains (given the typical assumption of a 1 ms lower bound on the neuronal refractory period [31]).

As a second point, we note that the bounded last-spike PPM draws upon multiple features of pre-existing PPMs in the spike modeling literature, although it is also important to clarify how our current implementation differs from these earlier approaches. For this purpose, two categories of pre-existing PPMs are relevant. The models in both categories aimed to account for spike activity relatively comprehensively (in contrast to our model, which aims to selectively capture the RP effect).

A principal representative of the first category is the Inhomogeneous Markov Interval (IMI) model of [33]. The IMI model predicts spike intensity as a function of the time elapsed since the most recent spike (*t-s*(t)*, with no bound imposed) as further modulated by a function the current clock time (*t*); the two functions’ forms are estimated with regression splines. Therefore, our current model shares the IMI model’s incorporation of a *t-s*(t)* term, but differs in its exclusion of the clock time term, and also of the *t-s*(t)* information when *t-s*(t)* > . Our PPM also differs in its modeling of the *t-s*(t)* effect with a series of delta functions, as opposed to the splines used by [33]. We did test a version of the residuals method that replaced the delta functions with cubic splines (specifically, the modified cardinal splines from [34]), and that adjusted the number and location of the knots according to rules that depended on the magnitude of . The spline-based model was found to perform slightly worse than the delta function version, as demonstrated through the same evaluation procedures as were used to compare global and local shuffling. See the Github repository for code sufficient to reproduce this result.

The second category consists of autoregressive PPMs which predict spike intensity as a function of the spike counts recorded for each of a series of historical time bins (e.g., [25, 26]). Specifically, these models consider a long historical time interval (up to *t* - 150 ms in [25]), divide this interval into time bins of interest, and use the counts of spikes recorded in each bin as individual predictor variables in the PPM. Typically, a subset of narrow (e.g., 1 ms) bins positioned over the most recent historical timepoints (e.g,. −10 through −1 ms) are assumed to capture the combined effects of both the recovery period and any other short-lag history effects (e.g., bursting). Our present approach draws upon these PPMs’ use of indicator functions to model history effects limited within bounded historical intervals, but differs in its exclusive focus on the most recent millseconds, and on the timing of the most recent spike that occurred within that historical window.

#### Evaluation of the PSD Correction Methods

Comparison of the residuals and shuffling methods entailed (1) simulation of synthetic spike trains over a range of physiologically-relevant parameters, (2) generation of the two methods’ corrected PSDs, with tabulation of hit and false alarm rates for these PSDs across a range of α_{c} levels, and (3) comparison of the methods’ classification of true and false oscillations, by way of ROC (Receiver Operating Characteristic) analysis (collapsed across parameter settings) and regression analysis of the residuals-shuffling differences in the hit and false alarm rates (to examine parameter effects).

##### Synthetic Spike Train Test Sets

Method evaluation was first carried out on a primary synthetic test set; subsequently, we generated a number of additional test sets to address follow-up questions. For each test set, and for each of the unique combinations resulting from the crossing of the varied simulation parameters, we randomly generated one hundred spike trains.

For the primary test set, 100 spike trains were simulated for each of the 540 unique combinations that resulted from crossing the following four varied parameters:

Spike train duration (

*T*): {30,60,120} × 1024 ms windowsOscillation frequency (

*f*): {7,9,12,20,32} Hz_{osc}Base firing rate offset (

*p*, such that_{base_offset}*p*=_{base}*f*+_{osc}*p*): +2_{base_offset}^{[0:1:5]}HzModulation index (

*m = p*): [0:0.2:1]_{osc}/p_{base}

These parameters emphasize relatively low duration and firing rate settings, with the short durations consistent with the limitations that can be posed by real experimental data, and the low FRs in line with the motivation for developing the residuals method. At the same time, the setting of *p*_{base} as a positive offset from *f*_{osc} ensured that, in most cases, the synthetic unit averaged > 1 spike per oscillation cycle, although the recovery period implied that the average dropped to slightly < 1 in some of the *f _{osc}* + 1 cases. For

*m*, note that the inclusion of 0 allowed for tracking of false positive detections of significant power when no oscillatory effect was introduced into the simulation.

Three kinds of variants on this primary dataset were created to address follow-up questions. First, we generated one dataset in which *p _{base}* and

*p*were directly manipulated, leaving

_{base_offset}*f*=

_{osc}*p*-

_{base}*p-*as the indirectly manipulated variable. This dataset allowed for tests to determine whether the methods’ relative performance was significantly influenced by the magnitude of the

_{base_offset}*p*difference, when controlling for

_{base}- f_{osc}*p*itself. For this dataset, the simulations used the same

_{base}*T*and

*m*settings as were used for the primary dataset. We crossed these factors with two levels of

*p*(15 Hz, 20 Hz) and 4 levels of

_{base}*p*([2:2:8] Hz), resulting in 144 unique parameter combinations.

_{base_offset}The second and third variations involved the use of higher FRs, and alternatives to the default RP parameters, respectively. For the high-FR simulations, the set of sampled *p _{base_offset}* values was changed to [35:10:85] Hz; these were added to the

*f*values used for the primary dataset ((2) in the list above) to determine the

_{osc}*p*settings. For the altered-RP simulations, we tested both shorter and longer RP durations (

_{base}*n*) and varying exponential steepness terms (

_{r}*k*); full details are described in the Results.

##### Labeling of Hits and False Alarms

For each of the 540 x 100 (or 144 x 100) synthetic spike trains in a test set, the shuffling and residuals methods were both applied to produce corrected PSDs. Given a corrected PSD and an α_{c} setting, statistical testing returned a binary vector indicating whether oscillatory power exceeded the CI bound for each of the 102 frequency bins of width *Δf* = 0.9766 Hz in (0,100] Hz. Frequency bins are labeled by their lower bound (0.9766, 1.9531, 2.9297…). Let **f** denote the frequency labels and s(f) denote the binary significance output. We classified a PSD as containing a “hit” (i.e., a true positive detection of oscillatory power) if (1) m > 0 and (2) any s(f) = 1 for those labels *f*_{i} ∊ **f** that were the 3 nearest neighbors of *f*_{osc} (e.g., *f*_{i} ∊ {10.7422,11.7188,12.6953} for *f*_{osc} = 12 Hz). We classified a PSD as containing a false alarm (FA) if either of two cases held true: (1) *m* = 0, s(f) = 1 for any label *f*_{i} ∊ **f**, or (2) *m* > 0, s(f) = 1 for *f _{i}* ∊ |

*f*

_{i}-

*f*

_{osc}| > 5 Hz.

Note that the output of this classification procedure was a pair of binary labels for each PSD, indicating whether each contained ≥ 1 hit, and ≥ 1 FA, respectively. For the ROC analysis described below, counts of hits and FAs refer to the counts of these PSD-level labels (as opposed to the total counts of individual hit- or FA-classified frequency bins).

##### Comparison of ROC curves for the Correction Methods

ROC analysis of the shuffling and residuals methods consisted of three steps: (1) random selection of subsamples of the PSDs, to be used for a bootstrapping procedure for statistical testing, (2) calculation of partial ROC curves and corresponding partial area under the curve (pAUC) values for the subsampled PSDs, and (3) statistical comparison of the pAUC values yielded by the shuffling and residuals methods.

The subsampling procedure was iterated 1000 times. For each subsampling iteration, we randomly sampled, without replacement, 20 of the 100 synthetic spike trains generated within each of the 540 (or 144) parameter combinations. The full resulting subsample of 540 x 20 = 10800 spike trains (or 144 x 20 = 2880 spike trains), and the corresponding hit and FA classifications for the shuffling- and residuals-corrected PSDs, were used to generate separate shuffling and residuals ROC curves for that iteration.

For a given subsampling iteration and correction method, we formed an ROC curve as the function relating hits to FAs over a range of α_{c} values. The full set of α_{c} values consisted of the sorted vector of the values 1 and 5 multiplied by a series of negative powers of 10 ([1 5] ^{T} * [10^{[-8:1:-1}]) and 1. The curves were formed by collapsing across the simulation parameter combinations, such that a point on an ROC curve represented the fraction of the 10800 corrected PSDs that were associated with ≥ 1 hit and ≥ 1 FA, respectively. (Note that the inclusion of the *m* = 0 case implies that the hit rate would fall short of an upper asymptote of 1 for an ideal observer.)

For each analyzed dataset, the above procedure resulted in 1000 pairs of shuffling- and residuals-generated ROC curves. As the ranges of spanned hit and FA rates varied slightly across the individual ROC curves, we analyzed the classification performance represented by these curves with a partial area under the curve (pAUC) approach [36]. Specifically, the raw curves were cropped (using linear interpolation as needed) to span the intersection of all 2000 curves’ FA ranges. For each cropped curve, we used the trapezoidal rule (Matlab *trapz*) to compute the area under this bounded curve region. Due to the bootstrapping process, the paired differences in the shuffling and residuals methods’ pAUC estimates (D_{pAUC}(i) = pAUC_{res}(i) – pAUC_{shuf}(i)) were assumed to approximate a normal distribution; therefore, a two-tailed paired *t*-test was used to assess the null hypothesis of D_{pAUC} = 0.

##### Analysis of Simulation Parameter Effects on Relative Hit and False Alarm Rates

To examine the simulation parameter effects, we focused on a fixed α_{c} = .05 threshold, and applied regression analyses to the residuals-shuffling differences in HRs (D_{HR}) and FAs (D_{FA}) separately. To avoid specific distributional assumptions regarding D_{HR} and D_{FA}, we conducted these analyses using the same 1000 subsamples as were used for the pROC tests. We considered all simulation parameter combinations, with the exception of those for which *m* = 0 (to omit cases when D_{HR} = 0 - 0 by definition, and with this same exception applied to the D_{FA} analyses for consistency). Consequently, 450 x 1000 (or 120 x 1000) unique D_{HR} and D_{FA} estimates, respectively, contributed to the two analyses, for each dataset on which they were performed.

Separate general linear models for the two dependent variables (DVs), D_{HR} and D_{FA}, were constructed using five independent variables (IVs): the mean-centered equivalents of the four directly manipulated simulation parameters (*T*, *f*_{osc}, *p _{base_offset}*, and

*m*for most datasets, with

*f*replaced with

_{osc}*p*when

_{base}*p*was directly manipulated), and also the square of the mean-centered modulation strength parameter ((

_{base}*m*-

*m*-)

^{2}), given trends in the data visualizations that were suggestive of quadratic effects of the

*m*factor (Fig. 4A-B, Fig. S2). For each DV, the full GLM consisted of the intercept term plus the sum of the main effect, 2-way interaction, and 3-way interaction terms formed from the five IVs.

##### Post-Hoc Analyses of False Alarms in Spike Trains with High Firing Rates

In addition to the above ROC and regression analyses, we ran an additional set of post-hoc analyses to the synthetic dataset that had been generated with high FR settings (see “Synthetic Spike Train Test Sets” above). These follow-up assessments were motivated by the observation of unexpectedly high FA rates, especially for spike trains generated with long *T* and high *m*, and either low *f _{osc}* (residuals method) or high

*f*(shuffling method; see Fig. S8-S10). We report three analyses, all of which summarized over the 1000 spike train subsamples, and focused on the

_{osc}*T*= 120 x 1024 ms and

*m*= 1.0 conditions.

The first analysis was a linear regression consisting of two IVs (*f _{osc,} p_{base_}*

_{offset}) and the DV (the subsampled D

_{FA}scores). Consequently, 30 (= number of

*f*x

_{osc}*p*

_{base_}_{offset}combinations) x 1000 (=

*n*subsamples) unique D

_{FA}estimates contributed to this analysis. The regression model included both main effects and the

*f*x

_{osc}*p*interaction term.

_{base_offset}The second and third analyses entailed the calculation of descriptive statistics. These statistics were computed separately for the two correction methods. For one set of descriptive statistics, we report the mean absolute FA rate for each of the two methods and the two most extreme *f*_{osc} cases (7 Hz, 32 Hz); this value simply corresponded to the grand average of the FA rates, collapsed over the 1000 subsamples and six *p _{base_offset}* settings. For the other set, we report summaries of the peak power (and associated frequencies) for the PSDs’ false alarm points, expressed as a ratio over the peak power for the hit points. For each correction method, we iterated over the 1000 subsamples of spike trains, and for each subsample, over the 30 unique

*f*x

_{osc}*p*

_{base}_{_offset}setting combinations. Within each subsample, each unique

*f*x

_{osc}*p*

_{base}_{_offset}setting combination is represented by 20 spike trains (given the constraint that

*T*= 120 x 1024 ms and

*m*= 1.0). For the correction method under consideration, we identified the FA and hit points present within the 20 respective corrected PSDs. For each PSD, the peak (i.e., maximum) power estimates were identified over any identified FA and hit points, respectively. For a PSD that included both FA and hit points (as was common in the

*T*= 120 x 1024 ms,

*m*= 1.0 cases), we computed the peak

_{FA}/ peak

_{Hit}power ratio. These peak ratios and the associated peak FA frequencies were then averaged over the PSDs (for which they were defined), and then averaged over the 30 parameter setting combinations and 1000 subsamples.

### Experimental Data Methods

#### Subject

We analyzed data from a single rhesus monkey (monkey G, 7.1 kg, female) which was one of two animals studied in an investigation of MPTP (1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine)-induced parkinsonism. Here, we selectively analyzed data from the post-MPTP state. A prior report based on the pre-MPTP data [30] provides detailed explanations of several components of the experimental protocol that are also applicable to the post-MPTP dataset; for these components, we briefly recap the main points below.

#### Task

Single-unit extracellular signals were collected during performance of a visually-cued reaching task. on which monkey G had been trained to a high degree of proficiency. Food rewards were delivered for successfully performed trials. Full task details are available in [30]. Here, the most relevant task component is the pre-reach delay period that spanned a variable duration interval (1.01 - 3.50 s for the trials included in our analyses) from trial start to go-cue onset. During this delay period, the left hand rested at a start position (a metal bar) fixed at waist height to the animal’s left. The onsets/offsets of hand contact with this start position were detected with an infrared sensor.

#### Surgery

Devices were implanted surgically to allow access to the right thalamus and globus pallidus via a parasagittal approach. The surgery was performed under sterile conditions with ketamine induction followed by Isoflurane anesthesia. Vital signs (i.e., pulse rate, blood pressure, respiration, end-tidal pCO2, and EKG) were monitored throughout the surgery to ensure a proper level of anesthesia. Analgesics and antibiotics were administered prophylactically.

On protocols.io, we have posted two step-by-step protocols relevant to the surgical procedures (craniotomy https://www.protocols.io/private/8791490F352811EEAFC70A58A9FEAC02 and head fixation post and recording chamber implantation https://www.protocols.io/private/A317A58E352711EEAFC70A58A9FEAC02).

#### MPTP Administration

The animal was rendered hemiparkinsonian by injection of MPTP into the right internal carotid artery (0.5-mg/kg, [37]). This model of parkinsonism was chosen to facilitate care of the animal and to increase the likelihood that the animal would perform the behavioral task following intoxication [38]. The MPTP administration procedure was performed under general anesthesia (1-3% Isoflurane) and prophylactic antibiotics and analgesics were administered post-surgically. The animal developed stable signs of parkinsonism contralateral to the infusion (i.e., on the left side of the body).

On protocols.io, we have posted a protocol that includes a description of the procedures for internal carotid artery administration of MPTP (https://www.protocols.io/private/7BED34B9490A11EEACE60A58A9FEAC02).

#### Data Acquisition and Spike Sorting

Regions of GPi and VLa to be targeted for single-unit recording were located using standard neurophysiologic mapping methods [30]. The extracellular spiking activity of neurons in GPi and VLa was collected using multiple glass-insulated tungsten microelectrodes (0.5–1.5MΩ, Alpha Omega Co.) or 16-contact linear probes (0.5–1.0MΩ, V-probe, Plexon Inc.). Data were amplified (4×, 2Hz–7.5kHz), digitized at 24 kHz (16-bit resolution; Tucker Davis Technologies), and saved to disk as continuous data. The stored neuronal data were high-pass filtered (Fpass: 300Hz, Matlab *firpm*) and thresholded, and candidate action potentials were sorted into clusters in principal components space (Off-line Sorter, Plexon Inc.; RRID:SCR_000012). Clusters were accepted as well-isolated single-units only if the unit’s action potentials were of a consistent shape and could be separated reliably from the waveforms of other neurons as well as from background noise throughout the period of recording. Times of spike occurrence were saved at millisecond accuracy.

On protocols.io, we have posted protocols relevant to electrophysiological recording (https://www.protocols.io/private/7CF970AD3BE111EE882A0A58A9FEAC02) and recording chamber maintenance (https://www.protocols.io/private/F56F59C63BDD11EE882A0A58A9FEAC02).

#### Task-Aligned Analysis Windows

To focus on intervals of the monkey G spike trains when oscillatory activity was expected to be high, we restricted our analyses to task windows spanning the first second of each trial. These windows sampled data from the pre-reach delay periods; such intervals of enforced hand stillness have been historically associated with elevated beta power [39]. The 1000 ms window was chosen instead of the PSD segment size (1024 ms) because a small fraction of the go-cue onsets occurred at post-start latencies in (1000, 1024] ms. This selection step jointly served a second purpose of reducing potential sources of substantial non-stationarity in the analyzed spike data (along with a firing rate screening step, described further below).

#### Unit and Trial Selection Criteria

The process of selecting units and trials to include in analysis proceeded in three stages. As a first step, we excluded units recorded from the GPi and VLa when these failed to meet basic quality checks (e.g., poor isolation during spike sorting, or highly anomalous spike waveform features) or exhibited outlier FRs (as measured over the full recording period) of > 106.48 Hz (GPi) or > 39.50 Hz (VLa). Units with FRs < 1 Hz were also excluded.

A second, trial selection step was prompted by the observation of units for which the “hold” (i.e., pre-reach delay) period-windowed data exhibited visible inhomogeneity in firing rates over trials, often appearing as blocks of trials during which spiking was either absent or very sparse, preceded or followed by blocks for which spiking was more regular. As this pattern was suggestive of considerable non-stationarity remaining in the hold period-restricted data, we implemented a procedure to detect and remove windows associated with trials with aberrant firing rates. This procedure itself entailed multiple steps:

Prepare an

*nW*x 1 vector**S**indicating the number of spikes in each 1000 s window.Remove those rows from

**S**reporting < 2 spikes.Use a Gaussian Mixture Model (GMM) to cluster the remaining entries of

**S**, using a forward search procedure to set the number of Gaussians (*k*) parameter:For each

**k**in 1 to length(**S**):Fit

*k*Gaussians to the data in**S**(Matlab*fitgmdist*) using initial centers (*µ*), variance (*σ*), and mixing proportion (^{2}*p*) settings as determined by the Variance Partitioning method [40].Record the Bayesian Information Criterion (BIC) value for this

*k*. The BIC is a negative log-likelihood-based measure that is penalized by the magnitude of*k*; lower BIC values are preferable.Break out of the loop when either:

BIC(k) > BIC(k-1)

A failure to converge is reported for iteration

*k*.

Upon exiting the loop, return the GMM with the number of Gaussians set to

*k*-*1*.

Assign each row in

**S**to one of the*k*Gaussians (hard clustering with Matlab*cluster*, which assigns the cluster labels that maximize the probability of observing the data, given the GMM parameters).Find the Gaussian

**G**with the maximum number of assigned windows. Retain only those windows from**S**that were assigned to**G**.From the windows assigned to

**G,**remove those windows for which the number of spikes*nSpk*is an outlier, given the mean and standard deviation parameters estimated for**G**. A window is defined as an outlier when |*nSpk*-*µ*| > 3*_{g}*σ*)._{g}

For each unit, the output of this trial (i.e., window) selection step was assumed to represent a set of 1000 ms segments for which the firing rates were relatively homogeneous. As a final step, we discarded those units for which < 30 windows survived the trial selection procedure, or for which the mean firing rate over the surviving windows was < 1 Hz.

#### Windowed Data: Uncorrected PSD

Note that Welch’s method processes time series segments independently (i.e., the final estimated PSDs are the average of the individual PSD estimates for the Hamming-windowed 1024 ms data segments). Prior to calling *pwelch*, we prepared the hold period data by demeaning and adding 24 ms of zero padding to each 1 s hold period vector, and then concatenating these 1 s vectors into a single time series. This vector was then submitted to identical PSD analysis procedures as were described for the synthetic data.

#### Windowed Data: Shuffling-corrected PSD

Similar to the synthetic dataset, many of the recorded units exhibited low FRs, such that local shuffling was not a practical option. We extended the global shuffling option from [7] (which was only defined for unbroken time series) to temporally separated data segments with the following algorithm (all indices assume indexing starts at 1):

Input:

**dMat**(“delta matrix”), an*nD*x*T*matrix, consisting of the original*nD*delta vectors of length*T*msec (here,*T*= 1000;*nD*= number of trials).Create a list,

**sISI**, consisting of the concatenation of all ISIs encountered in**dMat**, sorted in descending order of size. (Note that**sISI**will effectively function like a max-oriented priority queue.)Initialize an

*nD*x*T*matrix**wMat**(“working matrix”) with zero entries.Initialize an

*nD*x 1 vector**timeLeft**with entries of the value*T*-1 (here, 999), corresponding to the size of the largest ISI that might be inserted into each row of**wMat**. (Note that 1 ms must be reserved for insertion of the first spike for the first ISI).While

**sISI**is not empty:**isi**← top (max duration) entry, removed from**sISI****openRows**← indices of rows in which**timeLeft**≥**isi****targetRow**← randomly chosen row from**openRows**Insert

**isi**into**wMat[targetRow]**:If this is the first ISI entered into targetRow, place “spikes” (1 entries) at

**wMat[targetRow][1]**and**wMat[targetRow][1+isi]**Otherwise, place a spike at

**wMat[targetRow][i+isi]**, where**i**is the index of the rightmost spike present in the row

**timeLeft[targetRow] ← timeLeft[targetRow] - isi**

For each row in

**wMat:**If > 2 spikes are present in a row, randomly re-sequence the ISIs (as in the standard shuffling procedure)

If

**timeLeft[row]**> 0: Right shift the spikes (insert a leading series of 0s) by a random integer drawn from [0,**timeLeft[row]**]

**shufMat**(“shuffled matrix”) ←**wMat**

To summarize, the algorithm implements global ISI shuffling by (1) initializing an empty data matrix, and a pool of ISIs to be inserted into it, (2) randomly inserting ISIs into the new matrix, proceeding from the longest to shortest ISIs, and packing the rows in a left-aligned fashion, and (3) randomly re-sequencing and right-shifting the ISIs in each row, to eliminate the structure introduced by step (2).

Each matrix of shuffled spikes was submitted to PSD analysis through an identical manner as was described for the uncorrected PSDs, and the standard division of the original PSD by the mean of the shuffling-generated PSDs was then used to produce the corrected PSD.

#### Windowed Data: Residuals-corrected PSD

Recall that, in the case of an unbroken time series, the residuals output, *r(t)*, is not defined for the first ms. In the case of the windowed data, the initial ms must be dropped for the residuals output corresponding to each trial. Therefore, the delta vectors corresponding to each individual trial were zero-padded by 24+ ms, to both account for the initial RP-dependent lag, and to match the expected 1024 ms segment size. All other aspects of the residuals method remain as described in the context of the unbroken synthetic data.

#### Reference Synthetic Data for Comparison with Bursting Units

We conducted follow-up examination of two VLa spike trains, which exhibited spiking timing patterns and corrected PSDs that were distinct from those generally observed in our synthetic datasets. To better characterize the distinctive features of these spike trains, which included suspected burst-firing behavior, we generated additional sets of reference spike trains, which did not include a bursting component, but did roughly match the two VLa units with respect to other key parameters. All simulations used the same framework as described above. For the synthetic data matched to the first VLa unit, *T* = 118 x 1024 ms, *f _{osc}* = 13 Hz, and

*p*= 15 Hz. For the simulations matched to the second VLa unit,

_{base}*T*= 51 x 1024 ms,

*f*= 14 Hz, and

_{osc}*p*= 21 Hz. For both sets of simulations,

_{base}*m*= 0.6,

*n*= 1, and

_{r}*k*= 0, and

*n*= 100 synthetic spike trains were generated.

#### Population-Level Comparisons of Oscillation Detection Rates

We used three analyses to compare the counts of significant oscillations reported by the two methods, for specific frequency bands of interest. All three analyses were performed over the pooled GPi and VLa data. The first analysis compared the proportions of PSDs for which the residuals and shuffling methods reported at least one point of significant power within the 8-30 Hz alpha-beta band. Proportions were compared with McNemar’s χ^{2} test, with continuity correction (R *mcnemar.test*, R Core Team, 2021; RRID:SCR_001905). The second analysis focused specifically on those PSDs for which at least one method reported an alpha-beta oscillation (as defined above), and used logistic regression (Matlab *glmfit*, logit link) to ask whether the corresponding unit’s mean FR inversely predicted residuals-specific identification of alpha-beta incidence. The dependent measure was a binary coding of the PSDs (1 = alpha-beta detected by residuals method alone; 0 = detection by either both methods or shuffling only). The third analysis used McNemar’s χ^{2} to compare the proportions of PSDs for which the residuals and shuffling methods reported at least two consecutive points of significant power in the (0,4] Hz delta range. This delta band measure was intended to serve as a substitute for more traditional forms of 1/f-like trend assessment (which involve curve fits that do not accommodate possible RP distortion [3]).

## Results

### Results Overview

In the following, we first consider the shuffling and residuals methods in the context of synthetic spike trains. We will highlight two example cases that aid in illustrating the procedures and outputs of the two correction methods, and then present the results of a systematic comparison of the methods’ performance over simulations that encompass diverse parameter settings. Subsequently, we present the shuffling- and residuals-corrected output for the empirical single-unit data collected from the parkinsonian monkey, and highlight cases in which the divergence of the two methods’ output is particularly notable.

### Synthetic Data Results

#### Shuffling Method: Examples of Detected and Non-Detected Oscillations

Stochastic spike trains were generated according to the framework described in [2, 7]; full details are provided in the Materials and Methods. To briefly recap, spike arrivals were sampled from 1 kHz rate functions, *p*_{spk}(t), which could be varied with respect to the parameters that governed the exponentially rising recovery period (steepness *k*; duration *n _{r}*), the oscillation (frequency

*f*; modulation strength

_{osc}*m*), the baseline FR (

*p*), and the simulation duration (

_{base}*T*). We considered a variety of parameter settings, although with an initial priority on relatively low

*p*values, given the challenges posed by cases in which oscillation detection may be particularly reliant on the information present in the ISI distribution.

_{base}Figure 2 presents two illustrative cases generated from these simulation procedures (spike train duration *T* = 60 *x* 1024 ms, *f*_{osc} = 9 Hz; for all synthetic data, *n _{r}* = 9 and

*k*= 0.7, unless otherwise specified). The two cases are similar in that standard statistical assessment of the uncorrected spike trains fails to detect the underlying 9 Hz oscillatory component, but differ in the factors that likely contribute to this failure, and in the shuffling method’s ability to recover this spectral feature.

Figure 2A shows the uncorrected spectra. As compared to the second (Fig 2A *right*) example, the first spike train was generated with a relatively high base FR (41 Hz), but with a somewhat low amplitude modulation factor (*m* = 40%). As noted in the Introduction, high FRs lead to more dramatic RP-related distortions of the associated spectra [7]. Given these combined limitations (low *m*, greater distortion), the 9 Hz peak fails to cross the upper bound of the 95% corrected CI. The second spike train was generated with a stronger modulation setting (*m* = 60%), and a lower base FR (11 Hz), which results in less dramatic RP distortion; however, these advantages are offset by the reduced oscillation signal-to-noise ratio (SNR) that is also a known consequence of low FRs [2].

To both of these spike trains, we applied the standard global shuffling procedure (Fig 2B), yielding estimates of the renewal-equivalent power spectra (Fig 2A, red lines) and the shuffling-corrected PSDs. For the high FR, low *m* example, the mean shuffled data PSD follows the RP distortion trend while avoiding the implanted oscillation. Consequently, division by this shuffling-generated PSD achieves its intended outcome: The corrected spectrum (Fig 2C *left*) is generally flat, apart from the 9 Hz peak, which does surpass the upper frequency-derived CI threshold.

However, such successful peak recovery is not observed for the low FR, high *m* example (Fig 2C *right*). In this case, the shuffled PSD appears to capture both the distortion trend, and, to a modest extent, the 9 Hz peak. Correction reduces the relative height of this spectral feature, thereby compounding the challenges to detecting the spectral peak that were already present due to the low SNR. Even with the RP distortion removed, the 9 Hz peak falls just short of the significance threshold.

Figure 2D, which plots the ISI PDFs for the two cases, provides a time domain perspective on these different shuffling outcomes, by highlighting the spike timing information that shuffling correction removes. In the higher FR, lower *m* case (Fig 2D *left*), > 50% of the ISIs were ≤ 23 ms, thereby limiting the opportunity for the somewhat modest amplitude, 9 Hz oscillatory trend (period ≈ 111.11 ms) to substantially influence the shape of the ISI distribution. Qualitatively, the PDF resembles the form expected for a Poisson process modulated by a recovery period alone (i.e., an initial dip followed by an exponential distribution). For the lower FR, higher *m* case (Fig 2D *right*), 50% of the ISIs exceeded 73 ms (≈65.7% of the 9 Hz cycle length), and > 31% were 112 ms or longer (i.e., the spiking skipped entire cycles). These long survival times afforded greater opportunity for the moderately strong oscillation term to more visibly shape the ISI distribution, with relative peaks and valleys appearing in rough correspondence to that expected for the 9 Hz rhythm.

To summarize, the lower FR/higher *m* example illustrates an instance relevant to the general caveat raised by [7]: If much of the rhythmicity of interest is present in the ISI distribution, then caution is warranted when applying the shuffling method (or the analytical alternative), as such correction will remove any oscillatory information present in the first-order ISI statistics. What our example here helps to clarify is that such scenarios emerge not only from precise pacemaker-like patterns, but also in spike trains in which a known sinusoidal drive is expressed through relatively sparse spiking. This observation motivated our interest in examining how the residuals method, which models and removes spike history effects restricted to a short history window, might perform in such scenarios.

#### Residuals Method: Procedure and Example Case Performance

Figure 3 illustrates the steps of the residuals method, and the outcomes for the two example synthetic spike trains introduced above. To reiterate, the main steps include (1) estimation of the RP duration, (2) fitting the spike train with the bounded last-spike point process model (PPM), with the bound set to the RP duration, and (3) applying spectral analysis to the residuals of the fitted PPM.

##### RP Estimation Algorithm

The RP duration estimation method is illustrated in Fig. 3A-B (for the two example cases) and also Fig. S1 (for a spike train with no oscillation). To recap (see Materials and Methods for full details), we sought to translate the general strategy outlined by [27] into a specific algorithm that might perform with reasonable effectiveness on limited, sparse spiking data.

We specifically implemented a procedure that compares the variance explained by a series of right-shifted exponential curves fit to the ISI PDF. We pursued this approach after observing, on cases such as the Fig. 2-3 examples, that the curve goodness-of-fit often reaches its first local maximum when the exponential is anchored to the first post-RP time bin (i.e., when a unit with *m* = 0% should first exhibit ISIs consistent with Poisson spiking), even when noise and oscillatory trends create visible departures of the post-RP region from a clean exponential form. In Fig. 3A we highlight select iterations of the right-shifted curve fits, for the two examples. When the curves are left-anchored to the first possible start lag (1 ms), one observes visibly poor fits over the initial, RP-spanning time bins. As shown in Fig. 3B, these mismatches correspond to relatively low values for our goodness of fit metric (the deviance difference, see Materials and Methods). Iterating through the subsequent RP-containing bins, the fit measure gradually improves, reaching local peaks (in both examples) at the first post-RP time bin of 10 ms. After detecting subsequent decreases in the exponential term’s explanatory power at 11 ms, we confirm the presence of initial local maxima, and return, for both spike trains, RP duration estimates () that match the ground truth *n _{r}* of 9 ms.

Note that more thorough evaluation of RP estimation performance is provided in the Methods Evaluation section, where we report on the systematic comparison of the shuffling and residuals methods.

##### Bounded Last-Spike Point Process Model

The next step of the residuals method entails the estimation of the shape of the RP (i.e., the *t-s _{*}(t)* effect; see Materials and Methods) bounded within a historical interval of millseconds. Fig. 3C illustrates the inputs and outputs of our current estimation procedure on the two example cases and with = 9 ms. As in global shuffling, the input (leftmost panels, zoomed in on the first 209 ms) was the entire binary spike train vector.

As stated in the Introduction, we chose to use the point process modeling (PPM) framework to estimate the *t-s _{*}(t)* effect. In general, PPMs may operate on binary incidence vectors to estimate discrete-time approximations of conditional intensity functions (CIFs; see [9, 31] for additional background). When PPMs are used to predict future spikes on the basis of spike history, the CIF (

*λ*[

*t*|

*θ*,

*H*(

*t*)]) represents the time varying probability that a spike will occur in time bin

*t*, given the spiking history

*H(t)*and the model parameters (ϴ). PPMs can be implemented in a variety of frameworks; we chose the well-established route of utilizing the Poisson Generalized Linear Model framework (i.e., linear regression with a log link function). The GLM took on the form , where β

_{0}is the intercept term, the terms indicate whether the most recent spike fell within the time bin corresponding to

*j*ms in the past, and the

*β*weight the additive contributions of these last-spike lags to the predicted logλ(t). Once the vector of coefficient estimates is found, the linear expression is exponentiated to obtain the fit of the model to the original spike train (excluding the first time bins for which no fit is defined). Due to the exponentiation step, the constant and the last-spike factors combine multiplicatively in the estimation of λ(t) [9].

_{j}Qualitatively, the GLM fits to the two example spike trains (Fig. 3C, center panels) do appear to approximate the forms anticipated on the basis of the ground truth RP and *p _{base}* parameters. When no spikes occurred in the prior ms, the predicted λ(t) were approximately 0.041 and 0.011 for examples #1 and #2, respectively. When spikes did occur, the predicted λ(t) approximated the gradually rising RP trend.

##### Spectral Analysis

Given a GLM fit, the residuals time series is generated as the actual-predicted time series (Fig. 3C, rightmost panels; note the initial time bins are filled with zero padding). The residuals are submitted to identical spectral analysis steps as are applied to the original time series. For the two example cases (Fig. 3D), the known 9 Hz peaks are successfully recovered, including for the lower FR spike train (Fig 3D, *right*) for which the shuffling method did not detect significant oscillation. This sensitivity gain – in tandem with the desired flattening of the RP distortion – is the result that we aimed to achieve by selectively removing bounded *t-s _{*}(t)* effects, and leaving intact the temporal structure expressed by the longer lags of the ISI distribution.

#### Methods Evaluation

##### Oscillation Detection Performance: Low-to-Moderate Spike Rates

So far, we have shown a pair of example spike trains, for which the residuals method either matched or exceeded the shuffling method’s detection of a known oscillation. To compare the two methods comprehensively, we ran a series of simulations and analyses that took into account a broad set of factors that may influence method performance, including the simulation parameter settings, and the corrected alpha level (α_{c}) of the significance test. Evaluation began with a primary synthetic dataset that focused on low-to-moderate base firing rates (range 8-64 Hz), oscillations in the theta-low gamma ranges (7-32 Hz), and the default recovery period parameters; we also generated a series of secondary datasets to address follow-up questions.

For the primary dataset, we directly manipulated four simulation parameters (*T*, *f _{osc}*,

*m*, and

*p*=

_{base_offset}*p*); the tables in Fig. 4A-B list the full set of parameter settings. For each unique setting combination, we generated 100 spike trains, computed the shuffling- and residuals-corrected PSDs, and tabulated the true and false positive oscillation detection rates for each (see the Fig. 4 caption for the hit rate (HR) and false alarm (FA) rate definitions).

_{base}- f_{osc}To compare the classification accuracy of the two methods over a range of α_{c} thresholds, we conducted a pROC (partial Receiver Operating Characteristic) analysis (Fig. 4D). This analysis was implemented over a pooled dataset (i.e., collapsing over the parameter settings in Fig. 4A-B, which are further considered in the additional analyses below). As in traditional ROC analysis, we plotted hit versus false alarm rates as the decision threshold (α_{c}) was varied, with shuffling and residuals curves generated for each of 1000 subsamples of the full spike train dataset (with the subsampling performed to obtain bootstrapped estimates of variability). Since the original curves spanned variable FA and HR ranges, which did not fully span [0,1] (even with the inclusion of extreme significance thresholds), we truncated each curve to a partial ROC estimate that covered a restricted FA range (see Fig. 4 caption for details). The above steps enabled confirmation that the partial area under the ROC curve (pAUC) was reliably greater for the residuals method as compared against the shuffling method (*t*-test of pAUC_{res}-pAUC_{shuf} : t(999) = 1.3048e+03, *p* << .001). This result indicates that the residuals method generally improves oscillation detection performance on this collapsed-parameter, primary synthetic dataset.

We have described a rationale for anticipating that the residuals method may be especially advantageous for the lowest-FR spike trains, and in particular for sensitivity to the detection of oscillations when the base FR is low relative to the frequency of the oscillation to be detected. This reasoning leads to the prediction of an inverse relationship between *p _{base_offset}* and the magnitude of the residuals-associated improvement in hit rates. A qualitative trend consistent with this prediction is visible in the top right table of Fig. 4A (representing D

_{HR}= the residuals-shuffling differences in HR, for the original, non-subsampled data). For the select simulation parameters reported in the table, the predicted inverse relationship between D

_{HR}and

*p*is apparent as a general tendency for higher D

_{base_offset}_{HR}values (and warmer heatmap colors) towards the lower rows (corresponding to lower

*p*

_{base_offset}settings) of the table. The Fig. 4A D

_{HR}values also point to a general inverted U-shaped trend relating

*m*to D

_{HR}, such that the residuals’ advantage appears to be strongest for moderate modulation strengths, as compared to the lowest

*m*levels (when the oscillation is either non-existent or too weak for either method to detect) and highest

*m*levels (when the oscillation is readily detected by both methods).

To quantify and statistically test these trends, and to assess the parametric effects on the methods’ relative sensitivity and specificity more broadly, we conducted separate regression analyses of D_{HR} and D_{FA} (FA_{res}-FA_{shuf}), with the significance threshold held fixed (α_{c} = .05), and the manipulated simulation parameters taken into consideration. As the regression analyses assume normally distributed noise in D_{HR} and D_{FA}, we conducted these regressions using D_{HR} and D_{FA} values derived from the same subsampled HR and FA output as informed the pROC analysis. Fig. S2-S3 show D_{HR} and D_{FA} (summarized as the mean +/- standard error over subsamples) for all 540 unique combinations of the manipulated parameters. The D_{HR}- and D_{FA}-specific regression models included predictors corresponding to all four directly manipulated parameters (T,f_{osc}, *p _{base_offset}*, and

*m*, with all

*m*= 0 cases excluded), and a quadratic term for

*m*.

The full regression results for the primary synthetic dataset are provided in Tables S1-S2; we highlight a subset of the significant effects here. The D_{HR} results (Table S1) statistically confirmed the predicted inverse relationship between D_{HR} and *p _{base_offset}* (

*p*term:

_{base_offset}*t*= −292.25,

*p*<< .001) and the observed inverted U-shaped relationship between D

_{HR}and

*m*(

*m*

^{2}term:

*t*= −296.41,

*p*<< .001). As an additional test of the

*p*effect – which was correlated with

_{base_offset}*p*

_{base}in these simulations, thereby creating ambiguity in the interpretation of the effect as reflecting either base FR -

*f*or the absolute base FR – we created a follow-up synthetic dataset in which a directly manipulated

_{osc}*p*factor (levels = 15 or 20 Hz) was crossed with an independently varied

_{base}*p*

_{base_offset}factor (levels = [2:2:8] Hz) and the same

*T*and

*m*parameter sets as were described above. Regression analyses of the

*D*

_{HR}estimates from these simulations (Table S3;

*D*

_{HR}values are plotted in Fig. S4) replicated the expected negative

*p*effect (

_{base_offset}*t*= −59.81,

*p*<< .001). This result supports the interpretation of the margin between the firing rate and to-be-detected oscillation frequency as independently influential on the residuals method’s enhanced sensitivity (beyond the effect of the absolute

*p*).

_{base}No specific predictions were made regarding the two methods’ relative false alarm rates, and for many of the parameter combinations represented in Fig. S3 (and Fig. S5 for the dataset with the direct *p _{base}* manipulation), the D

_{FA}values clustered near zero. Nevertheless, the very high

*n*of subsamples considered resulted in high sensitivity to parametric effects, and significant main effects and interactions were found (Table S2; Table S4); however, their modest scale leaves uncertain the practical implications of these trends. As discussed below, more noticeable trends were found upon introducing higher FRs to the simulations.

##### Oscillation Detection Performance: High Firing Rates

The residuals method was motivated by a problem presented by sparse spike trains, but we might also ask how effectively it performs with more vigorously spiking units. To this end, we created a secondary dataset that differed from the primary dataset in its use of higher *p _{base_offset}* values (Fig. S6-S8).

Three patterns stand out in the results. First, as anticipated, high FRs reduced the residuals method’s overall performance advantage: pAUC_{res}-pAUC_{shuf} remained reliably positive (Fig. S6D), but with only a narrow margin of performance improvement. Second, as also anticipated, this narrowed performance gap was mirrored by a similar narrowing in the residuals’ sensitivity advantage. D_{HR} values remained mostly positive but very modest (Fig. S7; Table S5).

Third, we observed a complex interaction pattern in the methods’ relative false alarm rates. Complete illustrations and statistical results are available in Fig. S8-S11 and Table S6. Here, we will highlight four principal observations (all based on quantities summarized over the 1000 subsamples; see Materials and Methods). (1) As duration *T* increased, and especially as the modulation strength *m* approached 100%, D_{FA} scores grew either very positive (for low oscillation frequencies) or negative (for high frequencies; Fig. S8). In other words, in the high *T*, high *m* regime, the residuals method switched from producing either more or fewer FAs than shuffling did, as *f _{osc}* increased (

*T x f*interaction:

_{osc}x m*t*= - 178.70,

*p*<< .001). (2) Although

*f*is correlated with the

_{osc}*p*parameter, the sign flip in D

_{base}_{FA}scores cannot be easily attributed to increasing FRs. In the

*T*= 120 x 1024 ms,

*m*= 100% cases, the effect of

*p*

_{base_offset}on D

_{FA}negatively interacted with

*f*(

_{osc}*t*

_{f_osc x}_{pbase_offset}= −68.23;

*p*<< .001), reflecting a tendency for the sign of the FR offset effect to likewise flip. (3) Plots of the average absolute FAs for the lowest

*f*setting (7 Hz, Fig. S9) show that in the high

_{osc}*T*, high

*m*cases, the shuffling FA rates remained mostly low (mean over all subsample

*x p*combinations = 10.16%) in contrast to the elevated residuals FA rates (overall mean = 47.55%). For the highest

_{base}*f*setting (32 Hz, Fig. S10), the high

_{osc}*T*, high

*m*cases resulted in very elevated FA rates for shuffling (overall mean = 76.09%) and less dramatically elevated FA rates for the residuals method (overall mean = 31.73%). (4) Summarizing over all

*T*= 120 x 1024 ms,

*m*= 1.0 cases, both methods’ false detections were typically characterized by very low peak power, relative to that observed for hit points (mean power

_{FA}/power

_{Hit}= 0.15 for residuals and 0.22 for shuffling) and high, gamma range frequencies (means = 83.10 Hz for residuals and 57.86 Hz for shuffling). Fig. S11 presents representative PSDs for each method, in addition to samples of the associated synthetic spike trains. Although an ideal solution would prevent the false detections, their consistent features could aid in the development of post-hoc rules for flagging suspected FA points in either method’s corrected PSDs.

##### Evaluation of Recovery Period Duration Estimation

Overall, the preceding analyses indicate that the residuals method achieves its primary objective of enhancing oscillation detection in sparse spiking scenarios, while maintaining performance comparable to that of the shuffling method when spiking is more vigorous (excluding the limited high-FA cases described above). Correspondingly, the first step of the method – the estimation of the RP duration – was also reasonably accurate. In the primary synthetic dataset, for the RP estimates collapsed across the full set of 540 x 100 spike trains (Fig. 4C), 51.07% identified the 9 ms *n _{r}* setting precisely, 85.13% erred by ≤ 1 ms, and 94.78% erred by ≤ 2 ms. The follow-up, high FR dataset also demonstrated generally low error rates (Fig. S6C: % of absolute errors, |

*err*|, ≤ 0, 1, or 2 ms: 66.22%, 93.53%, 99.87%).

To determine whether this accurate performance might generalize to other RP durations and forms, we generated three additional secondary datasets, with the original RP parameters modified to produce either a long relative RP (*n _{r}* = 18 ms, steepness parameter

*k*= 0.7), a short, sharply-rising relative RP (

*n*= 4 ms,

_{r}*k*= 0.4), or a short absolute RP (

*n*= 3 ms,

_{r}*k*= 0). Estimates clustered closely around the ground truth

*n*for both the 3 ms absolute RP (Fig. S12C: % |

_{r}*err*| ≤ 0, 1, or 2 ms: 87.28%, 96.62%, 98.90%) and the 4 ms relative RP (Fig. S13C: % |

*err*| ≤ 0, 1, or 2 ms: 84.03%, 96.58%, 99.01%). When the relative RP with the default

*k*= 0.7 steepness is doubled to 18 ms, absolute error rates are comparable to those for the original 9 ms duration if the deviations under consideration are doubled accordingly (Fig. S14C: % |

*err*| ≤ 0, 2, or 4 ms: 44.79%, 86.73%, 92.72%).

These datasets also afford the opportunity to ask whether the residuals method retains its performance advantage when confronted with variable RP features. As reported above for the default RP datasets, panels A, B, and D of Fig. S12-S14 show, for each RP setting and correction method, highlighted HR and FA outcomes for select parameter combinations, and the overall pROC and pAUC output. For all three RP configurations, residuals remains the better-performing oscillation detector (*t*- tests of pAUC_{res}-pAUC_{shuf} : *t*s(999) > 1.0608e+03, *p*s << .001).

### Experimental Data Results

#### Motivation and Approach

Further comparison of the residuals and shuffling methods, conducted on real, biological single-unit data, served two purposes. First, any qualitative discrepancies between the neuronal and synthetic spectra provide useful information that can help inform the development of more accurate simulation frameworks in the future. Second, and most importantly for the present purposes, these evaluations can help experimentalists anticipate what might be encountered when applying either PSD correction method to their own data, and determine what further development, if any, might be needed to tailor the methods to the specific needs of real-world research questions.

In empirically-observed, neuronal spike trains, the ground truth is unknowable. That said, we prioritized sampling neural data acquired during circumstances for which the prior expectation of oscillations is high, based on evidence from sources that are distinct from spike trains. Towards that end, we evaluated the two correction methods’ output on single-unit data recorded from the internal globus pallidus (GPi) and GPi-targeted ventrolateral anterior thalamus (VLa) of a monkey with MPTP-induced parkinsonism, during the pre-reach delay (i.e., “hold”) periods of a cued reaching task. Across multiple species and motor-related brain regions, pathologically exaggerated oscillations in neural activity are a hallmark of parkinsonism, especially within the beta (12-30 Hz) and neighboring alpha (8-12 Hz) frequency bands (see [28, 29] for recent general reviews, and [41, 42] for examples specific to the thalamus and GPi, respectively). Moreover, alpha-beta oscillations are especially pronounced during periods of postural maintenance, such as the hold periods under consideration here [39]. Since the bulk of the evidence concerning these oscillations has been obtained from recordings of local field potentials (LFPs), our expectation of their occurrence in spike trains is based on a source that, while ultimately derived from the same underlying electrophysiological signal, is not affected by RP-related (and general point process-related) challenges to oscillation measurement.

The shuffling and residuals methods’ output were compared for 92 GPi units (analyzed recording duration min, median, max = 30, 80.5, 262 s) and 78 VLa units (31, 74.5, 191 s). As anticipated based on previous reports [30], firing rates for the analyzed rest periods were generally slower for the VLa than for the GPi (*p* << .001 by the Wilcoxon rank sum test; min, median, max FR for VLa = 3.09, 12.24, 41.02 Hz; for GPi = 3.13, 39.90, 114.96 Hz). Our simulation results allow us to predict the likely sensitivity of the two methods for detecting oscillations in data with durations and firing rates typical of this empirical dataset. The synthetic data hit rate results for a representative low beta oscillatory frequency (12 Hz), low base firing rate (13 Hz), and short spike train duration (30 x 1024 ms; Fig. S15) predict that, for an oscillation modulation strength of at least 60%, the residuals method at minimum will detect the oscillation > 65% of the time.

Application of the two methods to the empirical dataset revealed considerable variability in the resulting corrected PSDs, but some recurring patterns do stand out. Here, we will highlight two of these patterns with four representative units (Fig. 5, Fig S16). The full population results are summarized in Fig. 6. To highlight the (0,100] Hz region over which we sought significant power, all neural spectral plots are confined to the initial 100 Hz; however, note that full spectra may be generated using code from the Github repository that accompanies this report.

#### Single Unit Examples

Fig. 5 presents two units that illustrate the first recurring pattern, which is characterized by its consistency with what the simulations would predict: beta (or alpha) peaks that were recovered by the residuals method exclusively, from sparse spike trains that share qualitative features with our synthetic data. Fig. 5A shows such an example from the VLa (mean FR = 11.90 Hz, analyzed duration = 63 s). The uncorrected PSD matches the canonical RP distortion form, with a candidate beta oscillation (peak = 13.67 Hz) sitting atop the depressed low-frequency region. The oscillation is flagged as significant in the residuals PSD, but not the uncorrected or shuffling PSD. In this case, the behavior of the shuffling method is similar to that illustrated in the right-most column of Fig. 2A: The shuffling approximation of the renewal equivalent PSD captures not only the global distortion-consistent trend, but also much of the candidate beta peak, resulting in partial removal of this feature from the shuffling-corrected PSD. The 11.90 Hz FR indicates that, when considered relative to the candidate beta oscillation’s frequency, a spike occurred on average just .87 times per cycle. Fig. 5B shows a second example of this pattern, in an unusually slowly-spiking GPi unit (mean FR = 7.65 Hz, analyzed duration = 96 s, peak of alpha-beta oscillation = 15.62 Hz).

Fig. S16 presents two VLa units that illustrate the second pattern, broadly categorized as suspected alpha-beta oscillations accompanied by temporal structure that was not incorporated into our simulations. Such additional structure included both burst firing (that is, transient intervals of elevated spike rates, see both panels A and B) and prominent 1/f-like trends (panel B). The two correction methods’ behavior could differ considerably in these scenarios, with variable outcomes for their relative sensitivity to likely oscillations.

In the first VLa example (Fig. S16A), a likely low beta oscillation (peak alpha-beta frequency = 12.70 Hz) appears in tandem with a sporadic tendency for the unit to fire in a highly stereotyped pattern of two-spike bursts. The joint presence of these two patterns is most apparent in the autocorrelation plot (top left panel). Key features of the intra- and extra-burst spiking can be discerned from the ISI PDF (Fig. S17A) and joint [ISI_{n},ISI_{n+1}] PDF (Fig. S17C, [43]): ISIs of 2-3 ms occur exclusively between pairs of spikes that are bracketed by much longer ISIs (≥ 14 ms), and all remaining spikes are separated by ISIs that last at least 7 ms (and often much longer). As such, the ISI statistics for this unit are consistent with a description of the spike train as switching between two different states, for which the effective recovery period varies from very short (intra-burst) to much longer (extra-burst).

This two-state characterization lends insight into the PSD outcomes. In the uncorrected spectrum, the beta peak is sunken into a typical distortion-related trough, and the shuffling method achieves satisfactory flattening of the spectrum and raising of the peak. In contrast, the residuals method performs very little distortion correction, due to the 1 ms estimated RP duration (see Fig. 17A-B, *left*) and resulting 1 ms bound on the last-spike effects estimated by the PPM. Such a model should adequately account for RP effects within the two-spike bursts: On simulated, non-bursty spike trains with a 1 ms recovery period and parameters that otherwise roughly matched those for this VLa unit (see Materials and Methods), the residuals method exhibited accurate oscillation detection (for *f _{osc}* = 12 Hz;

*m*= 0.6: hit rate = 100%, FA rate = 7%). The apparent under-correction in this VLa example is likely due to the insufficient removal of the long RP effects that occur outside of the bursts. In the Discussion we propose potential extensions upon the residuals method that may address the need to accommodate burst and non-burst states with differing RP dynamics.

In the second VLa example (Fig. S16B), a likely low beta oscillation (peak alpha-beta frequency = 13.67 Hz) appears alongside a somewhat different pattern of spiking, relative to the first example. The ISIs are not split across disconnected zones of the histograms (Fig. 17A, *right*), but the presence of bursting episodes is suggested by the high density of very short ISI incidence (28.08% of all ISIs ∊ [2,3] ms, compared to a mean of 4.83% for matched, non-bursty synthetic units; see Materials and Methods) and sequential co-occurrence (6.02% of all ISI_{n}, ISI_{n+1} pairs ∊ [2,3] ms, compared to a mean of 0.25% for the matched synthetic units). Therefore, it is conceivable that this unit’s spiking behavior might also be appropriately described as switching between states of higher FRs (intra-burst) and lower FRs (extra-burst). If so, whether the RP would also be best characterized as varying between the two states is unclear.

These spike pattern observations may again aid interpretation of the PSD results, to some extent. In the uncorrected spectrum, the beta peak sits atop a prominent 1/f-like trend, to which aperiodic burst-associated FR fluctuations could be a contributor [44]. The 1/f-like feature stands out more strongly than any RP distortion that may be present, consistent with the possibility that a very short RP characterizes much of the spike train. The shuffling method does remove the 1/f-like curve, but also removes the beta peak. The residuals method again performs little correction – owing to the 1 ms estimated RP – and therefore retains both the beta peak and 1/f-like trend (with many of the elevated low-frequency points of the PSD also labeled as significant). As alluded to in the Introduction, such 1/f-like features should ideally be modeled and considered an additional component of the baseline against which the statistical significance of candidate oscillatory peaks is determined. Possible strategies for extracting the 1/f-like component – which include both the aforementioned extensions to the residuals method to model bursts, and post-hoc function fits to the corrected PSD – will also be considered further in the Discussion.

#### GPi and VLa Population Results

Fig. 6 illustrates results from the two correction methods for the whole neuronal dataset. The three columns show simplified versions of the spectra (again zoomed in to 0-100 Hz on the *x* axis), in which any points that are both significant and labeled as a local spectral peak are labeled in magenta, and all other significant points are marked as blue. Spectra are plotted according to the order of the neuron’s observed firing rate, with the lowest-FR units at the top.

The population-level patterns mirror those that we highlight with the individual unit examples. First, we found trends consistent with those that our simulations would predict, with respect to overall incidence of alpha-beta oscillation detections, and the variation of these detections with firing rates. Collapsing across the GPi and VLa units, the residuals method reported that a greater proportion of the total units exhibited significant alpha-beta power, compared to the shuffling method (66/170 versus 37/170 units, McNemar’s *χ*^{2} = 19.12, *p* < 1.3e-5). Moreover, for those units for which alpha-beta band oscillations were reported by at least one method, lower firing rates were predictive of an increased likelihood of residuals-only detection (as compared to shuffling-only or joint method detection; *β*_{FR} = - 0.089, p < 1.5e-5). Therefore, to the extent that the exclusively residuals-identified power did indeed correspond to true underlying alpha-beta oscillations, the method did fulfill the expectation of enhanced sensitivity, especially in the presence of low firing rates. Note also that the residuals method did identify alpha-beta oscillations over units with a variety of estimated recovery periods (median RP = 3.5 ms, iqr = 6 ms), consistent with the high hit rates seen across the varied RP parameters in the simulations (Fig. S12-S14).

At the population level, the second pattern – non-oscillatory contributors to spike timing – is most visible as a regular occurrence of 1/f-like effects. In Fig. 6, PSDs with strong 1/f-like trends tend to appear as significant points in the lowest frequency bands, starting with the (0,4] Hz delta band, and at times extending well beyond it. Using a simple rule to label likely 1/f-like trends (≥ 2 consecutive significant points within (0,4] Hz; see Materials and Methods), we identified a greater incidence of these trends in the residuals-versus shuffling-corrected PSDs (85/170 versus 25/170 units, McNemar’s *χ*^{2} = 54.40, *p* < 1.7e-13). This result is consistent with the shuffling method’s aggressive removal of the 1/f-like trend in Fig. S16B. Also consistent with the Fig. S16B unit, we can identify additional cases in Fig. 6 for which the residuals method retains a 1/f-like trend that extends into a series of significant alpha-beta points. Therefore, with respect to the above observations of residuals-only alpha-beta detections, an open question does remain regarding the extent to which putative periodic power in these bands would remain significant after removal of the 1/f-like, presumably aperiodic spectral component.

Again consistent with the Fig. S16B example, we did observe additional units for which probable bursting may have contributed to 1/f-like trends. For example, for four additional VLa units that met the above delta band rule, > 25% of the ISIs fell within 2-3 ms (range: 35.76% - 42.91%), in spite of overall mean FRs of 5.35 Hz - 8.26 Hz.

## Discussion

### Discussion Overview

To recap, our principal objective was to evaluate a novel approach for removal of the recovery period’s distortion of spike train power spectra, with the ultimate aim of improving the accuracy of oscillation detection. An established ISI shuffling method [7] approaches this problem by extracting the spectral component explained by the spike train’s ISI distribution alone. Because the ISI distribution contains information about not only the recovery period, but also oscillations, the shuffling method may hinder detection of relatively subtle oscillatory spiking, especially in slowly spiking units. As an alternative, we developed a regression-based “residuals” method that also models and removes temporal structure related to the ISIs, but only those ISIs that fall within a historical bound determined by the estimated duration of the recovery period (RP) function. We compared residuals-corrected versus shuffling-corrected PSDs to determine whether the former enhanced sensitivity to oscillations, with this effect predicted to be greatest for low FR spike trains.

The following sections summarize the predicted and less anticipated aspects of our findings, propose future directions for method development, and situate the residuals method in the context of the broader set of tools available for rhythmic spiking analysis.

### Predicted Findings

The two methods were initially evaluated on synthetic spike trains, generated with varying settings for firing rate, oscillatory frequency and modulation strength, and recording duration, and with the RP represented as an exponentially-recovering modulation of spike probability. Collapsing over these variable settings, we found that residuals correction enabled more accurate classification of true versus false positive points in the power spectra. More detailed examination of the varied parameters, and of their impact on the methods’ relative hit and false alarm rates, revealed three general trends. First, and as anticipated, residuals correction primarily benefited from enhanced sensitivity to the ground truth oscillations, especially when firing rates were low, relative to the oscillation frequency. Second, two other predictors of strong relative residuals performance (moderate oscillation modulation (*m*); short spike train duration (*T*)) suggest that this method may be most useful for identifying subtle oscillatory effects that reside close to a threshold level of detectability. Third, the largely sensitivity-driven benefit was gained, for most parameter settings, without exaggeration of false positive oscillation detections. On a limited subset of high FR, strongly oscillating spike trains, both the residuals and shuffling methods did tend to produce PSDs with very low-power FA peaks; we propose strategies for addressing these FAs farther below.

The largely favorable results for the residuals method generalized across variable recovery period durations (3, 4, 9, and 18 ms), and shapes (i.e., absolute or exponentially-rising, with variable steepness). This persistence of successful detection across RP characteristics was accompanied by relatively accurate estimation of the ground truth RP durations by the ISI distribution-fitting algorithm, with estimation errors rarely exceeding 1-2 ms.

In the real GPi and VLa data – keeping in mind the caveat concerning the unknowability of the ground truth – we did observe additional evidence consistent with our expectations. Focusing on the alpha-beta frequency range that was anticipated in the parkinsonian NHP under consideration, we found that the residuals method reported significant oscillations for a greater fraction of the units than the shuffling method did. This result was anticipated on the basis of the prevalence of low-to-moderate FRs in this dataset (especially amongst the VLa units), and follow-up tests confirmed the relevance of low spike rates for predicting residuals-only oscillation detection. Although we cannot presently determine whether the residuals-flagged oscillations are “real”, it is conceivable that future empirical work could aid in providing corroborating evidence. For example, confidence that an oscillatory drive governs a unit’s spiking could be bolstered by the discovery of matching oscillatory trends in the unit’s synaptic input dynamics (e.g., as tracked by high resolution biosensors of the relevant neurotransmitters [45]). One might also anticipate that any significant spike oscillations may correspond to similar oscillations in a unit’s membrane voltage. Consistent with this expectation, a recent study in mice [46] observed that striatal cholinergic interneurons that exhibited robust delta rhythms in their spiking (as estimated in the time domain with an ISI-based heuristic) also showed strong delta rhythms in subthreshold membrane voltage (as estimated with wavelets).

A subset of the GPi and VLa spike trains exhibited aperiodic structure that the current implementation of the residuals method was not designed to address. Consequently, the corresponding residuals PSDs should be interpreted with care. Below, we discuss potential strategies for addressing these additional patterns that may arise in empirical spike trains.

### Areas for Future Development

In our synthetic and empirical spike trains, three special cases gave rise to unanticipated outcomes in the corrected PSDs. Below, we briefly review these cases, and the attendant directions for future development.

#### False alarms with fast spiking and strong modulation

When operating on synthetic inputs generated with high base FRs (*p _{base}*) and strong oscillatory modulation (

*m*), both shuffling and residuals correction produced spectra with elevated false alarm rates (Fig. S8). This trend worsened with increasing spike train duration (

*T*), and with variation in the “true” oscillation frequency (

*f*). The nature of the

_{osc}*f*effect depended on the method in question. For the lowest frequencies (e.g., 7 Hz; Fig. S9), residuals correction commonly yielded false positives, but the shuffling FA rate remained mostly low. As

_{osc}*f*approached the highest tested settings (e.g., 32 Hz; Fig. S10), shuffling FAs become extremely frequent, while the residuals FA rate moderated somewhat, such that the relative specificity of the two methods flipped. Under the FA-promoting conditions, both methods reported false peaks that tended to fall within the gamma band, and were characterized by very low power that barely exceeded the statistical significance threshold (Fig. S11A). The distinctive features of these false alarms suggest that a simple PSD postprocessing step (e.g., imposition of a more stringent significance and/or effect size threshold) may help minimize their appearance in future analyses.

_{osc}Future development of the residuals method should seek to prevent these elevated false alarms. In the fast spiking, strong modulation cases, a key challenge concerns the heightened correlation between the effect that we are attempting to model (the recovery period) and the unmodeled oscillation effect. This correlation engenders an “omitted variable bias” [47] in RP function estimation. Visualization of representative spike trains (Fig. S11B) illustrates the core difficulty: For a strongly oscillating spike train, the spikes and the recovery period regressors will principally cluster under the positive phases of the oscillation cycle. Consequently, the RP regressors are associated not only with the recent occurrence of single spikes, but also with an elevation of the background firing rate above its overall mean. The result is an underestimation of the RP’s suppressive effect, which in turn leads to incomplete removal of the RP distortion effect from the corrected PSD.

Stevenson [47] noted that such biases might be reduced through introduction of variables that may capture suspected unmodeled effects (here, oscillations of unknown frequency), but also acknowledged that such strategies can be assumption-intensive and nontrivial. Such variables are present in those PPMs that estimate spike rhythms entirely in the time domain (e.g., [24–26]). We recap the pros and cons of these time domain methods in the concluding section below (“Conclusions: Tools for Spike Oscillation Analysis.”).

#### Burst firing impacts on the corrected spectra

Spike “bursts” consist of transient intervals of rapid spiking, occurring against a background of a slower-rate steady state [48–50]. In the empirical data, we observed multiple units that displayed likely bursting, as inferred from their ISI statistics (e.g., as in Fig. S16-S17). Previous investigations have reported bursting in the parkinsonian GPi and VLa (see [51] for a review).

Burst firing can introduce two variations in the spike train that may influence the form of the residuals-corrected PSD. First, by definition, bursting creates increases in the background FR, which may occur aperiodically. Our current residuals implementation does not remove such variation, which may in turn contribute to 1/f-like trends in the resulting PSD (consistent with Fig. S17B; see also [44]). Second, bursting could alter the recovery period, as suggested by the Fig. S17A VLa example. Our current model cannot accommodate distinct burst and non-burst recovery periods, thereby preventing the accurate removal of all RP-associated variance. In the VLa example, the model estimated and removed the short, within-burst RP, resulting in insufficient correction for the long, between-bursts RP.

Future development could conceivably expand the current residuals method to capture burst- and non-burst states. Previous work has proposed discrete state-space models designed to infer burst states in point process time series [52, 53] and could inform such an approach. Armed with an algorithm that labels spike train timepoints with their likelihood of pertaining to burst or non-burst intervals, one could attempt to model the two states’ base FRs and RPs with separate sets of delta functions, with these functions weighted by the inferred probabilities that either state is active at each timepoint (i.e., , where *k* indexes over the states, and the *α _{k}* (

*t*) denote the complementary state probabilities). Note that adoption of this strategy would require

*a priori*estimates of the two states’ RP durations ((

*k*)), and therefore the development of a dual-RP estimation algorithm.

#### Aperiodic spectral components

Several GPi and VLa units exhibited prominent 1/f-like trends in both the uncorrected and residuals spectra. “1/f-like” refers to spectral components that are well-approximated by a 1/f^{χ} function, to which researchers commonly add a “knee” parameter to accommodate a potential bend in the curve [54]. Strong 1/f-like effects are ubiquitous in neural activity spectra (and especially EEG/LFP PSDs) and represent the frequency domain manifestation of aperiodic trends. Removal of these trends will be necessary for rigorous interpretation of the suprathreshold points (e.g., the green points in Fig. S16B) in the residuals-corrected PSDs.

Two general strategies may help achieve this removal. First, as noted above, aperiodic burst spiking may contribute to the 1/f-like component [44]. Therefore, effective removal of burst-related variance at the regression stage could ultimately attenuate some 1/f-like effects. However, bursts represent just one of the many diverse underlying dynamics that may give rise to 1/f-like trends [55]. Therefore, complementary procedures will still be needed to remove any remaining aperiodic effects.

The second strategy entails the standard postprocessing step of removing a fitted 1/f^{χ} function (with a possible knee component) from the PSD. This step would assume that residuals correction has removed sufficient RP-related distortion from the spectrum. The routine introduced by [3] offers one option for achieving a robust curve fit to the PSD. This method additionally fits a series of Gaussians to any narrowband spectral peaks, and those Gaussians that sufficiently exceed the aperiodic baseline are inferred to represent likely oscillations.

### Adaptation to Time-Frequency Analysis

In its present form, the residuals method assumes the presence of stable oscillatory components. Adaptation of the method to the analysis of time-varying oscillations may be feasible, depending on two factors: the unit’s spike rate, and the stability of its RP function. Vigorous spiking and a stable RP present the optimal scenario. In this case, one could reasonably apply the PPM strategy that we described here, and submit the residuals time series to the preferred time-frequency analysis method (e.g., short-time Fourier Transform (STFT), Wavelet Transform, or bandpass filtering followed by Hilbert Transform).

If spiking is slow, then insufficient data may be available to detect transient oscillations, even with adequate RP correction. Simulations may aid in estimating the minimal spike rate required for sensitivity to any hypothesized effects.

If the RP function is likely to vary over time, the pattern of the suspected variation should inform the adaptation approach. If one assumes that the RP varies across transitions between latent discrete states, but remains stable within states, then it may be reasonable to first apply a multi-state PPM (as we have described for bursting) and then submit the residuals to time-frequency analysis. Alternatively, if one assumes both fast spiking and RP stability within each segment of an STFT analysis, one might conceivably execute the full residuals routine (RP estimation/PPM/PSD) on each segment independently. More challenging data features (e.g., complicated RP variation patterns) will require the development of methods that go beyond the basic approaches that we have laid out here.

If time-frequency analyses are attempted, appropriate postprocessing should be applied to extract aperiodic trends. For example, Wilson et al. [56] recently proposed a time-resolved method for aperiodic and periodic trend decomposition, which extends upon the stationarity-assuming method of Donoghue et al. [3].

### Conclusions: Tools for Spike Oscillation Analysis

Throughout this report, we have referenced a set of available tools that both estimate spike oscillations and control for recovery period effects. Here, we briefly summarize the major features of each method, with an eye to the tradeoffs to consider when deciding which approach to adopt for a given project. We divide these tools into those that operate entirely in the time domain, and those that return outputs in the frequency domain.

#### Time domain methods

We further split the time domain methods into two subcategories. The first subcategory encompasses all Poisson GLM-based PPMs that may capture both recovery period and oscillation effects (e.g., [25, 26]). As described in the Materials and Methods (“Clarifications and Relationship to Pre-existing PPM Approaches”), these PPMs typically model spike history effects with a series of indicator functions of varying width. A set of narrow, short-lag regressors estimate recovery period and bursting effects, and wider, longer-lag regressors are positioned to estimate periodic effects. For example, spike counts summarized over the −50 to −40 ms bin may be used to estimate 20-25 Hz rhythms [25]. Note that the fundamental differences between such methods and the current residuals implementation are relatively modest. The principal distinctions concern the choice of indicator regressors to include, and the estimation of oscillatory trends through either strategically spaced regressors or export of the residuals to spectral analysis.

Depending on a researcher’s goals, the replacement of the PSD with binned history effects may stand out as a limitation of these alternative GLM methods. The bin-based oscillation estimates sacrifice the high frequency resolution of a standard DFT-produced spectrum, and represent a departure from the typical basis set of sinusoids. These characteristics may complicate projects that aim to draw comparisons with previous spectral data, or derive precise oscillation frequency information. However, when these limitations are acceptable, these GLM methods can offer some advantages. For example, by incorporating oscillation terms alongside the short-lag regressors, these models may mitigate the omitted variable bias in RP estimation.

The second subcategory consists of the Latent Oscillatory Spike Train (LOST) model of [24]. LOST is a distinct, Bayesian PPM, which shares some key features with the regression-based models, but also differs in significant ways. The shared features include short-lag history terms (implemented through splines) and other covariate regressors (e.g., related to task events) that could be easily incorporated into a standard GLM. The core difference entails the modeling of oscillations. LOST represents underlying oscillatory drives as continuous latent states, which are constrained by priors, but may vary over time with respect to their precise modulation strengths and center frequencies. These latent oscillatory components in turn shape the ongoing spike probability. This approach therefore offers a number of benefits, especially for those users who seek a principled accounting for modest fluctuations in the properties of the rhythmic drive. Moreover, as with the above-mentioned GLM methods, LOST’s simultaneous estimation of short-lag and oscillatory trends may reduce any omitted variable bias effects. Note that LOST does constrain the number of distinct oscillatory effects that the user can simultaneously estimate, and may be best suited for small datasets (due to both runtime demands and manual fine-tuning requirements).

#### Frequency domain methods

The shuffling and residuals methods form the frequency domain category. These methods offer the advantage of a corrected power spectrum estimate.

We reiterate that the shuffling method remains a reasonable approach for RP distortion correction in moderate-to-high-FR spike trains. In addition, shuffling appeared to handle some instances of aperiodic structure (e.g., 1/f-like trends, and the Fig. S16A bursting example) more effectively than the current residuals implementation did. The principal tradeoff for these strengths is reduced sensitivity to oscillations in sparse spike trains. Note also that the shuffling method’s run time may grow very long as the spike train duration and count of shuffling iterations increases, and that this method shares the residuals method’s vulnerability to biases introduced by strong oscillations (as indicated by Fig. S10-S11).

The residuals method is an especially useful option when a power spectrum is required, spike rates are low, and when other factors, such as modest modulation strength and short recording duration, challenge oscillation detection. In the implementation we present here, the method draws upon an accessible and flexible GLM framework. Moreover, the generation of a corrected time series output – and not only corrected oscillation estimates – expands the set of analyses that the underlying framework might ultimately support, including common time-frequency analyses.

## Supporting Information

### Supporting Figures

### Supporting Tables

## Data Availability Statement

Both the experimentally-acquired single-unit data and the analyzed simulation outputs are available on Zenodo at https://doi.org/10.5281/zenodo.8313070. Code sufficient to reproduce all figures and tables, and the simulations and analyses that informed them, is available on Github: https://github.com/kc13/residuals_spectral_analysis. The most recent Github release (1.0.0) has been issued the DOI 10.5281/zenodo.8323322 by Zenodo. The Github repository also stores copies of the single-unit data, and synthetic datasets that are represented in a postprocessed state.

## Financial Disclosure Statement

This work was supported by the National Institute of Neurological Disorders and Stroke at the National Institutes of Health (grant number R01NS117058-01 to RST; https://www.ninds.nih.gov/). Additionally, this research was funded in part by Aligning Science Across Parkinson’s through the Michael J. Fox Foundation for Parkinson’s Research (ASAP-020519 to RST; https://parkinsonsroadmap.org/, https://www.michaeljfox.org/). For the purpose of open access, the authors have applied a Creative Commons Attribution 4.0 International (CC BY) public copyright license to all Author Accepted Manuscripts arising from this submission. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

## Acknowledgments

We thank Lisa Nieman-Vento for her contributions to animal care.