Fluctuation-driven plasticity allows for flexible rewiring of neuronal assemblies

Synaptic connections in neuronal circuits are modulated by pre- and post-synaptic spiking activity. Heuristic models of this process of synaptic plasticity can provide excellent fits to results from in-vitro experiments in which pre- and post-synaptic spiking is varied in a controlled fashion. However, the plasticity rules inferred from fitting such data are inevitably unstable, in that given constant pre- and post-synaptic activity the synapse will either fully potentiate or depress. This instability can be held in check by adding additional mechanisms, such as homeostasis. Here we consider an alternative scenario in which the plasticity rule itself is stable. When this is the case, net potentiation or depression only occur when pre- and post-synaptic activity vary in time, e.g. when driven by time-varying inputs. We study how the features of such inputs shape the recurrent synaptic connections in models of neuronal circuits. In the case of oscillatory inputs, the resulting structure is strongly affected by the phase relationship between drive to different neurons. In large networks, distributed phases tend to lead to hierarchical clustering. Our results may be of relevance for understanding the effect of sensory-driven inputs, which are by nature time-varying, on synaptic plasticity, and hence on learning and memory.


I. INTRODUCTION
Learning occurs through changes in the synaptic weights between cells in neuronal circuits. Understanding learning therefore requires working out the rules by which these changes happen at a single synapse, and then studying the consequence of this process at the network level. One major finding regarding the rules underlying synaptic plasticity was the observation that the synaptic weight change could depend on the relative timing of pre-and post-synaptic spikes [1,18]. Theoretical work has studied how such spike-timing dependent plasticity (STDP) can shape recurrent connections in neuronal networks [3,9] and serve to encode fixed point attractors in large-scale spiking models [30]. However, like all Hebbian plasticity rules, STDP is intrinsically unstable, and leads either to full saturation of all synaptic connections, or complete depression, depending on the sign of the integral of the plasticity window. Essentially, plasticity always occurs if the product of pre-and post-synaptic rates is non-zero, even if constant. Therefore, additional stabilizing mechanisms are required in order to allow for the emergence of non-trivial connectivity patterns [12,24,29,30]. A multiplicative STDP rule, for which potentiation is progressively weaker the stronger the synapse, stabilizes weights but does not readily allow for the emergence of non-trivial network structure [20,23,26].
Here we consider the scenario in which plasticity only occurs in the presence of timevarying rates. This mode of plasticity seems particularly relevant for learning given that most salient events unfold over time and would be expected to result in time-varying firing rates in the relevant brain circuits. For simplicity we explore this plasticity regime by assuming that the integral of the STDP window is exactly zero, i.e. that there is a balance between potentiation and depression when firing rates are constant. This obviates the need for added stabilizing mechanisms and allows for an in-depth analysis.
We consider both oscillatory as well as noisy drive and develop a theoretical framework for pairs of linear firing rate neurons. Specifically, we make use of the separation of timescales between neuronal and synaptic dynamics to derive self-consistent evolution equations for the synaptic weights. Analysis of these equations reveals a rich phase diagram from which the resulting connectivity motif can be predicted depending on the phase difference in the case of oscillatory drive, or the correlation and delay in the case of noisy drive. We also find many regions of multistability, meaning that the final connectivity motif will also depend on the initial configuration of weights. For the case of oscillatory drive we study the effect of the balanced STDP rule in networks numerically, and show that the resulting connectivity matrix can be well predicted from the pairwise theory in several relevant cases.

II. GENERAL TWO-NEURON MODEL
A. Firing rate formulation for two neurons with forcing We begin by considering the simplest possible scenario of two coupled neurons. We model the neuronal activity with a linear firing rate equation, which will allow for a complete analysis. The equations are τṙ 1 = −r 1 + w 12 r 2 + I 1 (t), τṙ 2 = −r 2 + w 21 r 1 + I 2 (t), (1) where w 12 and w 21 are the recurrent synaptic weights. The external inputs vary in time; we will consider both oscillatory inputs as well as correlated noise sources in subsequent sections.
Mathematical assumption: Rate dynamics Neurons are excitable units which can respond in an all-or-none manner to inputs. Specifically, if the neuronal membrane potential exceeds a threshold, an action potential or "spike" is generated, which propagates down the axon and causes the release of neurotransmitter onto postsynaptic targets. These excitable dynamics can be modelled as a system of coupled nonlinear ordinary differential equations (ODEs) [13], or in a simplified manner via one-dimensional ODEs with discontinuous threshold and reset conditions [6,16]. However, sometimes it is not the details of the subthreshold activity of the cell which is of primary interest, and but rather the mean spike rate. The dynamics of this rate can be described in a so-called firing rate equation [28], which is generally heuristic in nature. Here we will explore how variations in the spike rate affect plasticity, and hence take this firing rate approach. As a further simplification we take the rate dynamics to be linear in Eqs.1.
Here we interpret the rates in Eqs.1 as the underlying probability for a Poisson spiking process, and then apply a so-called spike-timing dependent plasticity (STDP) rule [15].
That is, in a small time interval ∆t, the probability that neuron 1 generates a spike is just r 1 (t)∆t. Once the spikes have been generated, a presynaptic spike of neuron i followed by a postsynaptic spike of neuron j at a latency T leads to a potentiation of synapse w ji by an amount A + e −T /τ + and a depression of synapse w ij by an amount -A − e −T /τ − . Synapses are bounded below by zero and above by a maximum value w max . Whenever there is a new spike the synaptic weights are updated in this way for all past spike pairs, see [21] for an efficient numerical implementation. This rule, together with Eqs.1, provide a complete model which can be simulated numerically. Note that while we are formally making use of an STDP rule here, the exact spike timing plays no role. That is, plasticity is due only to dynamics in the rate. Such rates effects appear to dominate over contributions due to spike timing in models of STDP when realistic input patterns are considered. Specifically, experimental protocols have traditionally used highly regular, repeated pairings of pre-and post-synaptic activity, and the observed synaptic plasticity has therefore been natually attributed to the exact spike timing [1,18]. However, theoretical work has shown that when the inferred rules are used in the presence of more in-vivo like spike trains with a high level of irregularity, the resulting synaptic plasticty can be accounted for to a large extent just by variations in the underlying firing rate [11].
Mathematical assumption: Pairwise STDP Long-term changes in synaptic strength are due to a complex chain of biochemical processes which are set in motion by the in-flux of calcium at the synapse [5]. Indeed, heuristic models of synaptic plasticity based on the local calcium signal at the synapse can reproduce the diverse phenomenology of potentiation and depression oberserved in in-vitro experiments [10]. The calcium signal itself is determined by both pre-and post-synaptic spiking, which suggests a phenomenological model based only on the timing of the pre-and postsynaptic spikes may provide a good approximate description of the plasticity dynamics [1,17]. The advantage of such a description is that it requires no knowledge of the subthreshold state of the neurons. Such spike-timing dependent plasticity models have proven very successful at fitting in-vitro data [21]. Here we use an STDP rule in which changes at the synapse are determined solely by pre-post spike pairs. Our selfconsistent description of neuronal activity and synaptic plasticity is therefore entirely at the level of spiking activity, and hence ignores the details of the subthreshold state of the neurons and calcium levels at the synapse.
If the amplitude of potentiations and depressions is small compared to the maximum synaptic strength w max , then the evolution of the synaptic weights can be approximated by the following integrals [15] (2) The first integral includes the contribution of all spike pairs leading to depression of the synapse w ij . Specifically, given Poisson processes, the probability of the spike pair in which cell i has spiked in a small interval around time t and cell j has spiked previously at a latency T is just r i (t)r j (t − T ) (T > 0). The integral sums up spike pairs at all possible latencies up until the current time. The second integral is analogous, but for spike pairs which lead to potentiation of the synapse. When the neuronal dynamics is stationary, the integrals can be written in the compact forṁ where A(T ) = A + e −T /τ + for T > 0 and −A − e T /τ − for T < 0, see Appendix A for a detailed explanation. Eqs.3 together with the rate equations Eqs.1 consitute a self-consistent approximation to the full model, and which is ammenable to analysis.

B. Asymptotic approximation for slow plasticity
Eqs.1 and 3 cannot be solved for directly. The reason is the presence of quadratic nonlinearities of the firing rate variables in the integrals in Eqs.3. However, we can take advantage of the slowness of synaptic plasticity compared to the firng rate dynamics to derive an approximate system of equations which can be solved exactly. Specifically, we assume that the amplitude of potentiation and that of depression are small and formalize this by replacing the kernel A(T ) = ǫĀ(T ) in Eqs.3. We also define a new, slow time t s = ǫt and allow the rates and the synaptic weights to evolve both on a fast as well as on a slow timescale. We expand the rates and weights in orders of ǫ and find that the leading order solution, where (r 1 , r 2 , w 12 , w 21 ) = (r 0 1 , r 0 2 , w 0 12 , w 0 21 ) + O(ǫ), obeys the following coupled equations τ ∂ t r 0 1 = −r 0 1 + w 0 12 r 0 2 + I 1 (t), τ ∂ t r 0 2 = −r 0 2 + w 0 21 r 0 1 + I 2 (t), see Appendix A for details of the derivation. In Eqs.4 there is a formal separation of the timescale of evolution of the rates from that of the synaptic weights, which is much slower.
In fact, the synaptic weights are only a function of the slow-time and hence can be treated as constants in the first two equations, allowing one to solve for the rates using techniques from linear algebra. The integrals can then be formally evaluated, yielding self-consistent evolution equations for the synaptic weights alone. The integrals over the fast time are performed over an appropriate time window, e.g. over one period of oscillation for oscillatory drive.
Mathematical assumption: Separation of timescales Neuronal membrane time constants generally range from milliseconds to tens of milliseconds. On the other hand, the time course of plasticity as gleaned from in-vitro and in-vivo studies can be much slower.
Specifically, in in-vitro protocols for the induction of plasticity via STDP, repeated pairings are required in order to observe a change in the synaptic efficacy[1]. On the other hand, rapid plasticity can also be induced experimentally through burst protocols[2], and has been inferred from fast remapping of place cell activity in the hippocampus [22]. Certainly, the formation of episodic memory requires plasticity to be fast enough for one-shot learning.
Our assumption of a separation of timescales therefore means we are modelling slow changes in cell responses, perhaps related to perceptual learning.

III. OSCILLATORY DRIVE
We first consider the case of oscillatory drive with frequency ω and phase difference φ.
Specifically, we take I 1 (t) = Ie iωt+iφ 1 and I 2 (t) = Ie iωt+iφ 2 . The physiological firing rates are given by the real parts of r 1 and r 2 , which are then also used to calculate the weights self-consistently in Eqs.4, yieldinġ and we have left off the superscript 0 for simplicity. Physiological assumption: oscillatory drive Oscillations are ubiquitous in the brain, although it remains unknown for the most part what their functional role might be. It has been hypothesized that the phase relationship between neuronal ensembles oscillating at the same frequency may influence their communication [7]. Here we explore that possibility that this phase relationship may play a role in influencing the directionality and degree of synaptic plasticity.
An analysis of Eqs.5 reveals that there are no fixed point solutions for w 12 , w 21 ≥ 0.
However, by studying the sign of the right hand side in Eqs.5 at the boundaries of the allowable domains, we can find stable solutions. For example, the fully potentiated solution (bidirectional motif) is stable ifẇ 12 ,ẇ 21 > 0 for (w 12 , w 21 ) = (w max , w max ). This condition is clearly satisfied for φ = 0 as long asÃ + (ω) −Ã − (ω) > 0, which holds when potentiation dominates at short latencies. On the other hand, when the φ = π,ẇ 12 ,ẇ 21 < 0, meaning that there is a critical value of the phase for which the fully potentiated solution becomes unstable. In a plane of the phase versus the frequency of the forcing, there is therefore a curve below which the potentiated solution is stable, see the orange line in Fig.1. An analogous argument can be made for the fully depressed solution (unconnected motif), which in this case is stable above a critical curve, see the blue line in Fig.1. Finally, the solution for which one synapse is fully potentiated and the other fully depressed (unidirectional motif) is stable between the dashed, black lines in Fig.1. Note that there is a region of bistability between the unidirectional motif and the bidirectional (unconnected) motif, indicated by the orange (blue) hatching in Fig.1.
The phase planes shown in Fig.1 show a clear resonance for values of a critical forcing frequency, approximately around 5Hz in this case. Specifically, the fully potentiated and fully depressed states are stable over a much wider range of forcing phases in this regime.
Additionally, and as illustrated in Fig.2, the rate of change of the synaptic weights is also maximal around this value of the frequency, and decreases to zero for zero frequency and in the limit of high frequencies. Both effects can be understood as the interaction of the forcing frequency with the window of plasticity, i.e. there is a "best" frequency which maximizes the integral in Eq.3. Precisely this resonance mechanism has been invoked to explain the role of theta oscillations in driving plasticity in rodent hippocampus [25]. This optimal frequency can be found by taking the derivative of the growth rate as a function of the forcing frequency. Doing so in the case where w 12 = w 21 = 0 leads to the simple relation Numerical simulations of the full model agree well with the analysis of the reduced system.
Specifically, given a fixed forcing frequency, there is a range of phase-differences near the in-phase forcing which lead to both synapses potentiating (PP) (orange region in Fig.1).

A. Oscillation-driven plasticity in networks
In the previous section we derived evolution equations for the synaptic weights for a single pair of neurons. We showed that these equations can admit several different stable configurations of weights depending mainly on the phase difference of the forcing, while the The symbols indicate the parameter values used for the simulations in Fig.3. C. Phase plane for  Note that, even though the theory was developed for linear firing rate neurons, it nonetheless correctly predicts the final state of the weight matrix even for nonlinear rate neurons, at least in this simple case. In fact, for the simulations shown in Fig.4, the neuronal transfer function is taken to be a Heaviside function, see Appendix B for details. Given this choice, the clusters, once formed, can exhibit bistability, see Fig.4C.
The applicability of the pairwise theory to networks in which the forcing is clustered is further illustrated in Fig.5. Specifically, with a phase difference of π/2 in a two-cluster network, we expect the cross-cluster connectivity to be unidirectional, from the leader to the follower. This is indeed what is found in simulation, see Fig.5A. For the case of three clusters, in which clusters 2 and 3 have a phase difference of π/2 and π with cluster 1 respectively, Fig.5B shows that the same unidirectional motif is found between clusters 2 and 1, and 3 and 2, while 1 and 3 become uncoupled, all as predicted from the pairwise theory.
The theory can furthermore be extended to the case in which the phase difference is distributed uniformly across the network, from 0 to π. In this case the interaction between any pair of neurons depends mainly on the weights between those two neurons because the influence of the rest of the network is close to zero, see Appendix B for details. In this case, neurons with similar phases are expected to form strong recurrent connections, while sufficiently different phase difference will lead to leader-follower unidirectional motifs, and phase differences near π will lead to complete uncoupling. Numerical simulations show that the resulting synaptic weight matrix is, in fact, very close to that predicted from the pure pairwise theory, see Fig.5C.
The correlation in the drive between the two noise sources is c, and the input to neuron 2 is lagged by an amount D. Given this input, we can calculate the self-consistent dynamics for the synaptic weights, as before, assuming that the impact of each spike pair is weak compared to the dynamic range of the synapse. The equations are FIG. 5. Network connectivity depends on the distribution of phases and is well-predicted from pairwise theory. A. A two-cluster network with a phase difference of π/2. B. A three-cluster network with phases φ 1 = φ, φ 2 = φ + π/2 and φ 3 = φ + π. C. A network with phases distributed uniformly between 0 and π. Other parameters are the same as in Fig.4 ẇ 12 = σ 2 8 where Note that lim D→0 G + = lim D→0 G − = F . It may appear from Eqs.7 that the dynamics is singular for w 12 → 0 or w 21 → 0, but these limits are, in fact, well defined, see Appendix C for details.
Physiological assumption: Correlated, noisy drive Neuronal activity reflects in part sensory input and in part the internal state of the animal. Many neurons respond selectively to particular features of a sensory stimulus, for example the orientation of bar [14], or the direction of motion of an object [19]. In the face of a generic time-varying sensory input, neurons with similar feature selectvities in a given cortical area will likely receive correlated inputs, e.g. they may share presynaptic inputs. Similarly, neurons with quite different feature selectivities may receive nearly uncorrelated or even negatively correlated inputs. In this section we investigate how the recurrent connectivity between such neurons is affected by the degree of correlation in their inputs. We choose Gaussian white-noise processes with a given correlation for simplicity.
As before, there are no fixed point solutions of Eqs.7. However, the equations together with the condition that 0 ≤ w ij ≤ w max , can be used to determine which synaptic states are stable, as in the previous section. Fig.6 shows sample phase planes as a function of the correlation and delay in the common noisy drive, and for three values of w max . In all cases, the fully potentiated state (PP) and the asymmetric state (DP) are favored for positive correlations. The state (DP) corresponds to a potentiated connection from 1 to 2, which occurs when the delay is sufficiently long (2 follows 1 here). For negative correlations the phase diagram is more complex, allowing for up to eight distinct regions. In general, the fully depressed state is stable when correlations are negative and the delay is not too large.
On the other hand, the asymmetric state (PD) is stable everywhere for negative c. plasticity rule we have chosen is formally a spike-timing dependent rule (STDP), although the fact that we generate spikes as a Poisson process insures that the actual spike timing plays no role here. Rather it is only variations in the underlying rates for the Poisson processes which can lead to plasticity in our model. This appears to be the dominant factor in shaping plasticity given realistic spike trains, even when spiking correlations are taken into account [11]. We additionally assume that the plasticity rule is balanced, namely that the integral over the STDP window is identically zero. If this is not the case, then the synaptic weights between neurons with non-zero rates would always grow or decay, depending on the sign of the integral. This implies that in network simulations the final synaptic weight matrix will always saturate or decay to zero, ruling out the emergence of any non-trivial structure. Additional mechanisms, such as homeostasis, are needed in this case in order to avoid saturation [30]. On the other hand, with the balance assumption only time-variations in the firing rates can drive plasticity. Specifically, the change in a given synaptic weight depends on the covariance of the pre-and post-synaptic firing weights, multiplied by the STDP window [25].
We also note that our equations do not take into account the variability arising due to the stochasticity of firing, but rather only the mean Poisson rates. In numerical simulations of the full, stochastic system, one observes fluctuations which may momentarily drive the synaptic weight away from its stable meanfield value (this is clearly seen in Fig.7) or even cause transitions between stable states. These effects are not captured by Eqs.5 or Eqs.7.
Oscillatory drive: In the case of two periodically forced neurons, the resulting connectivity motif depends solely on the phase difference, while the time it takes for the connectivity to reach its steady state depends strongly on the forcing frequency. When potentiation dominates at short latencies then small phase differences lead to a fully potentiated motif.
Larger phase differences generate a unidirectional motif in which the connection from leader to follower potentiates whereas the other depresses. Finally in the vicinity of anti-phase forcing both synapses depress and the neurons decouple. Analysis of Eqs.5 furthermore reveals several regions of bistability between these different motifs. Simulations agree well with the analysis, see Fig.3. Previous work on STDP in a network model of hippocampus found similar effects of the phase of oscillation on connectivity motifs through simulation [4].
The evolution equations for the synaptic weights of a network of arbitrary size can be derived using the separation of time scales technique. However, for several cases of interest the theory for pairs of neurons can be used to gain insight into the resutling connectivity of large networks. This includes the case of several clusters forced at different phases, as well as the case of a uniform distribution of phases. When phases are widely distributed the general finding is the emergence of hierarchical structure. Specifically, there is clustering locally between neurons with similar phases of the forcing, but between neuron pairs with disparate phases, unidirectional connections form according to the leader-follower phaserelationship. It is interesting to note that precisely this type of hierarchical clustering in cortical microcircuits has been inferred from data collected through multiple patch-clamp experiments in slices [27].
Noisy drive: In the case of two neurons forced by white-noise inputs with a given correlation and time-delay, the resulting phase diagram is very rich, see Fig.6. Although positive correlations tend to lead to potentiation and negative correlations to depression, the combined effect of correlation and delay is complex, and multistability is the rule. This is borne out in numerical simulations for pairs of neurons, see Fig.7. It is unclear how this will affect the emergent connectivity in large networks of neurons, and requires additional study.
The plasticity process we describe here consists of a build up over time of a large number of small changes. As such, it is slow, and would not be a relevant mechanism for rapid memory formation, such as episodic memory. Rather, such a process shapes the connectivity in recurrent circuits in accordance with regularities in the statistics of the inputs. For example, if we consider an area in the visual pathway, neurons with similar feature selectivity and overlapping receptive fields would exhibit positive correlations in their output in response to a time-varying stimulus. Both for the case of oscillatory as well as noisy dynamics our analysis would predict a potentiation of recurrent connections between these neurons. This would lead to an enhanced response to a similar stimulus over time. Neurons with similar feature selectivity but non-overlapping receptive fields would likely exhibit similar yet timedelayed (or out-of-phase) inputs in response to the motion of an object across the visual field, etc. Therefore, we expect that the statistics of some sensory stimuli can be mapped on to the parameters of the inputs in our model: frequency, phase, input-correlation, delay. Repeated exposure to the same sensory stimuli would lead to a slow reshaping of the recurrent circuit connectivity and hence the neuronal response. Such a process may be relevant for the phenomenon of perceptual learning [8].

APPENDIX A: DERIVATION OF SELF-CONSISTENT EQUATIONS FOR SYNAP-TIC WEIGHTS
Here we derive a self-consistent set of equations for the synaptic weights w 12 and w 21 by assuming a separation of time-scales between the rate dynamics and the synaptic plasticity.
We first note that the evolution equations for the synaptic weights, Eqs.2 can be rewritten for the case of stationary rate dynamics by noting that r i (t)r j (t−T ) t = r i (t+T )r j (t) t , which allows for a change of variables in the second integral, leading to Eq.3. Strictly speaking this correspondence only holds when the dynamics has been averaged over the fast time. As we are only interested in the slow-time dynamics, we write Eq.3 as if the correspondence were exact, cognizant that it is a slight abuse of notation. Now, given real-valued, time-varying inputs to the two neurons, the equations are τṙ 1 = −r 1 + w 12 r 2 + I 1 (t), τṙ 2 = −r 2 + w 21 r 1 + I 2 (t), where A(T ) is the plasticity rule. There is no general analytical solution to these equations given the quadratic nonlinearities. However, if we assume that each synaptic weight change is small, then the synaptic weights will evolve much more slowly than the rates and we can formally separate the time scales in a multi-scale analysis. To do this we introduce the small parameter ǫ ≪ 1 such that A(T ) = ǫÃ(T ). We also introduce the slow time t s = ǫt and allow for the rates and weights to evolve on both fast and slow time scales, i.e. they are functions of t and t s , and these two times are taken to be independent variables.
At order O(ǫ) we have τ ∂ ts r 0 1 + τ ∂ t r 1 1 = −r 1 1 + w 1 12 r 0 2 + w 0 12 r 1 2 , τ ∂ ts r 0 2 + τ ∂ t r 1 2 = −r 1 2 + w 1 21 r 0 1 + w 0 21 r 1 1 , The first two equations give a correction to the leading order solution of the firing rates, which we will not use here. The weight equations at first glance do not seem solvable since we are expected to solve for both the leading order solution of the synaptic weights as well as the next order correction in the same set of equations. However, we know that the leading order terms are independent of the fast time t, which will allow us to solve for both. Specifically, the evolution of the leading-order weights will depend only on those terms from the integral which are independent of the fast time. This leads to Eqs.15. For simplicity in notation, in what follows we will drop the superscripts and tildes and write, for the leading-order solution, simply τ ∂ t r 1 = −r 1 + w 12 r 2 + I 1 (t), τ ∂ t r 2 = −r 2 + w 21 r 1 + I 2 (t), dT A(T )r 2 (t)r 1 (t + T ), We will consider the specific cases of oscillatory and noisy drive below.

APPENDIX B: OSCILLATORY DRIVE
Here we study the case where the neurons are driven sinusoidally with a frequency ω and with a phase difference of φ, i.e. I 1 = I 0 + Ie iωt+iφ 1 and I 2 = I 0 + Ie iωt+iφ 2 . The (complex) rates can be written (r 1 , r 2 ) = (R 10 (t s ), R 20 (t s )) + (R 11 (t s ), R 21 (t s ))e iωt . We find that Because we only consider balanced plasticity rules here, the constant, baseline rates will not affect the synaptic rates. Nonetheless, when conducting numerical simulations it is important to take large enough constant drive I 0 to ensure positive rates.
In order to calculate the equations for the synaptic weights we must use the real part of the complex rates. As an illustration we consider the equation for the weight w 12 , which is dTĀ(T )Re(r 2 (t))Re(r 1 (t + T )).

The quadratic term
Re(r 2 (t))Re(r 1 (t + T )) = 1 4 R 2 e iωt +R 2 e −iωt · R 1 e iωt+iωT +R 1 e −iωt−iωT , Note that the first two terms are independent of the fast time t, while the second two terms oscillate on the fast timescale with a frequency 2ω. Integrating over the fast timescale therefore eliminates the latter terms. Performing the second integral and then doing the analogous calculation for the other weight leads to Eqs.5 where φ = φ 2 − φ 1 .
Growth rate of synaptic weights. While the final state of the synaptic weights depends on the phase difference, the rate at which plasticity occurs is strongly influenced by the frequency of forcing. This can be most easily seen for the case of in-phase forcing φ = 0, for which we expect both weights to potentiate (or depress for an anti-hebbian rule). Assuming w ij = w ji leads to a right-hand side (growth rate) of Eqs.5 which is simply proportional toÃ + (ω) −Ã − (ω) which is zero for ω = 0 and as ω → ∞, while it has a maximum for

Theory for networks
For the case of n coupled neurons, the rate equation for the ith neuron is where I i = I 0 + Ie iφ i , while the evolution equation for the synaptic weight from neuron j to neuron i is still described by Eq.3. We can once again apply the separation of timescales formally by defining A(T ) = ǫÃ(T ) where ǫ ≪ 1 and defining the slow time t s = ǫt. The rates can be written in vector form as r(t, t s ) = R 0 (t s ) + R 1 (t s , ω)e iωt where where I, W are the identity matrix and weight matrix respectively, e is a vector of ones, while the jth element of the vector p is e iφ j .
Applicability of pairwise theory to network simulations If we consider the rate equations for a pair of neurons j and k in the network Eqs.17 we find, applying the separation of time-scales approach detailed in Appendix A, that the oscillatory components obey where ξ a = 1 N N l=1, =j,k w 0 al R l1 . From this we find the complex amplitudes where R = 1/((1 + iτ ω) 2 − w 0 jk w 0 kj ). Note that these equations are identical to those for the complex amplitudes for the pairwise case (with renormalized weights), Eqs.16, with the exception of the meanfield terms ξ j and ξ k . The slow dynamics of the synaptic weight w jk is then given by Note that in principle the rates r k and r j still depend on the meanfield terms and hence this equation is not self-consistent as in the pairwise case. The influence of these meanfield terms depends strongly on the distribution of phases of the complex amplitudes. In one of the two limiting cases, if all of the phases are aligned then the moduli of the terms all sum. This is equivalent to the summation of vectors all with the same angle. In the other limiting case, if the phases are uniformaly distributed, then the resultant modulus will be close to zero because we are summing many vectors all with distinct phases (as long as the moduli and phases are only weakly correlated or uncorrelated). Hence in this limit the inlfuence of the meanfield vanishes and only the pairwise interactions matter. This latter case is the relevant one for Fig.5C and explains why the pairwise theory correctly predicts the network structure after learning.

Network simulations
For the simulations shown in Fig.4, the following nonlinear rate equations were used τṙ = −r + αΘ 1 K Wr − I 0 + I , where Θ is the Heaviside function. A spike from a neuron i in a timestep ∆t occurs with probability r i ∆t. Given the spike trains from neurons i and j, a weight w ij undergoes updates from all spike pairs according to the STDP rule, see [21] for the numerical scheme.
For the simulations in Fig.5 linear rate equations are used.
Therefore, the noisy drive to the two neurons has correlation c. The correlated input is delayed to neuron 2 with respect to neuron 1 by a time D.
To solve the system of rate equations we rewrite it in vector form as where r = (r 1 , r 2 ), I = (I 1 , I 2 ) and W =   −1 w 12 We diagonalize the connectivity matrix W = QΛQ −1 and obtain the system of independent equations τu = Λu + Q −1 I, where u = Q −1 r. The matrices resulting from the diagonalization are where dW u , dW v and dW r are the stochastic differentials corresponding to the Gaussian yields the expected value of the product of rates. Namely, dT A(T )E(r 2 (t)r 1 (t + T )), and similarly for w 21 . Evaluating this expectation requires products of stochastic integrals.
For independent processes this expectation is always zero, while for integrals of the same process the product can be expressed as a standard integral through the so-called Ito isometry.
For example, which is independent of t at long times. Performing these integrals yields the evolution equations, Eqs.7.
If w 12 ≪ 1 and w 21 can be order one, theṅ