Summary
Neural plasticity can be seen as ultimately aiming at the maximization of reward. However, the world is complicated and nonlinear and so are neurons’ firing properties. A neuron learning to make changes that lead to the maximization of reward is an estimation problem: would there be more reward if the neural activity had been different? Statistically, this is a causal inference problem. Here we show how the spiking discontinuity of neurons can be a tool to estimate the causal influence of a neuron’s activity on reward. We show how it can be used to derive a novel learning rule that can operate in the presence of non-linearities and the confounding influence of other neurons. We establish a link between simple learning rules and an existing causal inference method from econometrics, yielding proofs of both the correctness of the approach as well as its asymptotic behavior.