ABSTRACT
Extinction learning suppresses conditioned reward responses and is thus fundamental to adapt to changing environmental demands and to control excessive reward seeking. The medial prefrontal cortex (mPFC) monitors and controls conditioned reward responses. Using in vivo multiple single-unit recordings of mPFC we studied the relationship between single-unit and population dynamics during different phases of an operant conditioning task. To examine the fine temporal relation between neural activity and behavior, we developed a model-based statistical analysis that captured behavioral idiosyncrasies. We found that single-unit responses to conditioned stimuli changed throughout the course of a session even under stable experimental conditions and consistent behavior. However, when behavioral responses to task contingencies had to be updated during the extinction phase, unit-specific modulations became coordinated across the whole population, pushing the network into a new stable attractor state. These results show that extinction learning is not associated with suppressed mPFC responses to conditioned stimuli, but is driven by single-unit coordination into population-wide transitions of the animal’s internal state.