Minimal Circuit Model of Reward Prediction Error Computations and Effects of Nicotinic Modulations
Dopamine (DA) neurons in the ventral tegmental area (VTA) are thought to encode reward prediction errors (RPE) by comparing actual and expected rewards. In recent years, much work has been done to identify how the brain uses and computes this signal. While several lines of evidence suggest the interplay of the DA and the inhibitory interneurons in the VTA implements the RPE computation, it still remains unclear how the DA neurons learn key quantities, for example the amplitude and the timing of primary rewards during conditioning tasks. Furthermore, endogenous acetylcholine and exogenous nicotine, also likely affect these computations by acting on both VTA DA and GABA (γ -aminobutyric acid) neurons via nicotinic-acetylcholine receptors (nAChRs). To explore the potential circuit-level mechanisms for RPE computations during classical-conditioning tasks, we developed a minimal computational model of the VTA circuitry. The model was designed to account for several reward-related properties of VTA afferents and recent findings on VTA GABA neuron dynamics during conditioning. With our minimal model, we showed that the RPE can be learned by a two-speed process computing reward timing and magnitude. By including models of nAChR-mediated currents in the VTA DA-GABA circuit, we showed that nicotine should reduce the acetylcholine action on the VTA GABA neurons by receptor desensitization and potentially boost DA responses to reward-related signals in a non-trivial manner. Together, our results delineate the mechanisms by which RPE are computed in the brain, and suggest a hypothesis on nicotine-mediated effects on reward-related perception and decision-making.