Task Planning in “Block World” with Deep Reinforcement Learning
Humans often change their beliefs or behavior due to the behavior or opinions of others. This study explored, with the use of human event-related potentials (ERPs), whether social conformity is based on a general performance-monitoring mechanism. We tested the hypothesis that conflicts with a normative group opinion evoke a feedback-related negativity (FRN) often associated with performance monitoring and subsequent adjustment of behavior. The experimental results show that individual judgments of facial attractiveness were adjusted in line with a normative group opinion. A mismatch between individual and group opinions triggered a frontocentral negative deflection with the maximum at 200 ms, similar to FRN. Overall, a conflict with a normative group opinion triggered a cascade of neuronal responses: from an earlier FRN response reflecting a conflict with the normative opinion to a later ERP component (peaking at 380 ms) reflecting a conforming behavioral adjustment. These results add to the growing literature on neuronal mechanisms of social influence by disentangling the conflict-monitoring signal in response to the perceived violation of social norms and the neural signal of a conforming behavioral adjustment.
Drug addiction implicates both reward learning and homeostatic regulation mechanisms of the brain. This has stimulated 2 partially successful theoretical perspectives on addiction. Many important aspects of addiction, however, remain to be explained within a single, unified framework that integrates the 2 mechanisms. Building upon a recently developed homeostatic reinforcement learning theory, the authors focus on a key transition stage of addiction that is well modeled in animals, escalation of drug use, and propose a computational theory of cocaine addiction where cocaine reinforces behavior due to its rapid homeostatic corrective effect, whereas its chronic use induces slow and long-lasting changes in homeostatic setpoint. Simulations show that our new theory accounts for key behavioral and neurobiological features of addiction, most notably, escalation of cocaine use, drug-primed craving and relapse, individual differences underlying dose-response curves, and dopamine D2-receptor downregulation in addicts. The theory also generates unique predictions about cocaine self-administration behavior in rats that are confirmed by new experimental results. Viewing addiction as a homeostatic reinforcement learning disorder coherently explains many behavioral and neurobiological aspects of the transition to cocaine addiction, and suggests a new perspective toward understanding addiction.
Efficient regulation of internal homeostasis and defending it against perturbations requires adaptive behavioral strategies. However, the computational principles mediating the interaction between homeostatic and associative learning processes remain undefined. Here we use a definition of primary rewards, as outcomes fulfilling physiological needs, to build a normative theory showing how learning motivated behaviors may be modulated by internal states. Within this framework, we mathematically prove that seeking rewards is equivalent to the fundamental objective of physiological stability, defining the notion of physiological rationality of behavior. We further suggest a formal basis for temporal discounting of rewards by showing that discounting motivates animals to follow the shortest path in the space of physiological variables toward the desired setpoint. We also explain how animals learn to act predictively to preclude prospective homeostatic challenges, and several other behavioral patterns. Finally, we suggest a computational role for interaction between hypothalamus and the brain reward system.
Our decisions are affected not only by objective information about the available options but also by other people. Recent brain imaging studies have adopted the cognitive neuroscience approach for studying the neural mechanisms of social influence. A number of studies have shown that social influence is associated with neural activity in the medial prefrontal cortex and ventral striatum, which are two brain areas involved in the fundamental and not exclusively social mechanisms of performance monitoring. Therefore, the neural mechanisms of social influence could be deeply integrated into our general neuronal performance-monitoring mechanisms.