From Feedback- to Response-based Performance Monitoring in active and observational learning
Humans can adapt their behavior by learning from the consequences of their own actions or by observing others. Gradual active learning of action-outcome contingencies is accompanied by a shift from feedback- to response-based performance monitoring. This shift is reflected by complementary learning-related changes of two ACC-driven ERP components, the feedback-related negativity (FRN) and the error-related negativity (ERN), which have both been suggested to signal events "worse than expected," that is, a negative prediction error. Although recent research has identified comparable components for observed behavior and outcomes (observational ERN and FRN), it is as yet unknown, whether these components are similarly modulated by prediction errors and thus also reflect behavioral adaptation. In this study, two groups of 15 participants learned action-outcome contingencies either actively or by observation. In active learners, FRN amplitude for negative feedback decreased and ERN amplitude in response to erroneous actions increased with learning, whereas observational ERN and FRN in observational learners did not exhibit learning-related changes. Learning performance, assessed in test trials without feedback, was comparable between groups, as was the ERN following actively performed errors during test trials. In summary, the results show that action-outcome associations can be learned similarly well actively and by observation. The mechanisms involved appear to differ, with the FRN in active learning reflecting the integration of information about own actions and the accompanying outcomes.
Efficient regulation of internal homeostasis and defending it against perturbations requires adaptive behavioral strategies. However, the computational principles mediating the interaction between homeostatic and associative learning processes remain undefined. Here we use a definition of primary rewards, as outcomes fulfilling physiological needs, to build a normative theory showing how learning motivated behaviors may be modulated by internal states. Within this framework, we mathematically prove that seeking rewards is equivalent to the fundamental objective of physiological stability, defining the notion of physiological rationality of behavior. We further suggest a formal basis for temporal discounting of rewards by showing that discounting motivates animals to follow the shortest path in the space of physiological variables toward the desired setpoint. We also explain how animals learn to act predictively to preclude prospective homeostatic challenges, and several other behavioral patterns. Finally, we suggest a computational role for interaction between hypothalamus and the brain reward system.
We present two examples of how human-like behavior can be implemented in a model of computer player to improve its characteristics and decision-making patterns in video game. At first, we describe a reinforcement learning model, which helps to choose the best weapon depending on reward values obtained from shooting combat situations. Secondly, we consider an obstacle avoiding path planning adapted to the tactical visibility measure. We describe an implementation of a smoothing path model, which allows the use of penalties (negative rewards) for walking through ``bad'' tactical positions. We also study algorithms of path finding such as improved I-ARA* search algorithm for dynamic graph by copying human discrete decision-making model of reconsidering goals similar to Page-Rank algorithm. All the approaches demonstrate how human behavior can be modeled in applications with significant perception of intellectual agent actions.
In this article a combination of two modern aspects of games development is considered: (i) the impact of high quality graphics and virtual reality (VR) user adaptation to believe in realness of in-game events by user’s own eyes; (ii) modeling an enemy’s behavior under automatic computer control, called BOT, which reacts similarly to human players. We consider a First-Person Shooter (FPS) game genre, which simulates an experience of combat actions. We describe some tricks to overcome simulator sicknesses in a shooter with respect to Oculus Rift and HTC Vive headsets. We created a BOT model that strongly reduces the conflict and uncertainty in matching human expectations. BOT passes VR game Alan Turing test with 80% threshold of believable human-like behavior.
Adaptive and Learning Agents Workshop at International Joint Conference on Autonomous Agents and Multiagent Systems
The article considers the conditions necessary for arranging educational environment, which can provide the required level of training. One of the key factors affecting the quality of education is the professional level of teachers, their ability and willingness to design their own trainingtechnologies and use non-standard methods of solving educational problems such as active learning. The article provides research on the assessment process of writing papers and project learning experiment.
Our decisions are affected not only by objective information about the available options but also by other people. Recent brain imaging studies have adopted the cognitive neuroscience approach for studying the neural mechanisms of social influence. A number of studies have shown that social influence is associated with neural activity in the medial prefrontal cortex and ventral striatum, which are two brain areas involved in the fundamental and not exclusively social mechanisms of performance monitoring. Therefore, the neural mechanisms of social influence could be deeply integrated into our general neuronal performance-monitoring mechanisms.
The distractive effects on attentional task performance in different paradigms are analyzed in this paper. I demonstrate how distractors may negatively affect (interference effect), positively (redundancy effect) or neutrally (null effect). Distractor effects described in literature are classified in accordance with their hypothetical source. The general rule of the theory is also introduced. It contains the formal prediction of the particular distractor effect, based on entropy and redundancy measures from the mathematical theory of communication (Shannon, 1948). Single- vs dual-process frameworks are considered for hypothetical mechanisms which underpin the distractor effects. Distractor profiles (DPs) are also introduced for the formalization and simple visualization of experimental data concerning the distractor effects. Typical shapes of DPs and their interpretations are discussed with examples from three frequently cited experiments. Finally, the paper introduces hierarchical hypothesis that states the level-fashion modulating interrelations between distractor effects of different classes.
This article describes the expierence of studying factors influencing the social well-being of educational migrants as mesured by means of a psychological well-being scale (A. Perrudet-Badoux, G.A. Mendelsohn, J.Chiche, 1988) previously adapted for Russian by M.V. Sokolova. A statistical analysis of the scale's reliability is performed. Trends in dynamics of subjective well-being are indentified on the basis the correlations analysis between the condbtbions of adaptation and its success rate, and potential mechanisms for developing subjective well-being among student migrants living in student hostels are described. Particular attention is paid to commuting as a factor of adaptation.