Модели обучения в играх: обзор
This survey analyzes central ideas and the current state of the economic theory of learning in games. In game theory learning can be thought of as both an alternative to equilibria and as a way to better understand the nature of equilibria. Outside of game theory, theory of learning shows economic theory (for example, the classic Cournot oligopoly) in a new light, provides interesting theoretical problems, is nontrivial from econometric perspective, and can be studied with experimental methods. It also links economics with unexpected scientific disciplines: biology, philosophy of rationality and computer science. However, existing surveys are not particularly accessible to beginners and are not accessible at all in Russian. This survey intends to fill these gaps. It can serve both as an introduction and as a short reference. We analyze issues of classification as well as the models themselves. Theoretical descriptions are illustrated with concrete examples. Special attention is devoted to the empirical and experimental work. We also draw conclusions and hypothesize on perspectives of the field and its future role in economic theory.
In this article, I examine a model of oligopolistic competition in which consumers search for prices but have no knowledge of the underlying price distribution. The consumers' behaviour satisfies four consistency requirements and, as a result, their beliefs about the underlying distribution maximise Shannon entropy. I derive the optimal stopping rule and equilibrium price distribution of the model. Unlike in Stahl (1989), the expected price is decreasing in the number of firms. Moreover, consumers can benefit from being uninformed, if the number of firms is sufficiently large.
Efficient regulation of internal homeostasis and defending it against perturbations requires adaptive behavioral strategies. However, the computational principles mediating the interaction between homeostatic and associative learning processes remain undefined. Here we use a definition of primary rewards, as outcomes fulfilling physiological needs, to build a normative theory showing how learning motivated behaviors may be modulated by internal states. Within this framework, we mathematically prove that seeking rewards is equivalent to the fundamental objective of physiological stability, defining the notion of physiological rationality of behavior. We further suggest a formal basis for temporal discounting of rewards by showing that discounting motivates animals to follow the shortest path in the space of physiological variables toward the desired setpoint. We also explain how animals learn to act predictively to preclude prospective homeostatic challenges, and several other behavioral patterns. Finally, we suggest a computational role for interaction between hypothalamus and the brain reward system.
We present two examples of how human-like behavior can be implemented in a model of computer player to improve its characteristics and decision-making patterns in video game. At first, we describe a reinforcement learning model, which helps to choose the best weapon depending on reward values obtained from shooting combat situations. Secondly, we consider an obstacle avoiding path planning adapted to the tactical visibility measure. We describe an implementation of a smoothing path model, which allows the use of penalties (negative rewards) for walking through ``bad'' tactical positions. We also study algorithms of path finding such as improved I-ARA* search algorithm for dynamic graph by copying human discrete decision-making model of reconsidering goals similar to Page-Rank algorithm. All the approaches demonstrate how human behavior can be modeled in applications with significant perception of intellectual agent actions.
In this article a combination of two modern aspects of games development is considered: (i) the impact of high quality graphics and virtual reality (VR) user adaptation to believe in realness of in-game events by user’s own eyes; (ii) modeling an enemy’s behavior under automatic computer control, called BOT, which reacts similarly to human players. We consider a First-Person Shooter (FPS) game genre, which simulates an experience of combat actions. We describe some tricks to overcome simulator sicknesses in a shooter with respect to Oculus Rift and HTC Vive headsets. We created a BOT model that strongly reduces the conflict and uncertainty in matching human expectations. BOT passes VR game Alan Turing test with 80% threshold of believable human-like behavior.
Adaptive and Learning Agents Workshop at International Joint Conference on Autonomous Agents and Multiagent Systems
Humans often change their beliefs or behavior due to the behavior or opinions of others. This study explored, with the use of human event-related potentials (ERPs), whether social conformity is based on a general performance-monitoring mechanism. We tested the hypothesis that conflicts with a normative group opinion evoke a feedback-related negativity (FRN) often associated with performance monitoring and subsequent adjustment of behavior. The experimental results show that individual judgments of facial attractiveness were adjusted in line with a normative group opinion. A mismatch between individual and group opinions triggered a frontocentral negative deflection with the maximum at 200 ms, similar to FRN. Overall, a conflict with a normative group opinion triggered a cascade of neuronal responses: from an earlier FRN response reflecting a conflict with the normative opinion to a later ERP component (peaking at 380 ms) reflecting a conforming behavioral adjustment. These results add to the growing literature on neuronal mechanisms of social influence by disentangling the conflict-monitoring signal in response to the perceived violation of social norms and the neural signal of a conforming behavioral adjustment.
Our decisions are affected not only by objective information about the available options but also by other people. Recent brain imaging studies have adopted the cognitive neuroscience approach for studying the neural mechanisms of social influence. A number of studies have shown that social influence is associated with neural activity in the medial prefrontal cortex and ventral striatum, which are two brain areas involved in the fundamental and not exclusively social mechanisms of performance monitoring. Therefore, the neural mechanisms of social influence could be deeply integrated into our general neuronal performance-monitoring mechanisms.
Smoking is a problem, bringing signifi cant social and economic costs to Russiansociety. However, ratifi cation of the World health organization Framework conventionon tobacco control makes it possible to improve Russian legislation accordingto the international standards. So, I describe some measures that should be taken bythe Russian authorities in the nearest future, and I examine their effi ciency. By studyingthe international evidence I analyze the impact of the smoke-free areas, advertisementand sponsorship bans, tax increases, etc. on the prevalence of smoking, cigaretteconsumption and some other indicators. I also investigate the obstacles confrontingthe Russian authorities when they introduce new policy measures and the public attitudetowards these measures. I conclude that there is a number of easy-to-implementanti-smoking activities that need no fi nancial resources but only a political will.
One of the most important indicators of company's success is the increase of its value. The article investigates traditional methods of company's value assessment and the evidence that the application of these methods is incorrect in the new stage of economy. So it is necessary to create a new method of valuation based on the new main sources of company's success that is its intellectual capital.