A Comparative Evaluation of Machine Learning Methods for Robot Navigation Through Human Crowds
Robot navigation through crowds poses a difficult challenge to AI systems, since the methods should result in fast and efficient movement but at the same time are not allowed to compromise safety. Most approaches to date were focused on the combination of pathfinding algorithms with machine learning for pedestrian walking prediction. More recently, reinforcement learning techniques have been proposed in the research literature. In this paper, we perform a comparative evaluation of pathfinding/prediction and reinforcement learning approaches on a crowd movement dataset collected from surveillance videos taken at Grand Central Station in New York. The results demonstrate the strong superiority of state-of-the-art reinforcement learning approaches over pathfinding with state-of-the-art behavior prediction techniques.
We present two examples of how human-like behavior can be implemented in a model of computer player to improve its characteristics and decision-making patterns in video game. At first, we describe a reinforcement learning model, which helps to choose the best weapon depending on reward values obtained from shooting combat situations. Secondly, we consider an obstacle avoiding path planning adapted to the tactical visibility measure. We describe an implementation of a smoothing path model, which allows the use of penalties (negative rewards) for walking through ``bad'' tactical positions. We also study algorithms of path finding such as improved I-ARA* search algorithm for dynamic graph by copying human discrete decision-making model of reconsidering goals similar to Page-Rank algorithm. All the approaches demonstrate how human behavior can be modeled in applications with significant perception of intellectual agent actions.
In the present publication the invited, plenary and poster papers of the 20th Saint-Petersburg International Conference on Integrated Navigation Systems (27-29 May, 2013) are presented.
Efficient regulation of internal homeostasis and defending it against perturbations requires adaptive behavioral strategies. However, the computational principles mediating the interaction between homeostatic and associative learning processes remain undefined. Here we use a definition of primary rewards, as outcomes fulfilling physiological needs, to build a normative theory showing how learning motivated behaviors may be modulated by internal states. Within this framework, we mathematically prove that seeking rewards is equivalent to the fundamental objective of physiological stability, defining the notion of physiological rationality of behavior. We further suggest a formal basis for temporal discounting of rewards by showing that discounting motivates animals to follow the shortest path in the space of physiological variables toward the desired setpoint. We also explain how animals learn to act predictively to preclude prospective homeostatic challenges, and several other behavioral patterns. Finally, we suggest a computational role for interaction between hypothalamus and the brain reward system.
Humans often change their beliefs or behavior due to the behavior or opinions of others. This study explored, with the use of human event-related potentials (ERPs), whether social conformity is based on a general performance-monitoring mechanism. We tested the hypothesis that conflicts with a normative group opinion evoke a feedback-related negativity (FRN) often associated with performance monitoring and subsequent adjustment of behavior. The experimental results show that individual judgments of facial attractiveness were adjusted in line with a normative group opinion. A mismatch between individual and group opinions triggered a frontocentral negative deflection with the maximum at 200 ms, similar to FRN. Overall, a conflict with a normative group opinion triggered a cascade of neuronal responses: from an earlier FRN response reflecting a conflict with the normative opinion to a later ERP component (peaking at 380 ms) reflecting a conforming behavioral adjustment. These results add to the growing literature on neuronal mechanisms of social influence by disentangling the conflict-monitoring signal in response to the perceived violation of social norms and the neural signal of a conforming behavioral adjustment.
Humans can adapt their behavior by learning from the consequences of their own actions or by observing others. Gradual active learning of action-outcome contingencies is accompanied by a shift from feedback- to response-based performance monitoring. This shift is reflected by complementary learning-related changes of two ACC-driven ERP components, the feedback-related negativity (FRN) and the error-related negativity (ERN), which have both been suggested to signal events "worse than expected," that is, a negative prediction error. Although recent research has identified comparable components for observed behavior and outcomes (observational ERN and FRN), it is as yet unknown, whether these components are similarly modulated by prediction errors and thus also reflect behavioral adaptation. In this study, two groups of 15 participants learned action-outcome contingencies either actively or by observation. In active learners, FRN amplitude for negative feedback decreased and ERN amplitude in response to erroneous actions increased with learning, whereas observational ERN and FRN in observational learners did not exhibit learning-related changes. Learning performance, assessed in test trials without feedback, was comparable between groups, as was the ERN following actively performed errors during test trials. In summary, the results show that action-outcome associations can be learned similarly well actively and by observation. The mechanisms involved appear to differ, with the FRN in active learning reflecting the integration of information about own actions and the accompanying outcomes.
The article describes routs of visitors of museum-reserve Tsaritsyno (Moscow) after its reconstruction -- in the most popular and crowded "historical" part of the park and in the distant areas. In addition, we consider which type of visitors prefer certain routes, as well as how visitors experience space in different parts of the park (or different modes of perception). The article describes such modes as "consumption of public space", "romantic tourist gaze" and "existential" mode.
The comprehensive step-by-step method of the accelerometer choosing for the laser strapdown inertial navigation system of new generation is considered. The results of Si-flex and Q-flex accelerometers research and comparative tests are presented. The advantages and the problems of the above accelerometers connected with pendulum material are described and its influence on the accelerometers accuracy parameters as well as inertial navigation systems's accuracy class and its place among innovative sptrapdown systems are considered