A Comparative Evaluation of Machine Learning Methods for Robot Navigation Through Human Crowds

A. Shpilman; Kudenko D.; Gaydashenko A.

doi:10.1109/ICMLA.2018.00089

Publications

?

A Comparative Evaluation of Machine Learning Methods for Robot Navigation Through Human Crowds

P. 553–557.

Shpilman A., Kudenko D., Gaydashenko A.

Robot navigation through crowds poses a difficult challenge to AI systems, since the methods should result in fast and efficient movement but at the same time are not allowed to compromise safety. Most approaches to date were focused on the combination of pathfinding algorithms with machine learning for pedestrian walking prediction. More recently, reinforcement learning techniques have been proposed in the research literature. In this paper, we perform a comparative evaluation of pathfinding/prediction and reinforcement learning approaches on a crowd movement dataset collected from surveillance videos taken at Grand Central Station in New York. The results demonstrate the strong superiority of state-of-the-art reinforcement learning approaches over pathfinding with state-of-the-art behavior prediction techniques.

Language: English

DOI

Text on another site

Keywords: reinforcement learning navigation robot kinematics collision avoidance

In book

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)

IEEE, 2018.

Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V

Cham: Springer, 2025.

This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025. The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...

Added: September 29, 2025

Analysis of a Company Model in Conditions of Unstable Demand Using Reinforcement Learning Methods

Delev A., Semakov S., , in: 2025 8th International Conference on Artificial Intelligence and Big Data (ICAIBD).: IEEE, 2025. P. 318–322.

Profit is one of the most important economic indicators of a company’s performance, and for every company it is necessary to allocate resources in such a way as to obtain the maximum possible profit. The profit maximization problem is usually a dynamic optimization problem. This article discusses an approach to solving the production expansion problem ...

Added: August 25, 2025

Pseudo-collusion in a centralized algorithmic financial market

Pastushkov A., Boulatov A., Finance Research Letters 2025 Vol. 83 Article 107671

Recent studies have increasingly explored whether reinforcement learning algorithms can give rise to cooperative behavior that results in non-competitive pricing across various market settings. In financial markets, Cartea et al. (2022) show that market makers using multi-armed bandit (MAB) algorithms generally converge to competitive pricing in quote-driven over-the-counter (OTC) markets, barring some unlikely exceptions where ...

Added: June 19, 2025

Distributed Multi-Agent Navigation Based on Reciprocal Collision Avoidance and Locally Confined Multi-Agent Path Finding

Dergachev S., Yakovlev K., , in: 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE).: IEEE, 2024. Ch. n/a P. 1489–1494.

Avoiding collisions is the core problem in multiagent navigation. In decentralized settings, when agents have limited communication and sensory capabilities, collisions are typically avoided in a reactive fashion, relying on local ob-servations/communications. Prominent collision avoidance techniques, e.g. ORCA, are computationally efficient and scale well to a large number of agents. However, in numerous scenarios, involving ...

Added: May 5, 2025

The beer game bullwhip effect mitigation: a deep reinforcement learning approach

Rozhkov M., Alyamovskaya N., Zakhodiakin G., International Journal of Production Research 2025 Vol. 63 No. 18 P. 6630–6647

This article investigates the application of reinforcement learning (RL) methods to optimise a four-echelon linear supply chain model with stochastic demand. The proposed supply chain configuration is largely based on the production-distribution supply chain of the MIT Supply Chain Beer Game. We show that RL can significantly improve ordering efficiency and overall supply chain performance. ...

Added: March 24, 2025

Revisiting Non-Acyclic GFlowNets in Discrete Environments

Morozov N., Maximov I. V., Tiapkin D. et al., / Series "Working papers by Cornell University". 2025.

Generative Flow Networks (GFlowNets) are a family of generative models that learn to sample objects from a given probability distribution, potentially known up to a normalizing constant. Instead of working in the object space, GFlowNets proceed by sampling trajectories in an appropriately constructed directed acyclic graph environment, greatly relying on the acyclicity of the graph. ...

Added: February 20, 2025

Implementation of Bug1 and Bug2 Basic Path-Planning Algorithms for a TurtleBot 3 Robot in ROS Noetic

Spektor I., Zagirov A., Safin R. et al., , in: Proceedings Of The 2024 International Conference On Artificial Life And Robotics February 22 To 25, 2024 J:Com Horutohall, Oita, Japan. 29Th Arob International Meeting Series.: ALife Robotics Corporation Ltd., 2024. P. 272–275.

Added: February 19, 2025

Deep Reinforcement Learning-Based Congestion Control for File Transfer over QUIC

Blokhin A., Kalev V., Pusev R. et al., , in: 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON).: Novosibirsk: IEEE, 2024. P. 25–30.

Congestion control is one of the key mechanisms of communication in QUIC protocol which controls how much data and at which rate can be send to an endpoint at particular moment of time for better use of shared network resources and avoids moving into congestive collapse state. In this work we tackle the problem of ...

Added: December 18, 2024

Model predictive path integral for decentralized multi-agent collision avoidance

Dergachev S., Yakovlev K., PeerJ Computer Science 2024 Vol. 10 Article e2220

Collision avoidance is a crucial component of any decentralized multi-agent navigation system. Currently, most of the existing multi-agent collision-avoidance methods either do not take into account the kinematic constraints of the agents (i.e., they assume that an agent might change the direction of movement instantaneously) or are tailored to specific kinematic motion models (e.g., car-like ...

Added: August 28, 2024

Generative Flow Networks as Entropy-Regularized RL

Tiapkin D., Morozov N., Naumov A. et al., , in: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), 2-4 May 2024, Palau de Congressos, Valencia, Spain. PMLR: Volume 238Vol. 238.: Valencia: PMLR, 2024. P. 4213–4221.

The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to ...

Added: June 22, 2024

Model-free Posterior Sampling via Learning Rate Randomization

Tiapkin D., Belomestny D., Calandriello D. et al., , in: Advances in Neural Information Processing Systems 36 (NeurIPS 2023).: Curran Associates, Inc., 2023. P. 73719–73774.

Added: February 17, 2024

Reinforcement Procedure for Randomized Machine Learning

Yuri S. Popkov, Dubnov Y. A., Alexey Yu. Popkov, Mathematics 2023 Vol. 11 No. 17 Article 3651

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. ...

Added: February 5, 2024

Fast Rates for Maximum Entropy Exploration

Tiapkin D., Belomestny D., Calandriello D. et al., , in: Proceedings of the 40th International Conference on Machine Learning: Volume 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USAVol. 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USA.: PMLR, 2023. P. 34161–34221.

Added: December 1, 2023

2023 International Symposium ELMAR, 11-13 September 2023, Zadar, Croatia

Saleh H., IEEE, 2023.

Estimating depth is necessary to understand and navigate the environment surrounding us. Over the years, many active sensors have been developed to measure depth, but they are expensive and require additional space for mounting. A cheaper alternative is estimating depth from a single RGB image taken by an ordinary monocular camera, which can be placed ...

Added: November 30, 2023

Локальное планирование траектории колесного робота в ограниченной среде на основе модельного прогнозирующего управления

Алхаддад М., Миронов К. В., Dergachev S. et al., Робототехника и техническая кибернетика 2023 Т. 11 № 3 С. 205–214

The task of local trajectory planning for an autonomous wheeled robotic platform in cluttered indoor environment is considered. Such environment might include narrow passages, which width is less than the length of the platform. Therefore, it is not possible to apply standard approach, when the obstacles are inflated with the maximum radius of the platform. ...

Added: September 8, 2023

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

Tiapkin D., Belomestny D., Naumov A. et al., Working papers by Cornell University. Series math "arxiv.org" 2023 Article 2304.03056

In this work, we derive sharp non-asymptotic deviation bounds for weighted sums of Dirichlet random variables. These bounds are based on a novel integral representation of the density of a weighted Dirichlet sum. This representation allows us to obtain a Gaussian-like approximation for the sum distribution using geometry and complex analysis methods. Our results generalize ...

Added: June 28, 2023

Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization

Belomestny D., Kaledin M., Golubev A., /. 2022.

Policy-gradient methods in Reinforcement Learning(RL) are very universal and widely applied in practice but their performance suffers from the high variance of the gradient estimate. Several procedures were proposed to reduce it including actor-critic(AC) and advantage actor-critic(A2C) methods. Recently the approaches have got new perspective due to the introduction of Deep RL: both new control ...

Added: April 14, 2023

A note on observational equivalence of micro assumptions on macro level

Ponomarenko A. A., Economics: The Open-Access, Open-Assessment E-Journal 2020 Vol. 14 P. 1–15

The author set up a simplistic agent-based model where agents learn with reinforcement observing an incomplete set of variables. The model is employed to generate an artificial dataset that is used to estimate standard macro econometric models. The author shows that the results are qualitatively indistinguishable (in terms of the signs and significances of the ...

Added: March 28, 2023