Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World

Laurent F.; Schneider M.; Scheller C.; Watson J.; Li J.; Chen Z.; Zheng Y.; Chan S.; O. Svidchenko; D. Ivanov; A. Shpilman; Spirovska E.; Tanevski O.; Nikov A.; Grunder R.; Galevski D.; Mitrovski J.; Sartoretti G.; Luo Z.; Damani M.; Bhattacharya N.; Agarwal S.; Egli A.; Nygren E.; Mohanty S.

?

Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World

P. 275–301.

Laurent F., Schneider M., Scheller C., Watson J., Li J., Chen Z., Zheng Y., Chan S., Махнев К. И., Svidchenko O., Егоров В. С., Ivanov D., Shpilman A., Spirovska E., Tanevski O., Nikov A., Grunder R., Galevski D., Mitrovski J., Sartoretti G., Luo Z., Damani M., Bhattacharya N., Agarwal S., Egli A., Nygren E., Mohanty S.

The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operations research (OR) for decades, the ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible. Recently, multi-agent reinforcement learning (MARL) has successfully tackled challenging tasks where many agents need to be coordinated, such as multiplayer video games. However, the coordination of hundreds of agents in a real-life setting like a railway network remains challenging and the Flatland environment used for the competition models these real-world properties in a simplified manner. Submissions had to bring as many trains (agents) to their target stations in as little time as possible. While the best submissions were in the OR category, participants found many promising MARL approaches. Using both centralized and decentralized learning based approaches, top submissions used graph representations of the environment to construct tree-based observations. Further, different coordination mechanisms were implemented, such as communication and prioritization between agents. This paper presents the competition setup, four outstanding solutions to the competition, and a cross-comparison between them.

Language: English

Full text

Text on another site

Keywords: operations research Deep Reinforcement Learning multi-agent path finding multi-agent reinforcement learning vehicle re-scheduling problem

Publication based on the results of:

New game-theoretic methods to resource allocation problems (2022)

In book

Proceedings of Machine Learning Research

Vol. 133: Proceedings of the NeurIPS 2020: Competition and Demonstration Track. , PMLR, 2021.

Multi-Agent Pathfinding with Continuous Time

Andreychuk A., Yakovlev K., Atzmon D. et al., , in: Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019). International Joint Conferences on Artificial Intelligence, 2019. P. 39–45.

Multi-Agent Pathfinding (MAPF) is the problem offinding paths for multiple agents such that everyagent reaches its goal and the agents do not col-lide. Most prior work on MAPF was on grids, as-sumed agents’ actions have uniform duration, andthat time is discretized into timesteps. We proposea MAPF algorithm that does not rely on these as-sumptions, is ...

Added: August 21, 2019

Recent Advances of the Russian Operations Research Society

Cambridge: Cambridge Scholars Publishing, 2020.

This collection of articles highlights the most interesting new results from the IX Moscow International Operations Research Conference, the largest Russian meeting in this field, held every three years for leading experts. These papers will interest researchers and organizations specialized in OR, Game Theory, System Analysis, Macro- and Micro-economic Modelling, and Actuarial Mathematics. The volume ...

Added: July 8, 2020

Models, Algorithms, and Technologies for Network Analysis

NY: Springer, 2013.

This volume contains two types of papers—a selection of contributions from the “Second International Conference in Network Analysis” held in Nizhny Novgorod on May 7–9, 2012, and papers submitted to an "open call for papers" reflecting the activities of LATNA at the Higher School for Economics. This volume contains many new results in modeling and powerful ...

Added: September 27, 2013

Deep Reinforcement Learning with VizDoom First-Person Shooter

Dmitry Akimov, Makarov I., , in: Proceedings of the Fifth Workshop on Experimental Economics and Machine Learning at the National Research University Higher School of Economics co-located with the Seventh International Conference on Applied Research in Economics (iCare7). Aachen: CEUR Workshop Proceedings, 2019. P. 3–17.

In this work, we study deep reinforcement algorithms for partially observable Markov decision processes (POMDP) combined with Deep Q-Networks. To our knowledge, we are the first to apply standard Markov decision process architectures to POMDP scenarios. We propose an extension of DQN with Dueling Networks and several other model-free policies to training agent using deep ...

Added: November 19, 2019

Decentralized Unlabeled Multi-agent Navigation in Continuous Space

Dergachev S., Yakovlev K., , in: Interactive Collaborative Robotics. 9th International Conference, ICR 2024, Mexico City, Mexico, October 14–18, 2024, Proceedings. Cham: Springer, 2024. P. 186–200.

In this work, we study the problem where a group of mobile agents needs to reach a set of goal locations, but it does not matter which agent reaches a specific goal. Unlike most of the existing works on this topic that typically assume the existence of the centralized planner (or controller) and limit the ...

Added: September 11, 2024

Operations Research Techniques in Wildfire Fuel Management

Gillen C., Matsypura D., Prokopyev O., , in: Optimization Methods and Applications, In Honor of Ivan V. Sergienko's 80th Birthday, Springer Optimization and Its ApplicationsVol. 130. Springer, 2017. P. 119–135.

Wildfires are a naturally occurring phenomenon in many places of 4 the world. While they perform a number of important ecological functions, the 5 proximity of human activities to forest landscapes requires a measure of control/pre- 6 paredness to address safety concerns and mitigate damage. An important technique 7 utilized by forest managers is that ...

Added: November 28, 2017

Mathematical Optimization Theory and Operations Research. 18th International Conference, MOTOR 2019 (LNCS)

Springer, 2019.

This book constitutes the proceedings of the 18th International Conference on Mathematical Optimization Theory and Operations Research, MOTOR 2019, held in Ekaterinburg, Russia, in July 2019. The 48 full papers presented in this volume were carefully reviewed and selected from 170 submissions. MOTOR 2019 is a successor of the well-known International and All-Russian conference series, which were ...

Added: October 31, 2020

Deep Multi-Agent Reinforcement Learning with Relevance Graphs

Shpilman A., Malysheva A., Sung T. T. et al., , in: Deep RL Workshop NeurIPS 2018. [б.и.], 2018. P. 1–10.

Over recent years, deep reinforcement learning has shown strong successes in complex single-agent tasks, and more recently this approach has also been applied to multi-agent domains. In this paper, we propose a novel approach, called MAGnet, to multi-agent reinforcement learning (MARL) that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique inspired ...

Added: January 18, 2019

Prioritized Multi-Agent Path Finding for Differential Drive Robots

Yakovlev K., Andreychuk A., Vorobyev V., , in: Proceedings of the 2019 European Conference on Mobile Robotics (ECMR 2019). Prague: IEEE, 2019. P. 1–6.

Methods for centralized planning of the collision-free trajectories for a fleet of mobile robots typically solve the discretized version of the problem and rely on numerous simplifying assumptions, e.g. moves of uniform duration, cardinal only translations, equal speed and size of the robots etc., thus the resultant plans can not always be directly executed by ...

Added: January 15, 2020

VIII Moscow International Conference on Operations Research (ORM2016)

M.: Max press, 2016.

Added: March 5, 2019

Clusters, orders, trees: methods and applications. In Honor of Boris Mirkin's 70th Birthday

Berlin: Springer, 2014.

The volume is dedicated to Boris Mirkin on the occasion of his 70th birthday. In addition to his startling PhD results in abstract automata theory, Mirkin’s ground breaking contributions in various fields of decision making and data analysis have marked the fourth quarter of the 20th century and beyond. Mirkin has done pioneering work in ...

Added: October 7, 2013

Any-Angle Pathfinding for Multiple Agents Based on SIPP Algorithm

Yakovlev K., Andreychuk A., , in: Proceedings of the 27th International Conference on Automated Planning and Scheduling (ICAPS 2017). Palo Alto: AAAI Press, 2017. P. 586–593.

The problem of finding conflict-free trajectories for multiple agents of identical circular shape, operating in shared 2D workspace, is addressed in the paper and decoupled, e.g., prioritized, approach is used to solve this problem. Agents’ workspace is tessellated into the square grid on which any-angle moves are allowed, e.g. each agent can move into an ...

Added: July 17, 2017

Deep Reinforcement Learning in VizDoom via DQN and Actor-Critic Agents

Maria Bakhanova, Ilya Makarov, , in: Advances in Computational Intelligence: 16th International Work-Conference on Artificial Neural Networks, IWANN 2021, Virtual Event, June 16–18, 2021, Proceedings, Part I* 1. Vol. 12861. Springer, 2021. Ch. 12 P. 138–150.

In this work, we study the problem of learning reinforcement learning-based agents in a first-person shooter environment VizDoom. We compare several well-known architectures, such as DQN, DDQN, A3C, and Curiosity-driven model, while highlighting the main differences in learned policies of agents trained via these models. ...

Added: September 1, 2021

Workshop on AI for Autonomous Driving (AIAD)

[б.и.], 2020.

Self-driving cars and advanced safety features present one of today’s greatest challenges and opportunities for Artificial Intelligence (AI). Despite billions of dollars of investments and encouraging progress under certain operational constraints, there are no driverless cars on public roads today without human safety drivers. Autonomous Driving research spans a wide spectrum, from modular architectures -- ...

Added: December 28, 2020

Heuristic Algorithm for Solving the Cosmonauts Training Planning Problem

Alexander Lazarev, Khusnullin N., Musatova E. et al., , in: CEUR Workshop Proceedings (CEUR-WS.org) of the VIII International Conference on Optimization Methods and Applications “OPTIMIZATION AND APPLICATINS” (OPTIMA-2017)Vol. 1987. [б.и.], 2017. Ch. 1987 P. 364–369.

The cosmonauts training planning problem is a problem of construc- tion of cosmonauts training timetable. Each cosmonaut has his own set of tasks which should be performed with respect to resource and time con- straints. The problem is to determine start moments for all considered tasks. This problem is a generalization of the resource-constrained project ...

Added: October 20, 2017

Towards a Complete Multi-agent Pathfinding Algorithm for Large Agents

Dergachev S., Yakovlev K., , in: Advances in Computational Intelligence. 21st Mexican International Conference on Artificial Intelligence, MICAI 2022, Monterrey, Mexico, October 24–29, 2022, Proceedings* 1. Cham: Springer, 2022. P. 355–367.

Multi-agent pathfinding (MAPF) is a challenging problem which is hard to solve optimally even when simplifying assumptions are adopted, e.g. planar graphs (typically – grids), discretized time, uniform duration of move and wait actions etc. On the other hand, MAPF under such restrictive assumptions (also known as the Classical MAPF) is equivalent to the so-called ...

Added: May 16, 2023

Труды IX Московской международной конференции по исследованию операций (ORM2018)

М.: ООО «Макс Пресс», 2018.

This volume comprises the proceedings of the IX Moscow International Conference on Operations Research (ORМ 2018 – Germeyer100) in the scope of fundamental research and applications of decision-making theory under uncertainty, operations research in multiple areas as well as numerical methods of operations research. The conference is devoted to the centenary of the outstanding Soviet ...

Added: October 22, 2018

Discrete Optimization and Operations Research/9th International Conference, DOOR 2016, Vladivostok, Russia, September 19-23, 2016, Proceedings

Springer, 2016.

This book constitutes the proceedings of the 9th International Conference on Discrete Optimization and Operations Research, DOOR 2016, held in Vladivostok, Russia, in September 2016. The 39 full papers presented in this volume were carefully reviewed and selected from 181 submissions. They were organized in topical sections named: discrete optimization; scheduling problems; facility location; mathematical programming; ...

Added: September 12, 2016

VIII Moscow International Conference on Operations Research (ORM2016) Moscow, October 17–22, 2016

[б.и.], 2016.

Added: October 21, 2016

Decentralized Unlabeled Multi-agent Pathfinding Via Target And Priority Swapping

Dergachev S., Yakovlev K., , in: ECAI 2024. 27th European Conference on Artificial Intelligence, October 19 – 24 October 2024, Santiago de Compostela, Spain – Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS 2024). IOS Press, 2024.

In this paper we study a challenging variant of the multi-agent pathfinding problem (MAPF), when a set of agents must reach a set of goal locations, but it does not matter which agent reaches a specific goal -- Anonymous MAPF (AMAPF). Current optimal and suboptimal AMAPF solvers rely on the existence of a centralized controller ...

Added: September 11, 2024

Operations Research Forum

Springer, 2023.

Added: May 24, 2022

A Mathematical Model for the Astronaut Training Scheduling Problem

Musatova E. G., Lazarev A. A., Ponomarev K. et al., IFAC-PapersOnLine 2016 Vol. 49 No. 12 P. 221–225

We consider a problem of the astronaut training scheduling. Each astronaut has his own set of tasks which should be performed with respect to resource and time constraints. The problem is to determine start moments for all considered tasks. For this issue a mathematical model based on integer linear programming is proposed. Computational results of ...

Added: October 31, 2016

Mathematical Optimization Theory and Operations Research. 23rd International Conference, MOTOR 2024, Omsk, Russia, June 30–July 6, 2024, Proceedings. LNCS, volume 14766

Springer, 2024.

This volume contains the refereed proceedings of the 23rd International Conference on Mathematical Optimization Theory and Operations Research (MOTOR 2024)1 held from June 30 to July 06, 2024, in Omsk, Russia. The MOTOR conference joined several well-known conferences on mathematical programming, optimization, and operations research which had been held in the Urals, Siberia and the ...

Added: August 9, 2024

26th European Conference on Operational Research, Abstract Book

Rome: Sapienza Università di Roma, 2013.

EURO 2013 Abstract Book ...

Added: March 5, 2019