Jorge Cortés

Professor





Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
A. Ma, M. Ouimet, J. Cortés
Autonomous Robots 44 (3-4) (2020), 485-503


Abstract

We consider scenarios where a swarm of unmanned vehicles (UxVs) seek to satisfy a number of diverse, spatially distributed objectives. The UxVs strive to determine an efficient schedule of tasks to service the objectives while operating in a coordinated fashion. We focus on developing autonomous high-level planning, where low-level controls are leveraged from previous work in distributed motion, target tracking, localization, and communication. We rely on the use of state and action abstractions in a Markov decision processes framework to introduce a hierarchical algorithm, Dynamic Domain Reduction for Multi-Agent Planning , that enables multi-agent planning for large multi-objective environments. Our analysis establishes the correctness of our seach procedure within any subenvironment and characterizes the algorithm performance with respect to the optimal trajectories in single-agent and sequential multi-agent deployment scenarios using tools from submodularity. Simulated results show significant improvement over using a standard Monte Carlo tree search in an environment with large state and action spaces.

pdf

Mechanical and Aerospace Engineering, University of California, San Diego
9500 Gilman Dr, La Jolla, California, 92093-0411

Ph: 1-858-822-7930
Fax: 1-858-822-3107

cortes at ucsd.edu
Skype id: jorgilliyo