Jorge Cortés


Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
A. Ma, M. Ouimet, J. Cortés
Autonomous Robots 44 (3-4) (2020), 485-503


We consider scenarios where a swarm of unmanned vehicles (UxVs) seek to satisfy a number of diverse, spatially distributed objectives. The UxVs strive to determine an efficient schedule of tasks to service the objectives while operating in a coordinated fashion. We focus on developing autonomous high-level planning, where low-level controls are leveraged from previous work in distributed motion, target tracking, localization, and communication. We rely on the use of state and action abstractions in a Markov decision processes framework to introduce a hierarchical algorithm, Dynamic Domain Reduction for Multi-Agent Planning , that enables multi-agent planning for large multi-objective environments. Our analysis establishes the correctness of our seach procedure within any subenvironment and characterizes the algorithm performance with respect to the optimal trajectories in single-agent and sequential multi-agent deployment scenarios using tools from submodularity. Simulated results show significant improvement over using a standard Monte Carlo tree search in an environment with large state and action spaces.


Mechanical and Aerospace Engineering, University of California, San Diego
9500 Gilman Dr, La Jolla, California, 92093-0411

Ph: 1-858-822-7930
Fax: 1-858-822-3107

cortes at
Skype id: jorgilliyo