Tools and methods for learning in multi-agent maritime strategy games

Embargo Date
2027-01-29
OA Version
Citation
Abstract
Reinforcement learning (RL) presents a powerful paradigm for learning complex robotic tasks. While many works have studied its use in solving single-agent strategic and adversarial games in simulation, its application to games played in real-world maritime multi-agent environments is lacking. This can partially be attributed to the scarcity of learning-compatible multi-agent maritime simulators with sim-to-real pipelines, and a lack of imitation and reinforcement learning methods designed for long-horizon, adversarial, multi-agent strategy games in which coordination and cooperation are required. To date, cooperative multi-agent RL methods are often evaluated on either simple tasks with finite state-spaces, or curated mini-games from a small handful of real-time strategy video games. This leaves us with a suite of methods that either cannot handle the state-space complexity, or are too domain-specialized to learn viable strategies for other multi-agent strategy games, such as maritime capture-the-flag. In this dissertation, we address this gap from the ground-up by introducing a Python-based simulator and pipeline for training and deploying policies for multi-agent maritime strategy games, testing and deploying a sample-efficient imitation learning algorithm for multi-agent maritime capture-the-flag, and developing a preference-based inverse reinforcement learning method for reward extrapolation from suboptimal demonstrations of long-horizon zero-sum games.
Description
2026
License