Adaptive Robust Markov Decision Process for Wide-Area Surveillance with Collaborative Combat Aircraft
Collaborative Combat Aircraft (CCAs) are envisioned to enable autonomous Intelligence, Surveillance, and Reconnaissance (ISR) missions in contested environments, where adversaries may act strategically to deceive or evade detection. These missions pose significant challenges due to model uncertainty and the need for safe, real-time decision-making. While reinforcement learning (RL) offers adaptability, it lacks the safety guarantees required for critical operations. Robust Markov Decision Processes (RMDPs) offer worst-case guarantees but are traditionally limited by static ambiguity sets, which capture the uncertainty over the true environment model. This paper presents an adaptive RMDP framework tailored to wide-area ISR with CCAs. We introduce a mission-specific Markov Decision Process (MDP) formulation where aircraft alternate between movement and surveillance states. Adversarial tactics are modeled as a finite set of transition kernels, each capturing a different assumption about how the adversary’s sensing or environmental conditions affect the rewards. Our approach incrementally refines policies by passively eliminating inconsistent threat models over time, allowing each agent to shift from conservative to efficient behaviors while maintaining robustness. Across both Gaussian and non-Gaussian threat models and a range of network topologies, our adaptive robust planner consistently achieves higher performance and lower exposure risk than nominal and static robust baselines.
2-s2.0-105031186080
Michigan Engineering
École Polytechnique Fédérale de Lausanne
Michigan Engineering
2026-01-08
9781624107658
REVIEWED
EPFL
| Event name | Event acronym | Event place | Event date |
Orlando, FL, US | 2026-01-12 - 2026-01-16 | ||