{"slug": "trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi", "title": "TRIDENT: Breaking the Hybrid-Safety-Physics Coupling for Provably Safe Multi-Agent Reinforcement Learning", "summary": "Researchers introduced TRIDENT, the first multi-agent reinforcement learning framework that co-designs three components to cancel biases from hybrid discrete-continuous actions, safety constraints, and physics-governed dynamics. TRIDENT achieves a 95.5% reduction in training-time violations over MADDPG and 76.3% over MACPO while improving reward by 13.5% over unconstrained baselines in multi-UAV, autonomous intersection, and hybrid SMAC tasks.", "body_md": "arXiv:2606.18308v1 Announce Type: new\nAbstract: Safe coordination in networked cyber-physical systems forces learning algorithms to simultaneously handle hybrid discrete-continuous actions, hard training-time safety constraints, and physics-governed dynamics. We show that these three features form a directed cycle of biases that defeats any naive composition of off-the-shelf modules, and formalize this as a three-way coupling lemma. We then introduce TRIDENT, the first MARL framework whose three components are co-designed to cancel each leak: a Richardson-Romberg gradient correction reducing Gumbel-Softmax bias from O(tau) to O(tau^2), a Lyapunov-constrained sequential trust-region update enforcing per-iterate feasibility, and a physics-informed residual critic that decomposes value rather than reward. We prove an O~(1/sqrt(K)) convergence rate to a constrained Nash equilibrium and an O(sqrt(K)) cumulative-violation bound. On multi-UAV mobile-edge computing, autonomous intersection management, and a hybrid SMAC variant, TRIDENT cuts training-time violations by 95.5% over MADDPG and 76.3% over MACPO, while improving reward by 13.5% over the strongest unconstrained baseline.", "url": "https://wpnews.pro/news/trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi", "canonical_source": "https://arxiv.org/abs/2606.18308", "published_at": "2026-06-18 04:00:00+00:00", "updated_at": "2026-06-18 04:29:14.201666+00:00", "lang": "en", "topics": ["machine-learning", "ai-safety", "autonomous-vehicles", "ai-research"], "entities": ["TRIDENT", "MADDPG", "MACPO", "SMAC"], "alternates": {"html": "https://wpnews.pro/news/trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi", "markdown": "https://wpnews.pro/news/trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi.md", "text": "https://wpnews.pro/news/trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi.txt", "jsonld": "https://wpnews.pro/news/trident-breaking-the-hybrid-safety-physics-coupling-for-provably-safe-multi.jsonld"}}