{"slug": "towards-scalable-multi-task-reinforcement-learning-with-large-decision-models", "title": "Towards Scalable Multi-Task Reinforcement Learning with Large Decision Models", "summary": "Researchers introduced LDM-v0, a large decision model trained offline on trajectories from thousands of heterogeneous reinforcement learning environments, including robotics, autonomous driving, and video games. The single transformer policy matched the performance of task-specific reference policies across approximately 1,000 environments, demonstrating the feasibility of large-scale multi-task RL pretraining.", "body_md": "arXiv:2606.24962v1 Announce Type: new\nAbstract: Recent progress in large-scale sequence modeling has shown that a single model can learn useful representations across highly diverse data distributions. Inspired by these advances, we investigate whether a unified transformer policy can be trained across large collections of heterogeneous reinforcement learning environments.\nWe introduce LDM-v0, a Large Decision Model trained offline on trajectories collected from thousands of environments spanning multiple domains and modalities. LDM-v0 is a multi-task, multi-modal transformer policy conditioned on histories of observations, actions, rewards, and termination signals, and trained through supervised next-action prediction over offline trajectories. We describe the environment infrastructure, automated data generation pipeline, model architecture, and training methodology used to build LDM-v0, and evaluate its performance across diverse environments. We show that a single pretrained model matches the performance of independently trained task-specific reference policies on approximately 1,000 environments including robotics, autonomous driving, inventory management, cybersecurity, trading, and video games. These results demonstrate the feasibility of large-scale offline pretraining across heterogeneous reinforcement learning environments using a single transformer policy.", "url": "https://wpnews.pro/news/towards-scalable-multi-task-reinforcement-learning-with-large-decision-models", "canonical_source": "https://arxiv.org/abs/2606.24962", "published_at": "2026-06-25 04:00:00+00:00", "updated_at": "2026-06-25 04:26:25.510238+00:00", "lang": "en", "topics": ["machine-learning", "large-language-models", "ai-research"], "entities": ["LDM-v0", "Large Decision Model"], "alternates": {"html": "https://wpnews.pro/news/towards-scalable-multi-task-reinforcement-learning-with-large-decision-models", "markdown": "https://wpnews.pro/news/towards-scalable-multi-task-reinforcement-learning-with-large-decision-models.md", "text": "https://wpnews.pro/news/towards-scalable-multi-task-reinforcement-learning-with-large-decision-models.txt", "jsonld": "https://wpnews.pro/news/towards-scalable-multi-task-reinforcement-learning-with-large-decision-models.jsonld"}}