{"slug": "tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning", "title": "TycoonLE: A Jax reinforcement learning environment for long-horizon planning", "summary": "Researchers released TycoonLE, a JAX-based reinforcement learning environment for long-horizon planning in a simulated logistics economy. The environment supports action legality, delayed rewards, and replayable audit traces, with a companion benchmark report at TycoonBench. It is designed to study agent planning and decision-making under economic constraints.", "body_md": "Tycoon Learning Environment (TycoonLE) is a reinforcement learning environment for economically grounded, long-horizon planning. Agents operate in a simulated logistics economy where they allocate capital, build transport routes, move cargo, manage debt, and optimize delayed returns.\n\nIt is designed to study action legality, candidate-frontier decision interfaces, financing timing, delayed rewards, procedural variation, and replayable audit traces.\n\nTycoonLE uses a fixed-shape interface. Agents choose among valid route, finance, and wait candidates, making rollouts compatible with JAX transformations such as `jit`\n\n, `vmap`\n\n, and `scan`\n\n.\n\nThe replay UI makes policies inspectable through route choices, cargo flow, financing behavior, reward, score, and profit over time.\n\nTycoonBench provides a companion benchmark report for comparing agent and model performance on TycoonLE planning tasks: [vrtnis.github.io/tycoonbench](https://vrtnis.github.io/tycoonbench/).\n\nUse Python 3.11 or 3.12:\n\n```\npy -3.12 -m venv .venv\n.\\.venv\\Scripts\\python.exe -m pip install -e \".[test]\"\nnpm install\npython\nimport jax\nfrom tycoonle_jax import TycoonLE\n\nenv = TycoonLE(split=\"dev\", family=\"chain\")\nstate, timestep = env.reset(jax.random.PRNGKey(0))\naction = timestep.observation.action_mask.argmax()\nstate, timestep = env.step(state, action)\n```\n\nExport a replay:\n\n```\n.\\.venv\\Scripts\\python.exe examples\\quickstart.py\nnpm run dev\n```\n\nOpen the browser UI and load `runs/quickstart/replay.json`\n\n.\n\nRun tests:\n\n```\n.\\.venv\\Scripts\\python.exe -m pytest\nnpm run build\n```\n\nRun a small PPO smoke train:\n\n```\n.\\.venv\\Scripts\\python.exe examples\\train_ppo_jax.py --updates 1 --num-envs 4 --rollout-length 4 --update-epochs 1 --hidden-sizes 32\n```\n\nIf you find this work useful, consider citing:\n\n```\n@software{tycoonle,\n  title = {TycoonLE},\n  author = {TycoonLE contributors},\n  year = {2026},\n  url = {https://github.com/vrtnis/tycoon-learning-environment}\n}\n```\n\nTycoonLE uses sprite artwork from [OpenGFX](https://github.com/OpenTTD/OpenGFX), an open-source graphics base set for [OpenTTD](https://www.openttd.org/).", "url": "https://wpnews.pro/news/tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning", "canonical_source": "https://github.com/vrtnis/tycoon-learning-environment", "published_at": "2026-06-13 02:02:07+00:00", "updated_at": "2026-06-13 02:20:09.078446+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "ai-research", "ai-agents", "developer-tools"], "entities": ["TycoonLE", "JAX", "TycoonBench", "OpenGFX", "OpenTTD"], "alternates": {"html": "https://wpnews.pro/news/tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning", "markdown": "https://wpnews.pro/news/tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning.md", "text": "https://wpnews.pro/news/tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning.txt", "jsonld": "https://wpnews.pro/news/tycoonle-a-jax-reinforcement-learning-environment-for-long-horizon-planning.jsonld"}}