How to Add Live Telemetry and Failure Diagnosis to Isaac Lab, MuJoCo, or Gazebo Training in Under 5 Minutes

SimTooReal, a platform for robotics teams, now enables live telemetry and failure diagnosis for training runs in Isaac Lab, MuJoCo, Gazebo, and LeRobot with a single command. The tool wraps existing training scripts to stream 28+ metrics in real time and recognizes over 50 failure patterns from logs, eliminating the need to wait for log files to debug issues. A free public diagnosis page and a transfer scoring tool further allow teams to compare simulation and real-world trajectories without overhauling their training stack.

If you train robot policies long enough, you eventually realize the main problem is not launching runs. It is answering these questions fast enough: This post walks through a practical approach using SimTooReal, a platform built for robotics teams working across Isaac Lab, MuJoCo, Gazebo, and LeRobot workflows. The nice part is that the basic setup does not require rewriting your training loop. From the public docs and feature pages, the platform is built around a few concrete capabilities: For this post, we will focus on the fastest path: getting live metrics and failure diagnosis around an existing run. pip install simtooreal-agent That is the shortest path shown in the docs. For a generic training script: simtooreal-agent run -- python train.py --env HalfCheetah-v5 --algo ppo For an Isaac Lab run: simtooreal-agent run -- python scripts/rsl rl/train.py --task Isaac-Ant-v0 --headless The agent sits around your existing command, parses stdout in real time, and streams metrics into the SimTooReal dashboard. The docs route is here: https://www.simtooreal.com/docs Once the run is wrapped, you can monitor live training behavior rather than waiting for logs to finish writing. The platform advertises 28+ streamed signals. In practice, the most useful ones for day-to-day robotics training are likely to be: The reason this matters is simple: a reward curve alone is not enough. I have seen runs where: If you only check the final chart, you catch those issues too late. This is the part I like most in the public feature set. SimTooReal's failure intelligence is designed to recognize more than 50 patterns from live logs, including: The platform also exposes a free public diagnosis page: https://www.simtooreal.com/diagnose That page accepts logs from frameworks like: So even before you wire the full workflow, you can already use the diagnosis engine as a quick debugging surface. The training monitor page also shows an optional direct integration path for teams that want deeper instrumentation. Example: python from isaacmonitor sdk import MonitorClient client = MonitorClient run id="my-run-001" client.log metrics iteration=i, mean reward=mean reward, entropy=entropy, reward components={ "velocity": 0.8, "upright": 0.6, "contact": -0.1, }, client.log failure joint="knee left", reason="joint limit exceeded" client.finish run If your team already has a structured training loop and wants richer event data than stdout parsing alone, this is the cleaner path. Once you start collecting trajectories from both sim and real hardware, the next useful step is the public transfer scoring tool: https://www.simtooreal.com/score The CLI example from the product pages is: simtooreal score --sim sim traj.csv --real real traj.csv According to the site, this compares trajectories with Dynamic Time Warping and returns a transfer score out of 100. That is useful because it turns the vague statement "transfer seems okay" into something you can compare over time. This is where the product goes beyond metrics collection. The deployment flow on the site is built around: If your team is already training policies successfully, this is the next maturity step. Training visibility helps you produce better checkpoints. Deployment gates help you avoid promoting the wrong ones. A lot of robotics infra tools fail because they ask for too much up front. What makes this workflow practical is that it gives you a progressive path: That is a much better adoption curve than "rebuild your training stack around our SDK." If you work in robot learning, the gap between simulation and hardware is not just a modeling problem. It is a visibility problem and a decision problem. A good workflow should help you answer: That is what SimTooReal is aiming to solve. Useful links: https://www.simtooreal.com/docs https://www.simtooreal.com/features/train https://www.simtooreal.com/features/failures https://www.simtooreal.com/diagnose https://www.simtooreal.com/score robotics , machinelearning , python , ai How to Add Live Telemetry to Isaac Lab, MuJoCo, or Gazebo Training Start with the free diagnosis tool or the docs quickstart.