cd /news/robotics/isaac-lab-example-dual-arm-nero-reac… Β· home β€Ί topics β€Ί robotics β€Ί article
[ARTICLE Β· art-39268] src=hackster.io β†— pub= topic=robotics verified=true sentiment=↑ positive

Isaac Lab Example: Dual-Arm Nero Reach Training

NVIDIA Isaac Lab has been extended to support the AgileX Dual-Arm Nero Manipulator, enabling developers to train dual-arm reinforcement learning policies for robotic manipulation. The open-source implementation provides robot configuration files, URDF assets, and training scripts for reach tasks. This integration aims to accelerate research in adaptive robotic control for unstructured environments.

read9 min views1 publishedJun 25, 2026
Isaac Lab Example: Dual-Arm Nero Reach Training
Image: Hackster (auto-discovered)

Robotic manipulation remains one of the most important research directions in embodied AI. While traditional kinematics-based controllers provide stable motion execution, they often struggle in unstructured environments where adaptability is required.

Recent advances in Reinforcement Learning (RL) have enabled robotic arms to learn task-oriented behaviors directly from interaction, making it possible to achieve robust control policies without manually designing every motion strategy.

In this project, we extend the original SO-ARM101 Isaac Lab implementation by integrating the AgileX Dual-Arm Nero Manipulator, allowing developers to quickly train and validate dual-arm RL policies using NVIDIA Isaac Lab.

ReposityOpen-source implementation:

smalleha/isaac_so_arm101: Isaac Lab external project for SO-ARM100/101 arm robot.

β”œβ”€β”€ robotsβ”‚   β”œβ”€β”€ dual_neroβ”‚   β”‚   β”œβ”€β”€ dual_nero.pyβ”‚   β”‚   β”œβ”€β”€ __init__.pyβ”‚   β”‚   β”œβ”€β”€ meshesβ”‚   β”‚   └── urdfβ”‚   β”‚       └── dual_nero.urdfβ”œβ”€β”€ scriptsβ”‚   β”œβ”€β”€ rl_gamesβ”‚   β”‚   β”œβ”€β”€ play.pyβ”‚   β”‚   └── train.pyβ”‚   └── zero_agent.pyβ”œβ”€β”€ tasksβ”‚   β”œβ”€β”€ __init__.pyβ”‚   └── reachβ”‚       β”œβ”€β”€ agentsβ”‚       β”œβ”€β”€ dual_nero_joint_pos_env_cfg.pyβ”‚       β”œβ”€β”€ dual_nero_reach_env_cfg.pyβ”‚       β”œβ”€β”€ __init__.pyβ”‚       └── mdp└── ui_extension_example.py

Key additions include:

Before importing the robot into Isaac Lab, mesh references inside the URDF should be converted to relative paths.

For example:

<link name="base_link">    <inertial>      <origin rpy="0 0 0" xyz="-0.0592395620981769 -0.068642440505388 0.0562764736144042"/>      <mass value="4.46524857458863"/>      <inertia ixx="0.0608289280191989" ixy="-2.54649959130438E-06" ixz="1.11851948046933E-07" iyy="0.0218722514454004" iyz="-0.000689477252402357" izz="0.0680524540174318"/>    </inertial>    <visual>      <origin rpy="0 0 0" xyz="0 0 0"/>      <geometry>        <mesh filename="../meshes/base_link.dae"/>      </geometry>      <material name="">        <color rgba="0.776470588235294 0.756862745098039 0.737254901960784 1"/>      </material>    </visual>    <collision>      <origin rpy="0 0 0" xyz="0 0 0"/>      <geometry>        <mesh filename="../meshes/base_link.dae"/>      </geometry>    </collision>  </link>

This ensures that the asset can correctly locate mesh resources during simulation.

3.2 Creating the Robot ConfigurationNext, create a robot configuration file:

isaac_so_arm101/robots/dual_nero/dual_nero.py

This file defines:

  • Robot articulation properties
  • Joint stiffness and damping
  • Actuator configuration
  • Initial joint states
  • Gripper settings

The resulting DUAL_NERO_CFG

object becomes the robot asset used by Isaac Lab during training.

from pathlib import Pathimport isaaclab.sim as sim_utilsfrom isaaclab.actuators import ImplicitActuatorCfgfrom isaaclab.assets.articulation import ArticulationCfgTEMPLATE_ASSETS_DATA_DIR = Path(__file__).resolve().parentDUAL_NERO_CFG = ArticulationCfg(    spawn=sim_utils.UrdfFileCfg(        fix_base=True,        merge_fixed_joints=False,        replace_cylinders_with_capsules=True,        asset_path=f"{TEMPLATE_ASSETS_DATA_DIR}/urdf/dual_nero.urdf",        activate_contact_sensors=False, # set as false while waiting for capsule implementation        rigid_props=sim_utils.RigidBodyPropertiesCfg(            disable_gravity=False,            max_depenetration_velocity=5.0,        ),        articulation_props=sim_utils.ArticulationRootPropertiesCfg(            enabled_self_collisions=True,            solver_position_iteration_count=8,            solver_velocity_iteration_count=0,        ),        joint_drive=sim_utils.UrdfConverterCfg.JointDriveCfg(            gains=sim_utils.UrdfConverterCfg.JointDriveCfg.PDGainsCfg(stiffness=0, damping=0)        ),    ),    init_state=ArticulationCfg.InitialStateCfg(        rot=(1.0, 0.0, 0.0, 0.0),        joint_pos={                "left_joint.*": 0.0,                "right_joint.*": 0.0,                "left_gripper_joint.*": 0.0,                  "right_gripper_joint.*": 0.0,          },        # Set initial joint velocities to zero        joint_vel={".*": 0.0},    ),    actuators={            "arm": ImplicitActuatorCfg(                joint_names_expr=["left_joint.*", "right_joint.*"],                effort_limit=25.0,                 velocity_limit=1.5,                               stiffness={                    "left_joint1": 200.0,                     "left_joint2": 170.0,                    "left_joint3": 120.0,                    "left_joint4": 80.0,                    "left_joint5": 50.0,                    "left_joint6": 20.0,                    "left_joint7": 10.0,                    "right_joint1": 200.0,                     "right_joint2": 170.0,                    "right_joint3": 120.0,                    "right_joint4": 80.0,                    "right_joint5": 50.0,                    "right_joint6": 20.0,                    "right_joint7": 10.0                },                               damping={                    "left_joint1": 100.0,                    "left_joint2": 60.0,                    "left_joint3": 70.0,                    "left_joint4": 24.0,                    "left_joint5": 20.0,                    "left_joint6": 10.0,                    "left_joint7": 5,                    "right_joint1": 100.0,                    "right_joint2": 60.0,                    "right_joint3": 70.0,                    "right_joint4": 24.0,                    "right_joint5": 20.0,                    "right_joint6": 10.0,                    "right_joint7": 5,                },            ),        "gripper": ImplicitActuatorCfg(            joint_names_expr=["left_gripper_joint.*","right_gripper_joint.*"],            effort_limit_sim=22,  # Increased from 1.9 to 2.5 for stronger grip            velocity_limit_sim=1.5,            stiffness=800.0,  # Increased from 25.0 to 60.0 for more reliable closing            damping=20.0,  # Increased from 10.0 to 20.0 for stability        ),        },    soft_joint_pos_limit_factor=0.9,)

Create an init.py file inside the dual_nero directory so that Python recognizes it as a package.

Two environment configuration files are required:

tasks/reach/β”œβ”€β”€ dual_nero_joint_pos_env_cfg.py└── dual_nero_reach_env_cfg.py

4.1 Joint Position Environment Configurationdual_nero_joint_pos_env_cfg.py

specifies:

  • Controlled joints
  • End-effector links
  • Action spaces
  • Command targets
import mathimport isaaclab_tasks.manager_based.manipulation.reach.mdp as mdpfrom isaaclab.utils import configclassfrom isaac_so_arm101.robots import DUAL_NERO_CFG # noqa: F401   from isaac_so_arm101.tasks.reach.dual_nero_reach_env_cfg import Dual_NeroReachEnvCfgfrom isaaclab.assets.articulation import ArticulationCfg@configclassclass Dual_Nero_ReachEnvCfg(Dual_NeroReachEnvCfg):    def __post_init__(self):        # post init of parent        super().__post_init__()        # switch robot to OpenArm        self.scene.robot = DUAL_NERO_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot",)        # override rewards        self.rewards.left_end_effector_position_tracking.params["asset_cfg"].body_names = ["left_gripper_base"]        self.rewards.left_end_effector_position_tracking_fine_grained.params["asset_cfg"].body_names = [            "left_gripper_base"        ]        self.rewards.left_end_effector_orientation_tracking.params["asset_cfg"].body_names = ["left_gripper_base"]        self.rewards.right_end_effector_position_tracking.params["asset_cfg"].body_names = ["right_gripper_base"]        self.rewards.right_end_effector_position_tracking_fine_grained.params["asset_cfg"].body_names = [            "right_gripper_base"        ]        self.rewards.right_end_effector_orientation_tracking.params["asset_cfg"].body_names = ["right_gripper_base"]        # override actions        self.actions.left_arm_action = mdp.JointPositionActionCfg(            asset_name="robot",            joint_names=[                "left_joint.*",            ],            scale=0.5,            use_default_offset=True,        )        self.actions.right_arm_action = mdp.JointPositionActionCfg(            asset_name="robot",            joint_names=[                "right_joint.*",            ],            scale=0.5,            use_default_offset=True,        )        # override command generator body        # end-effector is along z-direction        self.commands.left_ee_pose.body_name = "left_gripper_base"        self.commands.right_ee_pose.body_name = "right_gripper_base"@configclass    class Dual_Nero_ReachEnvCfg_PLAY(Dual_Nero_ReachEnvCfg):    def __post_init__(self):        # post init of parent        super().__post_init__()        # make a smaller scene for play        self.scene.num_envs = 50        self.scene.env_spacing = 2.5        # disable randomization for play        self.observations.policy.enable_corruption = False

4.2 Reach Task Definitiondual_nero_reach_env_cfg.py

contains the full RL environment definition.

This includes:

1.Scene Configuration

  • Ground plane
  • Lighting
  • Robot asset

2.Command Generation

3.Observation Space

4.Reward Design

5.Curriculum Learning

6.Episode Settings

import mathfrom dataclasses import MISSINGimport isaaclab.sim as sim_utilsfrom isaaclab.assets import ArticulationCfg, AssetBaseCfgfrom isaaclab.envs import ManagerBasedRLEnvCfgfrom isaaclab.managers import ActionTermCfg as ActionTermfrom isaaclab.managers import CurriculumTermCfg as CurrTermfrom isaaclab.managers import EventTermCfg as EventTermfrom isaaclab.managers import ObservationGroupCfg as ObsGroupfrom isaaclab.managers import ObservationTermCfg as ObsTermfrom isaaclab.managers import RewardTermCfg as RewTermfrom isaaclab.managers import SceneEntityCfgfrom isaaclab.managers import TerminationTermCfg as DoneTermfrom isaaclab.scene import InteractiveSceneCfgfrom isaaclab.utils import configclassfrom isaaclab.utils.noise import AdditiveUniformNoiseCfg as Unoiseimport isaaclab_tasks.manager_based.manipulation.reach.mdp as mdp### Scene definition##@configclassclass Dual_NeroReachSceneCfg(InteractiveSceneCfg):    """Configuration for the scene with a robotic arm."""    # world    ground = AssetBaseCfg(        prim_path="/World/ground",        spawn=sim_utils.GroundPlaneCfg(),        init_state=AssetBaseCfg.InitialStateCfg(pos=(0.0, 0.0, 0)),    )    # robots    robot: ArticulationCfg = MISSING    # lights    light = AssetBaseCfg(        prim_path="/World/light",        spawn=sim_utils.DomeLightCfg(color=(0.75, 0.75, 0.75), intensity=2500.0),    )### MDP settings##@configclassclass CommandsCfg:    """Command terms for the MDP."""    left_ee_pose = mdp.UniformPoseCommandCfg(        asset_name="robot",        body_name=MISSING,        resampling_time_range=(4.0, 4.0),        debug_vis=True,        ranges=mdp.UniformPoseCommandCfg.Ranges(            pos_x=(0.5, 0.5),            pos_y=(0.15, 0.25),            pos_z=(0.3, 0.5),            roll=(-math.pi / 6, math.pi / 6),            pitch=(3 * math.pi / 2, 3 * math.pi / 2),            yaw=(8 * math.pi / 9, 10 * math.pi / 9),        ),    )    right_ee_pose = mdp.UniformPoseCommandCfg(        asset_name="robot",        body_name=MISSING,        resampling_time_range=(4.0, 4.0),        debug_vis=True,        ranges=mdp.UniformPoseCommandCfg.Ranges(            pos_x=(0.5, 0.5),            pos_y=(-0.25, -0.15),            pos_z=(0.3, 0.5),            roll=(-math.pi / 6, math.pi / 6),            pitch=(3 * math.pi / 2, 3 * math.pi / 2),            yaw=(8 * math.pi / 9, 10 * math.pi / 9),        ),    )@configclassclass ActionsCfg:    """Action specifications for the MDP."""    left_arm_action: ActionTerm = MISSING    right_arm_action: ActionTerm = MISSING@configclassclass ObservationsCfg:    """Observation specifications for the MDP."""    @configclass    class PolicyCfg(ObsGroup):        """Observations for policy group."""        # observation terms (order preserved)        left_joint_pos = ObsTerm(            func=mdp.joint_pos_rel,            params={                "asset_cfg": SceneEntityCfg(                    "robot",                    joint_names=[                        "left_joint.*",                    ],                )            },            noise=Unoise(n_min=-0.01, n_max=0.01),        )        right_joint_pos = ObsTerm(            func=mdp.joint_pos_rel,            params={                "asset_cfg": SceneEntityCfg(                    "robot",                    joint_names=[                        "right_joint.*",                    ],                )            },            noise=Unoise(n_min=-0.01, n_max=0.01),        )        left_joint_vel = ObsTerm(            func=mdp.joint_vel_rel,            params={                "asset_cfg": SceneEntityCfg(                    "robot",                    joint_names=[                        "left_joint.*",                    ],                )            },            noise=Unoise(n_min=-0.01, n_max=0.01),        )        right_joint_vel = ObsTerm(            func=mdp.joint_vel_rel,            params={                "asset_cfg": SceneEntityCfg(                    "robot",                    joint_names=[                        "right_joint.*",                    ],                )            },            noise=Unoise(n_min=-0.01, n_max=0.01),        )        left_pose_command = ObsTerm(func=mdp.generated_commands, params={"command_name": "left_ee_pose"})        right_pose_command = ObsTerm(func=mdp.generated_commands, params={"command_name": "right_ee_pose"})        left_actions = ObsTerm(func=mdp.last_action, params={"action_name": "left_arm_action"})        right_actions = ObsTerm(func=mdp.last_action, params={"action_name": "right_arm_action"})        def __post_init__(self):            self.enable_corruption = True            self.concatenate_terms = True    # observation groups    policy: PolicyCfg = PolicyCfg()@configclassclass EventCfg:    """Configuration for events."""    reset_robot_joints = EventTerm(        func=mdp.reset_joints_by_scale,        mode="reset",        params={            "position_range": (0.5, 1.5),            "velocity_range": (0.0, 0.0),        },    )@configclassclass RewardsCfg:    """Reward terms for the MDP."""    left_end_effector_position_tracking = RewTerm(        func=mdp.position_command_error,        weight=-0.2,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "command_name": "left_ee_pose",        },    )    right_end_effector_position_tracking = RewTerm(        func=mdp.position_command_error,        weight=-0.25,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "command_name": "right_ee_pose",        },    )    left_end_effector_position_tracking_fine_grained = RewTerm(        func=mdp.position_command_error_tanh,        weight=0.1,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "std": 0.1,            "command_name": "left_ee_pose",        },    )    right_end_effector_position_tracking_fine_grained = RewTerm(        func=mdp.position_command_error_tanh,        weight=0.2,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "std": 0.1,            "command_name": "right_ee_pose",        },    )    left_end_effector_orientation_tracking = RewTerm(        func=mdp.orientation_command_error,        weight=-0.1,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "command_name": "left_ee_pose",        },    )    right_end_effector_orientation_tracking = RewTerm(        func=mdp.orientation_command_error,        weight=-0.1,        params={            "asset_cfg": SceneEntityCfg("robot", body_names=MISSING),            "command_name": "right_ee_pose",        },    )    # action penalty    action_rate = RewTerm(func=mdp.action_rate_l2, weight=-0.0001)    left_joint_vel = RewTerm(        func=mdp.joint_vel_l2,        weight=-0.0001,        params={            "asset_cfg": SceneEntityCfg(                "robot",                joint_names=[                    "left_joint.*",                ],            )        },    )    right_joint_vel = RewTerm(        func=mdp.joint_vel_l2,        weight=-0.0001,        params={            "asset_cfg": SceneEntityCfg(                "robot",                joint_names=[                    "right_joint.*",                ],            )        },    )@configclassclass TerminationsCfg:    """Termination terms for the MDP."""    time_out = DoneTerm(func=mdp.time_out, time_out=True)@configclassclass CurriculumCfg:    """Curriculum terms for the MDP."""    action_rate = CurrTerm(        func=mdp.modify_reward_weight,        params={"term_name": "action_rate", "weight": -0.005, "num_steps": 4500},    )    left_joint_vel = CurrTerm(        func=mdp.modify_reward_weight,        params={"term_name": "left_joint_vel", "weight": -0.001, "num_steps": 4500},    )    right_joint_vel = CurrTerm(        func=mdp.modify_reward_weight,        params={"term_name": "right_joint_vel", "weight": -0.001, "num_steps": 4500},    )### Environment configuration##@configclassclass Dual_NeroReachEnvCfg(ManagerBasedRLEnvCfg):    """Configuration for the reach end-effector pose tracking environment."""    # Scene settings    scene: Dual_NeroReachSceneCfg = Dual_NeroReachSceneCfg(num_envs=4096, env_spacing=2.5)    # Basic settings    observations: ObservationsCfg = ObservationsCfg()    actions: ActionsCfg = ActionsCfg()    commands: CommandsCfg = CommandsCfg()    # MDP settings    rewards: RewardsCfg = RewardsCfg()    terminations: TerminationsCfg = TerminationsCfg()    events: EventCfg = EventCfg()    curriculum: CurriculumCfg = CurriculumCfg()    def __post_init__(self):        """Post initialization."""        # general settings        self.decimation = 2        self.sim.render_interval = self.decimation        self.episode_length_s = 24.0        self.viewer.eye = (3.5, 3.5, 3.5)        # simulation settings        self.sim.dt = 1.0 / 60.0\`\`\`**### 4.3 Registering the Environment** Register the task inside:\`src/isaac_so_arm101/tasks/reach/\__init_\_.py\`\`\`\` PYTHONgym.register(    id="Isaac-Dual-Nero-Reach-v0",    entry_point="isaaclab.envs:ManagerBasedRLEnv",    kwargs={        "env_cfg_entry_point":f"{\__name_\_}.dual_nero_joint_pos_env_cfg:Dual_Nero_ReachEnvCfg",        "rsl_rl_cfg_entry_point": f"{agents.\__name_\_}.rsl_rl_ppo_cfg:ReachPPORunnerCfg",        "rl_games_cfg_entry_point": f"{agents.\__name_\_}:rl_games_ppo_cfg.yaml",    },    disable_env_checker=True,)
  1. Training the Reach PolicyStep 1.Activate the Isaac Lab Environment
conda activate env_isaaclab

Step 2.Navigate to the project directory:

cd isaac_so_arm101

Step 3.Train the whole project

Option 1:RSL-RL

  • Train:
uv run train \    --task Isaac-Dual-Nero-Reach-v0 \    --headless
  • Evaluate:
uv run play \    --task Isaac-Dual-Nero-Reach-v0
  • Training result:
dual_nero_rsl_rl.gif

Option 2: RL-Games

  • Train:
python3 scripts/rl_games/train.py \    --task Isaac-Dual-Nero-Reach-v0 \    --headless
  • Evaluate:
python3 scripts/rl_games/play.py \    --task Isaac-Dual-Nero-Reach-v0
  • Training result:
dual_nero_rl_games.gif

Both frameworks successfully learn the dual-arm reaching task.

However, in our experiments:

  • RL-Games converges faster
  • Motion trajectories appear smoother
  • Final reaching accuracy is generally higher

For relatively complex robot morphologies such as dual-arm manipulators, RL-Games currently provides more stable performance and is recommended as the default training backend.

FAQQ1: Why is my Dual-Nero robot collapsing or shaking violently after the URDF?

This is usually caused by incorrect actuator parameters or unrealistic inertial properties.

Common causes include:

  • Joint stiffness set too high
  • Damping values too low
  • Incorrect mass distribution in the URDF
  • Self-collision configuration issues
  • Unstable simulation timestep

Before starting RL training, verify that the robot can remain stable under gravity using only PD control.

Quick check: If the robot cannot stand still without RL, the issue is likely in the robot model rather than the training algorithm.

Q2: Why does the reward improve, but the robot never reaches the target accurately?

A rising reward does not always indicate successful task completion.

Typical reasons include:

  • Reward weights are unbalanced
  • Orientation rewards dominate position rewards
  • Action penalties are too strong
  • Command sampling range is too large
  • End-effector link is incorrectly configured

In most reach tasks, incorrect reward shaping is the primary reason for poor final accuracy.

Q3: How do I verify that the end-effector link is configured correctly?

One of the most common mistakes in Isaac Lab reach tasks is assigning the wrong end-effector body.

For Dual-Nero, the target link should be:

left_gripper_baseright_gripper_base

Symptoms of an incorrect configuration include:

  • Reward remains low
  • Robot moves randomly
  • Training appears to converge but fails visually
  • End-effector does not move toward the target marker

Always verify the body name in Isaac Sim before launching large-scale training.

Q4: Why does RL-Games perform better than RSL-RL for this task?

Both frameworks are PPO-based, but their implementations differ.

For large-scale manipulation environments:

RL-Games generally scales better with thousands of parallel environments

PPO updates are often more stable

Training throughput is higher on modern GPUs

For Dual-Nero reach experiments, RL-Games typically achieves smoother trajectories and faster convergence.

However, results may vary depending on reward design and task complexity.

Q5: My policy works in simulation but fails on the real robot. Why?

This is the most common Sim-to-Real issue.

Possible causes include:

  • Joint friction mismatch
  • Encoder noise
  • Latency differences
  • Payload variations
  • Inaccurate motor models
  • Unmodeled cable effects

To improve transfer performance:

  • Apply domain randomization
  • Add observation noise
  • Randomize dynamics parameters
  • Validate trajectories at low speed first

Successful simulation training is only the first step toward real-world deployment.

πŸ“· Have Question?If you encounter any issues with environment installation, parameter configuration, or RL training, feel free to leave your questions for further discussion.

[Read more](javascript:void(0))

── more in #robotics 4 stories Β· sorted by recency
── more on @nvidia isaac lab 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/isaac-lab-example-du…] indexed:0 read:9min 2026-06-25 Β· β€”