FluxDAgger

A Model-Decoupled DAgger Pipeline for Dual-Arm Robotic Manipulation

Rollout to takeover demo frame
Rollout → Takeover. A single DAgger episode with autonomous-to-human handover at 0:30.
FluxDAgger system architecture
Model-decoupled architecture. Policy and reward modules plug in via standardized ROS topics.
Baseline data collection system
Baseline collection system. Cameras + master arm (demonstration) + slave arm (execution) on the AgileX Piper platform.
End-to-end data flow
End-to-end data flow. Raw multi-modal capture → Parquet → MP4 / NumPy / takeover clips / reward annotations.
Timestamp synchronization
Timestamp synchronization. Per-camera image buffers matched to robot states inside a bounded sync window.
Original CAN-bus topology
Hardware — before. Original shared CAN-bus topology of the AgileX Piper four-arm platform.
Modified CAN-bus topology
Hardware — after. Per-arm dedicated USB-CAN interfaces enabling independent enable/mode/control.

TL;DR

Deploying DAgger on a real dual-arm robot is far more than a policy-inference problem — it tangles policy, hardware, teleoperation, multi-camera sync, and post-processing into a single brittle stack. FluxDAgger decouples these concerns behind a small set of ROS topics, so swapping the VLA model or the reward model never touches the collection logic.

Contributions

  • Model-decoupled architecture. Policy inference lives in an external project/service; the collector consumes a fixed action topic, so one DAgger workflow serves arbitrary VLA policies.
  • Human-in-the-loop DAgger loop. Autonomous rollout, online human takeover, and per-frame source tagging (rollout vs. correction) within a single episode, with multi-camera + joint-state timestamp alignment.
  • Reward-pluggable infrastructure. A Qwen3-VL reward module is integrated via an independent interface for both online ROS publishing and offline batch annotation, enabling reward-guided dataset filtering.

Method & Hardware

FluxDAgger is organized as a set of ROS Noetic nodes — camera, sync-observation, model-inference, DAgger controller, DAgger collector, and reward node — communicating through standardized topic interfaces. The policy and reward models become drop-in modules rather than first-class citizens of the collector. On the hardware side, the stock AgileX Piper platform shares one CAN bus across arms; FluxDAgger introduces a per-arm USB-CAN topology so each arm’s enable state, mode, and commands can be managed independently.

For interactive demo videos and the full system walkthrough, visit the project page.