FluxDAgger
A Model-Decoupled DAgger Pipeline for Dual-Arm Robotic Manipulation
TL;DR
Deploying DAgger on a real dual-arm robot is far more than a policy-inference problem — it tangles policy, hardware, teleoperation, multi-camera sync, and post-processing into a single brittle stack. FluxDAgger decouples these concerns behind a small set of ROS topics, so swapping the VLA model or the reward model never touches the collection logic.
Contributions
- Model-decoupled architecture. Policy inference lives in an external project/service; the collector consumes a fixed action topic, so one DAgger workflow serves arbitrary VLA policies.
- Human-in-the-loop DAgger loop. Autonomous rollout, online human takeover, and per-frame source tagging (rollout vs. correction) within a single episode, with multi-camera + joint-state timestamp alignment.
- Reward-pluggable infrastructure. A Qwen3-VL reward module is integrated via an independent interface for both online ROS publishing and offline batch annotation, enabling reward-guided dataset filtering.
Method & Hardware
FluxDAgger is organized as a set of ROS Noetic nodes — camera, sync-observation, model-inference, DAgger controller, DAgger collector, and reward node — communicating through standardized topic interfaces. The policy and reward models become drop-in modules rather than first-class citizens of the collector. On the hardware side, the stock AgileX Piper platform shares one CAN bus across arms; FluxDAgger introduces a per-arm USB-CAN topology so each arm’s enable state, mode, and commands can be managed independently.
For interactive demo videos and the full system walkthrough, visit the project page.