Projects
Selected robotics, embodied intelligence, and visual perception projects.
FluxDAgger
A Model-Decoupled DAgger Pipeline for Dual-Arm Robotic Manipulation
TL;DR
Deploying DAgger on a real dual-arm robot is far more than a policy-inference problem — it tangles policy, hardware, teleoperation, multi-camera sync, and post-processing into a single brittle stack. FluxDAgger decouples these concerns behind a small set of ROS topics, so swapping the VLA model or the reward model never touches the collection logic.
Contributions
- Model-decoupled architecture. Policy inference lives in an external project/service; the collector consumes a fixed action topic, so one DAgger workflow serves arbitrary VLA policies.
- Human-in-the-loop DAgger loop. Autonomous rollout, online human takeover, and per-frame source tagging (rollout vs. correction) within a single episode, with multi-camera + joint-state timestamp alignment.
- Reward-pluggable infrastructure. A Qwen3-VL reward module is integrated via an independent interface for both online ROS publishing and offline batch annotation, enabling reward-guided dataset filtering.
Method & Hardware
FluxDAgger is organized as a set of ROS Noetic nodes — camera, sync-observation, model-inference, DAgger controller, DAgger collector, and reward node — communicating through standardized topic interfaces. The policy and reward models become drop-in modules rather than first-class citizens of the collector. On the hardware side, the stock AgileX Piper platform shares one CAN bus across arms; FluxDAgger introduces a per-arm USB-CAN topology so each arm’s enable state, mode, and commands can be managed independently.
For interactive demo videos and the full system walkthrough, visit the project page.
Gesture Recognition on Horizon X3 Pi
Efficient gesture recognition and mobile-robot deployment (undergraduate thesis)
TL;DR
An undergraduate-thesis project on deploying efficient gesture recognition on the Horizon X3 Pi edge board and integrating it into a mobile-robot system. The work covers model selection, edge-side optimization, and ROS2 integration with simulation and real-robot validation.
Contributions
- Edge-deployable detector. Trained YOLOv5s on the HaGRID static-gesture dataset, reaching 86.60% mAP, and benchmarked against the YOLO and DETR families.
- Edge optimization. Optimized and deployed YOLOv5s on the Horizon X3 Pi, improving inference from 20 FPS to 30 FPS with only 0.36% mAP degradation.
- Robot integration. Built ROS2 motion-control nodes and integrated gesture recognition with human tracking on the lab’s mobile robot.
Method & Platform
The pipeline combines a YOLOv5s detector (HaGRID-trained) with Horizon-toolchain-based quantization and graph optimization for the X3 Pi BPU. Recognized gestures are mapped to discrete motion primitives published as ROS2 Twist messages; a parallel human-tracking module shares the detection backbone to drive the platform’s follow behavior.