Vision-Language-Action Models
Verification and progress monitoring for VLA-based robot policies.
Overview
Vision-Language-Action (VLA) models enable robots to follow natural language instructions by grounding language in visual observations and producing actions. But how do we know when these policies are failing?
My research develops verification frameworks that monitor task progress and detect failure modes in real-time.
Current Projects
Progress-Monitored Verification for VLA Models
Status: Ongoing research at UC Irvine
→ Key insight: External verifiers can estimate task progress and re-rank VLA action outputs under ambiguity.
What we’re building:
- Verifier-augmented framework to estimate task progress during execution
- Failure mode detection for VLA-based robot policies
- Modular evaluation pipeline across simulated manipulation tasks
Why it matters: VLA models can fail silently — producing confident but incorrect actions. Progress monitoring enables early intervention before cascading failures.
Human Natural Language to Robotic Control
| Status: Completed at Caltech (2022-2024) | Advisor: Prof. John Doyle |
→ Key insight: LLMs + Model Predictive Control can bridge natural language and low-level robot control.
What we built:
- Human-robot collaboration framework for natural language to robotic control
- Integration of LLM task planning with MPC trajectory optimization
- Visual-language model feedback loop for improved robot performance
Research Questions I’m Exploring
- Compositional verification — Can we verify complex tasks by composing simpler sub-task verifiers?
- Learning from failures — How can VLA models improve from detected failure modes?
- Sim-to-real transfer — Do verification methods trained in simulation transfer to real robots?