Our Projects

  • RAGEN

    RAGEN

    We introduce RAGEN to train LLM reasoning agents via RL in multi-turn, stochastic environments. RAGEN is formulated with MDP and optimized through Reasoning-Interaction Chain Optimization (RICO). RAGEN-0.5B is trained across three agentic tasks, showing intriguing reasoning patterns.

  • VAGEN

    VAGEN

    VAGEN is an RL framework improving VLM agent training with the TRICO algorithm. By selectively focusing on critical tokens and enhancing cross-turn credit assignment, TRICO outperforms prior methods on visual agentic tasks.

  • Embodied Agent Interface

    Embodied Agent Interface

    Current evaluations of LLMs in embodied AI lack standardization and detailed error analysis. Our introduce a unified interface (Embodied Agent Interface) for diverse tasks and LLM modules (planning, decomposition, etc.) and fine-grained metrics (identifying hallucination, affordance errors, etc.). This enables systematic assessment, pinpointing specific LLM limitations and strengths to inform more effective integration into embodied agents.

  • Long Video Haystack

    Long Video Haystack

    We introduce LongVideoHaystack, a 480-hour video temporal search dataset with 15,092 human-annotated instances, where SOTA scores 2.1% Temporal F1.