Our Projects

RAGEN
We introduce RAGEN to train LLM reasoning agents via RL in multi-turn, stochastic environments. RAGEN is formulated with MDP and optimized through Reasoning-Interaction Chain Optimization (RICO). RAGEN-0.5B is trained across three agentic tasks, showing intriguing reasoning patterns.

Embodied Agent Interface
Current evaluations of LLMs in embodied AI lack standardization and detailed error analysis. Our proposed benchmark addresses this with a unified interface for diverse tasks and LLM modules (planning, decomposition, etc.) and fine-grained metrics (identifying hallucination, affordance errors, etc.). This enables systematic assessment, pinpointing specific LLM limitations and strengths to inform more effective integration into embodied agents.