Publications
2025
- RAGEN (RL-Agent): Training Agents by Reinforcing Reasoning
Zihan Wang*, Kangrui Wang*, Qineng Wang*, Pingyue Zhang*, Linjie Li*, Zhengyuan Yang, Kefan Yu, Minh Nhat Nguyen, Monica Lam, Yiping Lu, Kyunghyun Cho, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li
LLM Agent + Multi-turn RL - VAGEN: Training VLM Agents with Multi-Turn Reinforcement Learning
Kangrui Wang*, Pingyue Zhang*, Zihan Wang*, Qineng Wang*, Linjie Li*, Zhengyuan Yang, Chi Wan, Yiping Lu, Manling Li
VLM Agent + Multi-turn RL - Chain-of-Experts: Unlocking the Communication Power of MoEs
Zihan Wang, Rui Pan, Lu Yin, Manling Li, Shiwei Liu
- Re-thinking Temporal Search for Long-Form Video Understanding
Jinhui Ye*, Zihan Wang*, Haosen Sun, Keshigeyan Chandrasegaran, Zane Durante, Cristobal Eyzaguirre, Yonatan Bisk, Juan Carlos Niebles, Ehsan Adeli, Li Fei-Fei, Jiajun Wu, Manling Li
- LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models
Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu
- Foundation Models Meet Embodied Agents
Manling Li, Yunzhu Li, Jiayuan Mao, Wenlong Huang
Tutorial - Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu
- Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas
Shiqi Chen, Tongyao Zhu, Ruochen Zhou, Jinghan Zhang, Siyang Gao, Juan Carlos Niebles, Mor Geva, Junxian He, Jiajun Wu, Manling Li
- EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents
Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang
- SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering
Xuehang Guo, Xingyao Wang, Yangyi Chen, Sha Li, Chi Han, Manling Li, Heng Ji
2024
- Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Manling Li*, Shiyu Zhao*, Qineng Wang*, Kangrui Wang*, Yu Zhou*, Sanjana Srivastava, Cem Gokmen, Tony Lee, Li Erran Li, Ruohan Zhang, Weiyu Liu, Percy Liang, Li Fei-Fei, Jiayuan Mao, Jiajun Wu
NeurIPS 2024 DB Track (Oral Presentation) Best Paper Award at SoCalNLP 2024 - HourVideo: 1-Hour Video-Language Understanding
Keshigeyan Chandrasegaran, Agrim Gupta, Taran Kota, Lea M. Hadzic, Jimming He, Cristobal Eyzaguirre, Zane Durante, Manling Li, Jiajun Wu, Li Fei-Fei
- IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
Yunong Liu, Weiyu Liu, Shubh Khanna, Cristobal Eyzaguirre, Manling Li, Juan Carlos Niebles, Vineeth Ravi, Saumitra Mishra, Jiajun Wu
- Visually Descriptive Language Modeling for Vector Graphics Reasoning
Zhenhailong Wang, Joy Hsu, Xingyao Wang, Kuan-Hao Huang, Manling Li, Jiajun Wu, Heng Ji
- LM-Steer: Word Embeddings Are Steers for Language Models
Chi Han, Jialiang Xu, Manling Li, Yi Fung, Chenkai Sun, Nan Jiang, Tarek Abdelzaher, Heng Ji
Outstanding Paper Award at ACL 2024 - Why Does New Knowledge Create Messy Ripple Effects in LLMs?
Jiaxin Qin, Zixuan Zhang, Chi Han, Pengfei Yu, Manling Li, Heng Ji
- Deep Concept Injection for Zero-shot Multimodal Reasoning
Xudong Lin, Manling Li, Richard Zemel, Heng Ji, Shih-Fu Chang
- Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models
Yuji Zhang, Sha Li, Jiateng Liu, Pengfei Yu, Yi Fung, Jing Li, Manling Li, Heng Ji
- MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Cheng Li, May Fung, Qingyun Wang, Chi Han, Manling Li, Jindong Wang, Heng Ji
- Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate
Kyungha Kim*, Sangyun Lee*, Kung-Hsiang Huang*, Hou Pong Chan, Manling Li, Heng Ji
- InfoPattern: Unveiling Information Propagation Patterns in Social Media
Chi Han*, Jialiang Xu*, Manling Li*, Hanning Zhang*, Tarek Abdelzaher, Heng Ji
- SmartBook: AI-Assisted Situation Report Generation
Revanth Gangi Reddy, Yi Fung, Qi Zeng, Manling Li, Zihan Wang, Paul Sullivan, Heng Ji
- Controlling Object Existence Hallucinations in Large Vision Language Models
Bohan Zhai, Shijia Yang, Chenfeng Xu, Sheng Shen, Kurt Keutzer, Chunyuan Li, Manling Li
2023
- Event-centric Multimodal Knowledge Acquisition
Manling Li
Thesis - ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation
Yangyi Chen, Xingyao Wang, Manling Li, Derek Hoiem, Heng Ji
- Defining a New NLP Playground
Sha Li, Chi Han, Pengfei Yu, Carl Edwards, Manling Li, Xingyao Wang, Yi Fung, Charles Yu, Joel R. Tetreault, Eduard H Hovy, Heng Ji
- Knowledge-Driven Vision-Language Encoding
Manling Li, Xudong Lin, Jie Lei, Mohit Bansal, Carl Vondrick, Shih-Fu Chang, Heng Ji
Tutorial - Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang
- Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Hejie Cui, Xinyu Fang, Zihan Zhang, Ran Xu, Xuan Kan, Xin Liu, Manling Li, Yangqiu Song, Carl Yang
- Non-Sequential Graph Script Induction via Multimedia Grounding
Yu Zhou†, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal and Heng Ji
- A Language First Approach to Procedure Planning
Jiateng Liu†, Sha Li, Zhenhailong Wang, Manling Li, Heng Ji
- Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification
Sha Li, Ruining Zhao†, Manling Li, Heng Ji, Chris Callison-Burch and Jiawei Han
- Multimedia Generative Script Learning for Task Planning
Qingyun Wang, Manling Li, Hou Pong Chan, Lifu Huang, Julia Hockenmaier, Girish Chowdhary and Heng Ji
- Learning to Decompose Visual Features with Latent Textual Prompts
Feng Wang†, Manling Li, Xudong Lin, Hairong Lv, Alexander Schwing, Heng Ji
- Knowledge-Driven Vision-Language Pretraining
Manling Li, Xudong Lin, Jie Lei, Mohit Bansal, Shih-Fu Chang, Heng Ji
Tutorial - Video Event Extraction via Tracking Visual States of Arguments
Guang Yang†, Manling Li, Jiajie Zhang, Xudong Lin, Shih-Fu Chang, Heng Ji
- ADEPT: A DEbiasing PrompT Framework
Ke Yang†, Charles Yu, Yi Fung, Manling Li, Heng Ji
- Timeline Summarization based on Event Graph Compression via Time-Aware Optimal Transport
Manling Li, Tengfei Ma, Mo Yu, Lingfei Wu, Tian Gao, Heng Ji and Kathleen McKeown
- Joint Multimedia Event Extraction from Video and Article
Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji and Shih-Fu Chang
- RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System
Manling Li, Ying Lin, Joseph Hoover, Spencer Whitehead, Clare Voss, Jiawei Han and Heng Ji
- Connecting the Dots: Event Graph Schema Induction with Path Language Modeling
Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, Heng Ji, Jonathan May, Nathanael Chambers, Clare Voss
- Joint Extraction of Events and Entities within a Document Context
Bishan Yang, Tom Mitchell
- Improving Event Detection via Open-domain Trigger Knowledge
Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li
2022
- Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang†*, Manling Li*, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji
equal contribution - CLIP-Event: Connecting Vision and Text with Event Structures
Manling Li, Ruochen Xu, Shuohang Wang, Xudong Lin, Chenguang Zhu, Xuedong Huang, Heng Ji, Shih-Fu Chang
Oral, Top 4.1% - COVID-19 Claim Radar: A Structured Claim Extraction and Tracking System
Manling Li, Revanth Gangi Reddy, Ziqi Wang, Yi-Shyuan Chiang, Tuan M. Lai, Pengfei Yu, Zixuan Zhang, Heng Ji
- Event Schema Induction with Double Graph Autoencoders
Xiaomeng Jin†, Manling Li and Heng Ji
- New Frontiers of Information Extraction
Muhao Chen, Lifu Huang, Manling Li, Ben Zhou, Heng Ji
Tutorial - MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding
Revanth Gangi Reddy†, Xilin Rui†, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avi Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji
2021
- The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction
Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han and Clare Voss
- Event-centric Natural Language Processing
Muhao Chen, Hongming Zhang, Qiang Ning, Manling Li, Heng Ji, Kathleen McKeown and Dan Roth
Tutorial - COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation
Qingyun Wang, Manling Li, Xuan Wang, Nikolaus Parulian, Guangxing Han, Jiawei Ma, Jingxuan Tu, Ying Lin, Haoran Zhang, Weili Liu, Aabhas Chauhan, Yingjun Guan, Bangzheng Li, Ruisong Li, Xiangchen Song, Heng Ji, Jiawei Han, Shih-Fu Chang, James Pustejovsky, David Liem, Ahmed Elsayed, Martha Palmer, Jasmine Rah, Clare Voss, Cynthia Schneider, Boyan Onyshkevych
Best Demo Paper Award at NAACL 2021 - RESIN: A Dockerlized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System
Haoyang Wen, Ying Lin, Tuan M. Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi R. Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan W. Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang and Heng Ji
- Event-centric Natural Language Processing
Muhao Chen, Hongming Zhang, Qiang Ning, Manling Li, Heng Ji and Dan Roth
Tutorial - Connecting the Dots: Event Graph Schema Induction with Path Language Modeling
Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, Heng Ji, Jonathan May, Nathanael Chambers, Clare Voss
- GAIA: A Fine-grained Multimedia Knowledge Extraction System
Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang
Best Demo Paper Award at ACL 2020 - Cross-media Structured Common Space for Multimedia Event Extraction
Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman
- A Joint Neural Model for Information Extraction with Global Features
Ying Lin, Heng Ji, Fei Huang, Lingfei Wu
- GAIA at SM-KBP 2020 - A Dockerized Multi-media Multi-lingual Knowledge Extraction, Clustering, Temporal Tracking and Hypothesis Generation System
Manling Li, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Haoyang Wen, Sha Li, etc
Rank 1st in the leaderboard - Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization
Manling Li, Lingyu Zhang, Heng Ji, Richard J. Radke
- GAIA: A Fine-grained Multimedia Knowledge Extraction System
Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman
Best Demo Award - Cross-media Structured Common Space for Multimedia Event Extraction
Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang
2020
- Connecting the Dots: Event Graph Schema Induction with Path Language Modeling
Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, Heng Ji, Jonathan May, Nathanael Chambers, Clare Voss
- GAIA: A Fine-grained Multimedia Knowledge Extraction System
Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang
Best Demo Paper Award at ACL 2020 - Cross-media Structured Common Space for Multimedia Event Extraction
Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman
- Cross-lingual Structure Transfer for Zero-resource Event Extraction
Manling Li, Haoyang Wen, Qi Zeng, Ying Lin, Bishan Yang, Nanyun Peng, Heng Ji
- RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System
Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han, Clare Voss
- Connecting the Dots: Event Graph Schema Induction with Path Language Modeling
Manling Li, Qi Zeng, Ying Lin, Kyunghyun Cho, Heng Ji, Jonathan May, Nathanael Chambers, Clare Voss
2019
- Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization
Manling Li, Lingyu Zhang, Heng Ji, Richard J. Radke
- GAIA: A Fine-grained Multimedia Knowledge Extraction System
Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman
Best Demo Award - Cross-media Structured Common Space for Multimedia Event Extraction
Manling Li, Alireza Zareian, Qi Zeng, Spencer Whitehead, Di Lu, Heng Ji, Shih-Fu Chang
- GAIA at SM-KBP 2019 - A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System
Manling Li, Ying Lin, Ananya Subburathinam, Spencer Whitehead, Xiaoman Pan, Di Lu, Qingyun Wang, Tongtao Zhang, Lifu Huang, Heng Ji, Alireza Zareian, Hassan Akbari, Brian Chen, Bo Wu, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Jennifer Chen, Eric Berquist, Kexuan Sun, Xujun Peng, Ryan Gabbard Marjorie Freedman, Pedro Szekely, T.K. Satish Kumar, Arka Sadhu, Ram Nevatia, Miguel Rodriguez, Yifan Wang, Yang Bai, Ali Sadeghian, Daisy Zhe Wang
Rank 1st, with more than 10% higher than the second team