Key Conferences and Papers

Overview

Agent research spans multiple disciplines, with relevant work distributed across top venues in AI, NLP, robotics, and software engineering. This article surveys the most important academic conferences and foundational papers in the agent field, helping researchers quickly locate core literature.

1. Core Academic Conferences

1.1 Agent-Specific Conferences

Conference	Full Name	Founded	Characteristics
AAMAS	International Conference on Autonomous Agents and Multiagent Systems	2002	The most authoritative dedicated conference for agents
AAAI	Association for the Advancement of Artificial Intelligence	1980	Comprehensive AI conference with extensive agent work
IJCAI	International Joint Conference on Artificial Intelligence	1969	The earliest international AI conference

1.2 Deep Learning and NLP Conferences

LLM agent research is primarily published at the following venues:

Conference	Relevance to Agents	Representative Work
NeurIPS	Agent workshops, reasoning methods	CoT, ToT, Reflexion
ICML	RL-based agents, tool learning	Toolformer, RLHF
ICLR	LLM reasoning, agent architectures	ReAct, Self-Refine
ACL/EMNLP	Language agents, dialogue systems	WebGPT, Generative Agents
COLM	Conference on Language Modeling (new in 2024)	LLM agent evaluation and design

1.3 Robotics and Embodied Intelligence Conferences

Conference	Relevance to Agents
ICRA	Robotic agents, embodied planning
IROS	Autonomous systems, multi-robot coordination
CoRL	Robot learning, embodied decision-making
RSS	Robotics: Science and Systems

1.4 Important Workshops

Workshop	Host Conference	Topic
LLM Agents Workshop	NeurIPS 2023/2024	Design and evaluation of LLM agents
Foundation Models for Decision Making	NeurIPS 2023	Foundation models for decision-making
Agent Learning in Open-Endedness	ICML 2024	Agent learning in open-ended worlds
Language Agents Workshop	ICLR 2024	Language-driven agents

2. Foundational Papers

2.1 Blog Posts and Surveys (Informal but Highly Influential)

Year	Author	Title	Contribution
2023.06	Lilian Weng	LLM Powered Autonomous Agents	Defined the classic LLM agent framework: Planning + Memory + Tool Use
2023.09	Andrew Ng	Agentic Design Patterns	Systematically summarized four agent design patterns: Reflection, Tool Use, Planning, Multi-Agent
2024.01	Anthropic	Building Effective Agents	Proposed engineering best practices for agent systems

Recommended Starting Point

Lilian Weng's blog post is the most widely cited informal reference in the LLM agent field and is recommended as a first read.

2.2 Reasoning and Chain-of-Thought

Year	Paper	Venue	Core Contribution
2022	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	NeurIPS 2022	Wei et al. proposed CoT, demonstrating that intermediate reasoning steps significantly improve LLM reasoning
2022	Self-Consistency Improves Chain of Thought Reasoning	ICLR 2023	Wang et al. proposed self-consistency sampling with majority voting across multiple reasoning paths
2023	Tree of Thoughts: Deliberate Problem Solving with LLMs	NeurIPS 2023	Yao et al. extended reasoning from chains to trees, supporting backtracking and search

2.3 Action and Tool Use

Year	Paper	Venue	Core Contribution
2022	ReAct: Synergizing Reasoning and Acting in Language Models	ICLR 2023	Yao et al. proposed the Thought-Action-Observation loop, unifying reasoning and action
2021	WebGPT: Browser-assisted Question-answering	arXiv	Nakano et al. LLM uses browser to search and cite information
2023	Toolformer: Language Models Can Teach Themselves to Use Tools	NeurIPS 2023	Schick et al. LLM autonomously learns when and how to call tools
2023	Gorilla: Large Language Model Connected with Massive APIs	arXiv	Patil et al. trained LLM to accurately call large-scale APIs

2.4 Reflection and Self-Improvement

Year	Paper	Venue	Core Contribution
2023	Reflexion: Language Agents with Verbal Reinforcement Learning	NeurIPS 2023	Shinn et al. verbalized experience reflection replaces gradient updates
2023	Self-Refine: Iterative Refinement with Self-Feedback	NeurIPS 2023	Madaan et al. iterative generate-feedback-refine optimization loop
2024	Self-Debugging: Teaching LLMs to Debug Their Predictions	arXiv	Chen et al. LLM self-debugs code through execution feedback

2.5 Agent Systems and Architectures

Year	Paper	Venue	Core Contribution
2023	Generative Agents: Interactive Simulacra of Human Behavior	UIST 2023	Park et al. social simulation of 25 generative agents in a virtual town
2023	Voyager: An Open-Ended Embodied Agent with LLMs	arXiv	Wang et al. lifelong learning agent in Minecraft
2023	MetaGPT: Meta Programming for Multi-Agent Collaborative Framework	ICLR 2024	Hong et al. standardized multi-agent software development workflow
2024	Cognitive Architectures for Language Agents (CoALA)	arXiv	Sumers et al. cognitive architecture framework for language agents

2.6 Evaluation and Benchmarks

Year	Paper	Venue	Core Contribution
2023	AgentBench: Evaluating LLMs as Agents	ICLR 2024	First comprehensive LLM agent evaluation benchmark
2023	SWE-bench: Can Language Models Resolve Real-World Issues?	ICLR 2024	Software engineering evaluation based on real GitHub issues
2023	WebArena: A Realistic Web Environment for Building Autonomous Agents	ICLR 2024	Realistic web environment for agent evaluation

3. Classic Textbooks

Book	Author	Year	Status
Artificial Intelligence: A Modern Approach	Russell & Norvig	1995/2020	The "AI Bible," with an agent perspective throughout
An Introduction to MultiAgent Systems	Wooldridge	2002/2009	Classic textbook on multi-agent systems
Multiagent Systems	Shoham & Leyton-Brown	2008	Multi-agent algorithms and game theory
Speech and Language Processing	Jurafsky & Martin	2000/2024	NLP reference book with dialogue system chapters

4. Paper Reading Roadmap

Beginner Level (Recommended in Order)

Weng (2023) -- LLM Powered Autonomous Agents (blog)
Wei et al. (2022) -- Chain-of-Thought
Yao et al. (2022) -- ReAct
Park et al. (2023) -- Generative Agents
Shinn et al. (2023) -- Reflexion

Intermediate Level

Yao et al. (2023) -- Tree of Thoughts
Sumers et al. (2024) -- CoALA
Schick et al. (2023) -- Toolformer
Wang et al. (2023) -- Voyager
Hong et al. (2023) -- MetaGPT

Advanced Level

OpenAI (2024) -- o1 System Card
DeepSeek (2025) -- DeepSeek-R1
Anthropic (2024) -- Building Effective Agents
AgentBench / SWE-bench evaluation papers

5. Key Research Teams

Team/Institution	Key Researchers	Research Focus
Princeton NLP	Karthik Narasimhan, Shunyu Yao	ReAct, ToT, SWE-bench
Stanford NLP	Percy Liang, Joon Sung Park	Generative Agents, HELM
CMU	Graham Neubig	Code agents, software engineering
OpenAI	Research team	GPT series, Function Calling, Operator
Anthropic	Research team	Claude, Constitutional AI
DeepMind	Research team	Gemini, AlphaCode
Microsoft Research	Research team	AutoGen, TaskWeaver
Tsinghua KEG	Jie Tang's team	AgentBench, ChatGLM

References

Weng, L. (2023). LLM Powered Autonomous Agents. lilianweng.github.io.
Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022.
Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023.
Park, J.S. et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. UIST 2023.
Shinn, N. et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS 2023.