Agent Milestones
Overview
This article reviews the technological milestones of agents from the 1960s to the present. Each system achieved a breakthrough in a particular dimension. By analyzing these milestones, we can trace the evolution of agent technology from rule-based systems to learning systems, and from specialized to general-purpose.
Milestone Timeline
timeline
title Agent Technology Milestones
section Conversation & Understanding
1966 ELIZA : First conversational program
1971 SHRDLU : Language understanding in restricted worlds
2011 Siri : Mainstream voice assistant
2022 ChatGPT : LLM conversational agent
section Knowledge & Reasoning
1972 MYCIN : Medical expert system
1987 BDI : Rational agent theory
1991 SOAR : General cognitive architecture
2023 CoT/ReAct : LLM reasoning + action
section Planning & Action
1969 Shakey : Autonomous mobile robot
1997 Deep Blue : Game tree search
2016 AlphaGo : Deep RL decision-making
2023 Voyager : Open-world exploration
section Autonomous Agents
2023 AutoGPT : Autonomous task execution
2023 BabyAGI : Task-driven agent
2024 Claude Code : Programming agent
2024 Devin : Autonomous software engineer
2025 Operator : Browser agent
1. ELIZA (1966) -- The Illusion of Conversation
Developer: Joseph Weizenbaum, MIT
Technical Approach: A dialogue system based on pattern matching and substitution rules, simulating a Rogerian psychotherapist.
Breakthrough Significance:
- First demonstrated that humans project emotions onto machine conversations (the ELIZA effect)
- Revealed the philosophical gap between "understanding" and "appearing to understand"
- Pioneered human-computer dialogue research
Core Mechanism:
User Input → Keyword Matching → Template Filling → Response Output
Subsequent Impact: The spiritual predecessor of all dialogue systems, from Alexa to ChatGPT.
2. MYCIN (1972) -- The Power of Knowledge
Developer: Edward Shortliffe, Stanford
Technical Approach: A medical diagnosis expert system based on ~600 production rules, using Certainty Factors to handle uncertainty.
Certainty Factor Computation:
where \(MB\) is the Measure of Belief and \(MD\) is the Measure of Disbelief.
Breakthrough Significance:
- First AI system to reach expert-level performance in a specific domain
- Proved that knowledge (rather than reasoning algorithms) is the key to intelligence
- Introduced a practical framework for reasoning under uncertainty
3. Shakey (1969) -- Unifying Thought and Action
Developer: SRI International
Technical Approach: A mobile robot integrating visual perception, symbolic reasoning (STRIPS planner), and motor control.
Breakthrough Significance:
- First physical agent to integrate perception-planning-execution into a complete loop
- The STRIPS planning representation remains foundational in planning research today
- Demonstrated that symbolic reasoning can drive physical actions
4. BDI Model (1987) -- Formalizing Rationality
Proposer: Michael Bratman, Stanford
Technical Approach: A formal model of rational agents based on intentionality philosophy.
Formal Representation:
- \(B\): Belief set (information about world states)
- \(D\): Desire set (goals the agent hopes to achieve)
- \(I\): Intention set (committed plans of action)
Breakthrough Significance:
- Provided a rigorous philosophical and computational foundation for agent "mental states"
- Gave rise to practical systems such as PRS, AgentSpeak(L), and Jason
- The BDI loop remains a core pattern in agent design today
Cross-Reference
For the complete formalization and implementation of the BDI model, see BDI Model.
5. SOAR (1991) -- General Cognitive Architecture
Developers: John Laird, Allen Newell, Paul Rosenbloom
Technical Approach: A general cognitive architecture based on problem-space search, with learning through chunking.
Breakthrough Significance:
- First cognitive architecture aimed at achieving general intelligence
- Unified problem solving, learning, and knowledge representation
- Has been in continuous development for over 30 years, influencing all subsequent cognitive architecture research
6. Deep Blue (1997) -- The Pinnacle of Search
Developer: IBM
Technical Approach: Specialized hardware + Alpha-Beta pruning + opening book + endgame database, evaluating 200 million chess positions per second.
Breakthrough Significance:
- First to defeat a human world champion (Garry Kasparov) in an intellectual competition
- Demonstrated the power of brute-force search combined with domain knowledge
- Critics argued this was not "true intelligence"
7. AlphaGo (2016) -- Learning Surpasses Knowledge
Developer: DeepMind
Technical Approach: Deep neural networks (policy network + value network) + Monte Carlo Tree Search (MCTS) + self-play reinforcement learning.
MCTS Action Selection:
Breakthrough Significance:
- Achieved superhuman performance in Go (\(10^{170}\) state space), previously thought to require 20 more years
- Demonstrated superhuman capabilities of deep RL in complex decision tasks
- AlphaZero (2017) further proved that pure self-play without human knowledge can achieve superhuman level
8. GPT-3 (2020) -- Language as Intelligence
Developer: OpenAI
Technical Approach: A 175B-parameter autoregressive language model demonstrating powerful few-shot learning capabilities.
Breakthrough Significance:
- Proved that sufficiently large language models can perform diverse tasks without fine-tuning
- In-context learning ushered in the prompt engineering era
- Laid the foundation for subsequent LLM agents (ChatGPT, AutoGPT)
9. AutoGPT / BabyAGI (March 2023) -- The Awakening of Autonomy
AutoGPT
Developer: Significant Gravitas (Toran Bruce Richards)
Technical Approach: GPT-4 + task decomposition + memory storage + web access + self-prompting loop.
BabyAGI
Developer: Yohei Nakajima
Technical Approach: GPT-4 + task creation/prioritization/execution loop + vector database memory.
Shared Breakthroughs:
- First demonstrated the possibility of LLMs autonomously decomposing and executing complex tasks
- Triggered worldwide attention and investment in autonomous agents
- Although limited in practical reliability, they defined the direction for subsequent research
10. Voyager (2023) -- Lifelong Learning in Open Worlds
Developers: NVIDIA + Caltech + UT Austin
Technical Approach: In Minecraft, a GPT-4-driven agent achieves skill acquisition, skill library accumulation, and automatic curriculum design through code generation.
Breakthrough Significance:
- First LLM agent to achieve lifelong learning in an open world
- The Skill Library mechanism enables knowledge accumulation and reuse
- Automatic Curriculum enables autonomous exploration from simple to complex
11. Claude Code (2024) -- A Reliable Programming Partner
Developer: Anthropic
Technical Approach: Claude model + file system operations + command-line execution + multi-turn interaction + Human-in-the-Loop.
Breakthrough Significance:
- One of the first production-grade programming agents
- Demonstrated that LLM agents can work reliably on real software engineering tasks
- Human-in-the-Loop pattern balances autonomy with safety
12. Devin (2024) -- Autonomous Software Engineer
Developer: Cognition AI
Technical Approach: A full-stack autonomous programming agent integrating IDE, browser, and terminal.
Breakthrough Significance:
- First commercial product claiming to be an "AI software engineer"
- Achieved notable results on SWE-bench
- Sparked widespread discussion about whether AI will replace programmers
13. OpenAI Operator (January 2025) -- Autonomy in the Browser
Developer: OpenAI
Technical Approach: A GPT-4-based browser agent capable of autonomously browsing web pages, filling forms, and completing shopping tasks.
Breakthrough Significance:
- First commercial autonomous browser agent
- Extended agents from "conversation" to "operating interfaces on behalf of humans"
- Drove the development of the Computer Use paradigm
Comparative Analysis of Milestones
| Milestone | Knowledge Source | Reasoning Method | Action Space | Learning Ability | Generality |
|---|---|---|---|---|---|
| ELIZA | Manual rules | Pattern matching | Text replies | None | Single domain |
| MYCIN | Expert knowledge | Forward/backward chaining | Diagnostic suggestions | None | Single domain |
| Shakey | Manual model | STRIPS planning | Physical movement | None | Restricted env. |
| SOAR | Production rules | Problem-space search | Symbolic operations | Chunking | Multi-domain |
| Deep Blue | Evaluation function | Alpha-Beta search | Chess moves | None | Single domain |
| AlphaGo | Self-play | MCTS + NN | Go moves | Deep RL | Board games |
| GPT-3 | Pre-training corpora | Autoregressive generation | Text | In-context | Multi-domain |
| AutoGPT | Pre-training + tools | CoT + reflection | Text + tools | Memory accumulation | Multi-domain |
| Voyager | Pre-training + code | CoT + code generation | Minecraft | Skill library | Game world |
| Claude Code | Pre-training + tools | Multi-step reasoning | Code + files + CLI | In-context learning | Software eng. |
| Operator | Pre-training + browser | Vision + reasoning | Web operations | In-context learning | Web tasks |
Patterns of Development
1. Capability Leap Pattern
Every major breakthrough follows a similar pattern:
graph LR
A[Theory Proposed] --> B[Restricted Prototype]
B --> C[Domain Validation]
C --> D[Engineering Optimization]
D --> E[Large-Scale Deployment]
E --> F[Inspires New Theory]
F --> A
2. Key Turning Points
- Knowledge Acquisition Bottleneck (1980s): The excessive cost of knowledge engineering in expert systems drove the rise of machine learning
- Symbol Grounding Problem (1990s): Symbolic systems lacked perceptual and motor foundations, driving embodied agent research
- Data-Driven Revolution (2010s): Deep learning demonstrated the power of learning representations from data
- Language as Interface (2020s): LLMs turned natural language into a universal control interface for agents
3. Open Problems
- Long-term planning: Existing agents still struggle with reliable planning beyond dozens of steps
- World models: The gap between LLM "world knowledge" and true causal world models
- Continual learning: How to continuously accumulate new capabilities without forgetting old knowledge
- Safety alignment: How to ensure highly autonomous agents align with human intentions
References
- Weizenbaum, J. (1966). ELIZA. CACM, 9(1), 36-45.
- Shortliffe, E.H. (1976). Computer-Based Medical Consultations: MYCIN. Elsevier.
- Nilsson, N.J. (1984). Shakey the Robot. SRI International.
- Bratman, M.E. (1987). Intention, Plans, and Practical Reason. Harvard.
- Laird, J.E. (2012). The Soar Cognitive Architecture. MIT Press.
- Silver, D. et al. (2016). Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489.
- Brown, T. et al. (2020). Language Models are Few-Shot Learners. NeurIPS 2020.
- Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv:2305.16291.