Agent Milestones

Overview

This article reviews the technological milestones of agents from the 1960s to the present. Each system achieved a breakthrough in a particular dimension. By analyzing these milestones, we can trace the evolution of agent technology from rule-based systems to learning systems, and from specialized to general-purpose.

Milestone Timeline

timeline
    title Agent Technology Milestones
    section Conversation & Understanding
        1966 ELIZA : First conversational program
        1971 SHRDLU : Language understanding in restricted worlds
        2011 Siri : Mainstream voice assistant
        2022 ChatGPT : LLM conversational agent
    section Knowledge & Reasoning
        1972 MYCIN : Medical expert system
        1987 BDI : Rational agent theory
        1991 SOAR : General cognitive architecture
        2023 CoT/ReAct : LLM reasoning + action
    section Planning & Action
        1969 Shakey : Autonomous mobile robot
        1997 Deep Blue : Game tree search
        2016 AlphaGo : Deep RL decision-making
        2023 Voyager : Open-world exploration
    section Autonomous Agents
        2023 AutoGPT : Autonomous task execution
        2023 BabyAGI : Task-driven agent
        2024 Claude Code : Programming agent
        2024 Devin : Autonomous software engineer
        2025 Operator : Browser agent

1. ELIZA (1966) -- The Illusion of Conversation

Developer: Joseph Weizenbaum, MIT

Technical Approach: A dialogue system based on pattern matching and substitution rules, simulating a Rogerian psychotherapist.

Breakthrough Significance:

First demonstrated that humans project emotions onto machine conversations (the ELIZA effect)
Revealed the philosophical gap between "understanding" and "appearing to understand"
Pioneered human-computer dialogue research

Core Mechanism:

User Input → Keyword Matching → Template Filling → Response Output

Subsequent Impact: The spiritual predecessor of all dialogue systems, from Alexa to ChatGPT.

2. MYCIN (1972) -- The Power of Knowledge

Developer: Edward Shortliffe, Stanford

Technical Approach: A medical diagnosis expert system based on ~600 production rules, using Certainty Factors to handle uncertainty.

Certainty Factor Computation:

\[ CF(H, E) = MB(H, E) - MD(H, E) \]

where \(MB\) is the Measure of Belief and \(MD\) is the Measure of Disbelief.

Breakthrough Significance:

First AI system to reach expert-level performance in a specific domain
Proved that knowledge (rather than reasoning algorithms) is the key to intelligence
Introduced a practical framework for reasoning under uncertainty

3. Shakey (1969) -- Unifying Thought and Action

Developer: SRI International

Technical Approach: A mobile robot integrating visual perception, symbolic reasoning (STRIPS planner), and motor control.

Breakthrough Significance:

First physical agent to integrate perception-planning-execution into a complete loop
The STRIPS planning representation remains foundational in planning research today
Demonstrated that symbolic reasoning can drive physical actions

4. BDI Model (1987) -- Formalizing Rationality

Proposer: Michael Bratman, Stanford

Technical Approach: A formal model of rational agents based on intentionality philosophy.

Formal Representation:

\[ \text{Agent} = \langle B, D, I, \text{Plan Library} \rangle \]

\(B\): Belief set (information about world states)
\(D\): Desire set (goals the agent hopes to achieve)
\(I\): Intention set (committed plans of action)

Breakthrough Significance:

Provided a rigorous philosophical and computational foundation for agent "mental states"
Gave rise to practical systems such as PRS, AgentSpeak(L), and Jason
The BDI loop remains a core pattern in agent design today

Cross-Reference

For the complete formalization and implementation of the BDI model, see BDI Model.

5. SOAR (1991) -- General Cognitive Architecture

Developers: John Laird, Allen Newell, Paul Rosenbloom

Technical Approach: A general cognitive architecture based on problem-space search, with learning through chunking.

Breakthrough Significance:

First cognitive architecture aimed at achieving general intelligence
Unified problem solving, learning, and knowledge representation
Has been in continuous development for over 30 years, influencing all subsequent cognitive architecture research

6. Deep Blue (1997) -- The Pinnacle of Search

Developer: IBM

Technical Approach: Specialized hardware + Alpha-Beta pruning + opening book + endgame database, evaluating 200 million chess positions per second.

Breakthrough Significance:

First to defeat a human world champion (Garry Kasparov) in an intellectual competition
Demonstrated the power of brute-force search combined with domain knowledge
Critics argued this was not "true intelligence"

7. AlphaGo (2016) -- Learning Surpasses Knowledge

Developer: DeepMind

Technical Approach: Deep neural networks (policy network + value network) + Monte Carlo Tree Search (MCTS) + self-play reinforcement learning.

MCTS Action Selection:

\[ a^* = \arg\max_a \left[ Q(s, a) + c_{puct} \cdot P(s, a) \cdot \frac{\sqrt{N(s)}}{1 + N(s, a)} \right] \]

Breakthrough Significance:

Achieved superhuman performance in Go (\(10^{170}\) state space), previously thought to require 20 more years
Demonstrated superhuman capabilities of deep RL in complex decision tasks
AlphaZero (2017) further proved that pure self-play without human knowledge can achieve superhuman level

8. GPT-3 (2020) -- Language as Intelligence

Developer: OpenAI

Technical Approach: A 175B-parameter autoregressive language model demonstrating powerful few-shot learning capabilities.

Breakthrough Significance:

Proved that sufficiently large language models can perform diverse tasks without fine-tuning
In-context learning ushered in the prompt engineering era
Laid the foundation for subsequent LLM agents (ChatGPT, AutoGPT)

9. AutoGPT / BabyAGI (March 2023) -- The Awakening of Autonomy

AutoGPT

Developer: Significant Gravitas (Toran Bruce Richards)

Technical Approach: GPT-4 + task decomposition + memory storage + web access + self-prompting loop.

BabyAGI

Developer: Yohei Nakajima

Technical Approach: GPT-4 + task creation/prioritization/execution loop + vector database memory.

Shared Breakthroughs:

First demonstrated the possibility of LLMs autonomously decomposing and executing complex tasks
Triggered worldwide attention and investment in autonomous agents
Although limited in practical reliability, they defined the direction for subsequent research

10. Voyager (2023) -- Lifelong Learning in Open Worlds

Developers: NVIDIA + Caltech + UT Austin

Technical Approach: In Minecraft, a GPT-4-driven agent achieves skill acquisition, skill library accumulation, and automatic curriculum design through code generation.

Breakthrough Significance:

First LLM agent to achieve lifelong learning in an open world
The Skill Library mechanism enables knowledge accumulation and reuse
Automatic Curriculum enables autonomous exploration from simple to complex

11. Claude Code (2024) -- A Reliable Programming Partner

Developer: Anthropic

Technical Approach: Claude model + file system operations + command-line execution + multi-turn interaction + Human-in-the-Loop.

Breakthrough Significance:

One of the first production-grade programming agents
Demonstrated that LLM agents can work reliably on real software engineering tasks
Human-in-the-Loop pattern balances autonomy with safety

12. Devin (2024) -- Autonomous Software Engineer

Developer: Cognition AI

Technical Approach: A full-stack autonomous programming agent integrating IDE, browser, and terminal.

Breakthrough Significance:

First commercial product claiming to be an "AI software engineer"
Achieved notable results on SWE-bench
Sparked widespread discussion about whether AI will replace programmers

13. OpenAI Operator (January 2025) -- Autonomy in the Browser

Developer: OpenAI

Technical Approach: A GPT-4-based browser agent capable of autonomously browsing web pages, filling forms, and completing shopping tasks.

Breakthrough Significance:

First commercial autonomous browser agent
Extended agents from "conversation" to "operating interfaces on behalf of humans"
Drove the development of the Computer Use paradigm

Comparative Analysis of Milestones

Milestone	Knowledge Source	Reasoning Method	Action Space	Learning Ability	Generality
ELIZA	Manual rules	Pattern matching	Text replies	None	Single domain
MYCIN	Expert knowledge	Forward/backward chaining	Diagnostic suggestions	None	Single domain
Shakey	Manual model	STRIPS planning	Physical movement	None	Restricted env.
SOAR	Production rules	Problem-space search	Symbolic operations	Chunking	Multi-domain
Deep Blue	Evaluation function	Alpha-Beta search	Chess moves	None	Single domain
AlphaGo	Self-play	MCTS + NN	Go moves	Deep RL	Board games
GPT-3	Pre-training corpora	Autoregressive generation	Text	In-context	Multi-domain
AutoGPT	Pre-training + tools	CoT + reflection	Text + tools	Memory accumulation	Multi-domain
Voyager	Pre-training + code	CoT + code generation	Minecraft	Skill library	Game world
Claude Code	Pre-training + tools	Multi-step reasoning	Code + files + CLI	In-context learning	Software eng.
Operator	Pre-training + browser	Vision + reasoning	Web operations	In-context learning	Web tasks

Patterns of Development

1. Capability Leap Pattern

Every major breakthrough follows a similar pattern:

graph LR
    A[Theory Proposed] --> B[Restricted Prototype]
    B --> C[Domain Validation]
    C --> D[Engineering Optimization]
    D --> E[Large-Scale Deployment]
    E --> F[Inspires New Theory]
    F --> A

2. Key Turning Points

Knowledge Acquisition Bottleneck (1980s): The excessive cost of knowledge engineering in expert systems drove the rise of machine learning
Symbol Grounding Problem (1990s): Symbolic systems lacked perceptual and motor foundations, driving embodied agent research
Data-Driven Revolution (2010s): Deep learning demonstrated the power of learning representations from data
Language as Interface (2020s): LLMs turned natural language into a universal control interface for agents

3. Open Problems

Long-term planning: Existing agents still struggle with reliable planning beyond dozens of steps
World models: The gap between LLM "world knowledge" and true causal world models
Continual learning: How to continuously accumulate new capabilities without forgetting old knowledge
Safety alignment: How to ensure highly autonomous agents align with human intentions

References

Weizenbaum, J. (1966). ELIZA. CACM, 9(1), 36-45.
Shortliffe, E.H. (1976). Computer-Based Medical Consultations: MYCIN. Elsevier.
Nilsson, N.J. (1984). Shakey the Robot. SRI International.
Bratman, M.E. (1987). Intention, Plans, and Practical Reason. Harvard.
Laird, J.E. (2012). The Soar Cognitive Architecture. MIT Press.
Silver, D. et al. (2016). Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489.
Brown, T. et al. (2020). Language Models are Few-Shot Learners. NeurIPS 2020.
Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv:2305.16291.

Agent Milestones

Overview

Milestone Timeline

1. ELIZA (1966) -- The Illusion of Conversation

2. MYCIN (1972) -- The Power of Knowledge

3. Shakey (1969) -- Unifying Thought and Action

4. BDI Model (1987) -- Formalizing Rationality

5. SOAR (1991) -- General Cognitive Architecture

6. Deep Blue (1997) -- The Pinnacle of Search

7. AlphaGo (2016) -- Learning Surpasses Knowledge

8. GPT-3 (2020) -- Language as Intelligence

9. AutoGPT / BabyAGI (March 2023) -- The Awakening of Autonomy

AutoGPT

BabyAGI

10. Voyager (2023) -- Lifelong Learning in Open Worlds

11. Claude Code (2024) -- A Reliable Programming Partner

12. Devin (2024) -- Autonomous Software Engineer

13. OpenAI Operator (January 2025) -- Autonomy in the Browser

Comparative Analysis of Milestones

Patterns of Development

1. Capability Leap Pattern

2. Key Turning Points

3. Open Problems

References

评论 #