Skip to content

Architecture Design Patterns

Overview

Agent architecture design patterns are proven organizational schemes for agent architectures. From the classic Sense-Plan-Act to Brooks' Subsumption Architecture, and to modern LLM agent orchestration patterns, this article surveys the key patterns in agent architecture design and their applicable scenarios.


1. Classic Architecture Patterns

1.1 Sense-Plan-Act (SPA)

The earliest and most intuitive agent architecture pattern:

graph LR
    S[Sense] --> P[Plan]
    P --> A[Act]
    A --> ENV[Environment]
    ENV --> S

Characteristics:

  • Sequential execution: complete sensing first, then complete planning, then execute
  • Assumes the world does not change during planning (static environment assumption)
  • Representatives: Shakey robot, STRIPS planner

Pros: Clear architecture, easy to implement

Cons: - Long planning time, unable to respond in real time - Static environment assumption often does not hold in reality - Lack of feedback between sensing, planning, and execution

1.2 Subsumption Architecture

A behaviorist architecture proposed by Rodney Brooks (1986) that completely abandons internal representation:

┌──────────────────────────────────┐
│  Layer 3: Explore Behavior        │ ← Higher-level behavior
├──────────────────────────────────┤
│  Layer 2: Wander Behavior         │
├──────────────────────────────────┤
│  Layer 1: Avoid Behavior          │
├──────────────────────────────────┤
│  Layer 0: Move Behavior           │ ← Lower-level behavior
└──────────────────────────────────┘
    ↑ Sensor Input      ↓ Actuator Output

Core Principles:

  1. Layer subsumption: Higher-level behaviors can suppress (subsume) the outputs of lower-level behaviors
  2. No central control: Each layer runs independently, no central planner needed
  3. The world is its own best model: No internal world model required

Pros: Real-time response, strong robustness, suitable for dynamic environments

Cons: Difficult to achieve complex goal-directed behavior, poor scalability

1.3 Layered Architecture

Combines the advantages of SPA and Subsumption Architecture:

graph TD
    subgraph Three-Layer Architecture
        L3[Deliberative Layer<br/>Long-term planning, goal reasoning]
        L2[Sequencing Layer<br/>Short-term plans, coordination]
        L1[Reactive Layer<br/>Immediate response, emergency behaviors]
    end

    ENV[Environment] --> L1
    ENV --> L2
    ENV --> L3
    L3 --> L2
    L2 --> L1
    L1 --> ENV

Representative Architectures:

Architecture Layers Characteristics
InteRRaP 3 layers Cooperative planning + Local planning + Behavior layer
TouringMachines 3 layers Modeling + Planning + Reactive + Control framework
ATLANTIS 3 layers Advisory + Planning + Reactive layer

1.4 Blackboard Architecture

Multiple knowledge sources collaborate through a shared data structure (blackboard):

graph LR
    subgraph Knowledge Sources
        KS1[Knowledge Source 1<br/>Speech Recognition]
        KS2[Knowledge Source 2<br/>Syntactic Analysis]
        KS3[Knowledge Source 3<br/>Semantic Understanding]
        KS4[Knowledge Source 4<br/>Contextual Reasoning]
    end

    BB[Blackboard<br/>Shared Data Structure]
    CTRL[Controller<br/>Scheduler]

    KS1 <--> BB
    KS2 <--> BB
    KS3 <--> BB
    KS4 <--> BB
    CTRL --> BB

Characteristics:

  • Each knowledge source runs independently, communicating through the blackboard
  • The controller decides which knowledge source runs and when
  • Suitable for problems requiring collaboration of multiple types of expertise
  • Representative: Hearsay-II speech understanding system

2. Modern LLM Agent Architecture Patterns

2.1 Augmented LLM

The simplest pattern: LLM + retrieval/tools, no autonomous loop.

graph LR
    U[User] --> LLM[LLM]
    LLM --> |Needs retrieval| RAG[Retrieval System]
    RAG --> LLM
    LLM --> |Needs tool| TOOL[Tool]
    TOOL --> LLM
    LLM --> R[Response]

Applicable Scenario: Simple Q&A, single-step tool calls

Example: Chatbot with search capability

2.2 ReAct Loop

An iterative loop of Think-Act-Observe:

graph TD
    START[Task] --> THINK[Thought<br/>Analyze current situation]
    THINK --> ACT[Action<br/>Select and execute action]
    ACT --> OBS[Observation<br/>Observe execution result]
    OBS --> CHECK{Task complete?}
    CHECK -->|No| THINK
    CHECK -->|Yes| END[Final Answer]

Applicable Scenario: Tasks requiring multi-step reasoning and tool use

Key Design Decisions:

  • Format of thoughts (free text vs. structured)
  • Definition of the action space
  • Termination conditions
  • Maximum iteration count

2.3 Plan-and-Execute

First create a complete plan, then execute step by step:

graph TD
    TASK[Task] --> PLAN[Planner LLM<br/>Generate step list]
    PLAN --> S1[Step 1]
    S1 --> EXEC1[Executor LLM<br/>Execute Step 1]
    EXEC1 --> S2[Step 2]
    S2 --> EXEC2[Executor LLM<br/>Execute Step 2]
    EXEC2 --> S3[Step N]
    S3 --> EXECN[Executor LLM<br/>Execute Step N]
    EXECN --> CHECK{Need re-planning?}
    CHECK -->|Yes| PLAN
    CHECK -->|No| DONE[Complete]

Applicable Scenario: Complex tasks requiring clear step decomposition

Advantages: - Use small models for execution, large models for planning, saving cost - Steps are visible, easy to debug and monitor - Supports dynamic re-planning

2.4 Router

Routes to different processing flows based on input:

graph TD
    INPUT[User Input] --> ROUTER[Router LLM<br/>Classification]
    ROUTER -->|Code question| CODE[Code Assistant]
    ROUTER -->|Data analysis| DATA[Data Analyst]
    ROUTER -->|Document writing| WRITE[Writing Assistant]
    ROUTER -->|Simple Q&A| QA[Direct Answer]

Applicable Scenario: Multiple input types requiring different processing strategies

2.5 Orchestrator-Worker

A central orchestrator dynamically assigns tasks to specialized workers:

graph TD
    TASK[Complex Task] --> ORCH[Orchestrator LLM<br/>Decompose + Assign + Aggregate]
    ORCH --> W1[Worker 1<br/>Subtask A]
    ORCH --> W2[Worker 2<br/>Subtask B]
    ORCH --> W3[Worker 3<br/>Subtask C]
    W1 --> ORCH
    W2 --> ORCH
    W3 --> ORCH
    ORCH --> RESULT[Aggregated Result]

Applicable Scenario: Complex tasks requiring collaboration of multiple capabilities

Distinction from multi-agent: The orchestrator is a fixed control center; workers have no autonomous decision-making capability.

2.6 Evaluator-Optimizer

Generator and evaluator work alternately:

graph TD
    TASK[Task] --> GEN[Generator LLM<br/>Generate initial solution]
    GEN --> EVAL[Evaluator LLM<br/>Evaluate and provide feedback]
    EVAL --> CHECK{Meets criteria?}
    CHECK -->|No| REFINE[Optimizer LLM<br/>Improve based on feedback]
    REFINE --> EVAL
    CHECK -->|Yes| DONE[Output final solution]

Applicable Scenario: Generation tasks with clear quality criteria (code, copy, translation, etc.)


3. Architecture Pattern Comparison and Selection

3.1 Complexity vs. Capability Matrix

Pattern Implementation Complexity Task Complexity LLM Calls Latency Controllability
Augmented LLM Low Low 1 Low High
ReAct Loop Medium Medium 3-10 Medium Medium
Plan-Execute Medium Medium-High 5-20 Medium-High High
Router Low Multi-type 2+ Low High
Orchestrator-Worker High High 10-50 High Medium
Evaluator-Optimizer Medium Quality-sensitive 3-10 Medium High
Autonomous Loop High Open-ended 10-100+ High Low

3.2 Selection Decision Tree

graph TD
    START[Your Task] --> Q1{Completable in one step?}
    Q1 -->|Yes| AUG[Augmented LLM]
    Q1 -->|No| Q2{Steps predefinable?}
    Q2 -->|Yes| PE[Plan-Execute]
    Q2 -->|No| Q3{Requires multiple capabilities?}
    Q3 -->|Yes| Q4{Subtasks independent?}
    Q4 -->|Yes| OW[Orchestrator-Worker]
    Q4 -->|No| REACT[ReAct Loop]
    Q3 -->|No| Q5{Requires high-quality output?}
    Q5 -->|Yes| EO[Evaluator-Optimizer]
    Q5 -->|No| REACT

4. Design Principles

4.1 Anthropic's Design Principles

Summarized from Building Effective Agents:

  1. Keep it simple: Don't use complex patterns when simple ones suffice
  2. Prefer determinism: Use code logic instead of LLM judgment when possible
  3. Explicit state management: Make agent state observable and debuggable
  4. Graceful failure: Design proper error handling and fallback mechanisms
  5. Human fallback: Have humans confirm at critical decision points

4.2 Engineering Practice Principles

Principle Description
Least privilege Give tools and actions only the necessary permissions
Idempotent operations The same operation should produce the same result when executed multiple times
Observability Every step should have logging and tracing
Timeout mechanisms Set reasonable timeouts for each operation
Cost control Limit maximum LLM calls and token consumption
Progressive autonomy Start with low autonomy, gradually expand

graph LR
    A[Monolithic LLM<br/>2022] --> B[ReAct Loop<br/>Early 2023]
    B --> C[Plan-Execute<br/>Mid 2023]
    C --> D[Multi-Agent Orchestration<br/>Late 2023]
    D --> E[Adaptive Architecture<br/>2024-2025]
    E --> F[Self-Evolving Architecture<br/>Future]

Trend Observations:

  1. From fixed to adaptive: Architectures are no longer predefined but dynamically adjust based on tasks
  2. From single-model to multi-model: Different tasks use models of different sizes/capabilities
  3. From stateless to stateful: Increasing emphasis on memory and state management
  4. From single-agent to multi-agent: Complex tasks are decomposed among multiple specialized agents
  5. From synchronous to asynchronous: Support for long-running background tasks

References

  1. Brooks, R.A. (1986). A Robust Layered Control System For a Mobile Robot. IEEE JRA, 2(1), 14-23.
  2. Muller, J.P. (1996). The Design of Intelligent Agents. LNCS 1177. Springer.
  3. Nii, H.P. (1986). Blackboard Systems. AI Magazine, 7(2), 38-53.
  4. Anthropic. (2024). Building Effective Agents. anthropic.com.
  5. Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023.
  6. Wang, L. et al. (2024). A Survey on Large Language Model based Autonomous Agents. Frontiers of Computer Science.

评论 #