Plan-Execute Frameworks
Overview
Plan-and-Execute is an agent architecture that separates planning (generating a step list) from execution (completing steps one by one). This separation allows different models to handle different task phases and supports dynamic re-planning. This article provides an in-depth analysis of the Plan-and-Execute pattern, the LLMCompiler parallel execution framework, and Hierarchical Task Network (HTN) planning.
1. Plan-and-Execute Pattern
1.1 Core Architecture
graph TD
TASK[User Task] --> PLANNER[Planner LLM]
PLANNER --> PLAN[Step List<br/>Step 1, 2, ..., N]
PLAN --> EXEC1[Executor executes Step 1]
EXEC1 --> RESULT1[Result 1]
RESULT1 --> EXEC2[Executor executes Step 2]
EXEC2 --> RESULT2[Result 2]
RESULT2 --> DOTS[...]
DOTS --> EXECN[Executor executes Step N]
EXECN --> RESULTN[Result N]
RESULTN --> REPLAN{Need re-planning?}
REPLAN -->|Yes| PLANNER
REPLAN -->|No| FINAL[Aggregate Final Result]
1.2 Comparison with ReAct
| Dimension | ReAct | Plan-and-Execute |
|---|---|---|
| Planning approach | Decides next step at each step (greedy) | Generates complete plan first |
| Global view | None (only sees current step) | Yes (global plan) |
| LLM usage | Large model at every step | Large model for planning, small model for execution |
| Cost | Uniformly distributed | High planning cost, low execution cost |
| Adaptability | High (real-time adjustment) | Requires explicit re-planning |
| Interpretability | Medium | High (plan steps visible) |
1.3 Two-Phase Design
Phase 1: Planning
PLANNER_PROMPT = """
Given the user task, generate a step-by-step execution plan.
Each step should be a clear, executable instruction.
Task: {task}
Please output a numbered list, one step per line:
1. ...
2. ...
"""
plan = planner_llm.generate(PLANNER_PROMPT.format(task=task))
steps = parse_plan(plan)
Phase 2: Execution
EXECUTOR_PROMPT = """
You need to execute the following step using available tools.
Current step: {step}
Previous execution results: {previous_results}
Available tools: {tools}
Please execute the current step.
"""
for step in steps:
result = executor_llm.generate(
EXECUTOR_PROMPT.format(
step=step,
previous_results=results,
tools=tools
)
)
results.append(result)
1.4 Dynamic Re-Planning
When unexpected situations arise during execution, trigger re-planning:
REPLAN_PROMPT = """
Original plan: {original_plan}
Completed steps and results: {completed_steps}
Current issue: {issue}
Please modify the remaining plan based on the current situation:
"""
def should_replan(step_result, expected):
"""Determine whether re-planning is needed"""
# Execution failure
if step_result.error:
return True
# Result deviates significantly from expectation
if llm.evaluate(step_result, expected) < threshold:
return True
# New information discovered that changes the nature of the problem
if llm.detect_new_info(step_result):
return True
return False
2. LLMCompiler: Parallel Execution
2.1 Core Idea
LLMCompiler, proposed by Kim et al. (2024), decomposes tasks into a directed acyclic graph (DAG), identifying steps that can be executed in parallel:
graph TD
TASK[Task: Compare weather and population of Beijing and Shanghai] --> PLAN[LLM Planner<br/>Generate Task DAG]
PLAN --> T1[Task 1: Query Beijing weather]
PLAN --> T2[Task 2: Query Shanghai weather]
PLAN --> T3[Task 3: Query Beijing population]
PLAN --> T4[Task 4: Query Shanghai population]
T1 --> JOIN[Joiner<br/>Aggregate all results]
T2 --> JOIN
T3 --> JOIN
T4 --> JOIN
JOIN --> ANS[Final comparative analysis]
2.2 Three Major Components
Planner
Generates a task list with dependency relationships:
Task: Compare weather and population of Beijing and Shanghai
1. search("Beijing current weather") # No dependencies
2. search("Shanghai current weather") # No dependencies
3. search("Beijing population 2024") # No dependencies
4. search("Shanghai population 2024") # No dependencies
5. join() # Depends on 1,2,3,4
Key: The Planner not only generates the task list but also annotates dependency relationships, enabling tasks without dependencies to be executed in parallel.
Task Fetching Unit
Identifies parallel opportunities in the DAG:
That is, all tasks whose dependencies are complete can be launched in parallel.
Joiner
Aggregates results from parallel execution, deciding whether re-planning is needed:
def joiner(results, original_task):
# Check if all necessary results are available
if all_results_available(results):
return llm.synthesize(results, original_task)
else:
# Partial failure, decide re-planning strategy
return replan(results, original_task)
2.3 Performance Advantages
| Metric | ReAct | Plan-Execute (Serial) | LLMCompiler (Parallel) |
|---|---|---|---|
| Latency | \(N \times L\) | \(N \times l\) | \(D \times l\) |
| LLM calls | \(N\) | \(N + 1\) | \(1 + 1\) |
| Tool calls | \(N\) (serial) | \(N\) (serial) | \(N\) (parallel) |
where \(N\) is the number of steps, \(L\) is large model latency, \(l\) is small model/tool latency, and \(D\) is the longest path depth in the DAG.
3. Hierarchical Task Network (HTN)
3.1 HTN Planning Overview
Hierarchical Task Networks are a classical AI hierarchical planning method that decomposes complex tasks top-down into subtasks:
graph TD
T0[Prepare Dinner] --> M1[Method: Cook Chinese Food]
M1 --> T1[Buy Groceries]
M1 --> T2[Cook]
M1 --> T3[Plate]
T1 --> T11[Write Shopping List]
T1 --> T12[Go to Supermarket]
T1 --> T13[Select Ingredients]
T2 --> T21[Wash and Cut Vegetables]
T2 --> T22[Stir-Fry]
T2 --> T23[Cook Rice]
3.2 Combining HTN with LLM
LLMs are naturally suited for hierarchical task decomposition:
HTN_DECOMPOSE_PROMPT = """
Decompose the following task into subtasks. Each subtask should be
either a directly executable atomic operation or a compound task
that can be further decomposed.
Task: {task}
Available atomic operations: {primitive_actions}
Please output the hierarchical decomposition:
Task: {task}
├── Subtask 1: ...
│ ├── Atomic operation: ...
│ └── Atomic operation: ...
├── Subtask 2: ...
└── Subtask 3: ...
"""
3.3 Advantages of HTN in Agents
| Advantage | Description |
|---|---|
| Reusability | Decomposition methods can be reused across tasks |
| Abstraction levels | Reason and monitor at appropriate levels |
| Scalability | New decomposition methods can be added incrementally |
| Interpretability | Hierarchical structure clearly shows task logic |
| Failure recovery | Can retry at the subtask level |
4. Advanced Planning Strategies
4.1 Adaptive Planning
Dynamically adjust plan granularity and content based on information gained during execution:
def adaptive_planning(task, initial_confidence):
if initial_confidence > 0.9:
# High confidence: generate detailed plan, execute at once
plan = detailed_plan(task)
return execute_all(plan)
elif initial_confidence > 0.5:
# Medium confidence: rough plan + incremental refinement
rough_plan = rough_plan(task)
for step in rough_plan:
detailed_step = refine_step(step, context)
result = execute(detailed_step)
context.update(result)
else:
# Low confidence: exploratory execution
return react_loop(task)
4.2 Speculative Planning
Similar to CPU speculative execution, predict possible branches and compute ahead:
Plan Step 3: Query user's account status
Predicted Result A (80% likely): Account normal → Pre-prepare Step 4A
Predicted Result B (20% likely): Account abnormal → Pre-prepare Step 4B
4.3 Constrained Planning
Add explicit constraints to planning:
Constraint types:
| Constraint | Example |
|---|---|
| Time constraint | Total execution time < 60 seconds |
| Cost constraint | LLM API calls < $0.50 |
| Safety constraint | No delete/modify operations |
| Quality constraint | Per-step verification pass rate > 95% |
5. Framework Implementation
5.1 Plan-and-Execute in LangGraph
from langgraph.graph import StateGraph
# Define state
class PlanExecuteState(TypedDict):
task: str
plan: list[str]
current_step: int
results: list[str]
final_answer: str
# Build graph
workflow = StateGraph(PlanExecuteState)
# Add nodes
workflow.add_node("planner", plan_step)
workflow.add_node("executor", execute_step)
workflow.add_node("replanner", replan_step)
# Add edges
workflow.set_entry_point("planner")
workflow.add_edge("planner", "executor")
workflow.add_conditional_edges(
"executor",
should_continue,
{"replan": "replanner", "next": "executor", "end": END}
)
workflow.add_edge("replanner", "executor")
Cross-Reference
For a detailed introduction to the LangGraph framework, see LangChain and LangGraph.
5.2 Production Deployment Considerations
| Consideration | Recommendation |
|---|---|
| Planning model | Use the strongest model (GPT-4, Claude Opus) to ensure plan quality |
| Execution model | Can use smaller models (GPT-3.5, Claude Haiku) to reduce cost |
| Re-planning threshold | Should not be too sensitive to avoid frequent re-planning |
| Maximum steps | Set an upper limit (e.g., 20 steps) to prevent infinite loops |
| Step granularity | Each step should be a clear, verifiable operation |
| Error handling | Distinguish retryable errors from fatal errors |
6. Plan Quality Evaluation
6.1 Quality Dimensions of Plans
| Dimension | Definition | Evaluation Method |
|---|---|---|
| Completeness | Does the plan cover all necessary steps? | Check if goal is reachable |
| Executability | Can each step actually be executed? | Check tool/API availability |
| Efficiency | Is the number of steps minimal? | Compare with optimal plan |
| Robustness | Tolerance for unexpected situations | Recovery ability after error injection |
References
- Wang, L. et al. (2023). Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models. ACL 2023.
- Kim, S. et al. (2024). An LLM Compiler for Parallel Function Calling. arXiv:2312.04511.
- Erol, K. et al. (1994). HTN Planning: Complexity and Expressivity. AAAI 1994.
- Huang, W. et al. (2022). Inner Monologue: Embodied Reasoning through Planning with Language Models. CoRL 2022.
- Sun, H. et al. (2023). AdaPlanner: Adaptive Planning from Feedback with Language Models. NeurIPS 2023.