Skip to content

Classical Planning & Decision-Making

Overview of AI Planning

What is AI Planning

AI Planning studies how agents can automatically generate action sequences to transition from an initial state to a goal state. Unlike generic search problems, planning requires explicit representations of world states, actions, and their effects.

The core planning problem can be formalized as:

\[\text{Given } \langle S_0, G, A \rangle, \text{ find action sequence } a_1, a_2, \ldots, a_n \text{ such that } S_0 \xrightarrow{a_1} S_1 \xrightarrow{a_2} \cdots \xrightarrow{a_n} S_n \models G\]

where \(S_0\) is the initial state, \(G\) is the goal condition, and \(A\) is the set of available actions.

Taxonomy of Planning Problems

Dimension Simple Complex
Observability Fully observable Partially observable
Determinism Deterministic Stochastic
Time Discrete Continuous
Agents Single-agent Multi-agent
Objective Goal achievement Objective function optimization
  • Search: States are "black boxes" — only successor functions and goal tests are defined
  • Planning: States have structured representations (propositions/predicates), actions have explicit preconditions and effects

This structured representation allows planners to exploit the internal structure of the problem for efficient solving, rather than searching blindly.

Application Domains

  • Robotics: Task planning, motion planning, manipulation planning
  • Autonomous Driving: Behavior decision-making, path planning, trajectory generation
  • Game AI: NPC behavior, strategic planning
  • Logistics & Scheduling: Supply chain optimization, resource allocation
  • Aerospace: Satellite mission planning, Mars rover autonomous planning
  • NLP: Dialogue management, story generation

Classical Planning Representations

Classical planning addresses the most fundamental planning setting: fully observable, deterministic, static, discrete, single-agent environments.

STRIPS Representation

STRIPS (Stanford Research Institute Problem Solver, 1971) is one of the earliest planning representation languages:

  • State: A set of ground atoms (propositions)
  • Action: Described by a triple \(\langle \text{Precondition}, \text{Add}, \text{Delete} \rangle\)
    • Precondition: Conditions required before execution
    • Add List: Propositions added after execution
    • Delete List: Propositions removed after execution

Example: Blocks World

Action: Move(B, x, y)
  Precondition: On(B, x) ∧ Clear(B) ∧ Clear(y)
  Add: On(B, y) ∧ Clear(x)
  Delete: On(B, x) ∧ Clear(y)

PDDL — Planning Domain Definition Language

PDDL is the standardized representation language for classical planning, separating problems into two parts:

  1. Domain file: Defines types, predicates, and action templates
  2. Problem file: Defines specific objects, initial state, and goals
;; Domain: Logistics
(:action drive-truck
  :parameters (?truck ?from ?to ?city)
  :precondition (and (truck ?truck) (location ?from)
                     (location ?to) (in-city ?from ?city)
                     (in-city ?to ?city) (at ?truck ?from))
  :effect (and (at ?truck ?to) (not (at ?truck ?from))))

PDDL has undergone several extensions: PDDL 2.1 introduced temporal and numeric features, PDDL 3.0 added preferences and constraints.

Planning problems can be converted to graph search problems:

Forward Search (Progression):

  • Start from the initial state, progressively apply actions until the goal is met
  • Large branching factor — requires good heuristics

Backward Search (Regression):

  • Start from the goal, work backward to find which actions achieve subgoals
  • Naturally considers only relevant actions, but must handle partial states

Planning Graphs

The Graphplan algorithm (Blum & Furst, 1997) introduced the planning graph data structure:

  • Alternating proposition levels and action levels
  • Each level records mutex relations — pairs of propositions/actions that cannot hold simultaneously
  • Applications:
    • Extracting valid plans
    • Serving as heuristic functions to estimate goal distance (\(h^{\max}\), \(h^{\text{add}}\), \(h^{FF}\))

Heuristic Planning

The core idea in modern classical planners: automatically extract heuristic functions from the problem description.

Common heuristics:

Heuristic Method Characteristics
\(h^{\max}\) Maximum cost across subgoals Admissible, but imprecise
\(h^{\text{add}}\) Sum of subgoal costs Inadmissible, but informative
\(h^{FF}\) Planning graph ignoring delete effects Balances accuracy and computation
LM-cut Landmark-based cuts Admissible, suitable for optimal planning

Representative planners: Fast Downward, LAMA, FF Planner.

Partial-Order Planning

Unlike total-order planning, partial-order planning only constrains the ordering of actions with causal dependencies:

  • More flexible, allows parallel execution
  • Maintains causal links and threats
  • Classic algorithm: POP (Partial-Order Planner)

Behavior Decision-Making

Behavior decision-making concerns how agents select high-level behavioral strategies in complex environments, serving as the bridge between perception and low-level control.

Finite State Machines (FSM)

The simplest behavior decision model, where agents switch among a finite set of states:

  • States: Represent behavioral modes (e.g., patrol, chase, flee)
  • Transitions: State switches triggered by conditions
  • Pros: Intuitive, easy to implement and debug
  • Cons: State explosion — \(n\) state variables yield \(O(2^n)\) combined states

Hierarchical FSMs (HFSM) reduce complexity via nesting, but still struggle with large-scale problems.

Behavior Trees (BT)

Behavior trees are a widely used behavior decision framework in game AI and robotics:

Core Node Types:

Node Type Function Behavior
Sequence (→) Sequential execution Succeeds only if all children succeed; stops on failure
Selector (?) Fallback execution Stops on first success; fails only if all fail
Parallel (⇉) Parallel execution Success/failure determined by policy
Condition Condition check Returns success/failure
Action Execute action Returns success/failure/running
Decorator Modify child node Invert, repeat, timeout, etc.

Pros: Modular, reusable, easy to extend and debug.

Comparison with FSMs: In behavior trees, state is implicit in the tree traversal position — no need to explicitly manage state transitions.

Hierarchical Task Networks (HTN)

HTN planning solves problems through task decomposition:

  • Primitive Tasks: Directly executable actions
  • Compound Tasks: Must be decomposed into subtasks
  • Methods: Describe how to decompose compound tasks into subtask sequences
Task: Deliver(package, destination)
  Method 1: [Pick(package), Drive(destination), Drop(package)]
  Method 2: [Pick(package), Fly(destination), Drop(package)]

HTN leverages domain knowledge to reduce the search space, well-suited for highly structured applications (logistics, manufacturing).

Representative systems: SHOP2, SIPE-2.

Decision-Theoretic Planning

When environments are uncertain, probability and utility must be incorporated:

MDP (Markov Decision Process):

\[V^*(s) = \max_a \left[ R(s, a) + \gamma \sum_{s'} T(s, a, s') V^*(s') \right]\]

POMDP (Partially Observable MDP):

  • The agent cannot directly observe the full state
  • Maintains a belief state — a probability distribution over states
  • Computationally intractable (PSPACE-complete), typically requiring approximate solutions

Game-Theoretic Planning

In multi-agent settings, one must account for other agents' behavior:

  • Nash equilibrium: No agent has an incentive to unilaterally deviate
  • Minimax strategy: Optimal strategy in zero-sum games
  • Core challenge in multi-agent planning: coordination and communication

Optimization Methods in Planning

Many variants of planning problems can be formulated as mathematical optimization problems.

Linear Programming and Mixed-Integer Programming

Linear Programming (LP):

\[\min \mathbf{c}^T \mathbf{x} \quad \text{s.t.} \quad \mathbf{A}\mathbf{x} \leq \mathbf{b}, \quad \mathbf{x} \geq 0\]
  • Both objective function and constraints are linear
  • Simplex method and Interior Point methods
  • Applications: Resource allocation, network flows, relaxations of scheduling problems

Mixed-Integer Programming (MIP) — when some decision variables must take integer values:

\[\min \mathbf{c}^T \mathbf{x} + \mathbf{d}^T \mathbf{y} \quad \text{s.t.} \quad \mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{y} \leq \mathbf{b}, \quad \mathbf{y} \in \mathbb{Z}^p\]
  • MILP (Mixed-Integer Linear Programming): NP-hard, but modern solvers (Gurobi, CPLEX) can handle considerable scale
  • Applications: Obstacle avoidance (binary variables for which side of obstacles), task assignment (0-1 decisions), vehicle routing (TSP/VRP)

Nonlinear Programming and Convex Optimization

Nonlinear Programming (NLP):

\[\min f(\mathbf{x}) \quad \text{s.t.} \quad g_i(\mathbf{x}) \leq 0, \quad h_j(\mathbf{x}) = 0\]

Solution methods:

  • Gradient Descent / Newton's Method: For unconstrained or simply constrained problems
  • Sequential Quadratic Programming (SQP): Approximates the problem as QP at each step
  • Interior Point Methods (IPM): Converts inequality constraints to barrier functions
  • Augmented Lagrangian Method (ALM): Combines Lagrange multipliers with penalty terms

Convex Optimization — when \(f\) and the feasible region are both convex, local optima are global optima:

  • Quadratic Programming (QP): \(\min \frac{1}{2}\mathbf{x}^T\mathbf{H}\mathbf{x} + \mathbf{f}^T\mathbf{x}\)
  • Second-Order Cone Programming (SOCP): Constraints are second-order cones
  • Semidefinite Programming (SDP): Constraints are positive semidefinite matrices

In trajectory optimization, many problems can be formulated or approximated as convex optimization.

Constraint Optimization in Planning

Task Assignment Problem — assign \(n\) tasks to \(m\) agents, minimizing total cost:

\[\min \sum_{i,j} c_{ij} x_{ij} \quad \text{s.t.} \quad \sum_j x_{ij} = 1, \quad x_{ij} \in \{0, 1\}\]

The Hungarian algorithm solves this in polynomial time.

Vehicle Routing Problem (VRP) — a generalization of TSP: multiple vehicles depart from a depot to serve all customers and return:

  • Capacitated VRP (CVRP)
  • VRP with Time Windows (VRPTW)
  • Typically solved using MIP + heuristics (column generation, adaptive large neighborhood search)

Scheduling Problems — Job Shop Scheduling:

  • \(n\) jobs, \(m\) machines, each job has operation precedence constraints
  • Objective: Minimize makespan
  • NP-hard; in practice solved by combining CP (Constraint Programming) and MIP

Key References

  • Stuart Russell & Peter Norvig, Artificial Intelligence: A Modern Approach (AIMA) — Part III: Planning
  • Dana Nau et al., Automated Planning: Theory and Practice
  • Steven LaValle, Planning Algorithms

评论 #