NPC Behavior Evolution
Overview
The behavior control technology for game NPCs (Non-Player Characters) has undergone a revolutionary evolution from simple state machines to LLM-driven systems. This article traces this evolutionary path and compares the strengths and weaknesses of each approach.
Finite State Machines (FSM)
Finite state machines are the earliest and simplest NPC behavior control method.
Basic Structure
- \(S\): State set (e.g., Idle, Patrol, Chase, Attack)
- \(s_0\): Initial state
- \(\Sigma\): Input event set (e.g., see_enemy, lose_sight, health_low)
- \(\delta: S \times \Sigma \rightarrow S\): State transition function
- \(F\): Terminal state set
stateDiagram-v2
[*] --> Idle
Idle --> Patrol: start_patrol
Patrol --> Chase: see_enemy
Patrol --> Idle: patrol_complete
Chase --> Attack: in_range
Chase --> Patrol: lose_sight
Attack --> Chase: out_of_range
Attack --> Flee: health_low
Flee --> Idle: safe_distance
Implementation Example
class FSM_NPC:
def __init__(self):
self.state = "idle"
self.health = 100
def update(self, perception):
if self.state == "idle":
if perception.patrol_timer_expired:
self.state = "patrol"
elif self.state == "patrol":
if perception.enemy_visible:
self.state = "chase"
self.target = perception.nearest_enemy
elif perception.patrol_complete:
self.state = "idle"
elif self.state == "chase":
if self.distance_to(self.target) < self.attack_range:
self.state = "attack"
elif not perception.enemy_visible:
self.state = "patrol"
elif self.state == "attack":
if self.health < 20:
self.state = "flee"
elif self.distance_to(self.target) > self.attack_range:
self.state = "chase"
Limitations of FSMs
| Problem | Description |
|---|---|
| State explosion | Complex behaviors require many states; $ |
| Rigid transitions | Lack of flexibility; behavior patterns are fixed |
| Difficult to extend | Adding new behaviors requires modifying extensive transition logic |
| Lack of context | Cannot consider history and contextual information |
Hierarchical Finite State Machines (HFSM) partially alleviate the state explosion problem through state nesting, but fundamental limitations remain.
Behavior Trees
Behavior trees are the industry standard for game AI, widely used in AAA games (e.g., Halo, built into Unreal Engine).
Core Node Types
graph TD
Root[Root<br/>->] --> Sel1[Selector<br/>?]
Sel1 --> Seq1[Sequence<br/>-> Combat]
Sel1 --> Seq2[Sequence<br/>-> Patrol]
Sel1 --> Act1[Action<br/>Idle]
Seq1 --> Cond1[Condition<br/>Enemy Visible?]
Seq1 --> Sel2[Selector<br/>? Attack/Flee]
Sel2 --> Seq3[Sequence<br/>-> Attack]
Sel2 --> Act2[Action<br/>Flee]
Seq3 --> Cond2[Condition<br/>Health > 20?]
Seq3 --> Act3[Action<br/>Attack Enemy]
Seq2 --> Act4[Action<br/>Move to Waypoint]
Seq2 --> Cond3[Condition<br/>At Waypoint?]
Seq2 --> Act5[Action<br/>Wait]
Node Type Details
| Node Type | Symbol | Behavior | On Child Failure | On Child Success |
|---|---|---|---|---|
| Sequence | -> | Execute all children in order | Return failure immediately | Continue to next |
| Selector | ? | Try until one succeeds | Try next | Return success immediately |
| Decorator | diamond | Modify child behavior | Depends on decorator | Depends on decorator |
| Condition | square | Check condition | - | - |
| Action | circle | Execute action | - | - |
Tick Mechanism
Behavior trees are driven by ticks, traversing from the root node each frame:
- SUCCESS: Node executed successfully
- FAILURE: Node execution failed
- RUNNING: Node is still executing (e.g., moving toward a target)
class BehaviorTree:
def tick(self, node, context):
if isinstance(node, Sequence):
for child in node.children:
result = self.tick(child, context)
if result != Status.SUCCESS:
return result
return Status.SUCCESS
elif isinstance(node, Selector):
for child in node.children:
result = self.tick(child, context)
if result != Status.FAILURE:
return result
return Status.FAILURE
elif isinstance(node, Decorator):
child_result = self.tick(node.child, context)
return node.modify(child_result)
elif isinstance(node, Action):
return node.execute(context)
elif isinstance(node, Condition):
return Status.SUCCESS if node.check(context) else Status.FAILURE
Common Decorators
- Inverter: Invert child node result
- Repeater: Repeat child node N times
- Timeout: Return failure on timeout
- Cooldown: Do not execute during cooldown period
- ForceSuccess: Always return success
Goal-Oriented Action Planning (GOAP)
GOAP, inspired by the STRIPS planner, lets NPCs autonomously plan action sequences to achieve goals.
Core Concepts
- \(\mathcal{S}\): World state (set of key-value pairs)
- \(\mathcal{A}\): Action set, each action with preconditions and effects
- \(\mathcal{G}\): Goal set, each goal with desired world state
class GOAPAction:
def __init__(self, name, cost, preconditions, effects):
self.name = name
self.cost = cost # Action cost
self.preconditions = preconditions # Preconditions {key: value}
self.effects = effects # Effects {key: value}
# Define actions
actions = [
GOAPAction("get_axe", cost=2,
preconditions={"has_axe": False},
effects={"has_axe": True}),
GOAPAction("chop_tree", cost=4,
preconditions={"has_axe": True},
effects={"has_wood": True}),
GOAPAction("build_fire", cost=3,
preconditions={"has_wood": True},
effects={"is_warm": True}),
GOAPAction("find_shelter", cost=6,
preconditions={},
effects={"is_warm": True}),
]
# Goal
goal = {"is_warm": True}
# A* planner finds: get_axe -> chop_tree -> build_fire (cost=9)
# rather than: find_shelter (cost=6) -- depends on current state
A* Planning
GOAP uses backward A* search from the goal state back to the current state:
- \(g(n)\): Actual cost from start to current node
- \(h(n)\): Heuristic estimate (typically the number of unsatisfied conditions)
graph RL
G["Goal: is_warm=True"] --> A3["build_fire<br/>cost=3"]
A3 --> A2["chop_tree<br/>cost=4"]
A2 --> A1["get_axe<br/>cost=2"]
A1 --> S["Current State<br/>has_axe=False"]
G --> A4["find_shelter<br/>cost=6"]
A4 --> S2["Current State<br/>(no preconditions)"]
GOAP Advantages and Limitations
Advantages:
- Decoupled actions: Adding new actions requires no modification to other logic
- Automatic planning: NPCs autonomously discover action sequences
- Dynamic adaptation: Automatically replans when the environment changes
Limitations:
- State space must be predefined
- Heuristic function design affects performance
- Not suited for continuous actions and fuzzy goals
Utility AI
Utility AI computes a utility value for each possible action through scoring functions, selecting the action with the highest utility:
where: - \(a\) is the candidate action - \(f_i(a)\) is the \(i\)-th evaluation factor's score for action \(a\) - \(w_i\) is the corresponding weight
Scoring Curve Types
Common scoring function \(f_i(a)\) forms:
- Linear: \(f(x) = mx + b\)
- Quadratic: \(f(x) = ax^2 + bx + c\)
- Logistic: \(f(x) = \frac{1}{1 + e^{-k(x - x_0)}}\)
- Exponential decay: \(f(x) = e^{-\lambda x}\)
Example
class UtilityAI:
def select_action(self, npc, context):
actions = {
"eat": self.score_eat(npc),
"sleep": self.score_sleep(npc),
"fight": self.score_fight(npc, context),
"flee": self.score_flee(npc, context),
"socialize": self.score_socialize(npc, context),
}
return max(actions, key=actions.get)
def score_eat(self, npc):
hunger = npc.hunger / 100.0 # Normalize to [0,1]
# Logistic curve: rises sharply at high hunger
return 1.0 / (1.0 + math.exp(-10 * (hunger - 0.6)))
def score_fight(self, npc, context):
if not context.enemy_visible:
return 0.0
health_factor = npc.health / 100.0
weapon_factor = 1.0 if npc.has_weapon else 0.3
enemy_weakness = 1.0 - context.enemy_health / 100.0
return 0.4 * health_factor + 0.3 * weapon_factor + 0.3 * enemy_weakness
def score_flee(self, npc, context):
if not context.enemy_visible:
return 0.0
danger = 1.0 - npc.health / 100.0
return danger * 0.8 # High flee tendency at low health
LLM-Driven NPCs
The advent of LLMs has fundamentally changed the NPC design paradigm, shifting from predefined behavior to generative behavior.
Core Advantages
- Free dialogue: No longer limited to preset dialogue trees
- Dynamic goals: Generate reasonable goals based on context
- Personalization: Unique personalities through persona prompts
- Emergent behavior: Multi-NPC interactions produce unexpected behaviors
Architecture
class LLM_NPC:
def __init__(self, name, personality, backstory):
self.name = name
self.personality = personality
self.backstory = backstory
self.memory_stream = MemoryStream()
self.current_plan = []
def generate_response(self, player_input, context):
"""Generate dialogue response"""
relevant_memories = self.memory_stream.retrieve(player_input)
prompt = f"""You are {self.name}. {self.personality}
Backstory: {self.backstory}
Relevant memories:
{format_memories(relevant_memories)}
Current situation: {context}
{player_input}
Respond in character:"""
response = call_llm(prompt)
self.memory_stream.add(f"Player said: {player_input}")
self.memory_stream.add(f"I responded: {response}")
return response
def decide_action(self, perception):
"""Decide next action"""
memories = self.memory_stream.retrieve(str(perception))
prompt = f"""You are {self.name}. Given your personality,
memories, and current situation, what should you do next?
Available actions: {perception.available_actions}
Current observation: {perception.description}
Relevant memories: {format_memories(memories)}
Choose one action and explain briefly:"""
return call_llm(prompt)
Complete Technology Comparison
| Feature | FSM | Behavior Tree | GOAP | Utility AI | LLM NPC |
|---|---|---|---|---|---|
| Predictability | Very high | High | Medium | Medium | Low |
| Flexibility | Low | Medium | High | High | Very high |
| Scalability | Poor | Good | Good | Good | Good |
| Dialogue ability | None | None | None | None | Native |
| Development cost | Low | Medium | Medium | Medium | Low (API calls) |
| Runtime cost | Very low | Low | Medium | Low | High (API) |
| Debugging difficulty | Easy | Medium | Medium | Medium | Hard |
| Emergent behavior | None | None | Limited | Limited | Rich |
| Use case | Simple NPCs | AAA games | Strategy games | Simulation games | Open worlds |
| Representative case | Early games | Halo, UE | F.E.A.R | Sims | Smallville |
Evolution Trend
graph LR
A[FSM<br/>1990s] --> B[Behavior Trees<br/>2000s]
B --> C[GOAP<br/>2004+]
B --> D[Utility AI<br/>2010s]
C --> E[Hybrid Architecture<br/>2015+]
D --> E
E --> F[LLM-Driven<br/>2023+]
style F fill:#f9f,stroke:#333,stroke-width:2px
Hybrid Architecture Trend
Modern NPC systems increasingly adopt hybrid architectures:
- LLM + Behavior Tree: LLM handles dialogue and high-level decisions; behavior trees execute specific actions
- LLM + GOAP: LLM generates goals; GOAP plans action sequences
- LLM + Utility AI: LLM evaluates situations; Utility AI selects actions
- Layered architecture: Upper layer LLM (strategic), middle layer GOAP/Utility AI (tactical), lower layer FSM/behavior tree (execution)
Summary
NPC behavior control has evolved from deterministic FSMs to generative LLMs, with each generation of technology solving the core limitations of the previous one while introducing new challenges. The future direction is hybrid architecture---leveraging the flexibility of LLMs and the controllability of traditional methods to build NPCs that are both intelligent and reliable.