NPC Behavior Evolution

Overview

The behavior control technology for game NPCs (Non-Player Characters) has undergone a revolutionary evolution from simple state machines to LLM-driven systems. This article traces this evolutionary path and compares the strengths and weaknesses of each approach.

Finite State Machines (FSM)

Finite state machines are the earliest and simplest NPC behavior control method.

Basic Structure

\[\text{FSM} = (S, s_0, \Sigma, \delta, F)\]

$S$: State set (e.g., Idle, Patrol, Chase, Attack)
$s_0$: Initial state
$\Sigma$: Input event set (e.g., see_enemy, lose_sight, health_low)
$\delta: S \times \Sigma \rightarrow S$: State transition function
$F$: Terminal state set

stateDiagram-v2
    [*] --> Idle
    Idle --> Patrol: start_patrol
    Patrol --> Chase: see_enemy
    Patrol --> Idle: patrol_complete
    Chase --> Attack: in_range
    Chase --> Patrol: lose_sight
    Attack --> Chase: out_of_range
    Attack --> Flee: health_low
    Flee --> Idle: safe_distance

Implementation Example

class FSM_NPC:
    def __init__(self):
        self.state = "idle"
        self.health = 100

    def update(self, perception):
        if self.state == "idle":
            if perception.patrol_timer_expired:
                self.state = "patrol"

        elif self.state == "patrol":
            if perception.enemy_visible:
                self.state = "chase"
                self.target = perception.nearest_enemy
            elif perception.patrol_complete:
                self.state = "idle"

        elif self.state == "chase":
            if self.distance_to(self.target) < self.attack_range:
                self.state = "attack"
            elif not perception.enemy_visible:
                self.state = "patrol"

        elif self.state == "attack":
            if self.health < 20:
                self.state = "flee"
            elif self.distance_to(self.target) > self.attack_range:
                self.state = "chase"

Limitations of FSMs

Problem	Description
State explosion	Complex behaviors require many states; $
Rigid transitions	Lack of flexibility; behavior patterns are fixed
Difficult to extend	Adding new behaviors requires modifying extensive transition logic
Lack of context	Cannot consider history and contextual information

Hierarchical Finite State Machines (HFSM) partially alleviate the state explosion problem through state nesting, but fundamental limitations remain.

Behavior Trees

Behavior trees are the industry standard for game AI, widely used in AAA games (e.g., Halo, built into Unreal Engine).

Core Node Types

graph TD
    Root[Root<br/>->] --> Sel1[Selector<br/>?]

    Sel1 --> Seq1[Sequence<br/>-> Combat]
    Sel1 --> Seq2[Sequence<br/>-> Patrol]
    Sel1 --> Act1[Action<br/>Idle]

    Seq1 --> Cond1[Condition<br/>Enemy Visible?]
    Seq1 --> Sel2[Selector<br/>? Attack/Flee]

    Sel2 --> Seq3[Sequence<br/>-> Attack]
    Sel2 --> Act2[Action<br/>Flee]

    Seq3 --> Cond2[Condition<br/>Health > 20?]
    Seq3 --> Act3[Action<br/>Attack Enemy]

    Seq2 --> Act4[Action<br/>Move to Waypoint]
    Seq2 --> Cond3[Condition<br/>At Waypoint?]
    Seq2 --> Act5[Action<br/>Wait]

Node Type Details

Node Type	Symbol	Behavior	On Child Failure	On Child Success
Sequence	->	Execute all children in order	Return failure immediately	Continue to next
Selector	?	Try until one succeeds	Try next	Return success immediately
Decorator	diamond	Modify child behavior	Depends on decorator	Depends on decorator
Condition	square	Check condition	-	-
Action	circle	Execute action	-	-

Tick Mechanism

Behavior trees are driven by ticks, traversing from the root node each frame:

\[\text{tick}(node) \rightarrow \{\text{SUCCESS}, \text{FAILURE}, \text{RUNNING}\}\]

SUCCESS: Node executed successfully
FAILURE: Node execution failed
RUNNING: Node is still executing (e.g., moving toward a target)

class BehaviorTree:
    def tick(self, node, context):
        if isinstance(node, Sequence):
            for child in node.children:
                result = self.tick(child, context)
                if result != Status.SUCCESS:
                    return result
            return Status.SUCCESS

        elif isinstance(node, Selector):
            for child in node.children:
                result = self.tick(child, context)
                if result != Status.FAILURE:
                    return result
            return Status.FAILURE

        elif isinstance(node, Decorator):
            child_result = self.tick(node.child, context)
            return node.modify(child_result)

        elif isinstance(node, Action):
            return node.execute(context)

        elif isinstance(node, Condition):
            return Status.SUCCESS if node.check(context) else Status.FAILURE

Common Decorators

Inverter: Invert child node result
Repeater: Repeat child node N times
Timeout: Return failure on timeout
Cooldown: Do not execute during cooldown period
ForceSuccess: Always return success

Goal-Oriented Action Planning (GOAP)

GOAP, inspired by the STRIPS planner, lets NPCs autonomously plan action sequences to achieve goals.

Core Concepts

\[\text{GOAP} = (\mathcal{S}, \mathcal{A}, \mathcal{G})\]

$\mathcal{S}$: World state (set of key-value pairs)
$\mathcal{A}$: Action set, each action with preconditions and effects
$\mathcal{G}$: Goal set, each goal with desired world state

class GOAPAction:
    def __init__(self, name, cost, preconditions, effects):
        self.name = name
        self.cost = cost                    # Action cost
        self.preconditions = preconditions  # Preconditions {key: value}
        self.effects = effects              # Effects {key: value}

# Define actions
actions = [
    GOAPAction("get_axe", cost=2, 
               preconditions={"has_axe": False}, 
               effects={"has_axe": True}),
    GOAPAction("chop_tree", cost=4, 
               preconditions={"has_axe": True}, 
               effects={"has_wood": True}),
    GOAPAction("build_fire", cost=3, 
               preconditions={"has_wood": True}, 
               effects={"is_warm": True}),
    GOAPAction("find_shelter", cost=6, 
               preconditions={}, 
               effects={"is_warm": True}),
]

# Goal
goal = {"is_warm": True}
# A* planner finds: get_axe -> chop_tree -> build_fire (cost=9)
# rather than: find_shelter (cost=6) -- depends on current state

A* Planning

GOAP uses backward A* search from the goal state back to the current state:

\[f(n) = g(n) + h(n)\]

$g(n)$: Actual cost from start to current node
$h(n)$: Heuristic estimate (typically the number of unsatisfied conditions)

graph RL
    G["Goal: is_warm=True"] --> A3["build_fire<br/>cost=3"]
    A3 --> A2["chop_tree<br/>cost=4"]
    A2 --> A1["get_axe<br/>cost=2"]
    A1 --> S["Current State<br/>has_axe=False"]

    G --> A4["find_shelter<br/>cost=6"]
    A4 --> S2["Current State<br/>(no preconditions)"]

GOAP Advantages and Limitations

Advantages:

Decoupled actions: Adding new actions requires no modification to other logic
Automatic planning: NPCs autonomously discover action sequences
Dynamic adaptation: Automatically replans when the environment changes

Limitations:

State space must be predefined
Heuristic function design affects performance
Not suited for continuous actions and fuzzy goals

Utility AI

Utility AI computes a utility value for each possible action through scoring functions, selecting the action with the highest utility:

\[U(a) = \sum_{i=1}^{n} w_i \cdot f_i(a)\]

where: - $a$ is the candidate action - $f_i(a)$ is the $i$-th evaluation factor's score for action $a$ - $w_i$ is the corresponding weight

Scoring Curve Types

Common scoring function $f_i(a)$ forms:

Linear: $f(x) = mx + b$
Quadratic: $f(x) = ax^2 + bx + c$
Logistic: $f(x) = \frac{1}{1 + e^{-k(x - x_0)}}$
Exponential decay: $f(x) = e^{-\lambda x}$

Example

class UtilityAI:
    def select_action(self, npc, context):
        actions = {
            "eat": self.score_eat(npc),
            "sleep": self.score_sleep(npc),
            "fight": self.score_fight(npc, context),
            "flee": self.score_flee(npc, context),
            "socialize": self.score_socialize(npc, context),
        }
        return max(actions, key=actions.get)

    def score_eat(self, npc):
        hunger = npc.hunger / 100.0  # Normalize to [0,1]
        # Logistic curve: rises sharply at high hunger
        return 1.0 / (1.0 + math.exp(-10 * (hunger - 0.6)))

    def score_fight(self, npc, context):
        if not context.enemy_visible:
            return 0.0
        health_factor = npc.health / 100.0
        weapon_factor = 1.0 if npc.has_weapon else 0.3
        enemy_weakness = 1.0 - context.enemy_health / 100.0
        return 0.4 * health_factor + 0.3 * weapon_factor + 0.3 * enemy_weakness

    def score_flee(self, npc, context):
        if not context.enemy_visible:
            return 0.0
        danger = 1.0 - npc.health / 100.0
        return danger * 0.8  # High flee tendency at low health

LLM-Driven NPCs

The advent of LLMs has fundamentally changed the NPC design paradigm, shifting from predefined behavior to generative behavior.

Core Advantages

Free dialogue: No longer limited to preset dialogue trees
Dynamic goals: Generate reasonable goals based on context
Personalization: Unique personalities through persona prompts
Emergent behavior: Multi-NPC interactions produce unexpected behaviors

Architecture

class LLM_NPC:
    def __init__(self, name, personality, backstory):
        self.name = name
        self.personality = personality
        self.backstory = backstory
        self.memory_stream = MemoryStream()
        self.current_plan = []

    def generate_response(self, player_input, context):
        """Generate dialogue response"""
        relevant_memories = self.memory_stream.retrieve(player_input)

        prompt = f"""You are {self.name}. {self.personality}

Backstory: {self.backstory}

Relevant memories:
{format_memories(relevant_memories)}

Current situation: {context}

{player_input}

Respond in character:"""

        response = call_llm(prompt)
        self.memory_stream.add(f"Player said: {player_input}")
        self.memory_stream.add(f"I responded: {response}")
        return response

    def decide_action(self, perception):
        """Decide next action"""
        memories = self.memory_stream.retrieve(str(perception))

        prompt = f"""You are {self.name}. Given your personality, 
memories, and current situation, what should you do next?

Available actions: {perception.available_actions}
Current observation: {perception.description}
Relevant memories: {format_memories(memories)}

Choose one action and explain briefly:"""

        return call_llm(prompt)

Complete Technology Comparison

Feature	FSM	Behavior Tree	GOAP	Utility AI	LLM NPC
Predictability	Very high	High	Medium	Medium	Low
Flexibility	Low	Medium	High	High	Very high
Scalability	Poor	Good	Good	Good	Good
Dialogue ability	None	None	None	None	Native
Development cost	Low	Medium	Medium	Medium	Low (API calls)
Runtime cost	Very low	Low	Medium	Low	High (API)
Debugging difficulty	Easy	Medium	Medium	Medium	Hard
Emergent behavior	None	None	Limited	Limited	Rich
Use case	Simple NPCs	AAA games	Strategy games	Simulation games	Open worlds
Representative case	Early games	Halo, UE	F.E.A.R	Sims	Smallville

Evolution Trend

graph LR
    A[FSM<br/>1990s] --> B[Behavior Trees<br/>2000s]
    B --> C[GOAP<br/>2004+]
    B --> D[Utility AI<br/>2010s]
    C --> E[Hybrid Architecture<br/>2015+]
    D --> E
    E --> F[LLM-Driven<br/>2023+]

    style F fill:#f9f,stroke:#333,stroke-width:2px

Hybrid Architecture Trend

Modern NPC systems increasingly adopt hybrid architectures:

LLM + Behavior Tree: LLM handles dialogue and high-level decisions; behavior trees execute specific actions
LLM + GOAP: LLM generates goals; GOAP plans action sequences
LLM + Utility AI: LLM evaluates situations; Utility AI selects actions
Layered architecture: Upper layer LLM (strategic), middle layer GOAP/Utility AI (tactical), lower layer FSM/behavior tree (execution)

\[\text{Decision} = \text{LLM}_{\text{strategic}}(\text{context}) \rightarrow \text{GOAP}_{\text{tactical}}(\text{goal}) \rightarrow \text{BT}_{\text{execution}}(\text{plan})\]

Summary

NPC behavior control has evolved from deterministic FSMs to generative LLMs, with each generation of technology solving the core limitations of the previous one while introducing new challenges. The future direction is hybrid architecture---leveraging the flexibility of LLMs and the controllability of traditional methods to build NPCs that are both intelligent and reliable.