Mental Models
A mental model is an individual's internal understanding and cognitive framework of how the external world operates. Mental models help us filter out noise, focus on core logic, and in turn help us predict the future, form causal concepts, and guide our actions.
Hypotheses
The Narrative Model
Across every civilization on Earth, origin myths are strikingly similar: a creator deity, a pantheon of supporting gods, and a triumph of good over evil. The narrative structures found in different cultures are also remarkably alike -- from Journey to the West to The Odyssey, many stories share a nearly identical architecture.
We have reason to hypothesize that because all humans share the same biological hardware and survival logic, the myths that different peoples distill from their survival wisdom naturally converge in form.
Whether you were a person of the Shang Dynasty, a Sumerian, or a Maya, you wished for the sun to rise each morning and for the seasons to cycle predictably; you feared the coming of night and regarded female fertility as sacred and supernatural. If we treat the human brain as a large model, then given the same inputs (sun, darkness, death), its outputs will be similar.

The core structure of every heroic legend in the world is the same:
- Departure: leaving the comfort zone (breaking the old mental model)
- Initiation: encountering challenges in an unknown world (ingesting new data, refining the model)
- Return: coming back with new powers or treasures (model update complete)
These parallel myths also address the same set of meta-questions:
- Where do we come from? (origin, initialization)
- Where are we going? (destination, termination condition)
- How do we impose order on the unknown chaos? (algorithmic logic)
In the era of LLMs, although no logical rules are explicitly programmed in advance, after consuming massive amounts of text an LLM constructs a high-dimensional topological structure in latent space that simulates the process of logical reasoning. If we were to train a large model on the entirety of human data, this mythic structure would emerge spontaneously, because it is a projection of humanity's collective mind.
However, because LLMs operate on probabilistic principles, their capacity for logical reasoning remains unstable. Current efforts to address this -- such as Chain-of-Thought (CoT) prompting and neuro-symbolic AI -- are still open problems.
The Entropy-Reduction Algorithm
In nearly every creation myth, the first act is the establishment of order from chaos. We have good reason to believe that the human mental model is inherently biased toward constructing orderly concepts. The universe tends toward disorder, yet life requires order. This biological instinct to resist entropy increase naturally gives rise to categorical distinctions such as "light versus darkness" and "water versus land."
A core function of human intelligence -- differentiating between things -- is the most intuitive manifestation of an entropy-reduction algorithm. Even today, brands, national borders, ethnicities, races, dialects, social classes... no matter how much the world's operating principles converge, humans continue to invent new concepts to distinguish "us" from "them." This innate tendency to categorize and differentiate is one of the most fundamental cognitive patterns in the human mental model.
Archetypes of the Collective Unconscious
Carl Jung's theory of the collective unconscious holds that humans share a set of deep psychological archetypes that exist across cultures and recurrently manifest in myths, dreams, and art. Major archetypes include the Hero, the Shadow, the Wise Old Man, the Anima/Animus, and others.
From a computational perspective, the collective unconscious can be understood as the pre-trained weights of the human cognitive system -- structured priors that exist before individual experience.
Theory of Mind (ToM)
Definition
Theory of Mind refers to the ability to understand that others possess beliefs, desires, intentions, and knowledge that differ from one's own. It is one of the core capacities underlying human social cognition.
The Sally-Anne Test
The Sally-Anne test is the classic experiment for detecting Theory of Mind (Baron-Cohen et al., 1985):
- Sally places a marble in a basket and then leaves the room
- Anne moves the marble from the basket to a box
- Sally returns -- where will she look for the marble?
Correct answer: Sally will look in the basket (because she does not know the marble was moved).
- Children under 4 typically answer "the box" (unable to differentiate their own knowledge from another's belief)
- Children over 4 typically answer "the basket" (they have developed false-belief understanding)
- Individuals with autism show significant difficulty on this test
Levels of False-Belief Understanding
| Level | Description | Example |
|---|---|---|
| First-order false belief | Understanding that someone else's belief can be wrong | Sally thinks the marble is in the basket |
| Second-order false belief | Understanding someone's belief about someone else's belief | "Sally thinks Anne thinks..." |
| Higher-order ToM | Recursively reasoning over multiple nested beliefs | Negotiation, game theory, social strategy |
The BDI Model: Belief-Desire-Intention
Theoretical Framework
The BDI (Belief-Desire-Intention) model is a classic philosophical framework for describing rational agents (Bratman, 1987):
- Belief: The agent's knowledge and assumptions about the state of the world; may be incomplete or incorrect
- Desire: Goal states that the agent wishes to achieve; there can be multiple, even contradictory desires
- Intention: Action plans the agent commits to executing; these are actionable decisions filtered and formed from desires
Core loop:
Perceive world -> Update beliefs -> Generate desires -> Filter intentions -> Execute action -> Perceive outcome -> ...
Application in Agent Design
The BDI model has profoundly influenced the architectural design of intelligent agents:
| BDI Component | Agent Design Counterpart | Implementation |
|---|---|---|
| Belief | World model / Knowledge base | Knowledge graphs, vector databases, context window |
| Desire | Goal / Reward function | User instructions, task objectives, reward signal |
| Intention | Action plan | Chain-of-Thought, planning module, tool-call sequences |
BDI in LLM Agents:
- Belief: The LLM's context window + Retrieval-Augmented Generation (RAG) form a dynamic belief system
- Desire: User prompts and system instructions define the goals
- Intention: The CoT reasoning process and tool-call sequences instantiate intentions
Simulation Theory vs. Theory Theory
Two major philosophical positions exist regarding how we understand other minds:
Simulation Theory
- Core claim: We understand others' mental states by "simulating" their situation
- Mechanism: Placing ourselves in someone else's position and running our own cognitive system to predict their behavior
- Neural basis: The discovery of the Mirror Neuron System provided biological support for simulation theory
- Analogy: Similar to an LLM predicting a specific character's behavior through role-playing
Theory Theory
- Core claim: We possess a set of "naive theories" (folk psychology) about how minds work
- Mechanism: Like scientists, we use abstract rules and causal models to reason about others' behavior
- Development: Children progressively construct and revise these theories through observation and interaction
- Analogy: Similar to a rule-based reasoning system or causal inference model
| Dimension | Simulation Theory | Theory Theory |
|---|---|---|
| Core mechanism | Internal simulation | Rule-based reasoning |
| Knowledge representation | Implicit (procedural) | Explicit (propositional) |
| AI analogy | Neural networks (end-to-end) | Symbolic reasoning systems |
| Corresponding LLM capability | In-context learning, role-playing | CoT reasoning, rule-following |
In practice, humans likely employ both strategies -- simulation for familiar people and theory-based reasoning for unfamiliar situations.
Theory of Mind in AI
Do LLMs Possess ToM?
This is one of the most debated questions in current AI research.
Supporting evidence:
- Kosinski (2023), in "Theory of Mind May Have Spontaneously Emerged in Large Language Models," reported that GPT-4 achieved near-adult-level performance on classic ToM tasks
- GPT-4 can correctly answer Sally-Anne-type tests, understand sarcasm, and infer implied intentions
- LLMs perform well on Faux Pas tests (detecting social blunders)
Opposing evidence:
- Shapira et al. (2023) and Ullman (2023) showed that LLM ToM performance is extremely fragile -- minor modifications to questions cause failures
- LLMs may merely be doing "pattern matching" rather than genuinely understanding others' mental states
- It cannot be ruled out that the training data contained large volumes of ToM-related Q&A data
- LLMs do not possess a genuine belief system -- they lack a persistent world model
Middle ground:
LLMs exhibit Functional ToM -- behaviorally similar to ToM, but the underlying mechanism may be entirely different. This is analogous to how airplanes can fly but through a completely different mechanism than birds.
ToM Benchmarks
Commonly used tests for evaluating AI Theory of Mind capabilities:
| Test | Measurement Target | Difficulty |
|---|---|---|
| Sally-Anne Test | First-order false belief | Low |
| Smarties Test | First-order false belief (self) | Low |
| Second-order False Belief | Second-order false belief | Medium |
| Faux Pas Detection | Social cognition | Medium |
| Strange Stories Test | Metaphor and sarcasm comprehension | High |
| Recursive ToM | Multi-layered belief nesting | Very high |
Relationship with Embodied Intelligence
The Embodied Cognition Hypothesis
Embodied Cognition theory holds that the mind does not reside solely in the brain but is deeply rooted in the body's interaction with the environment.
Core arguments:
- Sensorimotor experience shapes concepts: Our understanding of "heavy" comes from the bodily experience of lifting heavy objects, not from an abstract definition
- Bodily basis of metaphor: Lakoff & Johnson argued that abstract concepts are constructed through bodily metaphors (e.g., "grasping an idea," "digesting knowledge")
- Situated cognition: Cognitive processes are situated; they cannot be fully understood apart from their environmental context
Implications for AI
| Language-only AI | Embodied AI |
|---|---|
| Learns world knowledge from text | Learns from sensorimotor interaction |
| Symbolic world model | Physics-based world model |
| May achieve "functional" understanding | May achieve "grounded" understanding |
| Cannot feel gravity or pain | Receives physical feedback through sensors |
Key question: Can an AI that has never interacted with the physical world truly understand the physical intuitions and emotional experiences inherent in mental models?
Mental Models in Embodied Intelligence
In robotics, the concept of mental models manifests in various forms:
- Internal models: A robot's dynamics model of its own body and the environment
- Intention recognition: Inferring goals and intentions from observed behavioral trajectories (Inverse Planning)
- Collaborative intelligence: Predicting human behavior in human-robot collaboration, which requires modeling the human mind
- Social navigation: Predicting pedestrian intentions when navigating through crowds
These applications demonstrate that mental models are not merely philosophical concepts but practical engineering requirements for building truly intelligent systems.