AI Research Paradigms

Introduction

AI research proceeds from different philosophical stances, giving rise to three major paradigms: symbolism, connectionism, and behaviorism. This article offers an in-depth comparison of the theoretical foundations, methodologies, and applicable scenarios of these three paradigms, and discusses the development trends of modern hybrid approaches.

Related content: Symbolic AI, Machine Learning, The Master Algorithm (Domingos's Five Tribes of ML — an orthogonal, complementary view to the three paradigms below)

1. Symbolism (GOFAI)

1.1 Philosophical Foundation

Symbolism stems from the rationalist tradition, with a core assumption:

Physical Symbol System Hypothesis (Newell & Simon, 1976): a physical symbol system has the necessary and sufficient means for general intelligent action.

That is, intelligence can be achieved by operating on symbols (physical patterns) through search, reasoning, and composition.

1.2 Knowledge Representation

Knowledge representation is the central problem of symbolism:

Representation	Description	Example
Propositional logic	Propositions + logical connectives	$P \wedge Q \Rightarrow R$
First-order predicate logic	Variables, quantifiers, predicates	$\forall x: \text{Human}(x) \Rightarrow \text{Mortal}(x)$
Semantic networks	Nodes + relational edges	"Bird → has wings"
Frames	Structured attribute slots	Object(name=..., color=...)
Ontologies	Concept hierarchies + relations	OWL, WordNet
Knowledge graphs	Entity-relation-entity triples	(Einstein, born_in, Ulm)

1.3 Reasoning Methods

Deductive reasoning: from general to specific. $\{P \Rightarrow Q, P\} \vdash Q$
Inductive reasoning: from specific to general. Observing multiple instances to induce rules
Abductive reasoning: from effect to cause. Given $Q$ and $P \Rightarrow Q$, infer $P$

1.4 Expert Systems

Knowledge base (IF-THEN rules)
    +
Inference engine (forward/backward chaining)
    +
Explanation module
    =
Expert system

Example rules (MYCIN):
IF   infection site = blood
AND  Gram stain = negative
AND  morphology = rod-shaped
AND  patient burn area > 30%
THEN pathogen = Pseudomonas aeruginosa (confidence 0.7)

1.5 Limitations

Knowledge acquisition bottleneck: expert knowledge is difficult to fully encode
Common sense problem: the CYC project after 30+ years is still incomplete
Brittleness: cannot handle situations outside the coverage of rules
Perception difficulty: struggles with unstructured data like images and speech

2. Connectionism

2.1 Philosophical Foundation

Connectionism stems from the empiricist tradition, inspired by neuroscience:

Intelligence emerges from the large-scale connections and coordinated activity of many simple units (artificial neurons).

Knowledge is not explicitly stored symbols, but distributed across connection weights
Learning is adjusting connection weights

2.2 Development Trajectory

Perceptron (1958) → Multi-layer feedforward networks (1986, backpropagation)
    → CNN (1998, LeNet) → Deep learning (2012, AlexNet)
    → RNN/LSTM → Transformer (2017)
    → Pre-trained models (2018, BERT/GPT)
    → Large language models (2020+, GPT-3/4)

2.3 Core Ideas

Universal Approximation Theorem: a sufficiently wide single-hidden-layer feedforward network can approximate any continuous function to arbitrary precision.

\[ f(x) = \sum_{i=1}^{N} w_i \sigma(a_i^T x + b_i) \]

Representation Learning: deep networks automatically learn hierarchical representations from raw data to task objectives through multiple layers of abstraction:

Pixels → Edges → Textures → Parts → Objects
              ↑ Automatically learned hierarchical features

2.4 Key Achievements

Domain	Method	Achievement
Image recognition	CNN	Surpassed human accuracy
Machine translation	Transformer	Near human-level
Protein folding	AlphaFold	Solved a 50-year problem
Text generation	GPT-4	General language capability
Image generation	Diffusion	Photo-realistic image generation

2.5 Limitations

Poor interpretability: black-box models, difficult to understand decisions
Data hungry: requires large amounts of labeled data
Compute intensive: high training costs (GPT-4 ~$100M)
Fragile generalization: adversarial examples, distribution shift
Lacks causal reasoning: learns correlations rather than causation

3. Behaviorism (Situated AI)

3.1 Philosophical Foundation

Behaviorism is influenced by evolutionary theory and cybernetics:

Intelligence does not require internal representations, but emerges through interaction with the environment.

Brooks (1990) advocated "intelligence without representation":

"The world is its own best model."

3.2 Reinforcement Learning

Core framework: an agent learns through trial and error to maximize cumulative reward in an environment.

\[ \pi^* = \arg\max_\pi \mathbb{E}\left[\sum_{t=0}^{\infty} \gamma^t r_t \mid \pi\right] \]

Key algorithms:

Algorithm	Type	Characteristics
Q-Learning	Value-based	Learns state-action value function
SARSA	Value-based	On-policy learning
Policy Gradient	Policy-based	Directly optimizes the policy
Actor-Critic	Hybrid	Value + Policy
PPO	Policy-based	Stable policy optimization
DQN	Deep RL	Deep Q-Network
AlphaZero	Deep RL	Self-play learning

3.3 Evolutionary Algorithms

Simulates the process of natural selection:

Initial population → Evaluate fitness → Selection → Crossover → Mutation → New population
              ↑                                                            │
              └────────────────────────────────────────────────────────────┘

Variants: Genetic Algorithms (GA), Genetic Programming (GP), Evolution Strategies (ES), Neuroevolution (NEAT).

3.4 Swarm Intelligence

Method	Biological Inspiration	Application
Ant Colony Optimization	Ant foraging	Path optimization
Particle Swarm Optimization	Bird flocking	Continuous optimization
Artificial Bee Colony	Bee foraging	Multi-objective optimization

3.5 Limitations

Low sample efficiency: requires extensive interaction experience
Reward design is difficult: improper rewards lead to unexpected behavior
Exploration-exploitation dilemma: balancing exploring new strategies with exploiting known good ones
Safety: training may produce dangerous behaviors

4. Paradigm Comparison

Dimension	Symbolism	Connectionism	Behaviorism
Knowledge source	Expert-encoded	Learned from data	Environmental interaction
Knowledge representation	Explicit symbols	Distributed weights	Implicit policies
Reasoning method	Logical reasoning	Pattern matching	Trial-and-error search
Interpretability	High	Low	Medium
Perceptual ability	Weak	Strong	Medium
Planning ability	Strong	Weak	Medium
Adaptability	Low	Medium	High
Representative system	Expert systems	GPT-4	AlphaGo

5. Hybrid Approaches and Modern Trends

5.1 Neuro-Symbolic AI

Combining connectionism's perception/learning capabilities with symbolism's reasoning/explanation capabilities:

Perception (neural network)
    → Symbol extraction (concepts, relations)
    → Symbolic reasoning (logic, planning)
    → Decision/generation

Representative work:

DeepProbLog: neural networks + probabilistic logic programming
Graph Neural Networks + Knowledge Graphs
LLM + external tools/knowledge bases (RAG)

5.2 LLMs as Reasoning Engines

Large language models to some extent fuse all three paradigms:

Connectionism: Transformer architecture, learning from data
Symbolism: Chain-of-Thought reasoning, code generation
Behaviorism: RLHF optimizes behavior through feedback

5.3 World Models

Combining perception, prediction, and planning:

\[ \text{Perception}(o_t) \xrightarrow{\text{Encoding}} z_t \xrightarrow{\text{World Model}} \hat{z}_{t+1} \xrightarrow{\text{Planning}} a_t \]

Representatives: Dreamer (RL), JEPA (LeCun), Sora (video generation).

6. Paradigm Selection Guide

Scenario	Recommended Paradigm
Clear rules, need for explainability	Symbolism
Abundant data, perceptual tasks	Connectionism
Interactive environment, sequential decision-making	Behaviorism
Perception + reasoning	Neuro-symbolic hybrid
Complex open-ended problems	Multi-paradigm fusion

References

"Artificial Intelligence: A Modern Approach" - Russell & Norvig
"The Society of Mind" - Marvin Minsky
"Intelligence without Representation" - Rodney Brooks (1990)
"Neuro-Symbolic AI: The 3rd Wave" - Garcez & Lamb (2020)

Representation	Description	Example
Propositional logic	Propositions + logical connectives	\(P \wedge Q \Rightarrow R\)
First-order predicate logic	Variables, quantifiers, predicates	\(\forall x: \text{Human}(x) \Rightarrow \text{Mortal}(x)\)
Semantic networks	Nodes + relational edges	"Bird → has wings"
Frames	Structured attribute slots	Object(name=..., color=...)
Ontologies	Concept hierarchies + relations	OWL, WordNet
Knowledge graphs	Entity-relation-entity triples	(Einstein, born_in, Ulm)