Classical Reinforcement Learning

Classical reinforcement learning studies how agents learn optimal policies through interaction with the environment, forming the theoretical foundation of deep reinforcement learning.

Contents:

Introduction to Classical RL — MDP framework, value functions, policies
Multi-armed Bandits — Exploration vs. exploitation, UCB, Thompson sampling
Finite MDP — Bellman equations, optimal policies
Dynamic Programming — Policy iteration, value iteration
Monte Carlo Methods — MC prediction, MC control, importance sampling
TD(0) — Temporal difference learning, SARSA, Q-Learning
N-step TD — Multi-step bootstrapping, bias-variance tradeoff
Learning & Planning — Dyna architecture, model learning
Approximation Methods — Function approximation, linear methods
TD(lambda) — Eligibility traces, forward view & backward view
Policy Gradient — REINFORCE, baseline functions, Actor-Critic

Classical Reinforcement Learning

评论 #