Optimization & Regularization
Optimization and regularization are key techniques for training deep neural networks, determining model convergence speed and generalization ability.
Contents:
- Optimization Theory — Convex optimization, gradient descent, saddle point problems
- Optimizer — SGD, Adam, AdamW, LAMB
- Initialization — Xavier, He initialization, orthogonal initialization
- LR Scheduling — Learning rate warmup, cosine annealing, cyclical scheduling
- Normalization — BatchNorm, LayerNorm, GroupNorm
- Regularization — Dropout, weight decay, data augmentation
- Optimization Experiments — Experimental comparison of different optimization strategies