Policy-based Reinforcement Learning Methods on Walker2d

Reinforcement Learning · PPO · Actor-Critic

V1: PPO demo with Stable-Baselines3

This is an ongoing project. In V1, a PPO agent is trained using Stable-Baselines3 on the Walker2d MuJoCo environment. In V2 and later versions, the focus will shift to fully self-designed and implemented strategies.

GitHub