CartPole#
CartPole is a classic control task in reinforcement learning. The goal is to keep the pole balanced by controlling the cart’s left-right movement.

Task Description#
State Space: Cart position, cart velocity, pole angle, pole angular velocity
Action Space: Apply force left or right
Reward Function: +1 reward for each step the pole stays upright
Termination Conditions: Pole angle exceeds ±15 degrees or episode length exceeds 10 seconds
Quick Start#
1. Environment Preview#
uv run scripts/view.py --env cartpole
2. Start Training#
uv run scripts/train.py --env cartpole
3. View Training Progress#
uv run tensorboard --logdir runs/cartpole
4. Test Training Results#
uv run scripts/play.py --env cartpole
Tip: The system will automatically find the latest and best policy files in the
runs/cartpole/directory for testing. You can also manually specify specific policy files using the--policyparameter.
Expected Results#
Pole angle stays within ±5 degrees most of the time
Cart displacement range is reasonable
Troubleshooting#
If training performance is poor, you can try:
Adjust learning rate (try 1e-4 to 1e-3)
Increase number of environments (more parallel training)
Adjust reward function weights
Check if physical parameters are reasonable