- removed vf clipping in pposgd - that was severely degrading performance on mujoco because it didn’t account for scale of returns - switched adam epsilon in pposgd_simple - brought back no-ops in atari wrapper (oops) - added readmes - revamped run_X_benchmark scripts to have standard form - cleaned up DDPG a little, removed deprecated SimpleMonitor and non-idiomatic usage of logger
265 B
265 B
A2C
- Original paper: https://arxiv.org/abs/1602.01783
- Baselines blog post: https://blog.openai.com/baselines-acktr-a2c/
python -m baselines.a2c.run_atari
runs the algorithm for 40M frames = 10M timesteps on an Atari game. See help (-h
) for more options.