baselines/baselines/acktr/README.md

# ACKTR

- Original paper: https://arxiv.org/abs/1708.05144
- Baselines blog post: https://blog.openai.com/baselines-acktr-a2c/
- `python -m baselines.run --alg=acktr --env=PongNoFrameskip-v4` runs the algorithm for 40M frames = 10M timesteps on an Atari Pong. See help (`-h`) for more options.
- also refer to the repo-wide [README.md](../../README.md#training-models)

## ACKTR with continuous action spaces
The code of ACKTR has been refactored to handle both discrete and continuous action spaces uniformly. In the original version, discrete and continuous action spaces were handled by different code (actkr_disc.py and acktr_cont.py) with little overlap. If interested in the original version of the acktr for continuous action spaces, use `old_acktr_cont` branch. Note that original code performs better on the mujoco tasks than the refactored version; we are still investigating why.
Fix atari wrapper (affecting a2c perf) and pposgd mujoco performance - removed vf clipping in pposgd - that was severely degrading performance on mujoco because it didn’t account for scale of returns - switched adam epsilon in pposgd_simple - brought back no-ops in atari wrapper (oops) - added readmes - revamped run_X_benchmark scripts to have standard form - cleaned up DDPG a little, removed deprecated SimpleMonitor and non-idiomatic usage of logger 2017-08-27 22:13:48 -07:00			`# ACKTR`

			`- Original paper: https://arxiv.org/abs/1708.05144`
			`- Baselines blog post: https://blog.openai.com/baselines-acktr-a2c/`
update readmes (#514) * update per-algorithm READMEs to reflect new way of running algorithms * adding a link to repo-wide README * updated README files and deepq.train_cartpole example 2018-08-16 14:53:49 -07:00			- `python -m baselines.run --alg=acktr --env=PongNoFrameskip-v4` runs the algorithm for 40M frames = 10M timesteps on an Atari Pong. See help (`-h`) for more options.
			`- also refer to the repo-wide [README.md](../../README.md#training-models)`

refactor acktr (#560) * refactor acktr * setup.cfg now tests style/syntax in acktr as well * flake8 complaints * added note about continuous action spaces for acktr into the README.md 2018-09-20 16:05:26 -07:00			`## ACKTR with continuous action spaces`
			The code of ACKTR has been refactored to handle both discrete and continuous action spaces uniformly. In the original version, discrete and continuous action spaces were handled by different code (actkr_disc.py and acktr_cont.py) with little overlap. If interested in the original version of the acktr for continuous action spaces, use `old_acktr_cont` branch. Note that original code performs better on the mujoco tasks than the refactored version; we are still investigating why.