* Added required arguments to the policy builder in the ACER model to fix the issue #783 * Changed the step model from nbatch to nenvs * Updated nsteps to be 1.
ACER
- Original paper: https://arxiv.org/abs/1611.01224
python -m baselines.run --alg=acer --env=PongNoFrameskip-v4
runs the algorithm for 40M frames = 10M timesteps on an Atari Pong. See help (-h
) for more options.- also refer to the repo-wide README.md