* update per-algorithm READMEs to reflect new way of running algorithms * adding a link to repo-wide README * updated README files and deepq.train_cartpole example
A2C
- Original paper: https://arxiv.org/abs/1602.01783
- Baselines blog post: https://blog.openai.com/baselines-acktr-a2c/
python -m baselines.run --alg=a2c --env=PongNoFrameskip-v4
runs the algorithm for 40M frames = 10M timesteps on an Atari Pong. See help (-h
) for more options- also refer to the repo-wide README.md