Commit Graph

8 Commits

Author SHA1 Message Date
Roman Ring
339d5640b9 add docs for layer_norm param in DQN baseline (#107) 2018-11-14 12:22:42 -08:00
Buck Shlegeris
a75bc37a40 fix typo in a comment (#161) 2018-11-14 12:20:55 -08:00
Xiaoquan Kong
0a13da8dfe Change variable name from inpt to input_ (#297) 2018-11-13 11:08:21 -08:00
pzhokhov
858afa8d7e Refactor DDPG (#111)
* run ddpg on Mujoco benchmark RUN BENCHMARKS

* autopep8

* fixed all syntax in refactored ddpg

* a little bit more refactoring

* autopep8

* identity test with ddpg WIP

* enable test_identity with ddpg

* refactored ddpg RUN BENCHMARKS

* autopep8

* include ddpg into style check

* fixing tests RUN BENCHMARKS

* set default seed to None RUN BENCHMARKS

* run tests and benchmarks in separate buildkite steps RUN BENCHMARKS

* cleanup pdb usage

* flake8 and cleanups

* re-enabled all benchmarks in run-benchmarks-new.py

* flake8 complaints

* deepq model builder compatible with network functions returning single tensor

* remove ddpg test with test_discrete_identity

* make ppo_metal use make_vec_env instead of make_atari_env

* make ppo_metal use make_vec_env instead of make_atari_env

* fixed syntax in ppo_metal.run_atari
2018-10-03 14:38:32 -07:00
pzhokhov
9070ee7ef3 tighten flake8, autopep8 to fix trailing whitespaces and blank lines with whitespaces (#87) 2018-09-11 13:18:43 -07:00
pzhokhov
8c2aea2add refactor a2c, acer, acktr, ppo2, deepq, and trpo_mpi (#490)
* exported rl-algs

* more stuff from rl-algs

* run slow tests

* re-exported rl_algs

* re-exported rl_algs - fixed problems with serialization test and test_cartpole

* replaced atari_arg_parser with common_arg_parser

* run.py can run algos from both baselines and rl_algs

* added approximate humanoid reward with ppo2 into the README for reference

* dummy commit to RUN BENCHMARKS

* dummy commit to RUN BENCHMARKS

* dummy commit to RUN BENCHMARKS

* dummy commit to RUN BENCHMARKS

* very dummy commit to RUN BENCHMARKS

* serialize variables as a dict, not as a list

* running_mean_std uses tensorflow variables

* fixed import in vec_normalize

* dummy commit to RUN BENCHMARKS

* dummy commit to RUN BENCHMARKS

* flake8 complaints

* save all variables to make sure we save the vec_normalize normalization

* benchmarks on ppo2 only RUN BENCHMARKS

* make_atari_env compatible with mpi

* run ppo_mpi benchmarks only RUN BENCHMARKS

* hardcode names of retro environments

* add defaults

* changed default ppo2 lr schedule to linear RUN BENCHMARKS

* non-tf normalization benchmark RUN BENCHMARKS

* use ncpu=1 for mujoco sessions - gives a bit of a performance speedup

* reverted running_mean_std to user property decorators for mean, var, count

* reverted VecNormalize to use RunningMeanStd (no tf)

* reverted VecNormalize to use RunningMeanStd (no tf)

* profiling wip

* use VecNormalize with regular RunningMeanStd

* added acer runner (missing import)

* flake8 complaints

* added a note in README about TfRunningMeanStd and serialization of VecNormalize

* dummy commit to RUN BENCHMARKS

* merged benchmarks branch
2018-08-13 09:56:44 -07:00
Matthias Plappert
882251878f Parameter space noise for DQN and DDPG (#75)
* Export param noise

* Update documentation

* Final finishing touches
2017-07-27 08:10:59 -07:00
Szymon Sidor
958810ed1e Initial commit 2017-05-24 02:34:20 -07:00