* run ddpg on Mujoco benchmark RUN BENCHMARKS
* autopep8
* fixed all syntax in refactored ddpg
* a little bit more refactoring
* autopep8
* identity test with ddpg WIP
* enable test_identity with ddpg
* refactored ddpg RUN BENCHMARKS
* autopep8
* include ddpg into style check
* fixing tests RUN BENCHMARKS
* set default seed to None RUN BENCHMARKS
* run tests and benchmarks in separate buildkite steps RUN BENCHMARKS
* cleanup pdb usage
* flake8 and cleanups
* re-enabled all benchmarks in run-benchmarks-new.py
* flake8 complaints
* deepq model builder compatible with network functions returning single tensor
* remove ddpg test with test_discrete_identity
* make ppo_metal use make_vec_env instead of make_atari_env
* make ppo_metal use make_vec_env instead of make_atari_env
* fixed syntax in ppo_metal.run_atari
* exported rl-algs
* more stuff from rl-algs
* run slow tests
* re-exported rl_algs
* re-exported rl_algs - fixed problems with serialization test and test_cartpole
* replaced atari_arg_parser with common_arg_parser
* run.py can run algos from both baselines and rl_algs
* added approximate humanoid reward with ppo2 into the README for reference
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* very dummy commit to RUN BENCHMARKS
* serialize variables as a dict, not as a list
* running_mean_std uses tensorflow variables
* fixed import in vec_normalize
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* flake8 complaints
* save all variables to make sure we save the vec_normalize normalization
* benchmarks on ppo2 only RUN BENCHMARKS
* make_atari_env compatible with mpi
* run ppo_mpi benchmarks only RUN BENCHMARKS
* hardcode names of retro environments
* add defaults
* changed default ppo2 lr schedule to linear RUN BENCHMARKS
* non-tf normalization benchmark RUN BENCHMARKS
* use ncpu=1 for mujoco sessions - gives a bit of a performance speedup
* reverted running_mean_std to user property decorators for mean, var, count
* reverted VecNormalize to use RunningMeanStd (no tf)
* reverted VecNormalize to use RunningMeanStd (no tf)
* profiling wip
* use VecNormalize with regular RunningMeanStd
* added acer runner (missing import)
* flake8 complaints
* added a note in README about TfRunningMeanStd and serialization of VecNormalize
* dummy commit to RUN BENCHMARKS
* merged benchmarks branch