* add monitor to the rollout envs in her RUN BENCHMARKS her
* Slice -> Slide in her benchmarks RUN BENCHMARKS her
* run her benchmark for 200 epochs
* dummy commit to RUN BENCHMARKS her
* her benchmark for 500 epochs RUN BENCHMARKS her
* add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her
* add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her
* add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her
* disable saving of policies in her benchmark RUN BENCHMARKS her
* run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch
* run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch
* launcher refactor wip
* wip
* her works on FetchReach
* her runner refactor RUN BENCHMARKS Fetch1M
* unit test for her
* fixing warnings in mpi_average in her, skip test_fetchreach if mujoco is not present
* pickle-based serialization in her
* remove extra import from subproc_vec_env.py
* investigating differences in rollout.py
* try with old rollout code RUN BENCHMARKS her
* temporarily use DummyVecEnv in cmd_util.py RUN BENCHMARKS her
* dummy commit to RUN BENCHMARKS her
* set info_values in rollout worker in her RUN BENCHMARKS her
* bug in rollout_new.py RUN BENCHMARKS her
* fixed bug in rollout_new.py RUN BENCHMARKS her
* do not use last step because vecenv calls reset and returns obs after reset RUN BENCHMARKS her
* updated buffer sizes RUN BENCHMARKS her
* fixed loading/saving via joblib
* dust off learning from demonstrations in HER, docs, refactor
* add deprecation notice on her play and plot files
* address comments by Matthias
* run ddpg on Mujoco benchmark RUN BENCHMARKS
* autopep8
* fixed all syntax in refactored ddpg
* a little bit more refactoring
* autopep8
* identity test with ddpg WIP
* enable test_identity with ddpg
* refactored ddpg RUN BENCHMARKS
* autopep8
* include ddpg into style check
* fixing tests RUN BENCHMARKS
* set default seed to None RUN BENCHMARKS
* run tests and benchmarks in separate buildkite steps RUN BENCHMARKS
* cleanup pdb usage
* flake8 and cleanups
* re-enabled all benchmarks in run-benchmarks-new.py
* flake8 complaints
* deepq model builder compatible with network functions returning single tensor
* remove ddpg test with test_discrete_identity
* make ppo_metal use make_vec_env instead of make_atari_env
* make ppo_metal use make_vec_env instead of make_atari_env
* fixed syntax in ppo_metal.run_atari
* refactor acktr
* setup.cfg now tests style/syntax in acktr as well
* flake8 complaints
* added note about continuous action spaces for acktr into the README.md