* viz docs
* writing vizualization docs
* documenting plot_util
* docstrings in plot_util
* autopep8 and flake8
* spelling (using default vim spellchecker and ingoring things like dataframe, docstring and etc)
* rephrased viz.md a little bit
* more examples of viz code usage in the docs
* make baselines run without mpi wip
* squash-merged latest master
* further removing MPI references where unnecessary
* more MPI removal
* syntax and flake8
* MpiAdam becomes regular Adam if Mpi not present
* autopep8
* add assertion to test in mpi_adam; fix trpo_mpi failure without MPI on cartpole
* mpiless ddpg
* sync internal changes. Make ddpg work with vecenvs
* B -> nenvs for consistency with other algos, small cleanups
* eval_done[d]==True -> eval_done[d]
* flake8 and numpy.random.random_integers deprecation warning
* Merge branch 'master' of github.com:openai/games into peterz_track_baselines_branch
* add some docstrings
* start making big changes
* state machine redesign
* sampling seems to work
* some reorg
* fixed sampling of real vals
* json conversion
* made it possible to register new commands
got nontrivial version of Pred working
* consolidate command definitions
* add more macro blocks
* revived visualization
* rename Userdata -> CmdInterpreter
make AlgoSmInstance subclass of SmInstance that uses appropriate userdata argument
* replace userdata by ci when appropriate
* minor test fixes
* revamped handmade dir, can run ppo_metal
* seed to avoid random test failure
* implement AlgoAgent
* Autogenerated object that performs all ops and macros
* more CmdRecorder changes
* move files around
* move MatchProb and JtftProb
* remove obsolete
* fix tests involving AlgoAgent (pending the next commit on ppo_metal code)
* ppo_metal: reduce duplication in policy_gen, make sess an attribute of PpoAgent and StochasticPolicy instead of using get_default_session everywhere.
* maze_env reformatting, move algo_search script (but stil broken)
* move agent.py
* fix test on handcrafted agents
* tuning/fixing ppo_metal baseline
* minor
* Fix ppo_metal baseline
* Don’t set epcount, tcount unless they’re being used
* get rid of old ppo_metal baseline
* fixes for handmade/run.py tuning
* fix codegen ppo
* fix handmade ppo hps
* fix test, go back to safe_div
* switch to more complex filtering
* make sure all handcrafted algos have finite probability
* train to maximize logprob of provided samples
Trex changes to avoid segfault
* AlgoSm also includes global hyperparams
* don’t duplicate global hyperparam defaults
* create generic_ob_ac_space function
* use sorted list of outkeys
* revive tsne
* todo changes
* determinism test
* todo + test fix
* remove a few deprecated files, rename other tests so they don’t run automatically, fix real test failure
* continuous control with codegen
* continuous control with codegen
* implement continuous action space algodistr
* ppo with trex RUN BENCHMARKS
* wrap trex in a monitor
* dummy commit to RUN BENCHMARKS
* adding monitor to trex env RUN BENCHMARKS
* adding monitor to trex RUN BENCHMARKS
* include monitor into trex env RUN BENCHMARKS
* generate nll and predmean using Distribution node
* dummy commit to RUN BENCHMARKS
* include pybullet into baselines optional dependencies
* dummy commit to RUN BENCHMARKS
* install games for cron rcall user RUN BENCHMARKS
* add --yes flag to install.py in rcall config for cron user RUN BENCHMARKS
* both continuous and discrete versions seem to run
* fixes to monitor to work with vecenv-like info and rewards RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* removed shape check from one-hot encoding logic in distributions.CategoricalPd
* reset logger configuration in codegen/handmade/run.py to be in-line with baselines RUN BENCHMARKS
* merged peterz_codegen_benchmarks RUN BENCHMARKS
* skip tests RUN BENCHMARKS
* working on test failures
* save benchmark dicts RUN BENCHMARK
* merged peterz_codegen_benchmark RUN BENCHMARKS
* add get_git_commit_message to the baselines.common.console_util
* dummy commit to RUN BENCHMARKS
* merged fixes from peterz_codegen_benchmark RUN BENCHMARKS
* fixing failure in test_algo_nll WIP
* test_algo_nll passes with both ppo and softq
* re-enabled tests
* run trex on gpus for 100k total (horizon=100k / 16) RUN BENCHMARKS
* merged latest peterz_codegen_benchmarks RUN BENCHMARKS
* fixing codegen test failures (logging-related)
* fixed name collision in run-benchmarks-new.py RUN BENCHMARKS
* fixed name collision in run-benchmarks-new.py RUN BENCHMARKS
* fixed import in node_filters.py
* test_algo_search passes
* some cleanup
* dummy commit to RUN BENCHMARKS
* merge fast fail for subprocvecenv RUN BENCHMARKS
* use SubprocVecEnv in sonic_prob
* added deprecation note to shmem_vec_env
* allow indexing of distributions
* add timeout to pipeline.yaml
* typo in pipeline.yml
* run tests with --forked option
* resolved merge conflict in rl_algs.bench.benchmarks
* re-enable parallel tests
* fix remaining merge conflicts and syntax
* Update trex_prob.py
* fixes to ResultsWriter
* take baselines/run.py from peterz_codegen branch
* actually save stuff to file in VecMonitor RUN BENCHMARKS
* enable parallel tests
* merge stricter flake8
* merge peterz_codegen_benchmark, resolve conflicts
* autopep8
* remove traces of Monitor from trex env, check shapes before encoding in CategoricalPd
* asserts and warnings to make q -> distribution change more explicit
* fixed assert in CategoricalPd
* add header to vec_monitor output file RUN BENCHMARKS
* make VecMonitor write header to the output file
* remove deprecation message from shmem_vec_env RUN BENCHMARKS
* autopep8
* proper shape test in distributions.py
* ResultsWriter can take dict headers
* dummy commit to RUN BENCHMARKS
* replace assert len(qs)==1 with warning RUN BENCHMARKS
* removed pdb from ppo2 RUN BENCHMARKS
* re-setting up travis
* re-setting up travis
* resolved merge conflicts, added missing dependency for codegen
* removed parallel tests (workers are failing for some reason)
* try test baselines only
* added language options - some weirdness in rcall image that requires them?
* added verbosity to tests
* try tests in baselines only
* ci/runtests.sh tests codegen (some failure on baselines specifically on travis, trying to narrow down the problem)
* removed render from codegen test - maybe that's the problem?
* trying even simpler command within the image to figure out the problem
* print out system info in ci/runtests.sh
* print system info outside of docker as well
* trying single test file in codegen
* install graphviz in the docker image
* git subrepo pull baselines
subrepo:
subdir: "baselines"
merged: "8c2aea2"
upstream:
origin: "git@github.com:openai/baselines.git"
branch: "master"
commit: "8c2aea2"
git-subrepo:
version: "0.4.0"
origin: "git@github.com:ingydotnet/git-subrepo.git"
commit: "74339e8"
* added graphviz to the dockerfile (need both graphviz-dev and graphviz)
* only tests in codegen/algo/test_algo_builder.py
* run baselines tests only. still no clue why collection of codegen tests fails
* update baselines setup to install filelock for tests
* run slow tests
* skip slow tests in baselines
* single test file in baselines
* try reinstalling tensorflow
* running slow tests
* try full baselines and codegen test suite
* in the test Dockerfile, reinstall tensorflow
* using fake display for codegen render tests
* fixed display-related failures by adding a custom entrpoint to the docker image
* set LC_ALL and LANG env variables in docker image
* try sequential tests
* include psutil in requirements; increase relative tolerance in test_low_level_algo_distr
* trying to fix codegen failures on travis
* git subrepo commit (merge) baselines
subrepo:
subdir: "baselines"
merged: "9ce84da"
upstream:
origin: "git@github.com:openai/baselines.git"
branch: "master"
commit: "b222dd0"
git-subrepo:
version: "0.4.0"
origin: "git@github.com:ingydotnet/git-subrepo.git"
commit: "74339e8"
* syntax in install.py
* changing the order of package installation
* removed supervised-reptile from installation list
* cron uses the full games repo in rcall
* flake8 complaints
* rewrite all extras logic in baselines, install.py always uses [all]
* exported rl-algs
* more stuff from rl-algs
* run slow tests
* re-exported rl_algs
* re-exported rl_algs - fixed problems with serialization test and test_cartpole
* replaced atari_arg_parser with common_arg_parser
* run.py can run algos from both baselines and rl_algs
* added approximate humanoid reward with ppo2 into the README for reference
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* very dummy commit to RUN BENCHMARKS
* serialize variables as a dict, not as a list
* running_mean_std uses tensorflow variables
* fixed import in vec_normalize
* dummy commit to RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* flake8 complaints
* save all variables to make sure we save the vec_normalize normalization
* benchmarks on ppo2 only RUN BENCHMARKS
* make_atari_env compatible with mpi
* run ppo_mpi benchmarks only RUN BENCHMARKS
* hardcode names of retro environments
* add defaults
* changed default ppo2 lr schedule to linear RUN BENCHMARKS
* non-tf normalization benchmark RUN BENCHMARKS
* use ncpu=1 for mujoco sessions - gives a bit of a performance speedup
* reverted running_mean_std to user property decorators for mean, var, count
* reverted VecNormalize to use RunningMeanStd (no tf)
* reverted VecNormalize to use RunningMeanStd (no tf)
* profiling wip
* use VecNormalize with regular RunningMeanStd
* added acer runner (missing import)
* flake8 complaints
* added a note in README about TfRunningMeanStd and serialization of VecNormalize
* dummy commit to RUN BENCHMARKS
* merged benchmarks branch