* refactor acktr
* setup.cfg now tests style/syntax in acktr as well
* flake8 complaints
* added note about continuous action spaces for acktr into the README.md
* Add possibility of plotting timesteps vs episodes
* Remove leftover from personal project patch
* Auto plt.tight_layout() on resize window event
Calls `plt.tight_layout()` if a `resize_event` is issued.
This means that the plot will look good even after the user has resized the plotting window.
* fix discovered test failures
* autopep8
* test indices up to 123
* testing from index 124 on
* add scope to logstd
* fix flakiness in test_train_mle
* autopep8
* add some docstrings
* start making big changes
* state machine redesign
* sampling seems to work
* some reorg
* fixed sampling of real vals
* json conversion
* made it possible to register new commands
got nontrivial version of Pred working
* consolidate command definitions
* add more macro blocks
* revived visualization
* rename Userdata -> CmdInterpreter
make AlgoSmInstance subclass of SmInstance that uses appropriate userdata argument
* replace userdata by ci when appropriate
* minor test fixes
* revamped handmade dir, can run ppo_metal
* seed to avoid random test failure
* implement AlgoAgent
* Autogenerated object that performs all ops and macros
* more CmdRecorder changes
* move files around
* move MatchProb and JtftProb
* remove obsolete
* fix tests involving AlgoAgent (pending the next commit on ppo_metal code)
* ppo_metal: reduce duplication in policy_gen, make sess an attribute of PpoAgent and StochasticPolicy instead of using get_default_session everywhere.
* maze_env reformatting, move algo_search script (but stil broken)
* move agent.py
* fix test on handcrafted agents
* tuning/fixing ppo_metal baseline
* minor
* Fix ppo_metal baseline
* Don’t set epcount, tcount unless they’re being used
* get rid of old ppo_metal baseline
* fixes for handmade/run.py tuning
* fix codegen ppo
* fix handmade ppo hps
* fix test, go back to safe_div
* switch to more complex filtering
* make sure all handcrafted algos have finite probability
* train to maximize logprob of provided samples
Trex changes to avoid segfault
* AlgoSm also includes global hyperparams
* don’t duplicate global hyperparam defaults
* create generic_ob_ac_space function
* use sorted list of outkeys
* revive tsne
* todo changes
* determinism test
* todo + test fix
* remove a few deprecated files, rename other tests so they don’t run automatically, fix real test failure
* continuous control with codegen
* continuous control with codegen
* implement continuous action space algodistr
* ppo with trex RUN BENCHMARKS
* wrap trex in a monitor
* dummy commit to RUN BENCHMARKS
* adding monitor to trex env RUN BENCHMARKS
* adding monitor to trex RUN BENCHMARKS
* include monitor into trex env RUN BENCHMARKS
* generate nll and predmean using Distribution node
* dummy commit to RUN BENCHMARKS
* include pybullet into baselines optional dependencies
* dummy commit to RUN BENCHMARKS
* install games for cron rcall user RUN BENCHMARKS
* add --yes flag to install.py in rcall config for cron user RUN BENCHMARKS
* both continuous and discrete versions seem to run
* fixes to monitor to work with vecenv-like info and rewards RUN BENCHMARKS
* dummy commit to RUN BENCHMARKS
* removed shape check from one-hot encoding logic in distributions.CategoricalPd
* reset logger configuration in codegen/handmade/run.py to be in-line with baselines RUN BENCHMARKS
* merged peterz_codegen_benchmarks RUN BENCHMARKS
* skip tests RUN BENCHMARKS
* working on test failures
* save benchmark dicts RUN BENCHMARK
* merged peterz_codegen_benchmark RUN BENCHMARKS
* add get_git_commit_message to the baselines.common.console_util
* dummy commit to RUN BENCHMARKS
* merged fixes from peterz_codegen_benchmark RUN BENCHMARKS
* fixing failure in test_algo_nll WIP
* test_algo_nll passes with both ppo and softq
* re-enabled tests
* run trex on gpus for 100k total (horizon=100k / 16) RUN BENCHMARKS
* merged latest peterz_codegen_benchmarks RUN BENCHMARKS
* fixing codegen test failures (logging-related)
* fixed name collision in run-benchmarks-new.py RUN BENCHMARKS
* fixed name collision in run-benchmarks-new.py RUN BENCHMARKS
* fixed import in node_filters.py
* test_algo_search passes
* some cleanup
* dummy commit to RUN BENCHMARKS
* merge fast fail for subprocvecenv RUN BENCHMARKS
* use SubprocVecEnv in sonic_prob
* added deprecation note to shmem_vec_env
* allow indexing of distributions
* add timeout to pipeline.yaml
* typo in pipeline.yml
* run tests with --forked option
* resolved merge conflict in rl_algs.bench.benchmarks
* re-enable parallel tests
* fix remaining merge conflicts and syntax
* Update trex_prob.py
* fixes to ResultsWriter
* take baselines/run.py from peterz_codegen branch
* actually save stuff to file in VecMonitor RUN BENCHMARKS
* enable parallel tests
* merge stricter flake8
* merge peterz_codegen_benchmark, resolve conflicts
* autopep8
* remove traces of Monitor from trex env, check shapes before encoding in CategoricalPd
* asserts and warnings to make q -> distribution change more explicit
* fixed assert in CategoricalPd
* add header to vec_monitor output file RUN BENCHMARKS
* make VecMonitor write header to the output file
* remove deprecation message from shmem_vec_env RUN BENCHMARKS
* autopep8
* proper shape test in distributions.py
* ResultsWriter can take dict headers
* dummy commit to RUN BENCHMARKS
* replace assert len(qs)==1 with warning RUN BENCHMARKS
* removed pdb from ppo2 RUN BENCHMARKS
* fixes to enjoy_cartpole, enjoy_mountaincar.py
* fixed {train,enjoy}_pong, removed enjoy_retro
* set number of timesteps to 1e7 in train_pong
* flake8 complaints
* use synchronous version fo acktr in test_env_after_learn
* flake8
* fixes to enjoy_cartpole, enjoy_mountaincar.py
* fixed {train,enjoy}_pong, removed enjoy_retro
* set number of timesteps to 1e7 in train_pong
* flake8 complaints
* use synchronous version fo acktr in test_env_after_learn
* flake8
* implement pdfromlatent in BernoulliPdType
* remove env.close() at the end of algorithms
* test case for environment after learn
* closing env in run.py
* fixes for acktr and trpo_mpi
* add make_session with new graph for every call in test_env_after_learn
* remove extra prints from test_env_after_learn
* Add lots of docstrings
Change hyperparameter transformations for slightly better efficiency and to avoid circular dependency.
Now all parameters are stored in a “human-readable” form.
* improve pretty-print of nodes and trees
* newlines at end-of-file, return graph in render(), assert_valid() fix
* split run_algo_search.py into several simpler scripts
* add joint_train option to get_prob
* minor changes to soln_db and embedding script
* Arguments: -> Args:
* fix replay, part 1
* fix behavior when using unpickled algos
* re-add retrieve_weights
* make training scripts more consistent
* lint
* lint
* lint + remove rendering some rendering functionality from trex env as it’s also elsewhere
* get rid of warnings
* refactor functionality for getting final q-function and losses. revive code for removing useless terms & tests for simplification.
* fix vecenv closing
* finish removing algo folder (most useful functionality has been moved out of it)
* control verbosity of trex
* fix tests
* rename spec => choice_spec, some comments, asserts, debug prints
* fix some tests
* putting instructions from README.md into a script
* install roboschool as a part of setup.py
* install roboschool from install.py
* export pkg_config_path
* remove compilation step from roboschool/setup.py
* removed roboschool install from games install due to extra compilation step
* removed unused import from roboschool/setup.py
* error if logger looks wrong
* check version of logger, call logger.configure() on import
* remove changes entry
* add version to rl-algs
* fix typo
* add comment
* switch version to string
* set logger env variable