baselines

Author	SHA1	Message	Date
uronce-cc	0a40206c6c	ncpu needs to be an integer. (#558 )	2018-08-31 09:02:18 -07:00
Alfredo Canziani	1937826784	Fix alien syntax and apply PEP 8 style (#554 )	2018-08-30 17:21:25 -07:00
pzhokhov	b29c8020d7	remove saving model as a pickle file in ppo2 (tries to pull environment in; bad idea - may need to use constructor argument pickling or somesuch if at all necessary) (#69 )	2018-08-30 13:41:38 -07:00
Peter Zhokhov	4ec308aaa4	fixed syntax	2018-08-30 13:41:38 -07:00
Peter Zhokhov	3bbf3f3511	allow_early_resets=True in create_vec_env	2018-08-30 13:41:38 -07:00
Joshua Meier	e5de29a954	instructions for tensorboard (#61 )	2018-08-30 13:41:37 -07:00
Joshua Meier	2507d335f9	Tensorboard util (#60 ) * separate_validation_set was not imported * launching tensorboard automatically	2018-08-30 13:41:37 -07:00
Damien Lancry	bdd4d385a6	Fix result_plotters in vectorized mujoco environments (#533 ) * I investigated a bit about running a training in a vectorized monitored mujoco env and found out that the 0.monitor.csv file could not be plotted using baselines.results_plotter.py functions. Moreover the seed is the same in every parallel environments due to the particular behaviour of lambda. this fixes both issues without breaking the function in other files (baselines.acktr.run_mujoco still works) * unifies make_atari_env and make_mujoco_env * redefine make_mujoco_env because of run_mujoco in acktr not compatible with DummyVecEnv and SubprocVecEnv * fix if else * Update run.py	2018-08-28 17:48:56 -07:00
Peter Zhokhov	0961f5dd94	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "95a81e86" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "c6c0f45c" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	2018-08-27 16:40:14 -07:00
Christopher Hesse	337d913a8f	remove reset_task from subproc vec env (#45 )	2018-08-27 16:40:14 -07:00
Karl Cobbe	34af61a132	baselines: fix dummy vec env render mode (#42 )	2018-08-27 16:40:14 -07:00
Christopher Hesse	1ea5ec647c	export SimpleEnv and assert_envs_equal, fix minor bug in action space (#46 )	2018-08-27 16:40:14 -07:00
pzhokhov	2fc7a1cbee	Trigger benchmarks from buildkite (#40 ) * rig buildkite pipeline to run benchmarks when commit ends with RUN BENCHMARKS * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file - merge test and benchmark steps * fix the buildkite pipeline file - merge test and benchmark steps * fix buildkite pipeline file * fix buildkite pipeline file * dry RUN BENCHMARKS * dry RUN BENCHMARKS * dry not run BENCHMARKS * not run benchmarks * not running benchmarks * no running benchmarks * no running benchmarks * still not running benchmarks * dummy commit to RUN BENCHMARKS * trigger benchmarks from buildkite RUN BENCHMARKS * specifying RCALL_KUBE_CLUSTER RUN BENCHMARKS * remove rl-algs/run-benchmarks-new.py (moved to ci), merged baselines/common/console_util and baselines/common/util.py * added missing imports in console_util * clone subrepo over https	2018-08-27 16:40:14 -07:00
John Schulman	14c1d69ef4	Reduce duplication in VecEnv subclasses. (#38 ) * Reduce duplication in VecEnv subclasses. Now VecEnv base class handles rendering and closing; subclasses should provide get_images and (optionally) close_extras. * fix tests * minor docstring change * raise NotImplementedError	2018-08-27 16:40:13 -07:00
pzhokhov	c8f6d8bac7	address rl-algs issue #169 (missing util functions from rcall) (#30 ) * copied parts of util.py to baselines.common from rcall * merged fix for baselines.logger, resolved conflicts * copied ccap to baselines/baselines/common/util.py	2018-08-27 16:40:13 -07:00
pzhokhov	3a006ba50e	flake8 fixes (#35 ) * flake8 fixes * added baselines/setup.cfg * style checks using setup.cfg in baselines	2018-08-27 16:40:13 -07:00
Tom	c6c0f45cb1	fix 'async' is a reserved word in Python >= 3.7 (#495 ) (#542 )	2018-08-27 12:36:43 -07:00
wangjksjtu	e92a6ad8f4	Update README.md (#537 ) １. Delete repetitive section 2. Align the commands	2018-08-27 12:35:48 -07:00
HelgeS	92b9a37257	Updated example commands to run ppo2 (#534 ) The headline mentions PPO, but the command was for A2C	2018-08-23 15:58:27 -07:00
Armin Primadi	cb14da96ca	Fix typo on policies documentation (#535 )	2018-08-23 15:56:13 -07:00
pzhokhov	3900f2a447	baselines issue 146 (remove tensorflow from setup.py) (#34 ) * baselines does not reinstall tensorflow * fix the version check in baselines/setup.py * replace print and assert with assert, str (thanks @csh)	2018-08-21 16:59:05 -07:00
pzhokhov	20d22a5d79	Fix baselines build (fails due to lack of mujoco in public baselines container) (#29 ) * make nminibatces = min(nminibatches, nenv) * clarify the usage of lstm policy, add an example and a test * cleaned up example, added assert to the test * remove nminibatches -> min(nminibatches, num_env) * removed code snippet from the docstring, pointing to the file * add _mujoco_present flag to skip the tests that require mujoco if mujoco is not present * re-format skip message in test_doc_examples * flake8 complaints	2018-08-21 10:08:24 -07:00
pzhokhov	caf7b08b4d	Baselines issue #525 (lack of docs for recurrent policies) (#27 ) * make nminibatces = min(nminibatches, nenv) * clarify the usage of lstm policy, add an example and a test * cleaned up example, added assert to the test * remove nminibatches -> min(nminibatches, num_env) * removed code snippet from the docstring, pointing to the file	2018-08-20 13:55:35 -07:00
Peter Zhokhov	ca0165cdf5	flake8 complaints	2018-08-17 18:11:00 -07:00
pzhokhov	eb5b605f86	restore subrepo conftest.py files (#22 ) * restore conftest.py in subrepos * remove conftest files from subrepos in the docker image * remove runslow flag from baselines .travis.yml and rl-algs ci/runtests.sh * move import of rendering module into the code to fix tests that don't require a display * restore the dockerfile	2018-08-17 17:02:39 -07:00
Peter Zhokhov	a89bee3c8d	Merge commit 'refs/subrepo/baselines/fetch' into subrepo/baselines	2018-08-17 13:55:27 -07:00
pzhokhov	353bb15e90	deduplicate algorithms in rl-algs and baselines (#18 ) * move vec_env * cleaning up rl_common * tests are passing (but mosts tests are deleted as moved to baselines) * add benchmark runner for smoke tests * removed duplicated algos * route references to rl_algs.a2c to baselines.a2c * route references to rl_algs.a2c to baselines.a2c * unify conftest.py * removing references to duplicated algs from codegen * removing references to duplicated algs from codegen * alex's changes to dummy_vec_env * fixed test_carpole[deepq] testcase by decreasing number of training steps... alex's changes seemed to have fixed the bug and make it train better, but at seed=0 there is a dip in the training curve at 30k steps that fails the test * codegen tests with atol=1e-6 seem to be unstable * rl_common.vec_env -> baselines.common.vec_env mass replace * fixed reference in trpo_mpi * a2c.util references * restored rl_algs.bench in sonic_prob * fix reference in ci/runtests.sh * simplifed expression in baselines/common/cmd_util * further increased rtol to 1e-3 in codegen tests * switched vecenvs to use SimpleImageViewer from gym instead of cv2 * make run.py --play option work with num_envs > 1 * make rosenbrock test reproducible * git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "e23524a5" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "bcde04e7" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * updated baselines README (num-timesteps --> num_timesteps) * typo in deepq/README.md	2018-08-17 13:54:11 -07:00
pzhokhov	64c0c0a043	Setup travis (#12 ) * re-setting up travis * re-setting up travis * resolved merge conflicts, added missing dependency for codegen * removed parallel tests (workers are failing for some reason) * try test baselines only * added language options - some weirdness in rcall image that requires them? * added verbosity to tests * try tests in baselines only * ci/runtests.sh tests codegen (some failure on baselines specifically on travis, trying to narrow down the problem) * removed render from codegen test - maybe that's the problem? * trying even simpler command within the image to figure out the problem * print out system info in ci/runtests.sh * print system info outside of docker as well * trying single test file in codegen * install graphviz in the docker image * git subrepo pull baselines subrepo: subdir: "baselines" merged: "8c2aea2" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "8c2aea2" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * added graphviz to the dockerfile (need both graphviz-dev and graphviz) * only tests in codegen/algo/test_algo_builder.py * run baselines tests only. still no clue why collection of codegen tests fails * update baselines setup to install filelock for tests * run slow tests * skip slow tests in baselines * single test file in baselines * try reinstalling tensorflow * running slow tests * try full baselines and codegen test suite * in the test Dockerfile, reinstall tensorflow * using fake display for codegen render tests * fixed display-related failures by adding a custom entrpoint to the docker image * set LC_ALL and LANG env variables in docker image * try sequential tests * include psutil in requirements; increase relative tolerance in test_low_level_algo_distr * trying to fix codegen failures on travis * git subrepo commit (merge) baselines subrepo: subdir: "baselines" merged: "9ce84da" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "b222dd0" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * syntax in install.py * changing the order of package installation * removed supervised-reptile from installation list * cron uses the full games repo in rcall * flake8 complaints * rewrite all extras logic in baselines, install.py always uses [all]	2018-08-17 13:54:10 -07:00
pzhokhov	5fee99e771	Setup travis (#12 ) * re-setting up travis * re-setting up travis * resolved merge conflicts, added missing dependency for codegen * removed parallel tests (workers are failing for some reason) * try test baselines only * added language options - some weirdness in rcall image that requires them? * added verbosity to tests * try tests in baselines only * ci/runtests.sh tests codegen (some failure on baselines specifically on travis, trying to narrow down the problem) * removed render from codegen test - maybe that's the problem? * trying even simpler command within the image to figure out the problem * print out system info in ci/runtests.sh * print system info outside of docker as well * trying single test file in codegen * install graphviz in the docker image * git subrepo pull baselines subrepo: subdir: "baselines" merged: "8c2aea2" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "8c2aea2" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * added graphviz to the dockerfile (need both graphviz-dev and graphviz) * only tests in codegen/algo/test_algo_builder.py * run baselines tests only. still no clue why collection of codegen tests fails * update baselines setup to install filelock for tests * run slow tests * skip slow tests in baselines * single test file in baselines * try reinstalling tensorflow * running slow tests * try full baselines and codegen test suite * in the test Dockerfile, reinstall tensorflow * using fake display for codegen render tests * fixed display-related failures by adding a custom entrpoint to the docker image * set LC_ALL and LANG env variables in docker image * try sequential tests * include psutil in requirements; increase relative tolerance in test_low_level_algo_distr * trying to fix codegen failures on travis * git subrepo commit (merge) baselines subrepo: subdir: "baselines" merged: "9ce84da" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "b222dd0" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * syntax in install.py * changing the order of package installation * removed supervised-reptile from installation list * cron uses the full games repo in rcall * flake8 complaints * rewrite all extras logic in baselines, install.py always uses [all]	2018-08-17 13:40:02 -07:00
Youngjin Kim	5edcd6886e	Fix argument error in deepq (#508 ) * Fix argment error in deepq * Fix argment error in deepq	2018-08-16 14:55:57 -07:00
Youngjin Kim	bcde04e710	Fix argument error in deepq (#508 ) * Fix argment error in deepq * Fix argment error in deepq	2018-08-16 14:55:57 -07:00
pzhokhov	cd375ab209	update readmes (#514 ) * update per-algorithm READMEs to reflect new way of running algorithms * adding a link to repo-wide README * updated README files and deepq.train_cartpole example	2018-08-16 14:53:49 -07:00
pzhokhov	5622a09fa4	update readmes (#514 ) * update per-algorithm READMEs to reflect new way of running algorithms * adding a link to repo-wide README * updated README files and deepq.train_cartpole example	2018-08-16 14:53:49 -07:00
Pim de Haan	e2da7cd42f	Several bugfixes for #504 , #505 , #506 related to Classic Control and deepq (#507 ) * Several bugfixes * Fixed ActWrapper.step bug	2018-08-16 12:08:53 -07:00
Peter Zhokhov	b222dd0610	updated links in README to point to master	2018-08-13 16:01:24 -07:00
pzhokhov	1870685071	Publish benchmark results (#502 ) * updated benchmark pages with final rewards * use htmlpreview to render pages * use htmlpreview to render pages * use htmlpreview to render pages * updated README to reflect ppo1 being obsolete * removed navbars from published benchmark pages * fixed link in README	2018-08-13 15:59:43 -07:00
pzhokhov	8c2aea2add	refactor a2c, acer, acktr, ppo2, deepq, and trpo_mpi (#490 ) * exported rl-algs * more stuff from rl-algs * run slow tests * re-exported rl_algs * re-exported rl_algs - fixed problems with serialization test and test_cartpole * replaced atari_arg_parser with common_arg_parser * run.py can run algos from both baselines and rl_algs * added approximate humanoid reward with ppo2 into the README for reference * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * very dummy commit to RUN BENCHMARKS * serialize variables as a dict, not as a list * running_mean_std uses tensorflow variables * fixed import in vec_normalize * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * flake8 complaints * save all variables to make sure we save the vec_normalize normalization * benchmarks on ppo2 only RUN BENCHMARKS * make_atari_env compatible with mpi * run ppo_mpi benchmarks only RUN BENCHMARKS * hardcode names of retro environments * add defaults * changed default ppo2 lr schedule to linear RUN BENCHMARKS * non-tf normalization benchmark RUN BENCHMARKS * use ncpu=1 for mujoco sessions - gives a bit of a performance speedup * reverted running_mean_std to user property decorators for mean, var, count * reverted VecNormalize to use RunningMeanStd (no tf) * reverted VecNormalize to use RunningMeanStd (no tf) * profiling wip * use VecNormalize with regular RunningMeanStd * added acer runner (missing import) * flake8 complaints * added a note in README about TfRunningMeanStd and serialization of VecNormalize * dummy commit to RUN BENCHMARKS * merged benchmarks branch	2018-08-13 09:56:44 -07:00
Tony Yu Cao	366f486e34	Update README.md (#416 ) Update Atari example	2018-08-08 10:42:10 -07:00
Adam Gleave	f272969325	GAIL: bugfix in dataset loading (#447 ) * Fix silly typo * Replace ad-hoc function with NumPy code	2018-07-06 16:12:14 -07:00
pzhokhov	a6b1bc70f1	re-import internal; fix missing tile_images.py (#427 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity * import internal * adding missing tile_images.py	2018-06-08 09:41:45 -07:00
pzhokhov	36ee5d1707	Import internal changes (#422 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity * import internal	2018-06-06 11:39:13 -07:00
pzhokhov	24fe3d6576	Import internal repo (#409 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity	2018-05-21 15:24:00 -07:00
pzhokhov	9cb7ece338	add opencv-python to the dependencies (#407 )	2018-05-14 10:52:19 -07:00
pzhokhov	9cf95a0054	setup travis ci build (#388 ) * simple .travis.yml file * added static syntax checks of common to .travis.yml * dockerizing the build * fix Dockerfile, adding build shield * cleaning up workdir in Dockerfile and .travis.yml * .travis.yml fixed common -> baselines/common for style check	2018-05-03 09:43:28 -07:00
pzhokhov	8b781038cc	put filters and running_stat files in common instead of acktr (#389 )	2018-05-02 18:42:48 -07:00
pzhokhov	69f25c6028	import internal repo (#385 )	2018-05-01 16:54:04 -07:00
pzhokhov	2b0283b9db	Readme.md detailed installation instructions (#377 ) * changes to README.md files with more detailed installation instructions * md-fying the changes better * link on the word homebrew in readme.md * typos in README.md * README.md * removed extra comma sign * removed sudo from brew command	2018-04-25 17:40:48 -07:00
Matthias Plappert	1f8a03f3a6	Update README	2018-03-26 16:50:22 +02:00
Matthias Plappert	3cc7df0608	Minor fixes to HER release (#319 ) * Fix plotting script * Add warning if num_cpu = 1	2018-03-05 11:06:17 +01:00
Alex Nichol	8b3a6c2051	fix DummyVecEnv reusing buffers	2018-03-02 17:18:07 -08:00

1 2 3 4

164 Commits