baselines

Author	SHA1	Message	Date
Youngjin Kim	bcde04e710	Fix argument error in deepq (#508 ) * Fix argment error in deepq * Fix argment error in deepq	2018-08-16 14:55:57 -07:00
pzhokhov	5622a09fa4	update readmes (#514 ) * update per-algorithm READMEs to reflect new way of running algorithms * adding a link to repo-wide README * updated README files and deepq.train_cartpole example	2018-08-16 14:53:49 -07:00
Pim de Haan	e2da7cd42f	Several bugfixes for #504 , #505 , #506 related to Classic Control and deepq (#507 ) * Several bugfixes * Fixed ActWrapper.step bug	2018-08-16 12:08:53 -07:00
Peter Zhokhov	b222dd0610	updated links in README to point to master	2018-08-13 16:01:24 -07:00
pzhokhov	1870685071	Publish benchmark results (#502 ) * updated benchmark pages with final rewards * use htmlpreview to render pages * use htmlpreview to render pages * use htmlpreview to render pages * updated README to reflect ppo1 being obsolete * removed navbars from published benchmark pages * fixed link in README	2018-08-13 15:59:43 -07:00
pzhokhov	8c2aea2add	refactor a2c, acer, acktr, ppo2, deepq, and trpo_mpi (#490 ) * exported rl-algs * more stuff from rl-algs * run slow tests * re-exported rl_algs * re-exported rl_algs - fixed problems with serialization test and test_cartpole * replaced atari_arg_parser with common_arg_parser * run.py can run algos from both baselines and rl_algs * added approximate humanoid reward with ppo2 into the README for reference * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * very dummy commit to RUN BENCHMARKS * serialize variables as a dict, not as a list * running_mean_std uses tensorflow variables * fixed import in vec_normalize * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * flake8 complaints * save all variables to make sure we save the vec_normalize normalization * benchmarks on ppo2 only RUN BENCHMARKS * make_atari_env compatible with mpi * run ppo_mpi benchmarks only RUN BENCHMARKS * hardcode names of retro environments * add defaults * changed default ppo2 lr schedule to linear RUN BENCHMARKS * non-tf normalization benchmark RUN BENCHMARKS * use ncpu=1 for mujoco sessions - gives a bit of a performance speedup * reverted running_mean_std to user property decorators for mean, var, count * reverted VecNormalize to use RunningMeanStd (no tf) * reverted VecNormalize to use RunningMeanStd (no tf) * profiling wip * use VecNormalize with regular RunningMeanStd * added acer runner (missing import) * flake8 complaints * added a note in README about TfRunningMeanStd and serialization of VecNormalize * dummy commit to RUN BENCHMARKS * merged benchmarks branch	2018-08-13 09:56:44 -07:00
Tony Yu Cao	366f486e34	Update README.md (#416 ) Update Atari example	2018-08-08 10:42:10 -07:00
Adam Gleave	f272969325	GAIL: bugfix in dataset loading (#447 ) * Fix silly typo * Replace ad-hoc function with NumPy code	2018-07-06 16:12:14 -07:00
pzhokhov	a6b1bc70f1	re-import internal; fix missing tile_images.py (#427 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity * import internal * adding missing tile_images.py	2018-06-08 09:41:45 -07:00
pzhokhov	36ee5d1707	Import internal changes (#422 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity * import internal	2018-06-06 11:39:13 -07:00
pzhokhov	24fe3d6576	Import internal repo (#409 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity	2018-05-21 15:24:00 -07:00
pzhokhov	9cb7ece338	add opencv-python to the dependencies (#407 )	2018-05-14 10:52:19 -07:00
pzhokhov	9cf95a0054	setup travis ci build (#388 ) * simple .travis.yml file * added static syntax checks of common to .travis.yml * dockerizing the build * fix Dockerfile, adding build shield * cleaning up workdir in Dockerfile and .travis.yml * .travis.yml fixed common -> baselines/common for style check	2018-05-03 09:43:28 -07:00
pzhokhov	8b781038cc	put filters and running_stat files in common instead of acktr (#389 )	2018-05-02 18:42:48 -07:00
pzhokhov	69f25c6028	import internal repo (#385 )	2018-05-01 16:54:04 -07:00
pzhokhov	2b0283b9db	Readme.md detailed installation instructions (#377 ) * changes to README.md files with more detailed installation instructions * md-fying the changes better * link on the word homebrew in readme.md * typos in README.md * README.md * removed extra comma sign * removed sudo from brew command	2018-04-25 17:40:48 -07:00
Matthias Plappert	1f8a03f3a6	Update README	2018-03-26 16:50:22 +02:00
Matthias Plappert	3cc7df0608	Minor fixes to HER release (#319 ) * Fix plotting script * Add warning if num_cpu = 1	2018-03-05 11:06:17 +01:00
Alex Nichol	8b3a6c2051	fix DummyVecEnv reusing buffers	2018-03-02 17:18:07 -08:00
Alex Nichol	569bd42629	Merge pull request #308 from araffin/master Bug fix in saving ACER model	2018-03-01 10:45:04 -08:00
Daniel Ziegler	f49a9c3d85	Fix bug in DDPG parameter space noise adaptation (#306 ) The training loop used the rollout step variable `t` rather than the training step variable `t_train` to decide when to adapt the scale of the parameter space noise.	2018-03-01 18:00:34 +01:00
Antonin RAFFIN	14f2f9328c	Bug fix in saving ACER model	2018-03-01 10:24:14 +01:00
Alex Nichol	6bdf2f55a2	Merge pull request #132 from bhatiaabhinav/bug_fixes Bug fix in saving a2c model.	2018-02-27 19:00:37 -08:00
Alex Nichol	97be70d6c8	fixes for DummyVecEnv Fixes various problems running MuJoCo tasks.	2018-02-27 18:55:10 -08:00
Matthias Plappert	b71152eea0	Adds support for Hindsight Experience Replay (HER) (#299 ) * Add Hindsight Experience Replay (HER) * Minor improvements	2018-02-26 17:40:16 +01:00
Christopher Hesse	df2e846ab7	export: fix accidental rename	2018-02-14 22:01:16 -08:00
Christopher Hesse	edb52c22a5	export: Fix deepq param noise refactoring, remove atari experiments and azure dependency	2018-02-14 21:42:22 -08:00
Andrei Kashin	98257ef8c9	Flush temporary file before compressing it. We need to flush the buffer after `pickle.dump`, otherwise the resulting zip archive might be incomplete (reproducible, if the state consists of a single integer).	2018-02-06 07:04:44 -08:00
Oleg Klimov	d9b36601d9	comment about loading weights in ppo2	2018-02-05 12:25:05 -08:00
Oleg Klimov	2793971c10	fix gail tf_util usage	2018-02-05 07:51:27 -08:00
John Schulman	16d7d23b7d	Merge pull request #271 from simontudo/add-requirement-cloudpickle added cloudpickle to requirements	2018-02-02 23:04:53 -08:00
John Schulman	9175b770c6	Merge pull request #273 from simontudo/videorecorder-import updated videorecorder import	2018-02-02 23:03:51 -08:00
simontudo	615870ad6b	updated videorecorder import	2018-02-01 12:09:08 +01:00
simontudo	7bd264e0e9	added cloudpickle to requirements	2018-01-31 10:43:17 +01:00
John Schulman	8d03102d4d	Merge pull request #265 from 20chase/patch-1 fix logger error for trpo_mpi	2018-01-29 00:54:51 -08:00
20chase	4a77855529	using mujoco_arg_parser as args remove origin parser	2018-01-29 16:52:01 +08:00
John Schulman	2e29b41592	Merge pull request #268 from ei-grad/master Fix fc call in AcerLstmPolicy	2018-01-27 18:42:31 -08:00
Andrew Grigorev	634e37c5b8	Fix fc call in AcerLstmPolicy The `act` keyword was removed from baselines.a2c.utils.fc in commit `9fa8e1b`.	2018-01-27 23:18:02 +03:00
20chase	452b548c2a	Merge branch 'master' into patch-1	2018-01-26 14:34:01 +08:00
John Schulman	ebb8afff2e	fix trpo_mpi bug where logstd wasn’t included	2018-01-25 21:17:40 -08:00
John Schulman	c9613b2293	Merge pull request #259 from andrewliao11/openai_gail Add gail maintainer list	2018-01-25 20:54:34 -08:00
John Schulman	459f007bcc	Merge pull request #260 from uidilr/master Add GAIL	2018-01-25 20:54:20 -08:00
John Schulman	9fa8e1baf1	Lots of cleanups Fixes for new gym version Add @olegklimov and @unixpickle to authors list	2018-01-25 18:54:24 -08:00
20chase	ac2ea4f31f	fix logger error for MPI Can't run logger.configure() if rank != 0	2018-01-25 22:09:00 +08:00
Yusuke Nakata	d8cce2309f	Add GAIL	2018-01-23 12:02:03 +09:00
andrew	0c207f0185	fix typo	2018-01-21 22:13:01 -08:00
andrew	41d41fabe3	add gail maintainer list	2018-01-21 22:12:03 -08:00
John Schulman	b5be53dc92	Merge pull request #229 from andrewliao11/gail GAIL implementation	2018-01-21 20:30:20 -05:00
Matthias Plappert	49c1a8ec26	Fix bug in parameter space noise DQN	2018-01-16 10:24:30 -08:00
andrew	e5a714b070	fix relative import	2018-01-12 15:12:45 -08:00

1 2 3

133 Commits