baselines

Author	SHA1	Message	Date
Peter Zhokhov	b222dd0610	updated links in README to point to master	2018-08-13 16:01:24 -07:00
pzhokhov	1870685071	Publish benchmark results (#502 ) * updated benchmark pages with final rewards * use htmlpreview to render pages * use htmlpreview to render pages * use htmlpreview to render pages * updated README to reflect ppo1 being obsolete * removed navbars from published benchmark pages * fixed link in README	2018-08-13 15:59:43 -07:00
pzhokhov	8c2aea2add	refactor a2c, acer, acktr, ppo2, deepq, and trpo_mpi (#490 ) * exported rl-algs * more stuff from rl-algs * run slow tests * re-exported rl_algs * re-exported rl_algs - fixed problems with serialization test and test_cartpole * replaced atari_arg_parser with common_arg_parser * run.py can run algos from both baselines and rl_algs * added approximate humanoid reward with ppo2 into the README for reference * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * very dummy commit to RUN BENCHMARKS * serialize variables as a dict, not as a list * running_mean_std uses tensorflow variables * fixed import in vec_normalize * dummy commit to RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * flake8 complaints * save all variables to make sure we save the vec_normalize normalization * benchmarks on ppo2 only RUN BENCHMARKS * make_atari_env compatible with mpi * run ppo_mpi benchmarks only RUN BENCHMARKS * hardcode names of retro environments * add defaults * changed default ppo2 lr schedule to linear RUN BENCHMARKS * non-tf normalization benchmark RUN BENCHMARKS * use ncpu=1 for mujoco sessions - gives a bit of a performance speedup * reverted running_mean_std to user property decorators for mean, var, count * reverted VecNormalize to use RunningMeanStd (no tf) * reverted VecNormalize to use RunningMeanStd (no tf) * profiling wip * use VecNormalize with regular RunningMeanStd * added acer runner (missing import) * flake8 complaints * added a note in README about TfRunningMeanStd and serialization of VecNormalize * dummy commit to RUN BENCHMARKS * merged benchmarks branch	2018-08-13 09:56:44 -07:00
pzhokhov	24fe3d6576	Import internal repo (#409 ) * import rl-algs from 2e3a166 commit * extra import of the baselines badge * exported commit with identity test * proper rng seeding in the test_identity	2018-05-21 15:24:00 -07:00
pzhokhov	9cf95a0054	setup travis ci build (#388 ) * simple .travis.yml file * added static syntax checks of common to .travis.yml * dockerizing the build * fix Dockerfile, adding build shield * cleaning up workdir in Dockerfile and .travis.yml * .travis.yml fixed common -> baselines/common for style check	2018-05-03 09:43:28 -07:00
pzhokhov	2b0283b9db	Readme.md detailed installation instructions (#377 ) * changes to README.md files with more detailed installation instructions * md-fying the changes better * link on the word homebrew in readme.md * typos in README.md * README.md * removed extra comma sign * removed sudo from brew command	2018-04-25 17:40:48 -07:00
Matthias Plappert	b71152eea0	Adds support for Hindsight Experience Replay (HER) (#299 ) * Add Hindsight Experience Replay (HER) * Minor improvements	2018-02-26 17:40:16 +01:00
John Schulman	459f007bcc	Merge pull request #260 from uidilr/master Add GAIL	2018-01-25 20:54:20 -08:00
John Schulman	9fa8e1baf1	Lots of cleanups Fixes for new gym version Add @olegklimov and @unixpickle to authors list	2018-01-25 18:54:24 -08:00
Yusuke Nakata	d8cce2309f	Add GAIL	2018-01-23 12:02:03 +09:00
John Schulman	2dd7d307d7	Add ACER, PPO2, and results_plotter.py	2017-11-16 10:02:32 -08:00
John Schulman	bb40378118	change atari preprocessing to use faster opencv some logger changes	2017-10-25 09:21:29 -04:00
John Schulman	aa6e58bdf1	fix readmes	2017-08-27 22:22:14 -07:00
John Schulman	3f676f7d1e	ACKTR + A2C	2017-08-18 09:25:39 -07:00
Matthias Plappert	882251878f	Parameter space noise for DQN and DDPG (#75 ) * Export param noise * Update documentation * Final finishing touches	2017-07-27 08:10:59 -07:00
Jonas Schneider	5dc00628fe	readme fiddling	2017-07-20 09:00:24 -07:00
John Schulman	da99706046	ppo and trpo	2017-07-20 08:52:35 -07:00
cxx	5e73387494	Fix README since BreakOut pretrained model doesn't match the correct tensor shape. Therefore, Pong is used instead.	2017-06-16 15:38:42 +08:00
Tiago Carvalho	1f3c3e33e7	Update README.md	2017-05-31 12:14:28 +01:00
Olivier Moindrot	d2c51f5933	Correct path to script "download_model" `python -m baselines.deepq.experiments.download_model` becomes `python -m baselines.deepq.experiments.atari.download_model`	2017-05-24 13:13:30 -07:00
Szymon Sidor	958810ed1e	Initial commit	2017-05-24 02:34:20 -07:00

21 Commits