dummy commit to RUN BENCHMARKS

Merge branch 'observation-dtype' of github.com:openai/baselines into peterz_benchmarks
fixed syntax in conv_only RUN BENCHMARKS
2018-08-10 09:46:43 -07:00 · 2018-08-10 09:45:50 -07:00 · 2018-08-08 16:24:39 -07:00 · 2018-08-08 15:59:59 -07:00 · 2018-08-08 15:15:03 -07:00 · 2018-08-08 15:10:39 -07:00
5 changed files with 8 additions and 18004 deletions
--- a/2
+++ b/2
@@ -1,4 +1,4 @@
-FROM ubuntu:18.04
+FROM ubuntu:16.04

 RUN apt-get -y update && apt-get -y install git wget python-dev python3-dev libopenmpi-dev python-pip zlib1g-dev cmake python-opencv
 ENV CODE_DIR /root/code
--- a/README.md
+++ b/README.md
@@ -112,6 +112,10 @@ This should get to the mean reward per episode about 5k. To load and visualize t
 *NOTE:* At the moment Mujoco training uses VecNormalize wrapper for the environment which is not being saved correctly; so loading the models trained on Mujoco will not work well if the environment is recreated. If necessary, you can work around that by replacing RunningMeanStd by TfRunningMeanStd in [baselines/common/vec_env/vec_normalize.py](baselines/common/vec_env/vec_normalize.py#L12). This way, mean and std of environment normalizing wrapper will be saved in tensorflow variables and included in the model file; however, training is slower that way - hence not including it by default


+
+
+
+
 ## Subpackages

 - [A2C](baselines/a2c)
@@ -121,19 +125,10 @@ This should get to the mean reward per episode about 5k. To load and visualize t
 - [DQN](baselines/deepq)
 - [GAIL](baselines/gail)
 - [HER](baselines/her)
- [PPO1](baselines/ppo1) (obsolete version, left here temporarily)
- [PPO2](baselines/ppo2) 
+- [PPO1](baselines/ppo1) (Multi-CPU using MPI)
+- [PPO2](baselines/ppo2) (Optimized for GPU)
 - [TRPO](baselines/trpo_mpi)

-
-
-## Benchmarks
-Results of benchmarks on Mujoco (1M timesteps) and Atari (10M timesteps) are available 
-[here for Mujoco](https://htmlpreview.github.com/?https://github.com/openai/baselines/blob/master/benchmarks_mujoco1M.htm) 
-and
-[here for Atari](https://htmlpreview.github.com/?https://github.com/openai/baselines/blob/master/benchmarks_atari10M.htm) 
-respectively. Note that these results may be not on the latest version of the code, particular commit hash with which results were obtained is specified on the benchmarks page. 
-
 To cite this repository in publications:

    @misc{baselines,
--- a/baselines/deepq/README.md
+++ b/baselines/deepq/README.md
@@ -32,7 +32,7 @@ In particular notice that once `deepq.learn` finishes training it returns `act`


 - [baselines/deepq/experiments/custom_cartpole.py](experiments/custom_cartpole.py) - Cartpole training with more fine grained control over the internals of DQN algorithm.
- [baselines/deepq/experiments/run_atari.py](experiments/run_atari.py) - more robust setup for training at scale.
+- [baselines/deepq/experiments/atari/train.py](experiments/atari/train.py) - more robust setup for training at scale.


 ##### Download a pretrained Atari agent
--- a/benchmarks_atari10M.htm
+++ b/benchmarks_atari10M.htm
--- a/benchmarks_mujoco1M.htm
+++ b/benchmarks_mujoco1M.htm
Author	SHA1	Message	Date
Peter Zhokhov	ea68f3b7e6	dummy commit to RUN BENCHMARKS	2018-08-10 09:46:43 -07:00
Peter Zhokhov	ca721a4be6	Merge branch 'observation-dtype' of github.com:openai/baselines into peterz_benchmarks	2018-08-10 09:45:50 -07:00
Peter Zhokhov	72f3572a10	fixed syntax in conv_only RUN BENCHMARKS	2018-08-08 16:24:39 -07:00
Peter Zhokhov	b9cd941471	dummy commit to RUN BENCHMARKS	2018-08-08 15:59:59 -07:00
Peter Zhokhov	0899b71ede	scale the images in conv_only RUN BENCHMARKS	2018-08-08 15:15:03 -07:00
Peter Zhokhov	cc8c9541fb	dummy commit to RUN BENCHMARKS	2018-08-08 15:10:39 -07:00
Peter Zhokhov	cb32522394	enable all benchmarks	2018-08-08 15:10:00 -07:00
Peter Zhokhov	1e40ec22be	dummy commit to RUN BENCHMARKS	2018-08-08 10:45:18 -07:00
Peter Zhokhov	701a36cdfa	added a note in README about TfRunningMeanStd and serialization of VecNormalize	2018-08-08 10:44:58 -07:00
Peter Zhokhov	5a7f9847d8	flake8 complaints	2018-08-03 13:59:58 -07:00
Peter Zhokhov	b63134e5c5	added acer runner (missing import)	2018-08-03 13:31:37 -07:00
Peter Zhokhov	db314cdeda	Merge branch 'peterz_profile_vec_normalize' into peterz_migrate_rlalgs	2018-08-03 11:47:36 -07:00
Peter Zhokhov	b08c083d91	use VecNormalize with regular RunningMeanStd	2018-08-03 11:44:12 -07:00
Peter Zhokhov	bfbbe66d9e	profiling wip	2018-08-02 11:23:12 -07:00
Peter Zhokhov	1c5c6563b7	reverted VecNormalize to use RunningMeanStd (no tf)	2018-08-02 10:55:09 -07:00
Peter Zhokhov	1fa8c58da5	reverted VecNormalize to use RunningMeanStd (no tf)	2018-08-02 10:54:07 -07:00
Peter Zhokhov	f6d1115ead	reverted running_mean_std to user property decorators for mean, var, count	2018-08-02 10:32:22 -07:00
Peter Zhokhov	f6d5a47bed	use ncpu=1 for mujoco sessions - gives a bit of a performance speedup	2018-08-02 10:24:21 -07:00
Peter Zhokhov	c2df27bee4	non-tf normalization benchmark RUN BENCHMARKS	2018-08-02 09:41:41 -07:00
Peter Zhokhov	974c15756e	changed default ppo2 lr schedule to linear RUN BENCHMARKS	2018-08-01 16:24:44 -07:00
Peter Zhokhov	ad43fd9a35	add defaults	2018-08-01 16:15:59 -07:00
Peter Zhokhov	72c357c638	hardcode names of retro environments	2018-08-01 15:18:59 -07:00
Peter Zhokhov	e00e5ca016	run ppo_mpi benchmarks only RUN BENCHMARKS	2018-08-01 14:56:08 -07:00
Peter Zhokhov	705797f2f0	Merge branch 'peterz_migrate_rlalgs' into peterz_benchmarks	2018-08-01 14:46:40 -07:00
Peter Zhokhov	fcd84aa831	make_atari_env compatible with mpi	2018-08-01 14:46:18 -07:00
Peter Zhokhov	390b51597a	benchmarks on ppo2 only RUN BENCHMARKS	2018-08-01 11:01:50 -07:00
Peter Zhokhov	95104a3592	Merge branch 'peterz_migrate_rlalgs' into peterz_benchmarks	2018-08-01 10:50:29 -07:00
Peter Zhokhov	3528f7b992	save all variables to make sure we save the vec_normalize normalization	2018-08-01 10:12:19 -07:00
Peter Zhokhov	151e48009e	flake8 complaints	2018-07-31 16:25:12 -07:00
Peter Zhokhov	92f33335e9	dummy commit to RUN BENCHMARKS	2018-07-31 15:53:18 -07:00
Peter Zhokhov	af729cff15	dummy commit to RUN BENCHMARKS	2018-07-31 15:37:00 -07:00
Peter Zhokhov	10f815fe1d	fixed import in vec_normalize	2018-07-31 15:19:43 -07:00
Peter Zhokhov	8c4adac898	running_mean_std uses tensorflow variables	2018-07-31 14:45:55 -07:00
Peter Zhokhov	2a93ea8782	serialize variables as a dict, not as a list	2018-07-31 11:13:31 -07:00
Peter Zhokhov	9c48f9fad5	very dummy commit to RUN BENCHMARKS	2018-07-31 10:23:43 -07:00
Peter Zhokhov	348cbb4b71	dummy commit to RUN BENCHMARKS	2018-07-31 09:42:23 -07:00
Peter Zhokhov	a1602ab15f	dummy commit to RUN BENCHMARKS	2018-07-30 17:51:16 -07:00
Peter Zhokhov	e63e69bb14	dummy commit to RUN BENCHMARKS	2018-07-30 17:39:22 -07:00
Peter Zhokhov	385e7e5c0d	dummy commit to RUN BENCHMARKS	2018-07-30 17:21:05 -07:00
Peter Zhokhov	d112a2e49f	added approximate humanoid reward with ppo2 into the README for reference	2018-07-30 16:58:31 -07:00
Peter Zhokhov	e662dd6409	run.py can run algos from both baselines and rl_algs	2018-07-30 16:09:48 -07:00
Peter Zhokhov	efc6bffce3	replaced atari_arg_parser with common_arg_parser	2018-07-30 15:58:56 -07:00
Peter Zhokhov	872181d4c3	re-exported rl_algs - fixed problems with serialization test and test_cartpole	2018-07-30 15:49:48 -07:00
Peter Zhokhov	628ddecf6a	re-exported rl_algs	2018-07-30 12:15:46 -07:00
peter	83a4a4be65	run slow tests	2018-07-26 14:39:25 -07:00
peter	7edac38c73	more stuff from rl-algs	2018-07-26 14:26:57 -07:00
peter	a6dca44115	exported rl-algs	2018-07-26 14:02:04 -07:00
Karl Cobbe	622915c473	fix dtype for wrapper observation spaces	2018-06-12 14:48:39 -07:00
Karl Cobbe	a1d3c18ec0	fix dtype for wrapper observation spaces	2018-06-11 13:35:47 -07:00