From 10c205c1596dd58d1a9b33d423fb75228fe03953 Mon Sep 17 00:00:00 2001
From: pzhokhov <peterzhokhoff@gmail.com>
Date: Tue, 2 Oct 2018 16:33:19 -0700
Subject: [PATCH] Debug codegen ppo (#123)

* disabled tests, running benchmarks only

* dummy commit to RUN BENCHMARKS

* benchmark ppo_metal; disable all but Bullet benchmarks

* ppo2, codegen ppo and ppo_metal on Bullet RUN BENCHMARKS

* run benchmarks on Roboschool instead RUN BENCHMARKS

* run ppo_metal on Roboschool as well RUN BENCHMARKS

* install roboschool in cron rcall user_config

* dummy commit to RUN BENCHMARKS

* import roboschool in codegen/contcontrol_prob.py RUN BENCHMARKS

* re-enable tests, flake8

* get entropy from a distribution in Pred RUN BENCHMARKS

* gin for hyperparameter injection; try codegen ppo close to baselines ppo RUN BENCHMARKS

* provide default value for cg2/bmv_net_ops.py

* dummy commit to RUN BENCHMARKS

* make tests and benchmarks parallel; use relative path to gin file for rcall compatibility RUN BENCHMARKS

* syntax error in run-benchmarks-new.py RUN BENCHMARKS

* syntax error in run-benchmarks-new.py RUN BENCHMARKS

* path relative to codegen/training for gin files RUN BENCHMARKS

* another reconcilliation attempt between codegen ppo and baselines ppo RUN BENCHMARKS

* value_network=copy for ppo2 on roboschool RUN BENCHMARKS

* make None seed work with torch seeding RUN BENCHMARKS

* try sequential batches with ppo2 RUN BENCHMARKS

* try ppo without advantage normalization RUN BENCHMARKS

* use Distribution to compute ema NLL RUN BENCHMARKS

* autopep8

* clip gradient norm in algo_agent RUN BENCHMARKS

* try ppo2 without vfloss clipping RUN BENCHMARKS

* trying with gamma=0.0 - assumption is, both algos should be equally bad RUN BENCHMARKS

* set gamma=0 in ppo2 RUN BENCHMARKS

* try with ppo2 with single minibatch RUN BENCHMARKS

* try with nminibatches=4, value_network=copy RUN BENCHMARKS

* try with nminibatches=1 take two RUN BENCHMARKS

* try initialization for vf=0.01 RUN BENCHMARKS

* fix the problem with min_istart >= max_istart

* i have no idea RUN BENCHMARKS

* fix non-shared variance between old and new RUN BENCHMARKS

* restored baselines.common.policies

* 16 minibatches in ppo_roboschool.gin

* fixing results of merge

* cleanups

* cleanups

* fix run-benchmarks-new RUN BENCHMARKS Roboschool8M

* fix syntax in run-benchmarks-new RUN BENCHMARKS Roboschool8M

* fix test failures

* moved gin requirement to codegen/setup.py

* remove duplicated build_softq in get_algo.py

* linting

* run softq on continuous action spaces RUN BENCHMARKS Roboschool8M
---
 baselines/common/misc_util.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/baselines/common/misc_util.py b/baselines/common/misc_util.py
index 451de1c..6a296d4 100644
--- a/baselines/common/misc_util.py
+++ b/baselines/common/misc_util.py
@@ -76,10 +76,9 @@ def set_global_seeds(i):
     myseed = i  + 1000 * rank if i is not None else None
     try:
         import tensorflow as tf
+        tf.set_random_seed(myseed)
     except ImportError:
         pass
-    else:
-        tf.set_random_seed(myseed)
     np.random.seed(myseed)
     random.seed(myseed)