gyunt
|
02e26fd9df
|
removed ppo_lstm_mlp .
|
2019-04-09 02:04:35 +09:00 |
|
gyunt
|
f63b09cf40
|
rename model.step_as_dict with model.step_with_dict .
|
2019-04-09 02:00:07 +09:00 |
|
gyunt
|
bb2523f54d
|
move RNN class to baselines/ppo2/layers.py' and revert baselines/common/models.py` to 858afa8 .
|
2019-04-09 01:53:10 +09:00 |
|
gyunt
|
b6e6c5201a
|
remove states variable.
|
2019-04-09 01:32:01 +09:00 |
|
gyunt
|
9af7351071
|
revert baselines/common/test/util.py to b875fb7 .
|
2019-04-08 23:30:26 +09:00 |
|
gyunt
|
417c52bf5f
|
remove redundant lines.
|
2019-04-08 22:54:40 +09:00 |
|
gyunt
|
82ceb4461c
|
revert baselines/common/tests/util.py .
|
2019-04-08 22:54:40 +09:00 |
|
gyunt
|
354b1bda41
|
replace ppo2.step() with original interface.
|
2019-04-08 22:54:33 +09:00 |
|
gyunt
|
93232a24e1
|
revert baselines/run.py.
|
2019-04-08 21:46:44 +09:00 |
|
gyunt
|
36aadd6a4b
|
reuse RNNs in a2c.utils.
|
2019-04-08 21:03:00 +09:00 |
|
gyunt
|
703a779991
|
remove state saving.
|
2019-04-08 21:02:55 +09:00 |
|
gyunt
|
0bccb3aa27
|
rename nlstm with num_units in RNN builder functions.
|
2019-04-08 20:52:18 +09:00 |
|
gyunt
|
e6f0d98b68
|
add RNN class.
|
2019-04-08 20:52:08 +09:00 |
|
gyunt
|
1dbfbaac16
|
Merge remote-tracking branch 'upstream/master' into ppo2_rnn_pr_2nd
# Conflicts:
# baselines/ppo2/ppo2.py
|
2019-04-08 18:12:13 +09:00 |
|
Peter Zhokhov
|
fa37beb52e
|
fix commit on atari bms page to point to a public commit
|
2019-04-06 20:03:32 -07:00 |
|
Peter Zhokhov
|
8a97e0df10
|
fix shuffling bug in ppo1
|
2019-04-05 15:23:46 -07:00 |
|
pzhokhov
|
fabbf2c611
|
short-circuit framestack wrapper with size 1 (#871)
|
2019-04-05 15:18:15 -07:00 |
|
Xingdong Zuo
|
5d285b318f
|
[Update misc_util.py]: clean up unused helper functions (#751)
* Update misc_util.py
* Update misc_util.py
|
2019-04-05 15:16:26 -07:00 |
|
Tim Zaman
|
49a99c7d23
|
Add eps to normalization (#797)
|
2019-04-05 14:46:01 -07:00 |
|
Peter Zhokhov
|
c79b3373bf
|
parse colon-separated env_id's
|
2019-04-05 14:43:09 -07:00 |
|
Sridhar Thiagarajan
|
6d1c6c78d3
|
Interface for U.make_session changed (#865)
|
2019-04-01 16:24:02 -07:00 |
|
JongGyun Kim
|
62a9c76f18
|
fix the definition of TfInput.make_feed_dict . (#812)
|
2019-04-01 15:49:25 -07:00 |
|
Hao-Chih, Lin
|
282c9cc91f
|
fix small bug in plot_results() (#864)
Remove the comma behind the last input argument
|
2019-04-01 15:48:35 -07:00 |
|
Peter Zhokhov
|
096f4d9cf0
|
neaten up stacking logic in mujoco_dset in gail
|
2019-04-01 15:47:13 -07:00 |
|
Mingfei
|
16136ddca7
|
fix bugs: obs_ph normalization in adversary.py (#823)
* fix bugs: obs_ph normalization in adversary.py
* fix bug in reshape obs and acs in Mujobo_Dset
|
2019-04-01 15:44:31 -07:00 |
|
Darío Hereñú
|
b1644157d6
|
Fixed typo on #092 (#824)
|
2019-04-01 15:41:52 -07:00 |
|
Yu Feng
|
58541db226
|
MPI refer to workers as ranks, not threads. (#833)
|
2019-04-01 15:38:45 -07:00 |
|
zlsh80826
|
c02b575f01
|
ppo2: use time.perf_counter() instead of time.time() for time measurement (#847)
|
2019-04-01 15:37:32 -07:00 |
|
Pastafarianist
|
897fa31548
|
Avoid using default config while requesting available GPUs (#863)
|
2019-03-29 13:25:56 -07:00 |
|
Brett Daley
|
d51f8be8f9
|
Report episode rewards/length in A2C and ACKTR (#856)
|
2019-03-28 09:21:48 -07:00 |
|
gyunt
|
536ade10f9
|
correct a typo.
|
2019-03-27 08:21:48 +09:00 |
|
gyunt
|
58aabbeb68
|
add the result of HalfCheeta-v2 env experiment.
|
2019-03-27 07:55:29 +09:00 |
|
gyunt
|
2a4ba2b0a5
|
add RNN layers.
|
2019-03-27 07:54:15 +09:00 |
|
Jacob Hilton
|
3f2f45acef
|
Merge pull request #860 from openai/build-retro-env-framestack-fix
run.py framestack bug fix
|
2019-03-25 14:33:15 -07:00 |
|
gyunt
|
45be273776
|
update REAMD.md .
|
2019-03-26 03:32:11 +09:00 |
|
Jacob Hilton
|
b64974eb90
|
build_env now doesn't apply frame stack to retro games twice
|
2019-03-24 12:27:14 -07:00 |
|
gyunt
|
82c7c96d77
|
remove redundant lines.
|
2019-03-23 05:50:52 +09:00 |
|
gyunt
|
243fba2dba
|
improve scopes and names of ppo2.
|
2019-03-23 05:35:08 +09:00 |
|
gyunt
|
b418b17ddd
|
wrap the initializations in ppo2.
|
2019-03-23 05:32:40 +09:00 |
|
gyunt
|
43a86980ea
|
name the memory variable of PPO RNNs more describly
|
2019-03-23 05:32:30 +09:00 |
|
gyunt
|
06cef53de3
|
clean the scope of ppo2 policy model.
|
2019-03-23 05:32:23 +09:00 |
|
gyunt
|
a9d3b1c727
|
improve scopes to compatible with multiple models (i.e, other tensorflow global/local variables)
|
2019-03-23 05:31:59 +09:00 |
|
gyunt
|
f996ffb52d
|
support the play.
|
2019-03-23 03:30:44 +09:00 |
|
gyunt
|
a9c2b79730
|
disable warning on purpose.
|
2019-03-23 00:50:52 +09:00 |
|
gyunt
|
c5474e53f8
|
fix checking of a model input args in simple_test function.
|
2019-03-22 23:20:40 +09:00 |
|
gyunt
|
b766b6413e
|
adjust input shape.
|
2019-03-22 23:00:11 +09:00 |
|
gyunt
|
93c3f32a76
|
make initialization once.
make `test_fixed_sequence` compatible with ppo2.
|
2019-03-22 12:18:58 +09:00 |
|
gyunt
|
e5da0bd0bb
|
rename 'obs' with 'observations'.
rename 'transition' with 'transitions'.
fix forgetting `dones` in the replay buffer.
fix a misuse of `states` and `next_states` in the replay buffer.
|
2019-03-22 11:49:07 +09:00 |
|
gyunt
|
dbd9ad3f63
|
make ppo2 rnn test available.
|
2019-03-22 05:39:00 +09:00 |
|
gyunt
|
8ddb807db2
|
add initial_state variable to help test.
|
2019-03-22 05:29:32 +09:00 |
|