Jonathan Raiman
a4fba209c4
remove ref to simple_bench + rank variable not used [mpi gone]
2017-10-25 14:08:01 -07:00
John Schulman
bb40378118
change atari preprocessing to use faster opencv
...
some logger changes
2017-10-25 09:21:29 -04:00
John Schulman
4993286230
Merge pull request #160 from mkarutz/fixFrameStackingA2C
...
Fixes frame stacking in A2C and ACKTR for multi-channel observations
2017-10-09 14:12:28 -07:00
Malcolm Karutz
cc8818f49e
Fixes frame stacking in A2C and ACKTR for multi-channel observation spaces.
2017-10-09 13:08:41 +11:00
John Schulman
3eb71a0ece
Merge pull request #151 from emansim/master
...
Fixes the NaN issues in ACKTR + bug in run_mujoco.py
2017-09-30 14:51:56 -07:00
Elman Mansimov
f8663eaf11
fixes acktr_cont issues
2017-09-30 17:21:04 -04:00
John Schulman
699919f1cf
Merge pull request #64 from jhumplik/master
...
Use standardized advantages in trpo.
2017-09-07 01:57:04 -07:00
John Schulman
498b4cfead
Merge pull request #128 from louiehelm/louiehelm-patch-1
...
Fix command lines
2017-09-06 01:04:47 -07:00
Louie Helm
589387403b
fix ppo command in readme
2017-09-05 06:06:19 -07:00
Louie Helm
3d3ea6cb16
fix trpo command in readme
2017-09-05 06:04:37 -07:00
John Schulman
902ffcb767
Merge pull request #120 from hamzamerzic/tensorflow_global_variable
...
Deprecated VARIABLES -> GLOBAL_VARIABLES.
2017-08-28 21:27:23 -07:00
Hamza Merzic
a7320b80c0
Deprecated VARIABLES -> GLOBAL_VARIABLES.
2017-08-28 16:51:48 +02:00
John Schulman
4e2a570eb4
Merge pull request #104 from stevenschmatz/patch-1
...
Fix relative links in README.md
2017-08-27 22:54:52 -07:00
John Schulman
6f39148452
fix gym req
2017-08-27 22:49:50 -07:00
John Schulman
2f30833043
Merge branch 'master' of github.com:openai/baselines
2017-08-27 22:36:44 -07:00
John Schulman
00cdeff35e
add __init__.py
2017-08-27 22:36:24 -07:00
John Schulman
410ef38898
Merge pull request #103 from learnercys/master
...
Adding links to source files
2017-08-27 22:31:46 -07:00
John Schulman
aa6e58bdf1
fix readmes
2017-08-27 22:22:14 -07:00
John Schulman
d9f194f797
Fix atari wrapper (affecting a2c perf) and pposgd mujoco performance
...
- removed vf clipping in pposgd - that was severely degrading performance on mujoco because it didn’t account for scale of returns
- switched adam epsilon in pposgd_simple
- brought back no-ops in atari wrapper (oops)
- added readmes
- revamped run_X_benchmark scripts to have standard form
- cleaned up DDPG a little, removed deprecated SimpleMonitor and non-idiomatic usage of logger
2017-08-27 22:14:59 -07:00
Steven Schmatz
06b071c105
Fix relative links in README.md
2017-08-18 13:35:22 -04:00
John Schulman
3f676f7d1e
ACKTR + A2C
2017-08-18 09:25:39 -07:00
Carlos Hernandez
b7966b31a5
Adding links to source files
2017-08-18 01:16:00 -06:00
Matthias Plappert
882251878f
Parameter space noise for DQN and DDPG ( #75 )
...
* Export param noise
* Update documentation
* Final finishing touches
2017-07-27 08:10:59 -07:00
Jan Humplik
4862140cea
Use standardized advantages in trpo.
2017-07-23 22:42:55 +02:00
Peter Welinder
df82a15fd3
Fix broken links in DQN readme
2017-07-23 09:58:10 -07:00
Jonas Schneider
5dc00628fe
readme fiddling
2017-07-20 09:00:24 -07:00
John Schulman
79b4a8a88e
Merge pull request #60 from openai/ppo-trpo
...
ppo and trpo
2017-07-20 08:55:43 -07:00
John Schulman
da99706046
ppo and trpo
2017-07-20 08:52:35 -07:00
Szymon Sidor
80f94f8ec5
bump version
2017-07-12 14:48:05 -07:00
Szymon Sidor
2b1b437908
Update simple.py
2017-07-12 23:42:36 +02:00
Szymon Sidor
04cd0dcf64
Merge pull request #52 from farbeiza/patch-1
...
Effectively apply weights from the replay buffer
2017-07-12 23:37:28 +02:00
Szymon Sidor
248aad1c3b
Merge pull request #39 from mirceamironenco/master
...
Fix TF graph variables deprecation
2017-07-12 23:32:24 +02:00
Fernando Arbeiza
d76cd1297a
Effectively apply weights from the replay buffer
...
It seems that the weights retrieved from the replay buffer are not applied when training the model. Is there any reason for that or am I missing something?
In any case, I have added a parameter in order for them to be used; just in case it is useful.
2017-07-11 11:09:51 +02:00
MironencoMircea
91b10857d8
Fixed TF graph variables deprecation
2017-06-28 15:48:45 +02:00
Szymon Sidor
0778e9f10f
Merge pull request #28 from zach-nervana/patch-1
...
remove unnecessary initialization of variable resized_screen
2017-06-23 17:05:25 -07:00
Szymon Sidor
59c7887e6b
Merge pull request #26 from LinZichuan/master
...
Update setup.py
2017-06-23 17:02:05 -07:00
Szymon Sidor
3d235ae7b8
Merge pull request #33 from cxxgtxy/master
...
Fix README since BreakOut pretrained model doesn't match the correct …
2017-06-23 16:59:55 -07:00
cxx
5e73387494
Fix README since BreakOut pretrained model doesn't match the correct tensor shape. Therefore, Pong is used instead.
2017-06-16 15:38:42 +08:00
Zach Dwiel
ec38bf460e
remove unnecessary initialization of variable resized_screen
2017-06-09 08:53:10 -04:00
Zichuan Lin
ef1a2402fc
Update setup.py
2017-06-07 17:29:38 +08:00
Szymon Sidor
184440ffd3
Merge pull request #22 from ngc92/doc_fixes
...
docstring and comment fixes
2017-06-04 00:41:34 -07:00
Szymon Sidor
fba0ac30ca
Merge pull request #15 from tiagosgc/patch-1
...
Update README.md
2017-06-04 00:40:58 -07:00
Szymon Sidor
584261a94a
Merge pull request #14 from quanvuong/master
...
Consistent initial type (float) for episode_rewards
2017-06-04 00:40:42 -07:00
Szymon Sidor
9c10c2fc27
Merge pull request #13 from ppwwyyxx/patch-1
...
Update setup.py
2017-06-04 00:40:31 -07:00
ngc92
02919483f2
docstring and comment fixes
2017-06-02 01:43:51 +02:00
Tiago Carvalho
1f3c3e33e7
Update README.md
2017-05-31 12:14:28 +01:00
Quan Vuong
86054f7a98
Consistent initial type (float) for episode_rewards
2017-05-30 11:49:25 +08:00
Yuxin Wu
709c327c40
Update setup.py
...
`PongNoFrameskip-v4` seems to require `gym>=0.9.1`
2017-05-29 19:39:25 -07:00
Szymon Sidor
fc2bbed4da
Merge pull request #11 from yenchenlin/fix-typo
...
Fix typos
2017-05-28 12:56:46 -07:00
YenChenLin
4fd1d21845
Fix typo
2017-05-28 13:13:47 -04:00