Andrew
f22bee085d
Add files via upload
2017-12-12 19:03:42 -08:00
andrew
4acc71fe23
add x, y, axis name
2017-12-12 18:58:57 -08:00
andrew
2f1b629ecc
Merge branch 'gail' of https://github.com/andrewliao11/baselines into gail
2017-12-12 18:56:00 -08:00
andrew
00573cf5e9
add x, y axis name
2017-12-12 18:54:03 -08:00
Andrew
cfa1236d78
Update README.md
2017-12-11 21:21:56 -08:00
Andrew
64288f9f84
Update gail-result.md
2017-12-11 21:19:47 -08:00
Andrew
5f647d4d34
Update README.md
2017-12-11 21:18:05 -08:00
Andrew
6723455b75
Update gail-result.md
2017-12-11 21:15:30 -08:00
Andrew
45a93cf2b9
add training curve from tensorboard
2017-12-11 21:06:04 -08:00
andrew
11604f7cc9
add download link to readme and add description to python file
2017-12-07 12:08:20 -08:00
Andrew
000033973b
Update gail-result.md
2017-12-03 15:50:24 -08:00
andrew
6090ee8292
add comparison for expert/BC/gail
2017-12-03 15:46:52 -08:00
andrew
7954327c5f
add behavior cloning learn/eval code
2017-12-03 13:55:44 -08:00
andrew
8495890534
add gail, file_writer for tf.summary, and allow specifying var_list for tf.train.Saver
2017-12-03 01:49:42 -08:00
John Schulman
b05be68c55
add missing files, fix Issue #209
2017-11-16 22:14:30 -08:00
John Schulman
2dd7d307d7
Add ACER, PPO2, and results_plotter.py
2017-11-16 10:02:32 -08:00
John Schulman
6a3cbb4bc5
switch append mode to write mode
2017-10-25 22:20:30 -04:00
John Schulman
bb40378118
change atari preprocessing to use faster opencv
...
some logger changes
2017-10-25 09:21:29 -04:00
John Schulman
4993286230
Merge pull request #160 from mkarutz/fixFrameStackingA2C
...
Fixes frame stacking in A2C and ACKTR for multi-channel observations
2017-10-09 14:12:28 -07:00
Malcolm Karutz
cc8818f49e
Fixes frame stacking in A2C and ACKTR for multi-channel observation spaces.
2017-10-09 13:08:41 +11:00
John Schulman
3eb71a0ece
Merge pull request #151 from emansim/master
...
Fixes the NaN issues in ACKTR + bug in run_mujoco.py
2017-09-30 14:51:56 -07:00
Elman Mansimov
f8663eaf11
fixes acktr_cont issues
2017-09-30 17:21:04 -04:00
John Schulman
699919f1cf
Merge pull request #64 from jhumplik/master
...
Use standardized advantages in trpo.
2017-09-07 01:57:04 -07:00
John Schulman
498b4cfead
Merge pull request #128 from louiehelm/louiehelm-patch-1
...
Fix command lines
2017-09-06 01:04:47 -07:00
Louie Helm
589387403b
fix ppo command in readme
2017-09-05 06:06:19 -07:00
Louie Helm
3d3ea6cb16
fix trpo command in readme
2017-09-05 06:04:37 -07:00
John Schulman
902ffcb767
Merge pull request #120 from hamzamerzic/tensorflow_global_variable
...
Deprecated VARIABLES -> GLOBAL_VARIABLES.
2017-08-28 21:27:23 -07:00
Hamza Merzic
a7320b80c0
Deprecated VARIABLES -> GLOBAL_VARIABLES.
2017-08-28 16:51:48 +02:00
John Schulman
4e2a570eb4
Merge pull request #104 from stevenschmatz/patch-1
...
Fix relative links in README.md
2017-08-27 22:54:52 -07:00
John Schulman
6f39148452
fix gym req
2017-08-27 22:49:50 -07:00
John Schulman
2f30833043
Merge branch 'master' of github.com:openai/baselines
2017-08-27 22:36:44 -07:00
John Schulman
00cdeff35e
add __init__.py
2017-08-27 22:36:24 -07:00
John Schulman
410ef38898
Merge pull request #103 from learnercys/master
...
Adding links to source files
2017-08-27 22:31:46 -07:00
John Schulman
aa6e58bdf1
fix readmes
2017-08-27 22:22:14 -07:00
John Schulman
d9f194f797
Fix atari wrapper (affecting a2c perf) and pposgd mujoco performance
...
- removed vf clipping in pposgd - that was severely degrading performance on mujoco because it didn’t account for scale of returns
- switched adam epsilon in pposgd_simple
- brought back no-ops in atari wrapper (oops)
- added readmes
- revamped run_X_benchmark scripts to have standard form
- cleaned up DDPG a little, removed deprecated SimpleMonitor and non-idiomatic usage of logger
2017-08-27 22:14:59 -07:00
Steven Schmatz
06b071c105
Fix relative links in README.md
2017-08-18 13:35:22 -04:00
John Schulman
3f676f7d1e
ACKTR + A2C
2017-08-18 09:25:39 -07:00
Carlos Hernandez
b7966b31a5
Adding links to source files
2017-08-18 01:16:00 -06:00
Matthias Plappert
882251878f
Parameter space noise for DQN and DDPG ( #75 )
...
* Export param noise
* Update documentation
* Final finishing touches
2017-07-27 08:10:59 -07:00
Jan Humplik
4862140cea
Use standardized advantages in trpo.
2017-07-23 22:42:55 +02:00
Peter Welinder
df82a15fd3
Fix broken links in DQN readme
2017-07-23 09:58:10 -07:00
Jonas Schneider
5dc00628fe
readme fiddling
2017-07-20 09:00:24 -07:00
John Schulman
79b4a8a88e
Merge pull request #60 from openai/ppo-trpo
...
ppo and trpo
2017-07-20 08:55:43 -07:00
John Schulman
da99706046
ppo and trpo
2017-07-20 08:52:35 -07:00
Szymon Sidor
80f94f8ec5
bump version
2017-07-12 14:48:05 -07:00
Szymon Sidor
2b1b437908
Update simple.py
2017-07-12 23:42:36 +02:00
Szymon Sidor
04cd0dcf64
Merge pull request #52 from farbeiza/patch-1
...
Effectively apply weights from the replay buffer
2017-07-12 23:37:28 +02:00
Szymon Sidor
248aad1c3b
Merge pull request #39 from mirceamironenco/master
...
Fix TF graph variables deprecation
2017-07-12 23:32:24 +02:00
Fernando Arbeiza
d76cd1297a
Effectively apply weights from the replay buffer
...
It seems that the weights retrieved from the replay buffer are not applied when training the model. Is there any reason for that or am I missing something?
In any case, I have added a parameter in order for them to be used; just in case it is useful.
2017-07-11 11:09:51 +02:00
MironencoMircea
91b10857d8
Fixed TF graph variables deprecation
2017-06-28 15:48:45 +02:00