baselines

Author	SHA1	Message	Date
Jonathan Raiman	a4fba209c4	remove ref to simple_bench + rank variable not used [mpi gone]	2017-10-25 14:08:01 -07:00
John Schulman	bb40378118	change atari preprocessing to use faster opencv some logger changes	2017-10-25 09:21:29 -04:00
John Schulman	4993286230	Merge pull request #160 from mkarutz/fixFrameStackingA2C Fixes frame stacking in A2C and ACKTR for multi-channel observations	2017-10-09 14:12:28 -07:00
Malcolm Karutz	cc8818f49e	Fixes frame stacking in A2C and ACKTR for multi-channel observation spaces.	2017-10-09 13:08:41 +11:00
John Schulman	3eb71a0ece	Merge pull request #151 from emansim/master Fixes the NaN issues in ACKTR + bug in run_mujoco.py	2017-09-30 14:51:56 -07:00
Elman Mansimov	f8663eaf11	fixes acktr_cont issues	2017-09-30 17:21:04 -04:00
John Schulman	699919f1cf	Merge pull request #64 from jhumplik/master Use standardized advantages in trpo.	2017-09-07 01:57:04 -07:00
John Schulman	498b4cfead	Merge pull request #128 from louiehelm/louiehelm-patch-1 Fix command lines	2017-09-06 01:04:47 -07:00
Louie Helm	589387403b	fix ppo command in readme	2017-09-05 06:06:19 -07:00
Louie Helm	3d3ea6cb16	fix trpo command in readme	2017-09-05 06:04:37 -07:00
John Schulman	902ffcb767	Merge pull request #120 from hamzamerzic/tensorflow_global_variable Deprecated VARIABLES -> GLOBAL_VARIABLES.	2017-08-28 21:27:23 -07:00
Hamza Merzic	a7320b80c0	Deprecated VARIABLES -> GLOBAL_VARIABLES.	2017-08-28 16:51:48 +02:00
John Schulman	4e2a570eb4	Merge pull request #104 from stevenschmatz/patch-1 Fix relative links in README.md	2017-08-27 22:54:52 -07:00
John Schulman	6f39148452	fix gym req	2017-08-27 22:49:50 -07:00
John Schulman	2f30833043	Merge branch 'master' of github.com:openai/baselines	2017-08-27 22:36:44 -07:00
John Schulman	00cdeff35e	add __init__.py	2017-08-27 22:36:24 -07:00
John Schulman	410ef38898	Merge pull request #103 from learnercys/master Adding links to source files	2017-08-27 22:31:46 -07:00
John Schulman	aa6e58bdf1	fix readmes	2017-08-27 22:22:14 -07:00
John Schulman	d9f194f797	Fix atari wrapper (affecting a2c perf) and pposgd mujoco performance - removed vf clipping in pposgd - that was severely degrading performance on mujoco because it didn’t account for scale of returns - switched adam epsilon in pposgd_simple - brought back no-ops in atari wrapper (oops) - added readmes - revamped run_X_benchmark scripts to have standard form - cleaned up DDPG a little, removed deprecated SimpleMonitor and non-idiomatic usage of logger	2017-08-27 22:14:59 -07:00
Steven Schmatz	06b071c105	Fix relative links in README.md	2017-08-18 13:35:22 -04:00
John Schulman	3f676f7d1e	ACKTR + A2C	2017-08-18 09:25:39 -07:00
Carlos Hernandez	b7966b31a5	Adding links to source files	2017-08-18 01:16:00 -06:00
Matthias Plappert	882251878f	Parameter space noise for DQN and DDPG (#75 ) * Export param noise * Update documentation * Final finishing touches	2017-07-27 08:10:59 -07:00
Jan Humplik	4862140cea	Use standardized advantages in trpo.	2017-07-23 22:42:55 +02:00
Peter Welinder	df82a15fd3	Fix broken links in DQN readme	2017-07-23 09:58:10 -07:00
Jonas Schneider	5dc00628fe	readme fiddling	2017-07-20 09:00:24 -07:00
John Schulman	79b4a8a88e	Merge pull request #60 from openai/ppo-trpo ppo and trpo	2017-07-20 08:55:43 -07:00
John Schulman	da99706046	ppo and trpo	2017-07-20 08:52:35 -07:00
Szymon Sidor	80f94f8ec5	bump version	2017-07-12 14:48:05 -07:00
Szymon Sidor	2b1b437908	Update simple.py	2017-07-12 23:42:36 +02:00
Szymon Sidor	04cd0dcf64	Merge pull request #52 from farbeiza/patch-1 Effectively apply weights from the replay buffer	2017-07-12 23:37:28 +02:00
Szymon Sidor	248aad1c3b	Merge pull request #39 from mirceamironenco/master Fix TF graph variables deprecation	2017-07-12 23:32:24 +02:00
Fernando Arbeiza	d76cd1297a	Effectively apply weights from the replay buffer It seems that the weights retrieved from the replay buffer are not applied when training the model. Is there any reason for that or am I missing something? In any case, I have added a parameter in order for them to be used; just in case it is useful.	2017-07-11 11:09:51 +02:00
MironencoMircea	91b10857d8	Fixed TF graph variables deprecation	2017-06-28 15:48:45 +02:00
Szymon Sidor	0778e9f10f	Merge pull request #28 from zach-nervana/patch-1 remove unnecessary initialization of variable resized_screen	2017-06-23 17:05:25 -07:00
Szymon Sidor	59c7887e6b	Merge pull request #26 from LinZichuan/master Update setup.py	2017-06-23 17:02:05 -07:00
Szymon Sidor	3d235ae7b8	Merge pull request #33 from cxxgtxy/master Fix README since BreakOut pretrained model doesn't match the correct …	2017-06-23 16:59:55 -07:00
cxx	5e73387494	Fix README since BreakOut pretrained model doesn't match the correct tensor shape. Therefore, Pong is used instead.	2017-06-16 15:38:42 +08:00
Zach Dwiel	ec38bf460e	remove unnecessary initialization of variable resized_screen	2017-06-09 08:53:10 -04:00
Zichuan Lin	ef1a2402fc	Update setup.py	2017-06-07 17:29:38 +08:00
Szymon Sidor	184440ffd3	Merge pull request #22 from ngc92/doc_fixes docstring and comment fixes	2017-06-04 00:41:34 -07:00
Szymon Sidor	fba0ac30ca	Merge pull request #15 from tiagosgc/patch-1 Update README.md	2017-06-04 00:40:58 -07:00
Szymon Sidor	584261a94a	Merge pull request #14 from quanvuong/master Consistent initial type (float) for episode_rewards	2017-06-04 00:40:42 -07:00
Szymon Sidor	9c10c2fc27	Merge pull request #13 from ppwwyyxx/patch-1 Update setup.py	2017-06-04 00:40:31 -07:00
ngc92	02919483f2	docstring and comment fixes	2017-06-02 01:43:51 +02:00
Tiago Carvalho	1f3c3e33e7	Update README.md	2017-05-31 12:14:28 +01:00
Quan Vuong	86054f7a98	Consistent initial type (float) for episode_rewards	2017-05-30 11:49:25 +08:00
Yuxin Wu	709c327c40	Update setup.py `PongNoFrameskip-v4` seems to require `gym>=0.9.1`	2017-05-29 19:39:25 -07:00
Szymon Sidor	fc2bbed4da	Merge pull request #11 from yenchenlin/fix-typo Fix typos	2017-05-28 12:56:46 -07:00
YenChenLin	4fd1d21845	Fix typo	2017-05-28 13:13:47 -04:00

1 2

56 Commits