Last G
328ebc5c87
Bug in frozen lake implementation
...
Because of bug terminal states do not act as terminal
2016-12-25 01:05:31 -08:00
Tom Brown
d20a5d02ed
Bump version to 0.6.0
2016-12-23 16:28:19 -08:00
Tom Brown
2d44ed4968
Add Monitored wrapper ( #434 )
...
* Add WIP Monitored wrapper
* Remove irrelevant render after close monitor test
* py27 compatibility
* Fix test_benchmark
* Move Monitored out of wrappers __init__
* Turn Monitored into a function that returns a Monitor class
* Fix monitor tests
* Remove deprecated test
* Remove deprecated utility
* Prevent duplicate wrapping, add test
* Fix test
* close env in tests to prevent writing to nonexistent file
* Disable semisuper tests
* typo
* Fix failing spec
* Fix monitoring on semisuper tasks
* Allow disabling of duplicate check
* Rename MonitorManager
* Monitored -> Monitor
* Clean up comments
* Remove cruft
2016-12-23 16:21:42 -08:00
Jie Tang
dc07c7d414
Skip tests properly in test_env_semantics
2016-12-22 17:49:34 -08:00
Jie Tang
5ca80a3141
Fix format of solves for TotalReward
2016-12-21 20:52:38 -08:00
Jie Tang
6d5cc8fbf1
Add score from file helper
2016-12-21 20:52:23 -08:00
Greg Brockman
f8a7f5e129
Bump version
2016-12-17 08:54:56 -08:00
Joean
626643bb07
Update hex.py ( #425 )
...
I think this variable in Line 127 should be 'a', not 'action'
2016-12-15 06:07:16 -08:00
Szymon Sidor
527a73dd03
fix atari benchmarks
2016-12-14 14:02:08 -08:00
Szymon Sidor
3aae03c238
monitor keeps count of the total number of steps taken
2016-12-14 13:55:18 -08:00
Jie Tang
b33fc9fb85
Fix a bug when the benchmark result is empty
2016-12-13 22:05:29 -08:00
Jie Tang
54ead345dc
Bunch of refactoring since TotalReward and RewardPerTime scoring are quite
...
similar
2016-12-13 22:01:50 -08:00
Jie Tang
f63bb2e1aa
Add RewardPerTime scoring function and tests
2016-12-13 22:01:01 -08:00
michalsustr
afd6888d2f
swap avconv / ffmpeg. ffmpeg is obsolete, so it shouldn't be evaled first (crashes on my pc) ( #364 )
...
Avconv seems more likely to work for more people, so switch
2016-12-13 21:09:40 -08:00
Szymon Sidor
c6215f190d
simplify Atari benchmarks, and allow finer grain control over how they are displayed on the website
2016-12-13 16:57:19 -08:00
damodei
8944480727
Merge branch 'master' of github.com:openai/gym
2016-12-07 04:51:24 +01:00
damodei
7804acefe4
add racing games benchmark
2016-12-07 04:51:05 +01:00
Matthew Chan
895155682e
Update environments.md ( #398 )
...
Added link to the 2D maze environment.
2016-12-06 17:07:48 -08:00
Avital Oliver
8676fa858f
Unbreak random_agent ( #406 )
...
The semantics of `env.reset()` have changed. It's no longer possible
to reset an environment until it's done.
Notably, this commit changes the behavior of random_agent. It used
to have a limit on number of actions per episode. If we wanted
that now, we'd have to close the environment and recreate it
on each episode, which may be slow.
2016-12-06 17:07:23 -08:00
Greg Brockman
057862d83f
Run video recorder in a separate session
2016-12-04 17:17:22 -08:00
Szymon Sidor
e97ec21acc
add ability to mark env_groups as universe/gym
2016-12-04 15:35:28 -08:00
catherio
2c0c7bfde1
README tweaks
2016-12-04 07:34:14 -08:00
Trevor Blackwell
fe002c8b7f
For autoreset envs, advance the video recorder episode
2016-12-02 18:06:01 -08:00
Trevor Blackwell
f243b55316
Cut 2 more nondeterministic envs
2016-12-01 22:03:07 -08:00
Trevor Blackwell
31855a97bc
Better error messages for env semantic changes
2016-12-01 13:31:10 -08:00
Trevor Blackwell
cd7896b3b6
Change rewards for SemisuperPendulumNoise-v0 to match env
2016-12-01 13:30:47 -08:00
Trevor Blackwell
8fad22d51a
Better error messages for env semantic changes
2016-12-01 13:21:59 -08:00
Trevor Blackwell
22f091dbc5
Remove mujoco envs from test, since we can't run on Travis
2016-12-01 13:10:09 -08:00
Tambet Matiisen
4ca0969c79
Added Minecraft benchmarks.
2016-12-01 10:30:03 -08:00
Jie Tang
5dba36c68d
Fix benchmark score compute when a monitor file is empty
2016-11-30 22:38:11 -08:00
Tambet Matiisen
214f9fb779
Added reward thresholds to ClassicControl-v0 benchmark.
2016-11-30 11:39:33 -08:00
Tambet Matiisen
460f104a1e
Renamed ClassicControl benchmark.
2016-11-29 13:58:50 -08:00
Szymon Sidor
ebd0dcc515
add reward ranges to stochastic Atari Benchmarks
2016-11-29 13:42:48 -08:00
Tambet Matiisen
305b966f38
Added new benchmark ClassicControl2-v1.
2016-11-27 23:38:57 -08:00
Greg Brockman
b187596b22
Fix typo
2016-11-24 00:51:03 -08:00
Greg Brockman
84439ab08b
Allow async envs to return None
2016-11-24 00:48:21 -08:00
Szymon Sidor
bcc68f8415
add reward bounds to Atari7PixelDeterministic benchmark
2016-11-23 20:14:30 -08:00
Szymon Sidor
3bee1000e0
add deterministic Atari v3 benchmark
2016-11-22 13:15:53 -08:00
Jie Tang
95f1248d7c
Switch from median to mean when computing binned graphs
2016-11-22 01:24:57 -08:00
Trevor Blackwell
4bd03664d0
Fix Blackjack reward (after fixing env in b5108b384e
)
2016-11-21 21:51:33 -08:00
Trevor Blackwell
3265fcc8ba
Change to py3 formatting of rollout file. Add missing mujoco envs
2016-11-21 21:22:44 -08:00
Trevor Blackwell
b5108b384e
Fix compare in Blackjack. Keeping at -v0 since it was totally broken before
2016-11-20 21:15:06 -08:00
Szymon Sidor
5fd1c86865
add a deterministic Atari benchmark
2016-11-15 10:28:40 -08:00
Greg Brockman
90214baa9f
Bump version
2016-11-13 19:43:36 -08:00
Greg Brockman
faed1c6546
Get rid of separate async semantics
2016-11-13 19:42:57 -08:00
Greg Brockman
3864e42498
Bump version
2016-11-12 13:09:41 -08:00
Greg Brockman
53c074ab3b
Add period to valid env IDs
2016-11-11 21:09:25 -08:00
Greg Brockman
b7e53c371d
Update ObservationWrapper for new async semantics
2016-11-07 14:30:06 -08:00
Jie Tang
991b309cee
Bump version
2016-11-02 14:18:57 -07:00
Jie Tang
974cbf4844
Allow multiple resets in a row
2016-11-02 12:56:38 -07:00