Commit Graph

1491 Commits

Author SHA1 Message Date
Last G
328ebc5c87 Bug in frozen lake implementation
Because of bug terminal states do not act as terminal
2016-12-25 01:05:31 -08:00
Tom Brown
d20a5d02ed Bump version to 0.6.0 2016-12-23 16:28:19 -08:00
Tom Brown
2d44ed4968 Add Monitored wrapper (#434)
* Add WIP Monitored wrapper

* Remove irrelevant render after close monitor test

* py27 compatibility

* Fix test_benchmark

* Move Monitored out of wrappers __init__

* Turn Monitored into a function that returns a Monitor class

* Fix monitor tests

* Remove deprecated test

* Remove deprecated utility

* Prevent duplicate wrapping, add test

* Fix test

* close env in tests to prevent writing to nonexistent file

* Disable semisuper tests

* typo

* Fix failing spec

* Fix monitoring on semisuper tasks

* Allow disabling of duplicate check

* Rename MonitorManager

* Monitored -> Monitor

* Clean up comments

* Remove cruft
2016-12-23 16:21:42 -08:00
Jie Tang
dc07c7d414 Skip tests properly in test_env_semantics 2016-12-22 17:49:34 -08:00
Jie Tang
5ca80a3141 Fix format of solves for TotalReward 2016-12-21 20:52:38 -08:00
Jie Tang
6d5cc8fbf1 Add score from file helper 2016-12-21 20:52:23 -08:00
Greg Brockman
f8a7f5e129 Bump version 2016-12-17 08:54:56 -08:00
Joean
626643bb07 Update hex.py (#425)
I think this variable in Line 127 should be 'a', not 'action'
2016-12-15 06:07:16 -08:00
Szymon Sidor
527a73dd03 fix atari benchmarks 2016-12-14 14:02:08 -08:00
Szymon Sidor
3aae03c238 monitor keeps count of the total number of steps taken 2016-12-14 13:55:18 -08:00
Jie Tang
b33fc9fb85 Fix a bug when the benchmark result is empty 2016-12-13 22:05:29 -08:00
Jie Tang
54ead345dc Bunch of refactoring since TotalReward and RewardPerTime scoring are quite
similar
2016-12-13 22:01:50 -08:00
Jie Tang
f63bb2e1aa Add RewardPerTime scoring function and tests 2016-12-13 22:01:01 -08:00
michalsustr
afd6888d2f swap avconv / ffmpeg. ffmpeg is obsolete, so it shouldn't be evaled first (crashes on my pc) (#364)
Avconv seems more likely to work for more people, so switch
2016-12-13 21:09:40 -08:00
Szymon Sidor
c6215f190d simplify Atari benchmarks, and allow finer grain control over how they are displayed on the website 2016-12-13 16:57:19 -08:00
damodei
8944480727 Merge branch 'master' of github.com:openai/gym 2016-12-07 04:51:24 +01:00
damodei
7804acefe4 add racing games benchmark 2016-12-07 04:51:05 +01:00
Matthew Chan
895155682e Update environments.md (#398)
Added link to the 2D maze environment.
2016-12-06 17:07:48 -08:00
Avital Oliver
8676fa858f Unbreak random_agent (#406)
The semantics of `env.reset()` have changed. It's no longer possible
to reset an environment until it's done.

Notably, this commit changes the behavior of random_agent. It used
to have a limit on number of actions per episode. If we wanted
that now, we'd have to close the environment and recreate it
on each episode, which may be slow.
2016-12-06 17:07:23 -08:00
Greg Brockman
057862d83f Run video recorder in a separate session 2016-12-04 17:17:22 -08:00
Szymon Sidor
e97ec21acc add ability to mark env_groups as universe/gym 2016-12-04 15:35:28 -08:00
catherio
2c0c7bfde1 README tweaks 2016-12-04 07:34:14 -08:00
Trevor Blackwell
fe002c8b7f For autoreset envs, advance the video recorder episode 2016-12-02 18:06:01 -08:00
Trevor Blackwell
f243b55316 Cut 2 more nondeterministic envs 2016-12-01 22:03:07 -08:00
Trevor Blackwell
31855a97bc Better error messages for env semantic changes 2016-12-01 13:31:10 -08:00
Trevor Blackwell
cd7896b3b6 Change rewards for SemisuperPendulumNoise-v0 to match env 2016-12-01 13:30:47 -08:00
Trevor Blackwell
8fad22d51a Better error messages for env semantic changes 2016-12-01 13:21:59 -08:00
Trevor Blackwell
22f091dbc5 Remove mujoco envs from test, since we can't run on Travis 2016-12-01 13:10:09 -08:00
Tambet Matiisen
4ca0969c79 Added Minecraft benchmarks. 2016-12-01 10:30:03 -08:00
Jie Tang
5dba36c68d Fix benchmark score compute when a monitor file is empty 2016-11-30 22:38:11 -08:00
Tambet Matiisen
214f9fb779 Added reward thresholds to ClassicControl-v0 benchmark. 2016-11-30 11:39:33 -08:00
Tambet Matiisen
460f104a1e Renamed ClassicControl benchmark. 2016-11-29 13:58:50 -08:00
Szymon Sidor
ebd0dcc515 add reward ranges to stochastic Atari Benchmarks 2016-11-29 13:42:48 -08:00
Tambet Matiisen
305b966f38 Added new benchmark ClassicControl2-v1. 2016-11-27 23:38:57 -08:00
Greg Brockman
b187596b22 Fix typo 2016-11-24 00:51:03 -08:00
Greg Brockman
84439ab08b Allow async envs to return None 2016-11-24 00:48:21 -08:00
Szymon Sidor
bcc68f8415 add reward bounds to Atari7PixelDeterministic benchmark 2016-11-23 20:14:30 -08:00
Szymon Sidor
3bee1000e0 add deterministic Atari v3 benchmark 2016-11-22 13:15:53 -08:00
Jie Tang
95f1248d7c Switch from median to mean when computing binned graphs 2016-11-22 01:24:57 -08:00
Trevor Blackwell
4bd03664d0 Fix Blackjack reward (after fixing env in b5108b384e) 2016-11-21 21:51:33 -08:00
Trevor Blackwell
3265fcc8ba Change to py3 formatting of rollout file. Add missing mujoco envs 2016-11-21 21:22:44 -08:00
Trevor Blackwell
b5108b384e Fix compare in Blackjack. Keeping at -v0 since it was totally broken before 2016-11-20 21:15:06 -08:00
Szymon Sidor
5fd1c86865 add a deterministic Atari benchmark 2016-11-15 10:28:40 -08:00
Greg Brockman
90214baa9f Bump version 2016-11-13 19:43:36 -08:00
Greg Brockman
faed1c6546 Get rid of separate async semantics 2016-11-13 19:42:57 -08:00
Greg Brockman
3864e42498 Bump version 2016-11-12 13:09:41 -08:00
Greg Brockman
53c074ab3b Add period to valid env IDs 2016-11-11 21:09:25 -08:00
Greg Brockman
b7e53c371d Update ObservationWrapper for new async semantics 2016-11-07 14:30:06 -08:00
Jie Tang
991b309cee Bump version 2016-11-02 14:18:57 -07:00
Jie Tang
974cbf4844 Allow multiple resets in a row 2016-11-02 12:56:38 -07:00