Commit Graph

1491 Commits

Author SHA1 Message Date
Matt Harvey
92d5c9458a Move _reset() in constructor to below the definitions of action_space (#220) 2016-06-20 00:38:25 +02:00
Greg Brockman
a7e5a581dc scoring.py: Avoid NaN when computing standard error 2016-06-17 18:45:19 -07:00
Rafael Cosman
ad4930c812 adds timestep limit to cartpole safety envs (#215)
* add probabilistic off-switch cartpole environment

* renames prob_offswitch_cartpole to offswitch_cartpole_prob

* adds timestep_limit to cartpole safety envs
2016-06-17 18:38:36 -07:00
Rafael Cosman
330889efb4 adds README to safety/ (#209) 2016-06-17 14:54:27 -07:00
Rafael Cosman
0d5312e835 fixes error in safety envs description of pendulum problems (#208) 2016-06-17 14:43:58 -07:00
Rafael Cosman
a46972cf8d overrides step() in safety envs to pass monitor true_reward (#206) 2016-06-17 14:41:14 -07:00
Arushi Raghuvanshi
9460023398 Added test for environment semantics (#196) 2016-06-17 13:23:49 -07:00
Jie Tang
36d476224e Assert is a keyword, not a function 2016-06-16 01:17:37 -07:00
Jie Tang
5372d34b37 Fix failing tests by making safety cartpole envs contain a cartpole rather than
inheriting from it
2016-06-16 01:17:26 -07:00
Jie Tang
8a545eeb59 Tighten assertions (#197) 2016-06-16 00:12:47 -07:00
Philip Paquette
681a80f10d Doom - Only setting seed once (#194) 2016-06-14 20:00:44 -07:00
Philip Paquette
a3f89de736 Doom - Added meta-Doom as separate PR (#189)
* Doom - Added meta-Doom as separate PR

* meta-Doom - Added description to scoreboard
2016-06-14 16:21:58 -07:00
Philip Paquette
aff7a643cc Doom - Same Action Space Across Environments (#157)
* Doom - Added reward_threshold and timestep_limit for all environments

* Doom - Returning all available game variables

* Doom - Moved _seed to doom_env to avoid repetition in every environment

* Doom - Added ALT_ATTACK and made all action_space equivalent (same controls between environments).

* Doom - Actions can either be a short list of allowed actions or the full list of 41 commands

* Doom - Returning black observation space on error or is_finished, rather than empty list (which was triggering an error)

* Doom - HighLow.sample() returns the small list.

* Doom - Updated difficulty for some missions

* Doom - Fixed inconsistency between controls.md and deathmatch.cfg

* Doom - Issue #168 - Remove sleep statement from DoomEnv render

* Doom - Only using full action space (43 keys)

- Added 'normal', 'fast' and 'human' mode
- Set non-deterministic to True
- Set video.frames_per_second to 35
- Properly returning game variables

* Replaced warnings.warn by logger.warn

* Doom - Added NUM_ACTIONS and action_idx instead of x

* Doom - Added NUM_ACTIONS and action_idx instead of x

* Doom - reset() only calls game.new_episode() after first call

* Doom is now deterministic

* Doom - Partial fix for issue #167 - DoomDeathmatch environment crashes sporadically

* Doom - Standardized envs, simplified _reset

* Doom - Removed temporary fix for issue #167

* Doom - Added scoreboard summary and description
2016-06-14 15:57:47 -07:00
Rafael Cosman
5b8603066c shortens safety env names (#193) 2016-06-14 15:57:25 -07:00
Rafael Cosman
2b503f719b improves descriptions for safety envs (#192) 2016-06-14 11:37:10 -07:00
Rafael Cosman
36cc23707b Adds summaries for safety envs (#190) 2016-06-14 10:39:23 -07:00
Oleg Klimov
af5bb400fe trivial fix for #143 (#187)
* trivial fix for #143

* #143 same fix for other environments, not really necessary, but still
2016-06-14 08:32:51 -07:00
Greg Brockman
2aa03d6088 Add configure method to Env, and support multiple displays in CartPole (#175)
* Add configure method to Env, and support multiple displays in CartPole

This allows people to pass runtime specification which doesn't affect
the environment semantics to environments created via `make`.

Also include an example of setting the display used for CartPole

* Provide full configure method

* Allow environments to require configuration

* Don't take arguments in make
2016-06-12 20:56:21 -07:00
Rafael Cosman
f7f064160e Improves safety docs (#182)
* add interruptibility method

* revise docs for interruptibility method

* improves safety docs
2016-06-12 20:55:21 -07:00
Rafael Cosman
f71a836528 fully qualifies paths in safety/__init__ (#183) 2016-06-12 20:55:05 -07:00
Rafael Cosman
b6e9a45857 makes off_switch_cartpole use a tuple instead of an np array (#177) 2016-06-12 16:58:46 -07:00
Rafael Cosman
85b9f909b4 removes unofficial cartpole testfile (#179) 2016-06-12 16:58:21 -07:00
Rafael Cosman
055f8895c2 makes semi_supervised_pendulum envs pass determinism tests (#178) 2016-06-12 16:58:13 -07:00
Rafael Cosman
0df07849e7 Fixes reset states for 2 interpretability envs (#180) 2016-06-12 16:57:58 -07:00
Philip Paquette
93e49ba2c8 Monitoring Tests - Added skip_mujoco test (#171) 2016-06-12 14:19:28 -07:00
Rafael Cosman
2020cc0966 makes test_determinism support tuples (#176) 2016-06-12 14:08:05 -07:00
Greg Brockman
a1f80e2fa9 Only create ~/.mujoco if MUJOCO_KEY_BUNDLE is set
This will fix PR builds. CF:

  37b87ccba6 (r66727379)
2016-06-12 14:06:31 -07:00
Rafael Cosman
c784b71aed Series of safety environments (#172)
* adds off_switch_cartpole.py

* adds interpretability_cartpole_actions.py

* adds semi_supervised_pendulum_noise.py

* adds semi_supervised_pendulum_random.py

* adds calls to reset()

* adds interpretability_cartpole_observations.py

* adds semi_supervised_pendulum_decay.py

* adds __init__.py

* adds registration

* removes unofficial test files
2016-06-12 13:36:50 -07:00
Rafael Cosman
f254dd197e adds example usage to spaces (#173) 2016-06-11 23:10:58 -07:00
Jie Tang
a4c35f82c0 Merge PR #170 2016-06-10 14:07:25 -07:00
Catherine Olsson
e3548d62aa Make it possible to step() in a newly created env, rather than throwing AttributeError 2016-06-10 16:26:07 -04:00
Jie Tang
7605986f09 Push up x_threshold to silence warnings: fixes #88 2016-06-09 16:22:15 -07:00
Greg Brockman
116b03144c Add docstring for monitor property 2016-06-09 11:31:35 -07:00
Greg Brockman
d257dac86e Go back to lazily creating the monitor
The refresh case is complicated enough that special handling is needed anyway
2016-06-09 11:29:34 -07:00
Greg Brockman
e759526268 Handle monitor close better 2016-06-09 11:27:27 -07:00
Greg Brockman
c183708a52 Correct self -> env 2016-06-09 09:45:55 -07:00
Greg Brockman
790f02e471 Create monitor at __new__ time
Only reason we don't already is we didn't have the
__new__ hook when we first wrote this code
2016-06-09 09:26:27 -07:00
John Schulman
9714ea8a66 add bibtex entry and reference to whitepaper 2016-06-07 00:37:31 -07:00
Oleg Klimov
72d89cb22f Faster video recording (#119)
* Faster video recording

* rendering.py: return_rgb_array default to False, for other environments not to break
2016-06-06 00:06:26 -07:00
Tom White
0079589bbc Update README.rst (#158)
Fixed LunarLander version in code sample.
2016-06-05 23:09:06 -07:00
Rafael Cosman
491cbca5af Adds instructions for adding envs (#160) 2016-06-05 20:35:53 -07:00
Jie Tang
f925af26d0 Add credit for contributed envs 2016-06-03 18:17:40 -07:00
Greg Brockman
9ce682914f Bump version 2016-06-02 10:25:10 -07:00
Greg Brockman
f6303f6a20 Only warn once about out-of-bound actions and observations (#153) 2016-06-02 10:19:33 -07:00
Jie Tang
d173281ea6 Remove algorithm id from example agents, add documentation to api.py 2016-06-01 11:16:29 -07:00
Greg Brockman
81a99013bc Fix car_dynamics import 2016-06-01 07:17:40 -07:00
Maciek
43992f4752 Make agent examples compatible with python 3 (#150)
* make cem agen exaple compatible with python 2 and 3

* make the keyboard_agent example compatible with python 2 and 3

Changing `xrange` to `range` should not impact performance unless we're
generating millions of elements (currently only 1000).

* remove algorithm_id from the upload call
2016-06-01 07:15:18 -07:00
Jie Tang
d167a391a4 Add dependencies for building scipy from source 2016-06-01 00:32:21 -07:00
Jie Tang
c2819fa0f6 Merge branch 'iaroslav-ai-master' 2016-06-01 00:13:59 -07:00
Jie Tang
7cbc24ba08 Fix failing DoomDeathmatch test (altattack -> alattack) 2016-06-01 00:12:14 -07:00