2021-12-22 19:12:57 +01:00
from __future__ import annotations
2021-08-05 16:35:07 +02:00
from abc import abstractmethod
2022-03-31 12:50:38 -07:00
from typing import Generic , Optional , SupportsFloat , Tuple , TypeVar , Union
2021-08-05 16:35:07 +02:00
2018-02-26 17:35:07 +01:00
import gym
2022-02-10 18:24:41 +01:00
from gym import spaces
2021-12-08 22:14:15 +01:00
from gym . logger import deprecation
2022-03-31 12:50:38 -07:00
from gym . utils import seeding
2022-02-10 18:24:41 +01:00
from gym . utils . seeding import RandomNumberGenerator
2016-05-27 12:16:35 -07:00
2021-12-22 19:12:57 +01:00
ObsType = TypeVar ( " ObsType " )
ActType = TypeVar ( " ActType " )
2016-04-27 08:00:58 -07:00
2021-12-22 19:12:57 +01:00
class Env ( Generic [ ObsType , ActType ] ) :
2020-04-24 23:10:27 +02:00
""" The main OpenAI Gym class. It encapsulates an environment with
2016-04-28 10:33:37 -07:00
arbitrary behind - the - scenes dynamics . An environment can be
partially or fully observed .
The main API methods that users of this class need to know are :
step
2016-05-29 09:07:09 -07:00
reset
2016-04-28 10:33:37 -07:00
render
2016-05-15 15:59:02 -07:00
close
2016-05-29 09:07:09 -07:00
seed
2016-04-27 08:00:58 -07:00
And set the following attributes :
action_space : The Space object corresponding to valid actions
observation_space : The Space object corresponding to valid observations
2016-05-27 12:16:35 -07:00
reward_range : A tuple corresponding to the min and max possible rewards
2016-04-27 08:00:58 -07:00
2016-08-24 23:10:58 +02:00
Note : a default reward range set to [ - inf , + inf ] already exists . Set it if you want a narrower range .
2020-04-24 23:10:27 +02:00
The methods are accessed publicly as " step " , " reset " , etc . . .
2016-04-27 08:00:58 -07:00
"""
2021-07-29 02:26:34 +02:00
2016-04-27 08:00:58 -07:00
# Set this in SOME subclasses
2022-02-28 15:54:03 -05:00
metadata = { " render_modes " : [ ] }
2021-07-29 02:26:34 +02:00
reward_range = ( - float ( " inf " ) , float ( " inf " ) )
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
spec = None
2016-05-15 15:59:02 -07:00
2016-04-27 08:00:58 -07:00
# Set these in ALL subclasses
2021-12-22 19:12:57 +01:00
action_space : spaces . Space [ ActType ]
observation_space : spaces . Space [ ObsType ]
2016-04-27 08:00:58 -07:00
2021-12-08 22:14:15 +01:00
# Created
2022-02-10 18:24:41 +01:00
_np_random : RandomNumberGenerator | None = None
@property
def np_random ( self ) - > RandomNumberGenerator :
""" Initializes the np_random field if not done already. """
if self . _np_random is None :
self . _np_random , seed = seeding . np_random ( )
return self . _np_random
2021-12-08 22:14:15 +01:00
2022-02-18 01:38:22 +01:00
@np_random.setter
def np_random ( self , value : RandomNumberGenerator ) :
self . _np_random = value
2021-08-05 16:35:07 +02:00
@abstractmethod
2021-12-22 19:12:57 +01:00
def step ( self , action : ActType ) - > Tuple [ ObsType , float , bool , dict ] :
2016-04-28 10:33:37 -07:00
""" Run one timestep of the environment ' s dynamics. When end of
2022-04-06 20:12:55 +01:00
episode is reached , you are responsible for calling : meth : ` reset `
2016-04-28 10:33:37 -07:00
to reset this environment ' s state.
2016-04-27 08:00:58 -07:00
2016-05-27 12:16:35 -07:00
Accepts an action and returns a tuple ( observation , reward , done , info ) .
2016-04-27 08:00:58 -07:00
2016-05-27 12:16:35 -07:00
Args :
2019-05-24 14:29:01 -07:00
action ( object ) : an action provided by the agent
2016-04-27 08:00:58 -07:00
2022-04-06 20:12:55 +01:00
This method returns a tuple ` ` ( observation , reward , done , info ) ` `
2016-05-27 12:16:35 -07:00
Returns :
2022-04-06 20:12:55 +01:00
observation ( object ) : agent ' s observation of the current environment. This will be an element of the environment ' s : attr : ` observation_space ` . This may , for instance , be a numpy array containing the positions and velocities of certain objects .
2016-05-27 12:16:35 -07:00
reward ( float ) : amount of reward returned after previous action
2022-04-06 20:12:55 +01:00
done ( bool ) : whether the episode has ended , in which case further : meth : ` step ` calls will return undefined results . A done signal may be emitted for different reasons : Maybe the task underlying the environment was solved successfully , a certain timelimit was exceeded , or the physics simulation has entered an invalid state . ` ` info ` ` may contain additional information regarding the reason for a ` ` done ` ` signal .
info ( dict ) : contains auxiliary diagnostic information ( helpful for debugging , learning , and logging ) . This might , for instance , contain :
- metrics that describe the agent ' s performance or
- state variables that are hidden from observations or
- information that distinguishes truncation and termination or
- individual reward terms that are combined to produce the total reward
2016-04-27 08:00:58 -07:00
"""
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
raise NotImplementedError
2016-04-27 08:00:58 -07:00
2021-08-05 16:35:07 +02:00
@abstractmethod
2022-01-19 23:28:59 +01:00
def reset (
2022-02-06 17:28:27 -06:00
self ,
* ,
seed : Optional [ int ] = None ,
return_info : bool = False ,
options : Optional [ dict ] = None ,
) - > Union [ ObsType , tuple [ ObsType , dict ] ] :
2020-08-28 17:58:35 -04:00
""" Resets the environment to an initial state and returns an initial
observation .
2022-02-13 01:39:03 +01:00
This method should also reset the environment ' s random number
2022-04-06 20:12:55 +01:00
generator ( s ) if ` ` seed ` ` is an integer or if the environment has not
2022-02-13 01:39:03 +01:00
yet initialized a random number generator . If the environment already
2022-04-06 20:12:55 +01:00
has a random number generator and : meth : ` reset ` is called with ` ` seed = None ` ` ,
2022-02-13 01:39:03 +01:00
the RNG should not be reset .
2022-04-06 20:12:55 +01:00
Moreover , : meth : ` reset ` should ( in the typical use case ) be called with an
2022-02-13 01:39:03 +01:00
integer seed right after initialization and then never again .
2016-04-27 08:00:58 -07:00
2022-04-06 20:12:55 +01:00
Args :
seed ( int or None ) : The seed that is used to initialize the environment ' s PRNG. If the environment does not already have a PRNG and ``seed=None`` (the default option) is passed, a seed will be chosen from some source of entropy (e.g. timestamp or /dev/urandom). However, if the environment already has a PRNG and ``seed=None`` is pased, the PRNG will *not* be reset. If you pass an integer, the PRNG will be reset even if it already exists. Usually, you want to pass an integer *right after the environment has been initialized and then never again*. Please refer to the minimal example above to see this paradigm in action.
return_info ( bool ) : If true , return additional information along with initial observation . This info should be analogous to the info returned in : meth : ` step `
options ( dict or None ) : Additional information to specify how the environment is reset ( optional , depending on the specific environment )
2019-08-23 15:02:33 -07:00
Returns :
2022-04-06 20:12:55 +01:00
observation ( object ) : Observation of the initial state . This will be an element of : attr : ` observation_space ` ( usually a numpy array ) and is analogous to the observation returned by : meth : ` step ` .
info ( optional dictionary ) : This will * only * be returned if ` ` return_info = True ` ` is passed . It contains auxiliary information complementing ` ` observation ` ` . This dictionary should be analogous to the ` ` info ` ` returned by : meth : ` step ` .
2016-04-27 08:00:58 -07:00
"""
2022-02-10 18:24:41 +01:00
# Initialize the RNG if the seed is manually passed
if seed is not None :
self . _np_random , seed = seeding . np_random ( seed )
2016-04-27 08:00:58 -07:00
2021-08-05 16:35:07 +02:00
@abstractmethod
2021-08-05 10:35:48 -04:00
def render ( self , mode = " human " ) :
2016-04-27 08:00:58 -07:00
""" Renders the environment.
The set of supported modes varies per environment . ( And some
2022-02-13 01:39:03 +01:00
third - party environments may not support rendering at all . )
By convention , if mode is :
2016-04-27 08:00:58 -07:00
- human : render to the current display or terminal and
return nothing . Usually for human consumption .
- rgb_array : Return an numpy . ndarray with shape ( x , y , 3 ) ,
representing RGB values for an x - by - y pixel image , suitable
for turning into a video .
- ansi : Return a string ( str ) or StringIO . StringIO containing a
terminal - style text representation . The text can include newlines
and ANSI escape sequences ( e . g . for colors ) .
2022-04-06 20:12:55 +01:00
2016-04-27 08:00:58 -07:00
Note :
2022-02-28 15:54:03 -05:00
Make sure that your class ' s metadata ' render_modes ' key includes
2016-04-27 08:00:58 -07:00
the list of supported modes . It ' s recommended to call super()
in implementations to use the functionality of this method .
Args :
mode ( str ) : the mode to render with
2022-04-06 20:12:55 +01:00
Example : :
2016-04-27 08:00:58 -07:00
2022-04-06 20:12:55 +01:00
class MyEnv ( Env ) :
metadata = { ' render_modes ' : [ ' human ' , ' rgb_array ' ] }
2016-05-27 12:16:35 -07:00
2022-04-06 20:12:55 +01:00
def render ( self , mode = ' human ' ) :
if mode == ' rgb_array ' :
return np . array ( . . . ) # return RGB frame suitable for video
elif mode == ' human ' :
. . . # pop up a window and render
else :
super ( MyEnv , self ) . render ( mode = mode ) # just raise an exception
2016-04-27 08:00:58 -07:00
"""
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
raise NotImplementedError
2016-04-27 08:00:58 -07:00
2016-05-15 15:59:02 -07:00
def close ( self ) :
2019-05-03 23:53:31 +02:00
""" Override close in your subclass to perform any necessary cleanup.
2016-05-27 12:16:35 -07:00
Environments will automatically close ( ) themselves when
garbage collected or when the program exits .
2016-05-15 15:59:02 -07:00
"""
2019-05-03 23:53:31 +02:00
pass
2016-05-15 15:59:02 -07:00
2016-05-29 09:07:09 -07:00
def seed ( self , seed = None ) :
""" Sets the seed for this env ' s random number generator(s).
Note :
Some environments use multiple pseudorandom number generators .
We want to capture all such seeds used in order to ensure that
there aren ' t accidental correlations between multiple generators.
Returns :
list < bigint > : Returns the list of seeds used in this env ' s random
number generators . The first value in the list should be the
" main " seed , or the value which a reproducer should pass to
' seed ' . Often , the main seed equals the provided ' seed ' , but
this won ' t be true if seed=None, for example.
"""
2021-12-08 22:14:15 +01:00
deprecation (
" Function `env.seed(seed)` is marked as deprecated and will be removed in the future. "
" Please use `env.reset(seed=seed) instead. "
)
2022-02-10 18:24:41 +01:00
self . _np_random , seed = seeding . np_random ( seed )
2021-12-08 22:14:15 +01:00
return [ seed ]
2017-06-16 16:35:03 -07:00
2016-08-17 15:16:45 -07:00
@property
2021-12-22 19:12:57 +01:00
def unwrapped ( self ) - > Env :
2016-08-17 15:16:45 -07:00
""" Completely unwrap this env.
2016-08-11 14:45:52 -07:00
Returns :
2016-08-17 15:16:45 -07:00
gym . Env : The base non - wrapped gym . Env instance
2016-08-11 14:45:52 -07:00
"""
2017-02-26 00:01:00 -08:00
return self
2016-08-11 14:45:52 -07:00
2016-04-27 08:00:58 -07:00
def __str__ ( self ) :
2017-06-16 16:35:03 -07:00
if self . spec is None :
2021-11-14 14:50:40 +01:00
return f " < { type ( self ) . __name__ } instance> "
2017-06-16 16:35:03 -07:00
else :
2021-11-14 14:50:40 +01:00
return f " < { type ( self ) . __name__ } < { self . spec . id } >> "
2016-04-27 08:00:58 -07:00
2019-02-25 15:53:58 -08:00
def __enter__ ( self ) :
2021-07-29 02:26:34 +02:00
""" Support with-statement for the environment. """
2019-02-25 15:53:58 -08:00
return self
def __exit__ ( self , * args ) :
2021-07-29 02:26:34 +02:00
""" Support with-statement for the environment. """
2019-02-25 15:53:58 -08:00
self . close ( )
# propagate exception
return False
2018-02-26 17:35:07 +01:00
2022-02-05 17:25:47 +01:00
class Wrapper ( Env [ ObsType , ActType ] ) :
2020-04-24 23:10:27 +02:00
""" Wraps the environment to allow a modular transformation.
2019-08-23 15:02:33 -07:00
2019-05-03 23:53:31 +02:00
This class is the base class for all wrappers . The subclass could override
some methods to change the behavior of the original environment without touching the
2019-08-23 15:02:33 -07:00
original code .
2019-05-03 23:53:31 +02:00
. . note : :
2019-08-23 15:02:33 -07:00
2019-05-03 23:53:31 +02:00
Don ' t forget to call ``super().__init__(env)`` if the subclass overrides :meth:`__init__`.
2019-08-23 15:02:33 -07:00
2019-05-03 23:53:31 +02:00
"""
2021-07-29 02:26:34 +02:00
2022-02-05 17:25:47 +01:00
def __init__ ( self , env : Env ) :
2016-08-13 19:24:48 -07:00
self . env = env
2021-09-17 18:02:59 -04:00
2022-02-05 17:25:47 +01:00
self . _action_space : spaces . Space | None = None
self . _observation_space : spaces . Space | None = None
self . _reward_range : tuple [ SupportsFloat , SupportsFloat ] | None = None
self . _metadata : dict | None = None
2016-12-23 16:21:42 -08:00
2019-03-25 20:11:53 +01:00
def __getattr__ ( self , name ) :
2021-07-29 02:26:34 +02:00
if name . startswith ( " _ " ) :
2022-03-24 19:10:06 +01:00
raise AttributeError ( f " accessing private attribute ' { name } ' is prohibited " )
2019-03-25 20:11:53 +01:00
return getattr ( self . env , name )
2019-06-28 15:27:43 -07:00
@property
def spec ( self ) :
return self . env . spec
2016-12-23 16:21:42 -08:00
@classmethod
def class_name ( cls ) :
return cls . __name__
2021-09-17 18:02:59 -04:00
@property
2022-02-05 17:25:47 +01:00
def action_space ( self ) - > spaces . Space [ ActType ] :
2021-09-17 18:02:59 -04:00
if self . _action_space is None :
return self . env . action_space
return self . _action_space
@action_space.setter
def action_space ( self , space ) :
self . _action_space = space
@property
2022-02-05 17:25:47 +01:00
def observation_space ( self ) - > spaces . Space :
2021-09-17 18:02:59 -04:00
if self . _observation_space is None :
return self . env . observation_space
return self . _observation_space
@observation_space.setter
def observation_space ( self , space ) :
self . _observation_space = space
@property
2022-02-05 17:25:47 +01:00
def reward_range ( self ) - > tuple [ SupportsFloat , SupportsFloat ] :
2021-09-17 18:02:59 -04:00
if self . _reward_range is None :
return self . env . reward_range
return self . _reward_range
@reward_range.setter
def reward_range ( self , value ) :
self . _reward_range = value
@property
2022-02-05 17:25:47 +01:00
def metadata ( self ) - > dict :
2021-09-17 18:02:59 -04:00
if self . _metadata is None :
return self . env . metadata
return self . _metadata
@metadata.setter
def metadata ( self , value ) :
self . _metadata = value
2022-02-05 17:25:47 +01:00
def step ( self , action : ActType ) - > Tuple [ ObsType , float , bool , dict ] :
2019-05-03 23:53:31 +02:00
return self . env . step ( action )
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
2022-02-06 17:28:27 -06:00
def reset ( self , * * kwargs ) - > Union [ ObsType , tuple [ ObsType , dict ] ] :
2022-01-19 23:28:59 +01:00
return self . env . reset ( * * kwargs )
2016-08-11 14:45:52 -07:00
2022-03-31 22:28:17 +02:00
def render ( self , * * kwargs ) :
return self . env . render ( * * kwargs )
2016-08-11 14:45:52 -07:00
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
def close ( self ) :
2019-03-25 18:23:14 +01:00
return self . env . close ( )
2016-08-11 14:45:52 -07:00
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
def seed ( self , seed = None ) :
2016-08-11 14:45:52 -07:00
return self . env . seed ( seed )
def __str__ ( self ) :
2021-11-14 14:50:40 +01:00
return f " < { type ( self ) . __name__ } { self . env } > "
2016-09-04 00:38:03 -07:00
def __repr__ ( self ) :
return str ( self )
2016-08-17 15:16:45 -07:00
@property
2022-02-05 17:25:47 +01:00
def unwrapped ( self ) - > Env :
2017-02-26 00:01:00 -08:00
return self . env . unwrapped
2016-09-04 00:38:03 -07:00
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
class ObservationWrapper ( Wrapper ) :
2022-01-19 23:28:59 +01:00
def reset ( self , * * kwargs ) :
2022-02-17 18:03:35 +01:00
if kwargs . get ( " return_info " , False ) :
obs , info = self . env . reset ( * * kwargs )
return self . observation ( obs ) , info
else :
return self . observation ( self . env . reset ( * * kwargs ) )
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
2019-05-03 23:53:31 +02:00
def step ( self , action ) :
observation , reward , done , info = self . env . step ( action )
return self . observation ( observation ) , reward , done , info
2021-08-05 16:35:07 +02:00
@abstractmethod
2016-09-04 01:44:20 -07:00
def observation ( self , observation ) :
2019-05-03 23:53:31 +02:00
raise NotImplementedError
2016-09-04 00:38:03 -07:00
class RewardWrapper ( Wrapper ) :
2022-01-19 23:28:59 +01:00
def reset ( self , * * kwargs ) :
return self . env . reset ( * * kwargs )
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
def step ( self , action ) :
2016-09-04 00:38:03 -07:00
observation , reward , done , info = self . env . step ( action )
2016-09-04 01:44:20 -07:00
return observation , self . reward ( reward ) , done , info
2021-08-05 16:35:07 +02:00
@abstractmethod
2016-09-04 01:44:20 -07:00
def reward ( self , reward ) :
2019-05-03 23:53:31 +02:00
raise NotImplementedError
2016-09-04 00:38:03 -07:00
class ActionWrapper ( Wrapper ) :
2022-01-19 23:28:59 +01:00
def reset ( self , * * kwargs ) :
return self . env . reset ( * * kwargs )
2019-03-25 18:23:14 +01:00
Cleanup, removal of unmaintained code (#836)
* add dtype to Box
* remove board_game, debugging, safety, parameter_tuning environments
* massive set of breaking changes
- remove python logging module
- _step, _reset, _seed, _close => non underscored method
- remove benchmark and scoring folder
* Improve render("human"), now resizable, closable window.
* get rid of default step and reset in wrappers, so it doesn’t silently fail for people with underscore methods
* CubeCrash unit test environment
* followup fixes
* MemorizeDigits unit test envrionment
* refactored spaces a bit
fixed indentation
disabled test_env_semantics
* fix unit tests
* fixes
* CubeCrash, MemorizeDigits tested
* gym backwards compatibility patch
* gym backwards compatibility, followup fixes
* changelist, add spaces to main namespaces
* undo_logger_setup for backwards compat
* remove configuration.py
2018-01-25 18:20:14 -08:00
def step ( self , action ) :
2019-05-03 23:53:31 +02:00
return self . env . step ( self . action ( action ) )
2016-09-04 00:38:03 -07:00
2021-08-05 16:35:07 +02:00
@abstractmethod
2016-09-04 01:44:20 -07:00
def action ( self , action ) :
2019-05-03 23:53:31 +02:00
raise NotImplementedError
2016-09-04 01:44:20 -07:00
2021-08-05 16:35:07 +02:00
@abstractmethod
2016-10-14 22:07:47 -07:00
def reverse_action ( self , action ) :
2021-07-29 02:26:34 +02:00
raise NotImplementedError