mirror of
https://github.com/Farama-Foundation/Gymnasium.git
synced 2025-07-30 21:34:30 +00:00
Updated Wrapper docs (#173)
This commit is contained in:
@@ -12,6 +12,11 @@ wrappers/observation_wrappers
|
||||
wrappers/reward_wrappers
|
||||
```
|
||||
|
||||
```{eval-rst}
|
||||
.. automodule:: gymnasium.wrappers
|
||||
|
||||
```
|
||||
|
||||
## gymnasium.Wrapper
|
||||
|
||||
```{eval-rst}
|
||||
@@ -35,6 +40,13 @@ wrappers/reward_wrappers
|
||||
.. autoproperty:: gymnasium.Wrapper.spec
|
||||
.. autoproperty:: gymnasium.Wrapper.metadata
|
||||
.. autoproperty:: gymnasium.Wrapper.np_random
|
||||
.. attribute:: gymnasium.Wrapper.env
|
||||
|
||||
The environment (one level underneath) this wrapper.
|
||||
|
||||
This may itself be a wrapped environment.
|
||||
To obtain the environment underneath all layers of wrappers, use :attr:`gymnasium.Wrapper.unwrapped`.
|
||||
|
||||
.. autoproperty:: gymnasium.Wrapper.unwrapped
|
||||
```
|
||||
|
||||
@@ -124,43 +136,4 @@ wrapper in the page on the wrapper type
|
||||
* - :class:`VectorListInfo`
|
||||
- Misc Wrapper
|
||||
- This wrapper will convert the info of a vectorized environment from the `dict` format to a `list` of dictionaries where the i-th dictionary contains info of the i-th environment.
|
||||
```
|
||||
|
||||
## Implementing a custom wrapper
|
||||
|
||||
Sometimes you might need to implement a wrapper that does some more complicated modifications (e.g. modify the
|
||||
reward based on data in `info` or change the rendering behavior).
|
||||
Such wrappers can be implemented by inheriting from Misc Wrapper.
|
||||
|
||||
- You can set a new action or observation space by defining `self.action_space` or `self.observation_space` in `__init__`, respectively
|
||||
- You can set new metadata and reward range by defining `self.metadata` and `self.reward_range` in `__init__`, respectively
|
||||
- You can override `step`, `render`, `close` etc. If you do this, you can access the environment that was passed
|
||||
to your wrapper (which *still* might be wrapped in some other wrapper) by accessing the attribute `self.env`.
|
||||
|
||||
Let's also take a look at an example for this case. Most MuJoCo environments return a reward that consists
|
||||
of different terms: For instance, there might be a term that rewards the agent for completing the task and one term that
|
||||
penalizes large actions (i.e. energy usage). Usually, you can pass weight parameters for those terms during
|
||||
initialization of the environment. However, *Reacher* does not allow you to do this! Nevertheless, all individual terms
|
||||
of the reward are returned in `info`, so let us build a wrapper for Reacher that allows us to weight those terms:
|
||||
|
||||
```python
|
||||
import gymnasium as gym
|
||||
|
||||
class ReacherRewardWrapper(gym.Wrapper):
|
||||
def __init__(self, env, reward_dist_weight, reward_ctrl_weight):
|
||||
super().__init__(env)
|
||||
self.reward_dist_weight = reward_dist_weight
|
||||
self.reward_ctrl_weight = reward_ctrl_weight
|
||||
|
||||
def step(self, action):
|
||||
obs, _, terminated, truncated, info = self.env.step(action)
|
||||
reward = (
|
||||
self.reward_dist_weight * info["reward_dist"]
|
||||
+ self.reward_ctrl_weight * info["reward_ctrl"]
|
||||
)
|
||||
return obs, reward, terminated, truncated, info
|
||||
```
|
||||
|
||||
```{note}
|
||||
It is *not* sufficient to use a `RewardWrapper` in this case!
|
||||
```
|
@@ -1,22 +1,16 @@
|
||||
# Action Wrappers
|
||||
|
||||
## Action Wrapper
|
||||
## Base Class
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.ActionWrapper
|
||||
|
||||
.. autofunction:: gymnasium.ActionWrapper.action
|
||||
.. automethod:: gymnasium.ActionWrapper.action
|
||||
```
|
||||
|
||||
## Clip Action
|
||||
|
||||
## Available Action Wrappers
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.ClipAction
|
||||
```
|
||||
|
||||
## Rescale Action
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.RescaleAction
|
||||
```
|
||||
|
||||
|
@@ -1,68 +1,15 @@
|
||||
# Misc Wrappers
|
||||
|
||||
## Atari Preprocessing
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.AtariPreprocessing
|
||||
```
|
||||
|
||||
## Autoreset
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.AutoResetWrapper
|
||||
```
|
||||
|
||||
## Compatibility
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.EnvCompatibility
|
||||
.. autoclass:: gymnasium.wrappers.StepAPICompatibility
|
||||
```
|
||||
|
||||
## Passive Environment Checker
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.PassiveEnvChecker
|
||||
```
|
||||
|
||||
## Human Rendering
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.HumanRendering
|
||||
```
|
||||
|
||||
## Order Enforcing
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.OrderEnforcing
|
||||
```
|
||||
|
||||
## Record Episode Statistics
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.RecordEpisodeStatistics
|
||||
```
|
||||
|
||||
## Record Video
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.RecordVideo
|
||||
```
|
||||
|
||||
## Render Collection
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.RenderCollection
|
||||
```
|
||||
|
||||
## Time Limit
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.TimeLimit
|
||||
```
|
||||
|
||||
## Vector List Info
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.VectorListInfo
|
||||
```
|
||||
|
@@ -1,62 +1,23 @@
|
||||
# Observation Wrappers
|
||||
|
||||
## Observation Wrapper
|
||||
## Base Class
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.ObservationWrapper
|
||||
.. autofunction:: gymnasium.ObservationWrapper.observation
|
||||
|
||||
.. automethod:: gymnasium.ObservationWrapper.observation
|
||||
```
|
||||
|
||||
## Transform Observation
|
||||
## Available Observation Wrappers
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.TransformObservation
|
||||
```
|
||||
|
||||
## Filter Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.FilterObservation
|
||||
```
|
||||
|
||||
## Flatten Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.FlattenObservation
|
||||
```
|
||||
|
||||
## Framestack Observations
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.FrameStack
|
||||
```
|
||||
|
||||
## Gray Scale Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.GrayScaleObservation
|
||||
```
|
||||
|
||||
## Normalize Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.NormalizeObservation
|
||||
```
|
||||
|
||||
## Pixel Observation Wrapper
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.PixelObservationWrapper
|
||||
```
|
||||
|
||||
## Resize Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.ResizeObservation
|
||||
```
|
||||
|
||||
## Time Aware Observation
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.TimeAwareObservation
|
||||
```
|
||||
|
@@ -1,22 +1,17 @@
|
||||
|
||||
# Reward Wrappers
|
||||
|
||||
## Reward Wrapper
|
||||
## Base Class
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.RewardWrapper
|
||||
|
||||
.. autofunction:: gymnasium.RewardWrapper.reward
|
||||
.. automethod:: gymnasium.RewardWrapper.reward
|
||||
```
|
||||
|
||||
## Transform Reward
|
||||
## Available Reward Wrappers
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.TransformReward
|
||||
```
|
||||
|
||||
## Normalize Reward
|
||||
|
||||
```{eval-rst}
|
||||
.. autoclass:: gymnasium.wrappers.NormalizeReward
|
||||
```
|
||||
|
Reference in New Issue
Block a user