mirror of
https://github.com/Farama-Foundation/Gymnasium.git
synced 2025-08-20 22:12:03 +00:00
@@ -18,8 +18,7 @@ class HopperEnv(mujoco_env.MujocoEnv, utils.EzPickle):
|
||||
### Description
|
||||
|
||||
This environment is based on the work done by Erez, Tassa, and Todorov in
|
||||
["Infinite Horizon Model Predictive Control for Nonlinear Periodic Tasks"]
|
||||
(http://www.roboticsproceedings.org/rss07/p10.pdf). The environment aims to
|
||||
["Infinite Horizon Model Predictive Control for Nonlinear Periodic Tasks"](http://www.roboticsproceedings.org/rss07/p10.pdf). The environment aims to
|
||||
increase the number of independent state and control variables as compared to
|
||||
the classic control environments. The hopper is a two-dimensional
|
||||
one-legged figure that consist of four main body parts - the torso at the
|
||||
@@ -71,19 +70,20 @@ class HopperEnv(mujoco_env.MujocoEnv, utils.EzPickle):
|
||||
on that value. This value is hidden from the algorithm, which in turn has
|
||||
to develop an abstract understanding of it from the observed rewards.
|
||||
Therefore, observation space has shape `(11,)` instead of `(12,)` and looks like:
|
||||
|
||||
| Num | Observation | Min | Max | Name (in corresponding XML file) | Joint| Unit |
|
||||
|-----|-----------------------|----------------------|--------------------|----------------------|--------------------|--------------------|
|
||||
| 0 | z-coordinate of the top (height of hopper) | -Inf | Inf | rootz | slide | position (m) |
|
||||
| 1 | angle of the top | -Inf | Inf | rooty | hinge | angle (rad) |
|
||||
| 2 | angle of the thigh joint | -Inf | Inf | thigh_joint | hinge | angle (rad) |
|
||||
| 3 | angle of the leg joint | -Inf | Inf | leg_joint | hinge | angle (rad) |
|
||||
| 4 | angle of the foot joint | -Inf | Inf | foot_joint | hinge | angle (rad) |
|
||||
| 5 | velocity of the x-coordinate of the top | -Inf | Inf | rootx | slide | velocity (m/s) |
|
||||
| 6 | velocity of the z-coordinate (height) of the top | -Inf | Inf | rootz | slide | velocity (m/s) |
|
||||
| 7 | angular velocity of the angle of the top | -Inf | Inf | rooty | hinge | angular velocity (rad/s) |
|
||||
| 8 | angular velocity of the thigh hinge | -Inf | Inf | thigh_joint | hinge | angular velocity (rad/s) |
|
||||
| 9 | angular velocity of the leg hinge | -Inf | Inf | leg_joint | hinge | angular velocity (rad/s) |
|
||||
| 10 | angular velocity of the foot hinge | -Inf | Inf | foot_joint | hinge | angular velocity (rad/s) |
|
||||
| 0 | z-coordinate of the top (height of hopper) | -Inf | Inf | rootz | slide | position (m) |
|
||||
| 1 | angle of the top | -Inf | Inf | rooty | hinge | angle (rad) |
|
||||
| 2 | angle of the thigh joint | -Inf | Inf | thigh_joint | hinge | angle (rad) |
|
||||
| 3 | angle of the leg joint | -Inf | Inf | leg_joint | hinge | angle (rad) |
|
||||
| 4 | angle of the foot joint | -Inf | Inf | foot_joint | hinge | angle (rad) |
|
||||
| 5 | velocity of the x-coordinate of the top | -Inf | Inf | rootx | slide | velocity (m/s) |
|
||||
| 6 | velocity of the z-coordinate (height) of the top | -Inf | Inf | rootz | slide | velocity (m/s) |
|
||||
| 7 | angular velocity of the angle of the top | -Inf | Inf | rooty | hinge | angular velocity (rad/s) |
|
||||
| 8 | angular velocity of the thigh hinge | -Inf | Inf | thigh_joint | hinge | angular velocity (rad/s) |
|
||||
| 9 | angular velocity of the leg hinge | -Inf | Inf | leg_joint | hinge | angular velocity (rad/s) |
|
||||
| 10 | angular velocity of the foot hinge | -Inf | Inf | foot_joint | hinge | angular velocity (rad/s) |
|
||||
|
||||
### Rewards
|
||||
The reward consists of three parts:
|
||||
@@ -103,8 +103,8 @@ class HopperEnv(mujoco_env.MujocoEnv, utils.EzPickle):
|
||||
|
||||
### Starting State
|
||||
All observations start in state
|
||||
(0.0, 1.25, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) with a uniform nois
|
||||
e in the range of [-0.005, 0.005] added to the values for stochasticity.
|
||||
(0.0, 1.25, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0) with a uniform noise
|
||||
in the range of [-0.005, 0.005] added to the values for stochasticity.
|
||||
|
||||
### Episode Termination
|
||||
The episode terminates when any of the following happens:
|
||||
|
Reference in New Issue
Block a user