From 92d8e2a8af64f7e9e3bae765297b1b636fab65b6 Mon Sep 17 00:00:00 2001 From: Roberto Schiavone Date: Tue, 14 Mar 2023 16:53:31 +0100 Subject: [PATCH] chore: mujoco.md :memo: (#386) --- docs/environments/mujoco.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/environments/mujoco.md b/docs/environments/mujoco.md index b0956b70f..bdec2d899 100644 --- a/docs/environments/mujoco.md +++ b/docs/environments/mujoco.md @@ -37,7 +37,7 @@ These environments also require that the MuJoCo engine be installed. As of Octob For MuJoCo V3 environments and older the `mujoco-py` framework is required (`pip install mujoco-py`) which can be found in the [GitHub repository](https://github.com/openai/mujoco-py/tree/master/mujoco_py) -There are ten Mujoco environments: Ant, HalfCheetah, Hopper, Humanoid, HumanoidStandup, IvertedDoublePendulum, InvertedPendulum, Reacher, Swimmer, and Walker. All of these environments are stochastic in terms of their initial state, with a Gaussian noise added to a fixed initial state in order to add stochasticity. The state spaces for MuJoCo environments in Gymnasium consist of two parts that are flattened and concatenated together: a position of a body part ('*mujoco-py.mjsim.qpos*') or joint and its corresponding velocity ('*mujoco-py.mjsim.qvel*'). Often, some of the first positional elements are omitted from the state space since the reward is calculated based on their values, leaving it up to the algorithm to infer those hidden values indirectly. +There are eleven Mujoco environments: Ant, HalfCheetah, Hopper, Humanoid, HumanoidStandup, InvertedDoublePendulum, InvertedPendulum, Pusher, Reacher, Swimmer, and Walker2d. All of these environments are stochastic in terms of their initial state, with a Gaussian noise added to a fixed initial state in order to add stochasticity. The state spaces for MuJoCo environments in Gymnasium consist of two parts that are flattened and concatenated together: a position of a body part ('*mujoco-py.mjsim.qpos*') or joint and its corresponding velocity ('*mujoco-py.mjsim.qvel*'). Often, some of the first positional elements are omitted from the state space since the reward is calculated based on their values, leaving it up to the algorithm to infer those hidden values indirectly. Among Gymnasium environments, this set of environments can be considered as more difficult ones to solve by a policy.