diff --git a/.github/workflows/build-docs.yml b/.github/workflows/build-docs.yml
index 7b5dad395..882e39626 100644
--- a/.github/workflows/build-docs.yml
+++ b/.github/workflows/build-docs.yml
@@ -8,22 +8,22 @@ jobs:
docs:
name: Generate Website
runs-on: ubuntu-latest
-
+
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
- python-version: '3.9'
+ python-version: '3.9'
- name: Install dependencies
run: pip install -r docs/requirements.txt
-
+
- name: Install Gymnasium
run: pip install mujoco && pip install .[atari,accept-rom-license,box2d]
-
+
- name: Build Envs Docs
- run: python docs/scripts/gen_mds.py
+ run: python docs/scripts/gen_mds.py && python docs/scripts/gen_envs_display.py
- name: Build
run: sphinx-build -b dirhtml -v docs _build
diff --git a/docs/.gitignore b/docs/.gitignore
index 0f0266979..324bc98c4 100644
--- a/docs/.gitignore
+++ b/docs/.gitignore
@@ -2,4 +2,13 @@
__pycache__
.vscode/
build/
-_build/
\ No newline at end of file
+_build/
+
+environments/**/list.html
+environments/**/complete_list.html
+
+environments/box2d/*.md
+environments/classic_control/*.md
+environments/mujoco/*.md
+environments/third_party_environments/*.md
+environments/toy_text/*.md
\ No newline at end of file
diff --git a/docs/environments/atari/index.md b/docs/environments/atari.md
similarity index 88%
rename from docs/environments/atari/index.md
rename to docs/environments/atari.md
index ccdc71c6f..2ccb52610 100644
--- a/docs/environments/atari/index.md
+++ b/docs/environments/atari.md
@@ -9,75 +9,75 @@ A set of Atari 2600 environment simulated through Stella and the Arcade Learning
```{toctree}
:hidden:
-adventure
-air_raid
-alien
-amidar
-assault
-asterix
-asteroids
-atlantis
-bank_heist
-battle_zone
-beam_rider
-berzerk
-bowling
-boxing
-breakout
-carnival
-centipede
-chopper_command
-crazy_climber
-defender
-demon_attack
-double_dunk
-elevator_action
-enduro
-fishing_derby
-freeway
-frostbite
-gopher
-gravitar
-hero
-ice_hockey
-jamesbond
-journey_escape
-kangaroo
-krull
-kung_fu_master
-montezuma_revenge
-ms_pacman
-name_this_game
-phoenix
-pitfall
-pong
-pooyan
-private_eye
-qbert
-riverraid
-road_runner
-robotank
-seaquest
-skiing
-solaris
-space_invaders
-star_gunner
-tennis
-time_pilot
-tutankham
-up_n_down
-venture
-video_pinball
-wizard_of_wor
-yars_revenge
-zaxxon
+atari/adventure
+atari/air_raid
+atari/alien
+atari/amidar
+atari/assault
+atari/asterix
+atari/asteroids
+atari/atlantis
+atari/bank_heist
+atari/battle_zone
+atari/beam_rider
+atari/berzerk
+atari/bowling
+atari/boxing
+atari/breakout
+atari/carnival
+atari/centipede
+atari/chopper_command
+atari/crazy_climber
+atari/defender
+atari/demon_attack
+atari/double_dunk
+atari/elevator_action
+atari/enduro
+atari/fishing_derby
+atari/freeway
+atari/frostbite
+atari/gopher
+atari/gravitar
+atari/hero
+atari/ice_hockey
+atari/jamesbond
+atari/journey_escape
+atari/kangaroo
+atari/krull
+atari/kung_fu_master
+atari/montezuma_revenge
+atari/ms_pacman
+atari/name_this_game
+atari/phoenix
+atari/pitfall
+atari/pong
+atari/pooyan
+atari/private_eye
+atari/qbert
+atari/riverraid
+atari/road_runner
+atari/robotank
+atari/seaquest
+atari/skiing
+atari/solaris
+atari/space_invaders
+atari/star_gunner
+atari/tennis
+atari/time_pilot
+atari/tutankham
+atari/up_n_down
+atari/venture
+atari/video_pinball
+atari/wizard_of_wor
+atari/yars_revenge
+atari/zaxxon
```
```{raw} html
- :file: index.html
+ :file: atari/list.html
```
-Atari environments are simulated via the Arcade Learning Environment (ALE) [[1]](#1).
+Atari environments are simulated via the Arcade Learning Environment (ALE) [[1]](#1).
### AutoROM (installing the ROMs)
@@ -113,12 +113,12 @@ The action space is a subset of the following discrete set of legal actions:
| 17 | DOWNLEFTFIRE |
If you use v0 or v4 and the environment is initialized via `make`, the action space will usually be much smaller since most legal actions don't have
-any effect. Thus, the enumeration of the actions will differ. The action space can be expanded to the full
+any effect. Thus, the enumeration of the actions will differ. The action space can be expanded to the full
legal space by passing the keyword argument `full_action_space=True` to `make`.
-The reduced action space of an Atari environment may depend on the "flavor" of the game. You can specify the flavor by providing
+The reduced action space of an Atari environment may depend on the "flavor" of the game. You can specify the flavor by providing
the arguments `difficulty` and `mode` when constructing the environment. This documentation only provides details on the
-action spaces of default flavor choices.
+action spaces of default flavor choices.
### Observation Space
The observation issued by an Atari environment may be:
@@ -131,26 +131,26 @@ The exact reward dynamics depend on the environment and are usually documented i
find these manuals on [AtariAge](https://atariage.com/).
### Stochasticity
-It was pointed out in [[1]](#1) that Atari games are entirely deterministic. Thus, agents could achieve
+It was pointed out in [[1]](#1) that Atari games are entirely deterministic. Thus, agents could achieve
state of the art performance by simply memorizing an optimal sequence of actions while completely ignoring observations from the environment.
To avoid this, ALE implements sticky actions: Instead of always simulating the action passed to the environment, there is a small
probability that the previously executed action is used instead.
On top of this, Gymnasium implements stochastic frame skipping: In each environment step, the action is repeated for a random
-number of frames. This behavior may be altered by setting the keyword argument `frameskip` to either a positive integer or
-a tuple of two positive integers. If `frameskip` is an integer, frame skipping is deterministic, and in each step the action is
-repeated `frameskip` many times. Otherwise, if `frameskip` is a tuple, the number of skipped frames is chosen uniformly at
+number of frames. This behavior may be altered by setting the keyword argument `frameskip` to either a positive integer or
+a tuple of two positive integers. If `frameskip` is an integer, frame skipping is deterministic, and in each step the action is
+repeated `frameskip` many times. Otherwise, if `frameskip` is a tuple, the number of skipped frames is chosen uniformly at
random between `frameskip[0]` (inclusive) and `frameskip[1]` (exclusive) in each environment step.
### Common Arguments
-When initializing Atari environments via `gymnasium.make`, you may pass some additional arguments. These work for any
+When initializing Atari environments via `gymnasium.make`, you may pass some additional arguments. These work for any
Atari environment. However, legal values for `mode` and `difficulty` depend on the environment.
- **mode**: `int`. Game mode, see [[2]](#2). Legal values depend on the environment and are listed in the table above.
-- **difficulty**: `int`. Difficulty of the game, see [[2]](#2). Legal values depend on the environment and are listed in
+- **difficulty**: `int`. Difficulty of the game, see [[2]](#2). Legal values depend on the environment and are listed in
the table above. Together with `mode`, this determines the "flavor" of the game.
- **obs_type**: `str`. This argument determines what observations are returned by the environment. Its values are:
@@ -168,7 +168,7 @@ action space will be reduced to a subset.
- **render_mode**: `str`. Specifies the rendering mode. Its values are:
- human: We'll interactively display the screen and enable game sounds. This will lock emulation to the ROMs specified FPS
- rgb_array: we'll return the `rgb` key in step metadata with the current environment RGB frame.
-> It is highly recommended to specify `render_mode` during construction instead of calling `env.render()`.
+> It is highly recommended to specify `render_mode` during construction instead of calling `env.render()`.
> This will guarantee proper scaling, audio support, and proper framerates
@@ -282,15 +282,15 @@ the available modes and difficulty levels for different Atari games:
### References
(#1)=
-[1]
-MG Bellemare, Y Naddaf, J Veness, and M Bowling.
-"The arcade learning environment: An evaluation platform for general agents."
-Journal of Artificial Intelligence Research (2012).
+[1]
+MG Bellemare, Y Naddaf, J Veness, and M Bowling.
+"The arcade learning environment: An evaluation platform for general agents."
+Journal of Artificial Intelligence Research (2012).
(#2)=
-[2]
-Machado et al.
+[2]
+Machado et al.
"Revisiting the Arcade Learning Environment: Evaluation Protocols
-and Open Problems for General Agents"
-Journal of Artificial Intelligence Research (2018)
-URL: https://jair.org/index.php/jair/article/view/11182
\ No newline at end of file
+and Open Problems for General Agents"
+Journal of Artificial Intelligence Research (2018)
+URL: https://jair.org/index.php/jair/article/view/11182
\ No newline at end of file
diff --git a/docs/environments/atari/complete_list.html b/docs/environments/atari/complete_list.html
deleted file mode 100644
index 14ea1a2b1..000000000
--- a/docs/environments/atari/complete_list.html
+++ /dev/null
@@ -1,749 +0,0 @@
-
-
-
-
\ No newline at end of file
diff --git a/docs/environments/box2d/index.md b/docs/environments/box2d.md
similarity index 86%
rename from docs/environments/box2d/index.md
rename to docs/environments/box2d.md
index bfffe20a0..2d2de10c4 100644
--- a/docs/environments/box2d/index.md
+++ b/docs/environments/box2d.md
@@ -8,17 +8,17 @@ lastpage:
```{toctree}
:hidden:
-bipedal_walker
-car_racing
-lunar_lander
-```
-
-```{raw} html
- :file: index.html
+box2d/bipedal_walker
+box2d/car_racing
+box2d/lunar_lander
```
-
+
+```{raw} html
+ :file: box2d/list.html
+```
+
These environments all involve toy games based around physics control, using [box2d](https://box2d.org/) based physics and PyGame based rendering. These environments were contributed back in the early days of Gymnasium by Oleg Klimov, and have become popular toy benchmarks ever since. All environments are highly configurable via arguments specified in each environment's documentation.
-
+
The unique dependencies for this set of environments can be installed via:
````bash
diff --git a/docs/environments/box2d/.gitkeep b/docs/environments/box2d/.gitkeep
new file mode 100644
index 000000000..e69de29bb
diff --git a/docs/environments/box2d/index.html b/docs/environments/box2d/index.html
deleted file mode 100644
index 1a16f1c17..000000000
--- a/docs/environments/box2d/index.html
+++ /dev/null
@@ -1,41 +0,0 @@
-
-
-
-
\ No newline at end of file
diff --git a/docs/environments/mujoco/index.md b/docs/environments/mujoco.md
similarity index 98%
rename from docs/environments/mujoco/index.md
rename to docs/environments/mujoco.md
index ff2b3247a..e9bf3a11d 100644
--- a/docs/environments/mujoco/index.md
+++ b/docs/environments/mujoco.md
@@ -21,7 +21,7 @@ walker2d
```
```{raw} html
- :file: index.html
+ :file: mujoco/list.html
```
MuJoCo stands for Multi-Joint dynamics with Contact. It is a physics engine for faciliatating research and development in robotics, biomechanics, graphics and animation, and other areas where fast and accurate simulation is needed.
diff --git a/docs/environments/mujoco/.gitkeep b/docs/environments/mujoco/.gitkeep
new file mode 100644
index 000000000..e69de29bb
diff --git a/docs/environments/mujoco/index.html b/docs/environments/mujoco/index.html
deleted file mode 100644
index bab47bd06..000000000
--- a/docs/environments/mujoco/index.html
+++ /dev/null
@@ -1,125 +0,0 @@
-
-
-
-
\ No newline at end of file
diff --git a/docs/environments/third_party_environments/index.md b/docs/environments/third_party_environments.md
similarity index 100%
rename from docs/environments/third_party_environments/index.md
rename to docs/environments/third_party_environments.md
diff --git a/docs/environments/toy_text/index.md b/docs/environments/toy_text.md
similarity index 71%
rename from docs/environments/toy_text/index.md
rename to docs/environments/toy_text.md
index eb7cd344b..dfd4d06b1 100644
--- a/docs/environments/toy_text/index.md
+++ b/docs/environments/toy_text.md
@@ -8,18 +8,18 @@ lastpage:
```{toctree}
:hidden:
-blackjack.md
-taxi.md
-cliff_walking.md
-frozen_lake.md
+toy_text/blackjack.md
+toy_text/taxi.md
+toy_text/cliff_walking.md
+toy_text/frozen_lake.md
```
```{raw} html
- :file: index.html
+ :file: toy_text/list.html
```
-All toy text environments were created by us using native Python libraries such as StringIO.
+All toy text environments were created by us using native Python libraries such as StringIO.
-These environments are designed to be extremely simple, with small discrete state and action spaces, and hence easy to learn. As a result, they are suitable for debugging implementations of reinforcement learning algorithms.
+These environments are designed to be extremely simple, with small discrete state and action spaces, and hence easy to learn. As a result, they are suitable for debugging implementations of reinforcement learning algorithms.
All environments are configurable via arguments specified in each environment's documentation.
diff --git a/docs/environments/toy_text/.gitkeep b/docs/environments/toy_text/.gitkeep
new file mode 100644
index 000000000..e69de29bb
diff --git a/docs/environments/toy_text/index.html b/docs/environments/toy_text/index.html
deleted file mode 100644
index 84adffbae..000000000
--- a/docs/environments/toy_text/index.html
+++ /dev/null
@@ -1,29 +0,0 @@
-
-