mirror of https://github.com/Farama-Foundation/Gymnasium.git synced 2025-08-02 14:26:33 +00:00

Files

Manuel Goulão 91769fc862 Add docs (#13 )

2022-09-13 20:27:34 +01:00

1.9 KiB

Raw Blame History

AUTOGENERATED, title

AUTOGENERATED	title
DO NOT EDIT FILE DIRECTLY	Cliff Walking

Cliff Walking

:width: 200px
:name: cliff_walking

This environment is part of the Toy Text environments. Please read that page first for general information.


Action Space	Discrete(4)
Observation Space	Discrete(48)
Import	`gymnasium.make("CliffWalking-v0")`

This is a simple implementation of the Gridworld Cliff reinforcement learning task.

Adapted from Example 6.6 (page 106) from Reinforcement Learning: An Introduction by Sutton and Barto.

With inspiration from: [https://github.com/dennybritz/reinforcement-learning/blob/master/lib/envs/cliff_walking.py] (https://github.com/dennybritz/reinforcement-learning/blob/master/lib/envs/cliff_walking.py)

Description

The board is a 4x12 matrix, with (using NumPy matrix indexing):

[3, 0] as the start at bottom-left
[3, 11] as the goal at bottom-right
[3, 1..10] as the cliff at bottom-center

If the agent steps on the cliff, it returns to the start. An episode terminates when the agent reaches the goal.

Actions

There are 4 discrete deterministic actions:

0: move up
1: move right
2: move down
3: move left

Observations

There are 3x12 + 1 possible states. In fact, the agent cannot be at the cliff, nor at the goal (as this results in the end of the episode). It remains all the positions of the first 3 rows plus the bottom-left cell. The observation is simply the current position encoded as flattened index.

Reward

Each time step incurs -1 reward, and stepping into the cliff incurs -100 reward.

Arguments

gymnasium.make('CliffWalking-v0')

Version History

v0: Initial version release

1.9 KiB Raw Blame History