Skip to content

Commit d28a719

Browse files
Update v5 doc
1 parent c34659f commit d28a719

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

gymnasium_robotics/envs/maze/ant_maze_v5.py

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""A maze environment with the Gymnasium Ant agent (https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/mujoco/ant_v4.py).
1+
"""A maze environment with the Gymnasium Ant agent (https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/mujoco/ant_v5.py).
22
33
The code is inspired by the D4RL repository hosted on GitHub (https://github.com/Farama-Foundation/D4RL), published in the paper
44
'D4RL: Datasets for Deep Data-Driven Reinforcement Learning' by Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine.
@@ -40,22 +40,22 @@ class AntMazeEnv(MazeEnv, EzPickle):
4040
#### Maze size
4141
The map variations for the mazes are the same as for `PointMaze`. The ant environments with fixed goal and reset locations are the following:
4242
43-
* `AntMaze_UMaze-v4`
44-
* `AntMaze_BigMaze-v4`
45-
* `AntMaze_HardestMaze-v4`
43+
* `AntMaze_UMaze-v5`
44+
* `AntMaze_BigMaze-v5`
45+
* `AntMaze_HardestMaze-v5`
4646
4747
#### Diverse goal mazes
4848
The environments with fixed reset position for the ant and randomly selected goals, also known as diverse goal, are:
4949
50-
* `AntMaze_BigMaze_DG-v4`
51-
* `AntMaze_HardestMaze_DG-v4`
50+
* `AntMaze_BigMaze_DG-v5`
51+
* `AntMaze_HardestMaze_DG-v5`
5252
5353
#### Diverse goal and reset mazes
5454
5555
Finally, the environments that select the reset and goal locations randomly are:
5656
57-
* `AntMaze_BigMaze_DGR-v4`
58-
* `AntMaze_HardestMaze_DGR-v4`
57+
* `AntMaze_BigMaze_DGR-v5`
58+
* `AntMaze_HardestMaze_DGR-v5`
5959
6060
#### Custom maze
6161
Also, any of the `AntMaze` environments can be initialized with a custom maze map by setting the `maze_map` argument like follows:
@@ -70,7 +70,7 @@ class AntMazeEnv(MazeEnv, EzPickle):
7070
[1, C, 0, C, 1],
7171
[1, 1, 1, 1, 1]]
7272
73-
env = gym.make('AntMaze_UMaze-v4', maze_map=example_map)
73+
env = gym.make('AntMaze_UMaze-v5', maze_map=example_map)
7474
```
7575
7676
### Action Space
@@ -153,8 +153,8 @@ class AntMazeEnv(MazeEnv, EzPickle):
153153
- *sparse*: the returned reward can have two values: `0` if the ant hasn't reached its final target position, and `1` if the ant is in the final target position (the ant is considered to have reached the goal if the Euclidean distance between both is lower than 0.5 m).
154154
- *dense*: the returned reward is the negative Euclidean distance between the achieved goal position and the desired goal.
155155
156-
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `AntMaze_UMaze-v4`. However, for `dense`
157-
reward the id must be modified to `AntMaze_UMazeDense-v4` and initialized as follows:
156+
To initialize this environment with one of the mentioned reward functions the type of reward must be specified in the id string when the environment is initialized. For `sparse` reward the id is the default of the environment, `AntMaze_UMaze-v5`. However, for `dense`
157+
reward the id must be modified to `AntMaze_UMazeDense-v5` and initialized as follows:
158158
159159
```python
160160
import gymnasium as gym
@@ -197,7 +197,7 @@ class AntMazeEnv(MazeEnv, EzPickle):
197197
198198
gym.register_envs(gymnasium_robotics)
199199
200-
env = gym.make('AntMaze_UMaze-v4', max_episode_steps=100)
200+
env = gym.make('AntMaze_UMaze-v5', max_episode_steps=100)
201201
```
202202
203203
### Version History

0 commit comments

Comments
 (0)