Environment#

class maze_world.envs.MazeEnv[source]#

The main Maze Environment class for implementing different maze environments

The class encapsulates maze environments with arbitrary behind-the-scenes dynamics through the step() and reset() functions.

Example:

>>> import gymnasium as gym
>>> def generate_maze_fn():
...     maze_map = np.array(
...         [
...             [1, 1, 1, 1, 1, 1, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 1, 1, 1, 1, 1, 1],
...         ]
...     )
...     agent_loc = np.array([1, 1])
...     target_loc = np.array([3, 5])
...     return maze_map, agent_loc, target_loc
>>> env = MazeEnv(generate_maze_fn, None, 5, 7)

metadata: dict[str, Any] = {'render_fps': 4, 'render_modes': ['human', 'rgb_array']}#

__init__(generate_maze_fn, render_mode=None, maze_width=None, maze_height=None)[source]#

Parameters:

generate_maze_fn (callable) –
This function is called during every reset of the environment and is expected to return three items in following order:
- maze-map: numpy array of map where “1” represents wall and “0” represents floor.
- agent location: tuple (x,y) where x and y represent location of agent
- target location: tuple (x,y) where x and y represent target location of the agent
render_mode (Optional[str]) –
specifies one of the following:
- None (default): no render is computed.
- “human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.
- “rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
- “ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
- “rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().
maze_width (Optional[int]) – The width of the maze
maze_height (Optional[int]) – The height of the maze

property action_space#

Specifies available discrete action for the environment, where

“right”
“up”
“left”
“down”

Returns:: Discrete action space object representing the possible actions.
Return type:: gym.spaces.Discrete

property observation_space#

Defines the observation space of the 2D maze environment.

The observation space consists of two elements: - ‘agent’: Represents the position of the agent in the maze. - ‘target’: Represents the position of the target in the maze.

In the 2D maze:

0 corresponds to an empty floor.
1 corresponds to a wall.
2 corresponds to the agent or the target.

Returns:: Dictionary containing the observation space for the agent and the target. - ‘agent’: gym.spaces.Box object representing the agent’s position. - ‘target’: gym.spaces.Box object representing the target’s position.
Return type:: gym.spaces.Dict

reset(seed=None, options=None)[source]#

Resets the environment to its initial state and generates a new random maze configuration.

Parameters:

seed (Optional[int]) – Seed for the random number generator. Defaults to None.
options – Unused parameter.

Returns:

observation: Agent’s observation of the initial environment state.
info (dict): Additional information about the environment.

Return type:

tuple

Raises:

ValueError – If the shape of the maze generated by generate_maze_fn() doesn’t match the specified maze width and height.

step(action)[source]#

Take a step in the environment.

Parameters:

action (int) – The action to take.

Returns:

observation: Agent’s current observation of the environment.
reward (float): Reward received after taking the step.
terminated (bool): Whether the episode has terminated or not.
truncated (bool): Whether the episode has been truncated due to max episode steps.
info (dict): Additional information about the step.

Return type:

tuple

render()[source]#: Compute the render frames as specified by render_mode during the initialization of the environment.

close()[source]#

Closes the environment.

This method shuts down the Pygame display if it was initialized.

class maze_world.envs.RandomMazeEnv[source]#

Extends the MazeEnv class to create random mazes of specified sizes at each reset.

Example:

>>> env = RandomMazeEnv(maze_width=5,maze_height=7)

__init__(render_mode=None, maze_width=11, maze_height=11)[source]#

Parameters:

render_mode (Optional[str]) –
Specify one of the following:
- None (default): No render is computed.
- ”human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.
- ”rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
- ”ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
- ”rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().
maze_width (int) – The width of the maze.
maze_height (int) – The height of the maze.

Raises:

ValueError – If the width or height of the maze is not odd.