Environment#

class maze_world.envs.MazeEnv[source]#

The main Maze Environment class for implementing different maze environments

The class encapsulates maze environments with arbitrary behind-the-scenes dynamics through the step() and reset() functions.

Example:
>>> import gymnasium as gym
>>> def generate_maze_fn():
...     maze_map = np.array(
...         [
...             [1, 1, 1, 1, 1, 1, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 0, 0, 0, 0, 0, 1],
...             [1, 1, 1, 1, 1, 1, 1],
...         ]
...     )
...     agent_loc = np.array([1, 1])
...     target_loc = np.array([3, 5])
...     return maze_map, agent_loc, target_loc
>>> env = MazeEnv(generate_maze_fn, None, 5, 7)
metadata: dict[str, Any] = {'render_fps': 4, 'render_modes': ['human', 'rgb_array']}#
__init__(generate_maze_fn, render_mode=None, maze_width=None, maze_height=None)[source]#
Parameters:
  • generate_maze_fn (callable) –

    This function is called during every reset of the environment and is expected to return three items in following order:

    • maze-map: numpy array of map where “1” represents wall and “0” represents floor.

    • agent location: tuple (x,y) where x and y represent location of agent

    • target location: tuple (x,y) where x and y represent target location of the agent

  • render_mode (Optional[str]) –

    specifies one of the following:

    • None (default): no render is computed.

    • “human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.

    • “rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.

    • “ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).

    • “rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().

  • maze_width (Optional[int]) – The width of the maze

  • maze_height (Optional[int]) – The height of the maze

property action_space#
Specifies available discrete action for the environment, where
  1. “right”

  2. “up”

  3. “left”

  4. “down”

Returns:

Discrete action space object representing the possible actions.

Return type:

gym.spaces.Discrete

property observation_space#

Defines the observation space of the 2D maze environment.

The observation space consists of two elements: - ‘agent’: Represents the position of the agent in the maze. - ‘target’: Represents the position of the target in the maze.

In the 2D maze:
  • 0 corresponds to an empty floor.

  • 1 corresponds to a wall.

  • 2 corresponds to the agent or the target.

Returns:

Dictionary containing the observation space for the agent and the target. - ‘agent’: gym.spaces.Box object representing the agent’s position. - ‘target’: gym.spaces.Box object representing the target’s position.

Return type:

gym.spaces.Dict

reset(seed=None, options=None)[source]#

Resets the environment to its initial state and generates a new random maze configuration.

Parameters:
  • seed (Optional[int]) – Seed for the random number generator. Defaults to None.

  • options – Unused parameter.

Returns:

  • observation: Agent’s observation of the initial environment state.

  • info (dict): Additional information about the environment.

Return type:

tuple

Raises:

ValueError – If the shape of the maze generated by generate_maze_fn() doesn’t match the specified maze width and height.

step(action)[source]#

Take a step in the environment.

Parameters:

action (int) – The action to take.

Returns:

  • observation: Agent’s current observation of the environment.

  • reward (float): Reward received after taking the step.

  • terminated (bool): Whether the episode has terminated or not.

  • truncated (bool): Whether the episode has been truncated due to max episode steps.

  • info (dict): Additional information about the step.

Return type:

tuple

render()[source]#

Compute the render frames as specified by render_mode during the initialization of the environment.

close()[source]#

Closes the environment.

This method shuts down the Pygame display if it was initialized.

class maze_world.envs.RandomMazeEnv[source]#

Extends the MazeEnv class to create random mazes of specified sizes at each reset.

Example:
>>> env = RandomMazeEnv(maze_width=5,maze_height=7)
__init__(render_mode=None, maze_width=11, maze_height=11)[source]#
Parameters:
  • render_mode (Optional[str]) –

    Specify one of the following:

    • None (default): No render is computed.

    • ”human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.

    • ”rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.

    • ”ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).

    • ”rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().

  • maze_width (int) – The width of the maze.

  • maze_height (int) – The height of the maze.

Raises:

ValueError – If the width or height of the maze is not odd.