Environment#
- class maze_world.envs.MazeEnv[source]#
The main Maze Environment class for implementing different maze environments
The class encapsulates maze environments with arbitrary behind-the-scenes dynamics through the
step()
andreset()
functions.- Example:
>>> import gymnasium as gym >>> def generate_maze_fn(): ... maze_map = np.array( ... [ ... [1, 1, 1, 1, 1, 1, 1], ... [1, 0, 0, 0, 0, 0, 1], ... [1, 0, 0, 0, 0, 0, 1], ... [1, 0, 0, 0, 0, 0, 1], ... [1, 1, 1, 1, 1, 1, 1], ... ] ... ) ... agent_loc = np.array([1, 1]) ... target_loc = np.array([3, 5]) ... return maze_map, agent_loc, target_loc >>> env = MazeEnv(generate_maze_fn, None, 5, 7)
- metadata: dict[str, Any] = {'render_fps': 4, 'render_modes': ['human', 'rgb_array']}#
- __init__(generate_maze_fn, render_mode=None, maze_width=None, maze_height=None)[source]#
- Parameters:
generate_maze_fn (callable) –
This function is called during every reset of the environment and is expected to return three items in following order:
maze-map: numpy array of map where “1” represents wall and “0” represents floor.
agent location: tuple (x,y) where x and y represent location of agent
target location: tuple (x,y) where x and y represent target location of the agent
render_mode (Optional[str]) –
specifies one of the following:
None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.
“rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
“ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().
maze_width (Optional[int]) – The width of the maze
maze_height (Optional[int]) – The height of the maze
- property action_space#
- Specifies available discrete action for the environment, where
“right”
“up”
“left”
“down”
- Returns:
Discrete action space object representing the possible actions.
- Return type:
gym.spaces.Discrete
- property observation_space#
Defines the observation space of the 2D maze environment.
The observation space consists of two elements: - ‘agent’: Represents the position of the agent in the maze. - ‘target’: Represents the position of the target in the maze.
- In the 2D maze:
0 corresponds to an empty floor.
1 corresponds to a wall.
2 corresponds to the agent or the target.
- Returns:
Dictionary containing the observation space for the agent and the target. - ‘agent’: gym.spaces.Box object representing the agent’s position. - ‘target’: gym.spaces.Box object representing the target’s position.
- Return type:
gym.spaces.Dict
- reset(seed=None, options=None)[source]#
Resets the environment to its initial state and generates a new random maze configuration.
- Parameters:
seed (Optional[int]) – Seed for the random number generator. Defaults to None.
options – Unused parameter.
- Returns:
observation: Agent’s observation of the initial environment state.
info (dict): Additional information about the environment.
- Return type:
tuple
- Raises:
ValueError – If the shape of the maze generated by generate_maze_fn() doesn’t match the specified maze width and height.
- step(action)[source]#
Take a step in the environment.
- Parameters:
action (int) – The action to take.
- Returns:
observation: Agent’s current observation of the environment.
reward (float): Reward received after taking the step.
terminated (bool): Whether the episode has terminated or not.
truncated (bool): Whether the episode has been truncated due to max episode steps.
info (dict): Additional information about the step.
- Return type:
tuple
- class maze_world.envs.RandomMazeEnv[source]#
Extends the MazeEnv class to create random mazes of specified sizes at each reset.
- Example:
>>> env = RandomMazeEnv(maze_width=5,maze_height=7)
- __init__(render_mode=None, maze_width=11, maze_height=11)[source]#
- Parameters:
render_mode (Optional[str]) –
Specify one of the following:
None (default): No render is computed.
”human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.
”rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.
”ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).
”rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(…,render_mode=”rgb_array_list”). The frames collected are popped after render() is called or reset().
maze_width (int) – The width of the maze.
maze_height (int) – The height of the maze.
- Raises:
ValueError – If the width or height of the maze is not odd.