The Wumpus World is a classic example of a knowledge-based agent in AI. It is a 4x4 grid consisting of 16 rooms. Agent starts at Room[1,1] facing right and its goal is to retrieve treasure while avoiding hazards such as pits and the Wumpus. Agent navigate through the grid using its limited sensory input to make decisions that will keep it safe, collect treasure and exit the cave.

Key Elements:
- Pits: If the agent steps into a pit it falls and dies. A breeze in adjacent rooms suggests nearby pits.
- Wumpus: A creature that kills agent if it enters its room. Rooms next to the Wumpus have a stench. Agent can use an arrow to kill the Wumpus.
- Treasure: Agent’s main objective is to collect the treasure (gold) which is located in one room.
- Breeze: Indicates a pit is nearby.
- Stench: Indicates the Wumpus is nearby.
Agent must navigate carefully avoiding dangers to collect treasure and exit safely.
PEAS Description
PEAS stands for Performance Measures, Environment, Actuators and Sensors which describe agent’s capabilities and environment.
1. Performance measures: Rewards or Punishments
- Agent gets gold and return back safe = +1000 points
- Agent dies (pit or Wumpus)= -1000 points
- Each move of the agent = -1 point
- Agent uses the arrow = -10 points
2. Environment: A setting where everything will take place.
- A cave with 16(4x4) rooms.
- Rooms adjacent (not diagonally) to the Wumpus are stinking.
- Rooms adjacent (not diagonally) to the pit are breezy.
- Room with gold glitters.
- Agent's initial position - Room[1, 1] and facing right side.
- Location of Wumpus, gold and 3 pits can be anywhere except in Room[1, 1].
3. Actuators: Devices that allow agent to perform following actions in the environment.
- Move forward: Move to next room.
- Turn right/left: Rotate agent 90 degrees.
- Shoot: Kill Wumpus with arrow.
- Grab: Take treasure.
- Release: Drop treasure
4. Sensors: Devices help the agent in sensing following from the environment.
- Breeze: Detected near a pit.
- Stench: Detected near the Wumpus.
- Glitter: Detected when treasure is in the room.
- Scream: Triggered when Wumpus is killed.
- Bump: Occurs when hitting a wall.
How the Agent Operates with PEAS
- Perception: Agent uses sensory inputs (breeze, stench, glitter) to detect its surroundings and understand the environment.
- Inference: Agent applies logical reasoning to find location of hazards. For example if it detects a breeze, it warns that a pit is nearby or if there’s a stench it suspects the Wumpus is in an adjacent room.
- Planning: Based on its deductions agent plans its next move avoiding risky areas like rooms with suspected pits or the Wumpus.
- Action: Agent performs planned action such as moving to a new room, shooting arrow at the Wumpus or taking the treasure.
This process repeats till the agent finds the cave using its sensory inputs, reasoning and planning to achieve its goal safely.
By using PEAS framework agent’s interactions with its environment are clearly defined, providing a structured approach to modeling intelligent behavior.
Implementation
Here we implements a Wumpus World AI agent:
Step 1: Initialize Explorer Agent and Environment
- Imports copy for safe state copying and deque efficient traversal using queues .
- Defines the Explorer class, which models an intelligent agent for the Wumpus World problem.
- Initializes a 4×4 world grid containing hazards (Wumpus, Pit) and a goal (Gold).
- Sets the agent’s starting position and tracks its survival and exit status.
import copy
from collections import deque
class Explorer:
def __init__(self):
self._world_map = [
['', '', '', ''],
['', 'W', 'P', ''],
['', '', 'G', ''],
['', '', '', ''],
]
self._position = [1, 1]
self._alive = True
self._exited = False
Step 2: Coordinate Conversion and Hazard Checking
- Converts 1-based coordinates to 0-based indices for accessing the grid.
- Checks if the current cell contains a hazard (Wumpus or Pit) and updates _alive.
- Detects if the agent has found the treasure and updates _exited.
- Returns the agent’s alive status to guide further actions.
def _coords_to_index(self, loc):
row, col = loc
return row - 1, col - 1
def _check_hazards(self):
row, col = self._coords_to_index(self._position)
cell = self._world_map[row][col]
if 'P' in cell or 'W' in cell:
self._alive = False
print(f"Agent encountered hazard at {self._position}. DEAD.")
if 'G' in cell:
print(f"Agent found the treasure at {self._position}!")
self._exited = True
return self._alive
Step 3: Movement and Adjacent Cells
- Moves the agent in the specified direction within grid bounds.
- Prevents movement if the agent is dead or has exited.
- Checks for hazards or treasure after moving.
- _adjacent_cells returns all valid neighboring cells.
def move(self, action):
directions = ['Up', 'Down', 'Left', 'Right']
move_vectors = [[0, 1], [0, -1], [-1, 0], [1, 0]]
if action not in directions:
raise ValueError(f"Invalid action: {action}")
if not self._alive:
print(f"Cannot move. DEAD at {self._position}")
return False
if self._exited:
print(f"Cannot move. Exited at {self._position}")
return False
idx = directions.index(action)
move = move_vectors[idx]
self._position = [
min(4, max(1, self._position[0] + move[0])),
min(4, max(1, self._position[1] + move[1]))
]
print(f"Moved {action}. Current position: {self._position}")
return self._check_hazards()
def _adjacent_cells(self):
adj = []
for dr, dc in [[0, 1], [0, -1], [-1, 0], [1, 0]]:
r, c = self._position[0] + dr, self._position[1] + dc
if 1 <= r <= 4 and 1 <= c <= 4: adj.append([r, c])
return adj
Step 4: Perception and Current Location
- perceive detects hazards in adjacent cells, returning breeze for pits and stench for the Wumpus.
- Prevents perception if the agent is dead or has exited.
- Uses _adjacent_cells and _coords_to_index to check neighboring cells.
- current_location simply returns the agent’s current position.
def perceive(self):
if not self._alive:
print(f"Cannot perceive. DEAD at {self._position}")
return [None, None]
if self._exited:
print(f"Cannot perceive. Exited at {self._position}")
return [None, None]
breeze, stench = False, False
for r, c in self._adjacent_cells():
i, j = self._coords_to_index([r, c])
cell = self._world_map[i][j]
if 'P' in cell: breeze = True
if 'W' in cell: stench = True
return [breeze, stench]
def current_location(self):
return self._position
Step 5: Knowledge Base and Utility Functions
- Initializes knowledge_base and actions_taken to track agent reasoning and moves.
- current_status keeps a 4×4 grid of visited or safe cells.
- neighbors returns valid adjacent cells for a given location.
- is_valid checks if given row and column indices are within grid bounds.
knowledge_base = []
actions_taken = []
current_status = [[0]*4 for _ in range(4)]
allowed_moves = [[0,1],[0,-1],[1,0],[-1,0]]
directions = ['Up','Down','Right','Left']
total_calls = 0
def neighbors(loc):
adj = []
for dr, dc in [[0, 1], [0, -1], [-1, 0], [1, 0]]:
r, c = loc[0] + dr, loc[1] + dc
if 1 <= r <= 4 and 1 <= c <= 4: adj.append([r, c])
return adj
def is_valid(r, c):
return 0 <= r < 4 and 0 <= c < 4
Step 6: BFS Pathfinding and Literal Extraction
- bfs_path finds a path from start to goal using Breadth-First Search on safe cells (current_status).
- Tracks visited cells and parent pointers to reconstruct the sequence of moves.
- Returns a list of directions to reach the goal or an empty list if unreachable.
- literal_of extracts the first literal from a logical expression, useful for knowledge base reasoning
def bfs_path(start, goal):
visited = [[False]*4 for _ in range(4)]
q = deque()
parent = {(start[0], start[1]): None}
q.append((start[0], start[1]))
visited[start[0]][start[1]] = True
while q:
r, c = q.popleft()
if [r, c] == goal: break
for idx, (dr, dc) in enumerate(allowed_moves):
nr, nc = r + dr, c + dc
if is_valid(nr, nc) and current_status[nr][nc] == 1 and not visited[nr][nc]:
visited[nr][nc] = True
q.append((nr, nc))
parent[(nr, nc)] = ((r, c), directions[idx])
path = []
node = (goal[0], goal[1])
if node not in parent:
return []
while node != (start[0], start[1]):
if parent[node] is None: break
move = parent[node][1]
path.append(move)
node = parent[node][0]
path.reverse()
return path
def literal_of(expr):
for clause in expr:
for literal in clause:
return literal[0]
Step 7: Pure Literals and Unit Clauses
- pure_literals identifies literals that appear with only one polarity in the expression, useful for simplifying the knowledge base.
- Tracks symbols and ensures consistency across clauses to detect pure literals.
- unit_clauses finds clauses with only one literal and checks for consistency.
- Returns the list of unit clauses and a boolean indicating whether the knowledge base is consistent.
def pure_literals(expr):
symbols = {lit[0] for clause in expr for lit in clause}
pure_set, vals = {}, {}
for s in symbols: pure_set[s] = True
for clause in expr:
for lit in clause:
if not pure_set[lit[0]]: continue
if lit[0] in vals:
if vals[lit[0]] != lit[1]: pure_set[lit[0]] = False
else: vals[lit[0]] = lit[1]
return {(k, vals[k]) for k in pure_set if pure_set[k]}
def unit_clauses(expr):
units, consistent = [], True
tracker = {}
for clause in expr:
if len(clause) == 1:
literal = next(iter(clause))
units.append({literal})
if literal[0] not in tracker: tracker[literal[0]] = literal[1]
elif tracker[literal[0]] != literal[1]: consistent = False
return consistent, units
Step 8: DPLL SAT Solver
- Implements the Davis-Putnam-Logemann-Loveland (DPLL) algorithm to check if a propositional logic expression is satisfiable.
- Simplifies the expression using pure literals and unit clauses, reducing search space.
- Recursively branches on a literal, trying both True and False assignments.
- Returns True if the expression is satisfiable, False if inconsistent, and tracks recursive calls with total_calls.
def dpll(expr):
global total_calls
total_calls += 1
expr = [frozenset(c) for c in expr]
expr = [c for c in expr if c]
ps = pure_literals(expr)
new_expr = []
for clause in expr:
satisfied = False
for pure_lit in ps:
if pure_lit in clause:
satisfied = True
break
if not satisfied:
new_clause = frozenset(l for l in clause if (l[0], 1-l[1]) not in ps)
new_expr.append(new_clause)
expr = new_expr
consistent, units = unit_clauses(expr)
if not consistent: return False
for u_clause in units:
literal = next(iter(u_clause))
expr = [c for c in expr if literal not in c]
negated_literal = (literal[0], 1-literal[1])
expr = [frozenset(l for l in c if l != negated_literal) for c in expr]
if not expr: return True
if any(not c for c in expr): return False
l = literal_of(expr)
branch_true = [frozenset(l_item for l_item in c if l_item != (l, 0)) for c in expr if (l, 1) not in c]
if dpll(branch_true): return True
branch_false = [frozenset(l_item for l_item in c if l_item != (l, 1)) for c in expr if (l, 0) not in c]
return dpll(branch_false)
Step 9: Simulation of Explorer Agent
- Uses a stack-based DFS approach to explore the Wumpus World from start (1,1) to goal (3,3) while tracking visited cells.
- Updates the knowledge_base with percepts (breeze, stench) and checks hazards using the DPLL solver.
- Marks safe (1) and unsafe (2) cells in current_status and plans moves using bfs_path.
- Records all moves in actions_taken and prints the final path and total DPLL calls after simulation.
def simulate(agent):
stack = [[1,1]]
current_status[0][0] = 1
visited = set()
visited.add((1,1))
while agent.current_location() != [3,3] and agent._alive:
current_cell = agent.current_location()
breeze, stench = agent.perceive()
knowledge_base.append({(f'B{current_cell[0]}{current_cell[1]}', int(breeze))})
knowledge_base.append({(f'S{current_cell[0]}{current_cell[1]}', int(stench))})
for room in neighbors(current_cell):
room_tuple = (room[0], room[1])
if room_tuple in visited: continue
visited.add(room_tuple)
w_literal = (f'W{room[0]}{room[1]}', 1)
knowledge_base.append({w_literal})
if dpll(knowledge_base):
knowledge_base.pop()
knowledge_base.append({(w_literal[0], 0)})
if not dpll(knowledge_base):
knowledge_base.pop()
knowledge_base.append({w_literal})
current_status[room[0]-1][room[1]-1] = 2
else:
knowledge_base.pop()
else:
knowledge_base.pop()
knowledge_base.append({(w_literal[0], 0)})
if current_status[room[0]-1][room[1]-1] != 2:
p_literal = (f'P{room[0]}{room[1]}', 1)
knowledge_base.append({p_literal})
if dpll(knowledge_base):
knowledge_base.pop()
knowledge_base.append({(p_literal[0], 0)})
if not dpll(knowledge_base):
knowledge_base.pop()
knowledge_base.append({p_literal})
current_status[room[0]-1][room[1]-1] = 2
else:
knowledge_base.pop()
else:
knowledge_base.pop()
knowledge_base.append({(p_literal[0], 0)})
if current_status[room[0]-1][room[1]-1] != 2 and room not in stack:
stack.append(room)
current_status[room[0]-1][room[1]-1] = 1
if not stack: break
next_cell = stack.pop()
path = bfs_path([current_cell[0]-1, current_cell[1]-1], [next_cell[0]-1, next_cell[1]-1])
for move in path:
if not agent._alive: break
agent.move(move)
actions_taken.append(move)
print("\nFinal Moves Taken:", actions_taken)
print("Total DPLL Calls:", total_calls)
if __name__ == '__main__':
agent = Explorer()
simulate(agent)
Output:

You can download full code from here