POSGGym is a Python library for planning and reinforcement learning research in partially observable, multi-agent environments..

Predator Prey

POSGGym provides a collection of discrete and continuous environments along with reference agents to allow for reproducible evaluations. The API aims to mimic that of Gymnasium and PettingZoo with the addition of a model API that can be used for planning.

Baseline implementations of planning and reinforcement learning algorithms for POSGGym are available in the POSGGym-Baselines library. Compatibility with other popular reinforcement learning libraries is possible using the PettingZoo wrapper (see below for an example).

Environment API

import posggym
env = posggym.make("PredatorPrey-v0")
observations, infos = env.reset(seed=42)

for t in range(100):
    actions = {i: env.action_spaces[i].sample() for i in env.agents}
    observations, rewards, terminations, truncations, all_done, infos = env.step(actions)

    if all_done:
        observations, infos = env.reset()


Model API

import posggym
env = posggym.make("PredatorPrey-v0")
model = env.model

state = model.sample_initial_state()
observations = model.sample_initial_obs(state)

for t in range(100):
    actions = {i: model.action_spaces[i].sample() for i in model.get_agents(state)}
    state, observations, rewards, terminations, truncations, all_done, infos = model.step(state, actions)

    # timestep attribute can be accessed individually:
    state = timestep.state
    observations = timestep.observations

    # Or unpacked fully
    state, observations, rewards, terminations, truncations, all_done, infos = timestep

    if all_done:
        state = model.sample_initial_state()
        observations = model.sample_initial_obs(state)

Agent API

import posggym
import posggym.agents as pga
env = posggym.make("PursuitEvasion-v1", grid="16x16")

policies = {
    '0': pga.make("PursuitEvasion-v1/grid=16x16/RL1_i0-v0", env.model, '0'),
    '1': pga.make("PursuitEvasion-v1/grid=16x16/ShortestPath-v0", env.model, '1')

obs, infos = env.reset(seed=42)
for i, policy in policies.items():

for t in range(100):
    actions = {i: policies[i].step(obs[i]) for i in env.agents}
    obs, rewards, terminations, truncations, all_done, infos = env.step(actions)

    if all_done:
        obs, infos = env.reset()
        for i, policy in policies.items():

for policy in policies.values():

Compatibility with PettingZoo

Any POSGGym environment can be converted into a PettingZoo ParallelEnv environment using the posggym.wrappers.petting_zoo.PettingZoo wrapper. This allows for easy integration with the ecosystem of libraries that support PettingZoo.

import posggym
from posggym.wrappers.petting_zoo import PettingZoo

env = posggym.make("PredatorPrey-v0")
env = PettingZoo(env)