Wrappers

POSGGym includes a wrapper API (adopted from Gymnasium) as well as a collection of common wrappers that provide a convenient way to modify an existing environment without having to alter the underlying code directly. Wrappers can be applied to any posggym.Env environment, and can also be chained to combine their effects.

In order to wrap an environment, you must first initialize a base environment. Then you can pass this environment along with (possibly optional) parameters to the wrapper’s constructor.

>>> import posggym
>>> from posggym.wrappers import FlattenObservation
>>> base_env = posggym.make("PursuitEvasion-v1")
>>> base_env.observation_spaces['0']
Tuple(Tuple(Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2)), Tuple(Discrete(16), Discrete(16)), Tuple(Discrete(16), Discrete(16)), Tuple(Discrete(16), Discrete(16)))
>>> wrapped_env = FlattenObservation(base_env)
>>> wrapped_env.observation_spaces['0']
Box(0, 1, (108,), int64)

You can access the environment underneath the top-most wrapper by using the posggym.Wrapper.env attribute. Since the posggym.Wrapper class inherits from posggym.Env, the environment from the posggym.Wrapper.env attribute can be another wrapper.

>>> wrapped_env
<FlattenObservation<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>>
>>> wrapped_env.env
<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>

If you want to get to the environment underneath all of the layers of wrappers, you can use the posggym.Wrapper.unwrapped attribute. If the environment is already a bare environment, this will just return the environment itself.

>>> wrapped_env
<FlattenObservation<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>>
>>> wrapped_env.unwrapped
<posggym.envs.grid_world.pursuit_evasion.PursuitEvasionEnv object at 0x7f4a94086d90>

There are three common things you might want a wrapper to do:

  • Transform actions before applying them to the base environment

  • Transform observations that are returned by the base environment

  • Transform rewards that are returned by the base environment

Such wrappers can be easily implemented by inheriting from posggym.ActionWrapper, posggym.ObservationWrapper, or posggym.RewardWrapper and implementing the respective transformation. If you need a wrapper to do more complicated tasks, you can inherit from the posggym.Wrapper class directly.

posggym.Wrapper

class posggym.Wrapper(env: Env[StateType, ObsType, ActType])

Wraps a posggym.Env to allow a modular transformation.

This class is the base class for all wrappers. Wrappers that inherit from this class can modify action_spaces, observation_spaces, reward_ranges and metadata attributes , without changing the underlying environment’s attributes.

Moreover, the behavior of the step() and reset() methods can be changed by these wrappers. Some attributes (spec, render_mode) will point back to the wrapper’s environment (i.e. to the corresponding attributes of env).

Note

If you inherit from Wrapper, don’t forget to call super().__init__(env) if the subclass overrides the __init__ method.

Methods

posggym.Wrapper.step(self, actions: Dict[str, WrapperActType]) Tuple[Dict[str, WrapperObsType], Dict[str, float], Dict[str, bool], Dict[str, bool], bool, Dict[str, Dict]]

Uses the step() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.reset(self, *, seed: int | None = None, options: Dict[str, Any] | None = None) Tuple[Dict[str, WrapperObsType], Dict[str, Dict]]

Uses the reset() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.render(self) None | np.ndarray | str | Dict[str, np.ndarray] | Dict[str, str]

Uses the render() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.close(self)

Closes the wrapper and env.

Attributes

property Wrapper.model: POSGModel

Returns the Env model.

property Wrapper.state: WrapperStateType

Returns the Env state.

property Wrapper.possible_agents: Tuple[str, ...]

Returns the Env possible_agents.

property Wrapper.agents: List[str]

Returns the Env agents.

property Wrapper.action_spaces: Dict[str, spaces.Space]

Return the Env action_spaces.

This is the Env action_spaces unless it’s overwritten then the wrapper action_spaces is used.

property Wrapper.observation_spaces: Dict[str, spaces.Space]

Return the Env observation_spaces.

This is the Env observation_spaces unless it’s overwritten then the wrapper observation_spaces is used.

property Wrapper.reward_ranges: Dict[str, Tuple[float, float]]

Return the Env reward_ranges.

This is the Env reward_ranges, unless it’s overwritten, then the wrapper reward_ranges is used.

property Wrapper.metadata: Dict[str, Any]

Returns the Env metadata.

property Wrapper.spec: EnvSpec | None

Return the Env spec attribute.

property Wrapper.render_mode: str | None

Return the Env render_mode.

posggym.Wrapper.env

The environment (one level underneath) this wrapper.

This may itself be a wrapped environment. To obtain the environment underneath all layers of wrappers, use posggym.Wrapper.unwrapped.

property Wrapper.unwrapped: Env

Returns the base environment of the wrapper.

This will be the bare posggym.Env environment, underneath all layers of wrappers.

POSGGym Wrappers

POSGGym provides a number of commonly used wrappers listed below.

Name

Type

Description

DiscretizeActions

Action Wrapper

An Action wrapper that discretizes continuous action spaces

RescaleActions

Action Wrapper

An Action wrapper for rescaling actions

FlattenObservations

Observation Wrapper

An Observation wrapper that flattens the observation

RescaleObservations

Observation Wrapper

An Observation wrapper for rescaling observations

OrderEnforcing

Misc Wrapper

This will produce an error if step or render is called before reset

PassiveEnvChecker

Misc Wrapper

Checks that the step, reset and render functions follow the posggym API.

RecordVideo

Misc Wrapper

This wrapper will record videos of rollouts.

TimeLimit

Misc Wrapper

This wrapper will emit a truncated signal if the specified number of steps is exceeded in an episode.