Wrappers

POSGGym includes a wrapper API (adopted from Gymnasium) as well as a collection of common wrappers that provide a convenient way to modify an existing environment without having to alter the underlying code directly. Wrappers can be applied to any posggym.Env environment, and can also be chained to combine their effects.

In order to wrap an environment, you must first initialize a base environment. Then you can pass this environment along with (possibly optional) parameters to the wrapper’s constructor.

>>> import posggym
>>> from posggym.wrappers import FlattenObservation
>>> base_env = posggym.make("PursuitEvasion-v1")
>>> base_env.observation_spaces['0']
Tuple(Tuple(Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2), Discrete(2)), Tuple(Discrete(16), Discrete(16)), Tuple(Discrete(16), Discrete(16)), Tuple(Discrete(16), Discrete(16)))
>>> wrapped_env = FlattenObservation(base_env)
>>> wrapped_env.observation_spaces['0']
Box(0, 1, (108,), int64)

You can access the environment underneath the top-most wrapper by using the posggym.Wrapper.env attribute. Since the posggym.Wrapper class inherits from posggym.Env, the environment from the posggym.Wrapper.env attribute can be another wrapper.

>>> wrapped_env
<FlattenObservation<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>>
>>> wrapped_env.env
<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>

If you want to get to the environment underneath all of the layers of wrappers, you can use the posggym.Wrapper.unwrapped attribute. If the environment is already a bare environment, this will just return the environment itself.

>>> wrapped_env
<FlattenObservation<TimeLimit<OrderEnforcing<PassiveEnvChecker<PursuitEvasionEnv<PursuitEvasion-v0>>>>>>
>>> wrapped_env.unwrapped
<posggym.envs.grid_world.pursuit_evasion.PursuitEvasionEnv object at 0x7f4a94086d90>

There are three common things you might want a wrapper to do:

Transform actions before applying them to the base environment
Transform observations that are returned by the base environment
Transform rewards that are returned by the base environment

Such wrappers can be easily implemented by inheriting from posggym.ActionWrapper, posggym.ObservationWrapper, or posggym.RewardWrapper and implementing the respective transformation. If you need a wrapper to do more complicated tasks, you can inherit from the posggym.Wrapper class directly.

posggym.Wrapper

class posggym.Wrapper(env: Env[StateType, ObsType, ActType])

Wraps a posggym.Env to allow a modular transformation.

This class is the base class for all wrappers. Wrappers that inherit from this class can modify action_spaces, observation_spaces, reward_ranges and metadata attributes , without changing the underlying environment’s attributes.

Moreover, the behavior of the step() and reset() methods can be changed by these wrappers. Some attributes (spec, render_mode) will point back to the wrapper’s environment (i.e. to the corresponding attributes of env).

Note

If you inherit from Wrapper, don’t forget to call super().__init__(env) if the subclass overrides the __init__ method.

Methods

posggym.Wrapper.step(self, actions: Dict[str, WrapperActType]) → Tuple[Dict[str, WrapperObsType], Dict[str, float], Dict[str, bool], Dict[str, bool], bool, Dict[str, Dict]]

Uses the step() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.reset(self, *, seed: int | None = None, options: Dict[str, Any] | None = None) → Tuple[Dict[str, WrapperObsType], Dict[str, Dict]]

Uses the reset() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.render(self) → None | np.ndarray | str | Dict[str, np.ndarray] | Dict[str, str]

Uses the render() of the env.

Can be overwritten to change the returned data.

posggym.Wrapper.close(self): Closes the wrapper and env.

Attributes

property Wrapper.model: POSGModel: Returns the Env model.

property Wrapper.state: WrapperStateType: Returns the Env state.

property Wrapper.possible_agents: Tuple[str, ...]: Returns the Env possible_agents.

property Wrapper.agents: List[str]: Returns the Env agents.

property Wrapper.action_spaces: Dict[str, spaces.Space]

Return the Env action_spaces.

This is the Env action_spaces unless it’s overwritten then the wrapper action_spaces is used.

property Wrapper.observation_spaces: Dict[str, spaces.Space]

Return the Env observation_spaces.

This is the Env observation_spaces unless it’s overwritten then the wrapper observation_spaces is used.

property Wrapper.reward_ranges: Dict[str, Tuple[float, float]]

Return the Env reward_ranges.

This is the Env reward_ranges, unless it’s overwritten, then the wrapper reward_ranges is used.

property Wrapper.metadata: Dict[str, Any]: Returns the Env metadata.

property Wrapper.spec: EnvSpec | None: Return the Env spec attribute.

property Wrapper.render_mode: str | None: Return the Env render_mode.

posggym.Wrapper.env

The environment (one level underneath) this wrapper.

This may itself be a wrapped environment. To obtain the environment underneath all layers of wrappers, use posggym.Wrapper.unwrapped.

property Wrapper.unwrapped: Env

Returns the base environment of the wrapper.

This will be the bare posggym.Env environment, underneath all layers of wrappers.

POSGGym Wrappers

POSGGym provides a number of commonly used wrappers listed below.

Name	Type	Description
`DiscretizeActions`	Action Wrapper	An Action wrapper that discretizes continuous action spaces
`RescaleActions`	Action Wrapper	An Action wrapper for rescaling actions
`FlattenObservations`	Observation Wrapper	An Observation wrapper that flattens the observation
`RescaleObservations`	Observation Wrapper	An Observation wrapper for rescaling observations
`OrderEnforcing`	Misc Wrapper	This will produce an error if step or render is called before reset
`PassiveEnvChecker`	Misc Wrapper	Checks that the step, reset and render functions follow the posggym API.
`RecordVideo`	Misc Wrapper	This wrapper will record videos of rollouts.
`TimeLimit`	Misc Wrapper	This wrapper will emit a truncated signal if the specified number of steps is exceeded in an episode.