Pursuit Evasion
These policies are for the Pursuit Evasion environment. Read environment page for detailed information about the environment.
Generic
These policies can be used for any version of this environment.
env = posggym.make("PursuitEvasion-v1")
Policy |
ID |
Valid Agent IDs |
Description |
|---|---|---|---|
|
|
All |
Takes the shortest path to the evader’s goal (evader) or the evader’s start location then the other possible evader start and goal locations (pursuer). |
grid=16x16
env = posggym.make(
"PursuitEvasion-v1",
grid="16x16",
max_obs_distance=12,
use_progress_reward=True
)
Policy |
ID |
Valid Agent IDs |
Description |
|---|---|---|---|
|
|
|
Level 0 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 1 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 2 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 3 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 4 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Best-response to K-Level Reasoning policies. This is a deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Level 0 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 1 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 2 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 3 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 4 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Best-response to K-Level Reasoning policies. This is a deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
grid=8x8
env = posggym.make(
"PursuitEvasion-v1",
grid="8x8",
max_obs_distance=12,
use_progress_reward=True
)
Policy |
ID |
Valid Agent IDs |
Description |
|---|---|---|---|
|
|
|
Level 0 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 1 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 2 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 3 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 4 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Best-response to K-Level Reasoning policies. This is a deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Level 0 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 1 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 2 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 3 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Level 4 K-Level Reasoning deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Best-response to K-Level Reasoning policies. This is a deep RL policy training using PPO and the Synchronous KLR algorithm. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |
|
|
|
Deep RL policy trained using PPO and self-play. |