Multi Access Broadcast Channel
This environment is part of the Classic environments. Please read that page first for general information.
Possible Agents |
(‘0’, ‘1’) |
Action Spaces |
{‘0’: Discrete(2), ‘1’: Discrete(2)} |
Observation Spaces |
{‘0’: Discrete(2), ‘1’: Discrete(2)} |
Symmetric |
True |
Import |
|
The Multi-Access Broadcast Channel Environment.
A cooperative game involving control of a multi-access broadcast channel. In this problem, each agent controls a node in a network. Each node needs to broadcast messages to each other over a shared channel, with only one node able to broadcast at a time. If more than one node broadcasts at the same time then there is a collision and no message is broadcast. The nodes share the common goal of maximizing the throughput of the channel.
Possible Agents
The environment supports two or more agents, although the default version only supports two agents. All agents are always active in the environment.
State Space
Each node has a message buffer that can store up to one message at a time.
That is it can be either EMPTY=0 or FULL=1.
Action Space
At each timestep each agent can either SEND=0 a message or not NOSEND=1.
Observation Space
At the end of each time step, each node receives a noisy observation of
whether there was a COLLISION=0 or NOCOLLISION=1.
Each agent observes the true outcome with probability obs_prob, which is
0.9 in the default version.
Rewards
Each agent receives a reward of 1 when a message is successfully
broadcast and a reward of 0 otherwise.
Dynamics
If a node’s buffer is EMPTY then at each step it will become full with
probability fill_probs[i], otherwise it will remain empty (independent
of the action performed). Where i is the agent ID. By default
fill_probs for node 0 is 0.9, and for node 1 is 0.1 (as per the paper).
If a node’s buffer is FULL and the node does not send a message - i.e. uses the
NOSEND action - then the buffer remains FULL. Otherwise - i.e. the agent chooses
the SEND action - if no other nodes sends a message (there is no COLLISION) the
buffer will be FULL with probability fill_probs[i], otherwise it will be empty.
If another message was sent at the same time by another node then there will be a
COLLISION and the node’s buffer remains FULL.
Starting State
Each node buffer starts as FULL with probability init_buffer_dist[i] (which is
1.0 by default), otherwise the buffer starts as EMPTY.
Episode End
By default episodes continue infinitely long. To set a step limit, specify
max_episode_steps when initializing the environment with posggym.make.
Arguments
num_nodes- the number of nodes (i.e. agents) in the network (default=2.0)fill_probs- the probability each nodes buffer is filled, should be a tuple with an entry for each node (default =None=(0.9, 0.1))observation_prob- the probability of correctly observing if there was a collision or not (default =0.9)init_buffer_dist- the probability each node starts with a full buffer. Should be a tuple with an entry for each node (default =None=(1.0, 1.0))
Version History
v0: Initial version
References
Ooi, J. M., and Wornell, G. W. 1996. Decentralized control of a multiple access broadcast channel: Performance bounds. In Proceedings of the 35th Conference on Decision and Control, 293–298.
Hansen, Eric A., Daniel S. Bernstein, and Shlomo Zilberstein. “Dynamic Programming for Partially Observable Stochastic Games.” In Proceedings of the 19th National Conference on Artificial Intelligence, 709–715. AAAI’04. San Jose, California: AAAI Press, 2004.