For questions about multi-agent reinforcement learning (MARL) agents, algorithms, or models.
Questions tagged [multi-agent-rl]
18 questions
4
votes
3 answers
How do I get started with multi-agent reinforcement learning?
Is there any tutorial that walks through a multi-agent reinforcement learning implementation (in Python) using libraries such as OpenAI's Gym (for the environment), TF-agents, and stable-baselines-3?
I searched a lot, but I was not able to find any…
Rnj
- 221
- 2
- 6
3
votes
2 answers
How to model a multi-agent reinforcement learning problem where actions of different agents can take different durations?
I am confused on a conceptual scale how I would be able to model a multi-agent reinforcement learning problem when each agent performing an action would take different durations to complete the action. This means that a certain action is performed…
hridayns
- 243
- 4
- 15
3
votes
1 answer
Why can I still easily beat my Q-learning agent that was trained against another Q-learning agent to play tic tac toe?
I implemented the Q-learning algorithm to play tic-tac-toe. The AI plays against the same algorithm, but they don't share the same Q matrix. After 200,000 games, I still beat the AI very easily and it's rather dumb. My selection is made by epsilon…
Irindul
- 49
- 7
2
votes
0 answers
When to model decision-making problem as single agent vs multi-agent problem?
I understand the goals and purposes of RL in the case of a single agent and the underlying model, i.e. MDPs, for RL problems (or sequential decision making with uncertainty in general).
My question is (and I know this will/may be subjective) are the…
David
- 121
- 4
2
votes
0 answers
Which multi-agent reinforcement learning algorithm can I use when there are two types of agents with different action spaces?
Most of the papers on multi-agent RL (MARL) that I have encountered have multiple agents who have a common action space.
In my work, my scenario involves $m$ numbers of a particular agent (say type A) and $n$ numbers of another type of agent. Here,…
user3656142
- 185
- 5
2
votes
1 answer
Is there a multi-agent deep reinforcement learning algorithm which is for environments with only discrete action spaces (Not hybrid)?
Is there a multi-agent deep reinforcement learning algorithm which is for environments with only discrete action spaces (Not hybrid) and have centralized training?
I have been looking for algorithms, (A2C, MADDPG etc.) but still havent find any…
Uur Kn
- 21
- 1
1
vote
0 answers
Why does MARL require full history while single-agent RL (Sutton \& Barto) uses state-based returns?
In single-agent RL (as in Sutton & Barto's framework), the expected return is defined from the current state $s_t$:
$$
G_t = R_t + R_{t+1} + ....
$$
In the Bellman equation, $V(s_t)$ depends only on the current state and future rewards:
$$
V(s_t) =…
fermented_bean
- 33
- 4
1
vote
1 answer
What multi agent reinforcement learning algorithms to use on a chain reaction game?
I looked for examples online and found one that had alpha zero are there any other algorithms that i can apply on this game?
Chain reaction game- We have grid and each player can place their pieces on any place that is empty and occupied by the same…
angoor
- 21
- 2
1
vote
3 answers
How do 2-player games fit under the MDP framework of reinforcement learning?
I am confused about the theoretical framework of reinforcement learning. For supervised learning, there seems to be a clear theoretical framework, e.g. as described by Wikipedia here. I am unclear about a similar framework for RL.
It seems that MDPs…
Joe C.
- 111
- 3
1
vote
1 answer
PPO learning to achieve a non-Markovian task?
I've been trying to train agents to achieve a non-Markovian task in a modified version of PettingZoo's Waterworld. In the task, I have two pursuers (the agents I'm training) and three evaders. I won't go deep into detail, but the task consists in…
sleipnir
- 11
- 1
1
vote
0 answers
How to correctly train policies in multi-agent RL?
I am diving into Multi-Agent Reinforcement Learning and after reading some literature, I would like to clarify some approaches because I am not quite sure. Now for the following two cases it is clear that:
independent learning: one distinct policy…
thsolyt
- 11
- 2
1
vote
1 answer
How can rewards and loss calculation be extended to multiple agents in a vanilla policy gradient RL setting?
Say I have a simple multi-agent reinforcement learning problem using vanilla policy gradient methods (i.e. REINFORCE) that is currently running with one network per agent. If I can say that each of my agents:
are all of the same class
have…
Josh
- 99
- 9
1
vote
0 answers
When should I use an MARL approach instead of training one agent while keep the others fixed?
I have built a custom multi-agent environment with PettingZoo, where a turn-based game with two agents, A and B, is setup.
I want to examine situations where malicious behavior may arise, given the game rules, and I am looking into training…
npit
- 111
- 1
0
votes
0 answers
Could you please help me with my MADDPG (Multi-Agent Deep Deterministic Policy Gradient Algorithm) implementation?
I am trying to implement MADDPG from scratch i finished the code but after 3000 episodes I still don't see any improvement on the behaviour of the agent could someone please help me. Thank you in advance ! Here is the code of my train function.
s :…
Meyssan
- 1
0
votes
0 answers
Two-agent sequential RL
I have the following RL model that I want to train (see the diagram below). My idea is to have two agents: agent A and agent B. Agent A observes the input I1 and decides an action action1, then immediately, agent B observes the input (action1, I2)…
zdm
- 309
- 2
- 9