I read about Q-Learning and was reading about multi-agent environments. I tried to read the paper Friend-or-Foe Q-learning, but could not understand anything, except for a very vague idea.
What does Friend-or-Foe Q-learning mean? How does it work? Could someone please explain this expression or concept in a simple yet descriptive way that is easier to understand and that helps to get the correct intuition?