What method is better to use for a two-player reinforcement learning environment?

Asked Jan 03 '22 at 19:46

Active Jan 05 '22 at 09:47

Viewed 583 times

I want to create an RL agent for a mancala-type two-player game as my first actual project in the field. I've already completed the game itself and coded a minimax algorithm.

The question is: how should I proceed? Which is the better way: to create a custom OpenAI Gym environment and use stable baselines algorithms or create an AlphaZero-like Monte-Carlo Tree Search algorithm from scratch?

People here suggested that it is easier to create MCTS that use Gym, since the latter does not natively support multiplayer games. But I thought I could use my minimax algorithm and incorporate it into my custom environment, and since I have both the game and the minimax algorithm, it's easier to use Gym than MCTS.

Are there any pitfalls I should avoid?

edited Jan 05 '22 at 09:47

nbro

42,615
12
119
217

asked Jan 03 '22 at 19:46

JollyOwl

What method is better to use for a two-player reinforcement learning environment?

0 Answers0