7

All sources I can find provide a similar explanation to each phase.

In the Selection Phase, we start at the root and choose child nodes until reaching a leaf. Once the leaf is reached (assuming the game is not terminated), we enter the Expansion Phase.

In the Expansion Phase, we expand any number of child nodes and select one of the expanded nodes. Then, we enter the Play-Out Phase.

Here is my confusion. If we choose to only expand a single node, the nodes that were not expanded will never be considered in future selections as we only select child nodes until a leaf is reached during the Selection Phase. Is this correct? If not, what am I misunderstanding about the Selection Phase?

nbro
  • 42,615
  • 12
  • 119
  • 217
Ralff
  • 173
  • 5

1 Answers1

2

If we choose to only expand a single node, the nodes that were not expanded will never be considered in future selections as we only select child nodes until a leaf is reached during the Selection Phase. Is this correct?

No this is not correct.

If not, what am I misunderstanding about the Selection Phase?

The selection phase does not end only when you reach a node that has no expanded nodes. It ends when you reach a node that has any unexpanded nodes. At which point you typically pick one or more nodes you have not yet expanded at that point in the tree, expand them and collect one or more rollout results for them. Variations are possible such as choosing to whether to expand or continue selecting stochastically, or expanding all child nodes at the same time using value estimates to initialise them - the latter is what AlphaZero does.

Neil Slater
  • 33,739
  • 3
  • 47
  • 66