What is the Thompson Sampling in simple terms?

Asked Feb 02 '22 at 11:23

Active Feb 02 '22 at 12:50

Viewed 142 times

I am looking at the different existing methods of action selection in reinforcement learning.

I found several methods like epsilon-greedy, softmax, upper confidence bound and Thompson sampling.

I managed to understand the principle of each method except Thompson sampling.

I can't understand the principle and the way it works and its action selection steps.

If you can explain to me the principle and the functioning of Thompson sampling with a simple example I would be grateful.

edited Feb 02 '22 at 12:50

Neil Slater

asked Feb 02 '22 at 11:23

user14053977

0 Answers0