0

I recently tried to reproduce the results of double Q-learning. However, the results are not satisfying. I have also tried to compare double Q learning with Q-learning in Taxi-v3, FrozenLake without slippery, Roulette-v0, etc. But Q-learning outperforms double Q-learning in all of these environments.

I am not sure whether if there is something wrong with my implementation as many materials about double Q actually focus on double DQN. While at the same time of checking, I wonder is there any toy example that can exemplify the performance of double Q-learning?

David
  • 5,100
  • 1
  • 11
  • 33

0 Answers0