Highest Voted 'double-q-learning' Questions - Artificial Intelligence Stack Exchange

8

votes

2 answers

Deep Q-Learning "catastrophic drop" reasons?

I am implementing some "classical" papers in Model Free RL like DQN, Double DQN, and Double DQN with Prioritized Replay. Through the various models im running on CartPole-v1 using the same underlying NN, I am noticing all of the above 3 exhibit a…

asked Jun 03 '21 at 14:33

Virus

81
1
5

5

votes

2 answers

Does value iteration still return the true Q-values in stochastic environment?

I'm working with the FrozenLake environment (8x8) from Gymnasium. In the deterministic case (is_slippery=False), I understand that using value iteration can converge to the true Q-values, since the environment is fully observable and transitions are…

reinforcement-learning q-learning value-iteration bias double-q-learning

asked Apr 10 '25 at 01:15

Jien Weng

69
4

5

votes

1 answer

Why does regular Q-learning (and DQN) overestimate the Q values?

The motivation for the introduction of double DQN (and double Q-learning) is that the regular Q-learning (or DQN) can overestimate the Q value, but is there a brief explanation as to why it is overestimated?

reinforcement-learning q-learning dqn double-dqn double-q-learning

asked Jan 10 '21 at 12:53

ground clown

111
3

2

votes

2 answers

How to embed game grid state with walls as an input to neural network

I've read most of the posts on here regarding this subject, however most of them deal with gameboards where there are two different categories of single pieces on a board without walls etc. My game board has walls, and multiple instances of food.…

convolutional-neural-networks deep-rl game-ai double-q-learning one-hot-encoding

asked Sep 04 '21 at 00:44

Arlo Rostirolla

31
1

1

vote

1 answer

How is estimation bias quantified in reinforcement learning?

In various estimation problems, especially in RL domains, where we currently looking into Q-learning and its variants, we often encounter the term estimation bias, which refers to the systematic deviation of an estimator’s expected value from the…

reinforcement-learning q-learning bias double-q-learning

asked Mar 31 '25 at 02:27

Jien Weng

69
4

1

vote

1 answer

Q learning achieves small reward in simple dice game

I am trying to train a Q learning agent on the following game: The states are parametrised by an integer $S \geq 0$ (representing the sum of the previous die rolls). In each step the player can choose to roll a die or quit the game. Whenever the…

q-learning double-q-learning

asked Jun 19 '23 at 15:08

deepfloe

111
2

0

votes

0 answers

What are the update equations for Double Expected Sarsa with an $\epsilon$-greedy target policy?

This is question 6.13 in Sutton-Barto,page 136. What are the update equations for Double Expected Sarsa with an $\epsilon$-greedy target policy? The answer is given as follows: Let $Q_1$ and $Q_2$ be the two action-value functions and let…

sutton-barto double-q-learning expected-sarsa

asked May 18 '24 at 20:27

DSPinfinity

1,223
4
10

0

votes

1 answer

Does "number of actions" refer to the number of actions taken or size of the action space?

In the original DDQN article (https://arxiv.org/pdf/1509.06461.pdf,) the phrase "number of actions" is used twice; First, in the following context: Secondly in Theorem 1. I have a hard time understanding the way the phrase is being used or if it…

deep-learning papers dqn double-dqn double-q-learning

asked Jan 06 '22 at 21:44

GeorgeWTrump

47
6

0

votes

0 answers

Is there any toy example that can exemplify the performance of double Q-learning?

I recently tried to reproduce the results of double Q-learning. However, the results are not satisfying. I have also tried to compare double Q learning with Q-learning in Taxi-v3, FrozenLake without slippery, Roulette-v0, etc. But Q-learning…

reinforcement-learning q-learning double-q-learning

asked Dec 23 '20 at 07:56

Allen_FrCh

1
1

Questions tagged [double-q-learning]