Highest Voted 'double-dqn' Questions - Artificial Intelligence Stack Exchange

12

votes

1 answer

What exactly is the advantage of double DQN over DQN?

I started looking into the double DQN (DDQN). Apparently, the difference between DDQN and DQN is that in DDQN we use the main value network for action selection and the target network for outputting the Q values. However, I don't understand why…

asked Jul 30 '20 at 19:40

Chukwudi

369
2
8

8

votes

2 answers

Can DQN perform better than Double DQN?

I'm training both DQN and double DQN in the same environment, but DQN performs significantly better than double DQN. As I've seen in the double DQN paper, double DQN should perform better than DQN. Am I doing something wrong or is it possible?

reinforcement-learning q-learning dqn deep-rl double-dqn

asked Apr 08 '19 at 09:08

Angelo

211
2
17

5

votes

1 answer

Why does regular Q-learning (and DQN) overestimate the Q values?

The motivation for the introduction of double DQN (and double Q-learning) is that the regular Q-learning (or DQN) can overestimate the Q value, but is there a brief explanation as to why it is overestimated?

reinforcement-learning q-learning dqn double-dqn double-q-learning

asked Jan 10 '21 at 12:53

ground clown

111
3

4

votes

1 answer

How can I ensure convergence of DDQN, if the true Q-values for different actions in the same state are very close?

I am applying a Double DQN algorithm to a highly stochastic environment where some of the actions in the agent's action space have very similar "true" Q-values (i.e. the expected future reward from either of these actions in the current state is…

reinforcement-learning value-functions convergence reward-functions double-dqn

asked Oct 24 '18 at 18:29

apitsch

93
9

4

votes

1 answer

Finding the true Q-values in gymnaiusm

I'm very interested in the true Q-values of state-action pairs in the classic control environments in gymnasium. Contrary to the usual goal, the ordering of the Q-values itself is irrelevant; a very close to accurate estimation of the Q-values is…

reinforcement-learning q-learning dqn gym double-dqn

asked Aug 10 '23 at 00:22

Mark B

43
3

4

votes

1 answer

What does the notation $p_t = \text{max}_{i

I am having a hard time converting line 6 of the prioritized experience replay algorithm from the original paper into plain English (see below): I understand that new transitions (not visited before) are given maximal priority. On line 6 this would…

reinforcement-learning dqn deep-rl experience-replay double-dqn

asked Jun 01 '19 at 02:26

Hanzy

519
3
11

3

votes

1 answer

Does the DoubleDQN algorithm use a target network or two separate policies?

I've been looking for ways to improve my DQN. That is when I found the Double DQN algorithm. After looking at explanatory videos and posts, I've seen conflicting information: The Double DQN algorithm has two separate policies Q1 and Q2 with…

reinforcement-learning deep-rl q-learning dqn double-dqn

asked Mar 21 '24 at 06:01

Vladislav Korecký

167
3

3

votes

1 answer

Why do we minimise the loss between the target Q values and 'local' Q values?

I have a question regarding the loss function of target networks and current (online) networks. I understand the action value function. What I am unsure about is why we seek to minimise the loss between the qVal for the next state in our target…

reinforcement-learning dqn deep-rl double-dqn

asked Nov 11 '20 at 16:48

user9317212

181
2
13

3

votes

1 answer

How to compute the target for double Q-learning update step?

I've already read the original paper about double DQN but I do not find a clear and practical explanation of how the target $y$ is computed, so here's how I interpreted the method (let's say I have 3 possible actions (1,2,3)): For each experience…

reinforcement-learning q-learning dqn deep-rl double-dqn

asked Aug 13 '20 at 14:24

unter_983

331
1
7

2

votes

0 answers

Update Rule with Deep Q-Learning (DQN) for 2-player games

I am wondering how to correctly implement the DQN algorithm for two-player games such as Tic Tac Toe and Connect 4. While my algorithm is mastering Tic Tac Toe relatively quickly, I cannot get great results for Connect 4. The agent is learning to…

reinforcement-learning q-learning deep-rl dqn double-dqn

asked May 14 '21 at 23:43

spadel

31
4

2

votes

0 answers

Can DQN outperform DoubleDQN?

I found a similar post about this issue, but unfortunately I did not find a proper answer. Are there any references where DQN is better than DoubleDQN, that is DoubleDQN does not improve DQN ?

reinforcement-learning dqn deep-rl reference-request double-dqn

asked Jan 10 '21 at 12:33

ddaedalus

947
1
7
21

2

votes

1 answer

How does the target network in double DQNs find the maximum Q value for each action?

I understand the fact that the neural network is used to take the states as inputs and it outputs the Q-value for state-action pairs. However, in order to compute this and update its weights, we need to calculate the maximum Q-value for the next…

reinforcement-learning training dqn deep-rl double-dqn

asked Jul 21 '20 at 14:20

Metrician

195
5

1

vote

0 answers

Resulting quantiles from Quantile Regression DQN

In my QR-DQN application, the resulting quantiles for a state s and action a take the form of the blue line in the figure. The method works well in expected values and trains effectively. However, I know that in my problem the return distribution…

reinforcement-learning q-learning probability-distribution statistics double-dqn

asked Dec 24 '23 at 10:03

amavrits

11
1

1

vote

0 answers

Why slow-changing policy invalidates Double DQN approach in TD3 paper?

In the paper describing TD3 (https://arxiv.org/abs/1802.09477), the authors say that they could not effectively address the Q-learning overestimation bias by using different networks for maximizing and estimating the next state Q value when…

reinforcement-learning deep-rl q-learning dqn double-dqn

asked Dec 01 '23 at 11:35

Jerry Ding

131
1

1

vote

0 answers

DDQN Snake keeps crashing into the wall

Edit: I managed to fix this by changing the optimizer to SGD. I am very new to reinforcement learning, and I attempted to create a DDQN for the game snake but for some reason it keeps learning to crash into the wall. I've tried changing the…

reinforcement-learning machine-learning deep-rl dqn double-dqn

asked Nov 27 '22 at 17:17

ImNotKevPlayz

21
3

Questions tagged [double-dqn]