Highest Voted 'value-based-methods' Questions - Artificial Intelligence Stack Exchange

5

votes

1 answer

Is reinforcement learning only about determining the value function?

I started reading some reinforcement learning literature, and it seems to me that all approaches to solving reinforcement learning problems are about finding the value function (state-value function or action-state value function). Are there any…

asked Oct 24 '20 at 00:55

Felix P.

295
1
7

4

votes

1 answer

Why are policy gradient methods more effective in high-dimensional action spaces?

David Silver argues, in his Reinforcement Learning course, that policy-based reinforcement learning (RL) is more effective than value-based RL in high-dimensional action spaces. He points out that the implicit policy (e.g., $\epsilon$-greedy) in…

policy-gradients value-functions function-approximation softmax value-based-methods

asked Dec 16 '22 at 12:52

Saucy Goat

153
5

4

votes

1 answer

What is the advantage of using MCTS with value based methods over value based methods only?

I have been trying to understand why MCTS is very important to the performance of RL agents, and the best description I found was from the paper Bootstrapping from Game Tree Search stating: Deterministic, two-player games such as chess provide an…

reinforcement-learning monte-carlo-tree-search value-based-methods

asked Apr 04 '21 at 21:48

Hossam

43
3

3

votes

1 answer

Is it possible for value-based methods to learn stochastic policies?

Is it possible for value-based methods to learn stochastic policies? I'm trying to get a clear picture of the different categories for RL algorithms, and while doing so I started to think about settings where the optimal policy is stochastic…

reinforcement-learning value-functions pomdp stochastic-policy value-based-methods

asked Oct 24 '19 at 09:30

Krrrl

221
1
10

2

votes

0 answers

What kind of reinforcement learning method does AlphaGo Deepmind use to beat the best human Go player?

In reinforcement learning, there are model-based versus model-free methods. Within model-based ones, there are policy-based and value-based methods. AlphaGo Deepmind RL model has beaten the best Go human player. What kind of reinforcement model does…

reinforcement-learning policies model-based-methods model-free-methods value-based-methods

asked Dec 23 '20 at 06:14

user781486

201
2
5

1

vote

0 answers

Is it possible to combine two policy-based RL agents?

I am developing an RL agent for a game environment. I have found out that there are two strategies to do well in the game. So I have trained two RL agents using neural networks with distinct reward functions. Each reward function corresponds to one…

reinforcement-learning deep-rl value-based-methods policy-based-methods

asked Jun 02 '22 at 07:35

BlackBrain

111
2

1

vote

1 answer

Why do we need to have two heads in D3QN to obtain value and advantage separately, if V is the average of Q values?

I have two questions on the Dueling DQN paper. First, I have an issue on understanding the identifiability that Dueling DQN paper mentions: Here is my question: If we have given Q-values $Q(s, a; \theta)$ for all actions, I assume we can get value…

reinforcement-learning dqn value-based-methods d3qn

asked Mar 22 '21 at 21:36

Afshin Oroojlooy

175
1
7

1

vote

0 answers

What are the disadvantages of actor-only methods with respect to value-based ones?

While the advantages of actor-only algorithms, the ones that search directly the policy without the use of the value function, are clear (possibility of having a continuous action space, a stochastic policy, etc.), I can't figure out the…

reinforcement-learning comparison policy-gradients policy-based-methods value-based-methods

asked Oct 01 '20 at 10:17

unter_983

331
1
7

1

vote

0 answers

Are policy-based methods better than value-based methods only for large action spaces?

In different books on reinforcement learning, policy-based methods are motivated by their ability to handle large (continuous) action spaces. Is this the only motivation for the policy-based methods? What if the action space is tiny (say, only 9…

reinforcement-learning policy-based-methods value-based-methods

asked Jun 23 '20 at 07:35

tmaric

402
3
14

0

votes

1 answer

Can Q-learning rewards and next states be non-deterministic?

I am working in a team to develop a Q-learning based approach for hyperparameter tuning. I have a disagreement with one of my teammates on how they defined this problem. They defined it as follows: The states are the values of the metric we are…

reinforcement-learning q-learning markov-decision-process value-based-methods

asked Jan 15 '24 at 02:42

Ahmed Mokhtar

1
1

Questions tagged [value-based-methods]