For questions about model-based reinforcement learning methods (or algorithms). An example of a model-based algorithm is Dyna-Q, which estimates a model of the environment (i.e. the transition function of the associated Markov decision process).
Questions tagged [model-based-methods]
37 questions
92
votes
6 answers
What's the difference between model-free and model-based reinforcement learning?
What's the difference between model-free and model-based reinforcement learning?
It seems to me that any model-free learner, learning through trial and error, could be reframed as model-based. In that case, when would model-free learners be…
mynameisvinn
- 1,021
- 1
- 8
- 6
6
votes
2 answers
Are there RL algorithms that also try to predict the next state?
So far I've developed simple RL algorithms, like Deep Q-Learning and Double Deep Q-Learning. Also, I read a bit about A3C and policy gradient but superficially.
If I remember correctly, all these algorithms focus on the value of the action and try…
Ram Rachum
- 260
- 1
- 11
6
votes
2 answers
How can the policy iteration algorithm be model-free if it uses the transition probabilities?
I'm actually trying to understand the policy iteration in the context of RL. I read an article presenting it and, at some point, a pseudo-code of the algorithm is given :
What I can't understand is this line :
From what I understand, policy…
Samuel Beaussant
- 193
- 3
5
votes
1 answer
Is the state transition matrix known to the agents in a Markov decision processes?
The question is more or less in the title.
A Markov decision process consists of a state space, a set of actions, the transition probabilities and the reward function. If I now take an agent's point of view, does this agent "know" the transition…
Felix P.
- 295
- 1
- 7
5
votes
3 answers
Isn't a simulation a great model for model-based reinforcement learning?
Most reinforcement learning agents are trained in simulated environments. The goal is to maximize performance in (often) the same environment, preferably with a minimum amount of interactions. Having a good model of the environment allows to use…
Rustam
- 471
- 3
- 8
5
votes
1 answer
Why are model-based methods more sample efficient than model-free methods?
Why do model-based methods use fewer samples than model-free methods? Here, I'm specifically referring to model-based methods in which we have to learn a policy and model. I can only think of two reasons for this question:
We can potentially obtain…
Maybe
- 471
- 2
- 11
5
votes
1 answer
How do temporal-difference and Monte Carlo methods work, if they do not have access to model?
In value iteration, we have a model of the environment's dynamics, i.e $p(s', r \mid s, a)$, which we use to update an estimate of the value function.
In the case of temporal-difference and Monte Carlo methods, we do not use $p(s', r \mid s, a)$,…
strongguy122
- 51
- 1
4
votes
1 answer
How does a model based agent learn the model?
I want to build model-based RL. I am wondering about the process of building the model.
If I already have data, from real experience:
$S_1, a \rightarrow R,S_2$
$S_2, a \rightarrow R,S_3$
Can I use this information, to build model-based RL? Or it…
user46045
- 43
- 2
4
votes
1 answer
What is the difference between a distribution model and a sampling model in Reinforcement Learning?
The book from Sutton and Barto, Reinforcement Learning: An Introduction, define a model in Reinforcement Learning as
something that mimics the behavior of the environment, or more generally, that allows inferences to be made about how the…
A. Pesare
- 141
- 4
4
votes
1 answer
Is the minimax algorithm model-based?
Trying to get my head around model-free and model-based algorithms in RL. In my research, I've seen the search trees created via the minimax algorithm. I presume these trees can only be created with a model-based agent that knows the full…
mason7663
- 653
- 4
- 12
3
votes
2 answers
Is Q-learning a type of model-based RL?
Model-based RL creates a model of the transition function.
Tabular Q-Learning does this iteratively (without directly optimizing for the transition function). So, does this make tabular Q-learning a type of model-based RL?
echo
- 713
- 1
- 6
- 12
3
votes
1 answer
Why is learning $s'$ from $s,a$ a kernel density estimation problem but learning $r$ from $s,a$ is just regression?
In David Silver's 8th lecture he talks about model learning and says that learning $r$ from $s,a$ is a regression problem whereas learning $s'$ from $s,a$ is a kernel density estimation. His explanation for the difference is that if we are in a…
David
- 5,100
- 1
- 11
- 33
3
votes
1 answer
Is there any grid world dataset or generator for reinforcement learning?
I would like to start programming a multi task reinforcement learning model. For this, I need not just one maze or grid world (or just model-based), but many with different reward functions. So, I am wondering if exists a dataset or a generator for…
noor
- 31
- 1
2
votes
1 answer
If we can model the environment, wouldn't be meaningless to use a model-free algorithm?
I am trying to understand the concept of model-free and model-based approaches. As far as I understand, having a model of the environment does not mean that an RL agent has to be model-based. It is about the policy. However, if we can model the…
Ayska
- 23
- 4
2
votes
2 answers
Model-based RL for time series data
I have time-series data. When I take an action, it impacts the next state, because my action directly determines the next state, but it is not known what the impact is.
To be concrete: I have $X(t)$ and $a(t-1)$, where $X(t)$ is n-dimensional…
Mariam
- 21
- 2