Highest Voted 'stationary-policy' Questions - Artificial Intelligence Stack Exchange

15

votes

4 answers

What does "stationary" mean in the context of reinforcement learning?

I think I've seen the expressions "stationary data", "stationary dynamics" and "stationary policy", among others, in the context of reinforcement learning. What does it mean? I think stationary policy means that the policy does not depend on time,…

asked Aug 20 '18 at 10:09

Paula Vega

438
4
9

9

votes

1 answer

What is the difference between a stationary and a non-stationary policy?

In reinforcement learning, there are deterministic and non-deterministic (or stochastic) policies, but there are also stationary and non-stationary policies. What is the difference between a stationary and a non-stationary policy? How do you…

reinforcement-learning comparison policies stationary-policy

asked Jun 27 '19 at 15:14

nbro

42,615
12
119
217

2

votes

1 answer

Policy performance when the stationary state distribution is not unique in RL

Consider the chainworld above with two actions, move (in red) and stay (in blue). Moving in A is stochastic: the agent moves to B with probability $p$ and to C with probability $1-p$. Moving or staying in B and C is irrelevant. Clearly, there…

reinforcement-learning markov-decision-process stationary-policy

asked Dec 05 '24 at 19:29

Simon

263
1
8

2

votes

0 answers

Should I use the discounted average reward as objective in a finite-horizon problem?

I am new to reinforcement learning, but, for a finite horizon application problem, I am considering using the average reward instead of the sum of rewards as the objective. Specifically, there are a total of $T$ maximally possible time steps (e.g.,…

reinforcement-learning q-learning rewards stationary-policy

asked Aug 10 '20 at 06:06

lll

121
2

1

vote

0 answers

Why do bootstrapping methods produce nonstationary targets more than non-bootstrapping methods?

The following quote is taken from the beginning of the chapter on "Approximate Solution Methods" (p. 198) in "Reinforcement Learning" by Sutton & Barto (2018): reinforcement learning generally requires function approximation methods able to handle…

reinforcement-learning monte-carlo-methods temporal-difference-methods stationary-policy bootstrapping

asked Jun 27 '20 at 13:00

Johan

121
4

1

vote

1 answer

What is the difference between the definition of a stationary policy in reinforcement learning and contextual bandit?

A stationary policy is a function that maps a state to a probability distribution of actions. In a contextual bandit problem, a state itself does not include the history. But in a reinforcement learning problem, the history can be used to define a…

machine-learning reinforcement-learning comparison definitions stationary-policy

asked Oct 03 '19 at 18:50

Hunnam

227
1
6

Questions tagged [stationary-policy]

What does "stationary" mean in the context of reinforcement learning?

What is the difference between a stationary and a non-stationary policy?

Policy performance when the stationary state distribution is not unique in RL

Should I use the discounted average reward as objective in a finite-horizon problem?

Why do bootstrapping methods produce nonstationary targets more than non-bootstrapping methods?

What is the difference between the definition of a stationary policy in reinforcement learning and contextual bandit?