Highest Voted 'episodes' Questions - Artificial Intelligence Stack Exchange

2

votes

0 answers

Why are agents trained in episodes, even in non-episodic tasks?

Let's consider some non-episodic problem. Maybe a game which can go on forever. My question is: Why are agents still trained in episodes? My understanding is that the agent's neural network is updated in batches depending on the batch size (so every…

asked Jun 08 '22 at 18:03

Vladimir Belik

362
3
15

1

vote

1 answer

Why could there be "information leak" if we do not use fixed horizons?

In this page Limitations on horizon length from the Imitation library, the authors recommend that the user sticks to fixed horizon experiments because there could be "information leak" otherwise. I'm having problems understanding this term, how can…

reinforcement-learning terminology imitation-learning episodes

asked Nov 18 '22 at 13:42

aletelecomm

11
1

0

votes

0 answers

What are the parameters to consider when I set the length of an episode during the training of an RL model?

I'm working on an RL algorithm that receive a list of orders and needs to find the optimal clusters considering different parameters such as due date, location, etc. I don't know what should be the length of the episode and how it can impact on the…

reinforcement-learning reward-functions episodes

asked Aug 02 '24 at 14:13

Filippo Beccherle

1

0

votes

1 answer

Why is the sliding puzzle problem episodic?

Why is the sliding puzzle problem episodic and not sequential? From what I understand, an environment is episodic if each episode is independent and doesn't affect past or future episodes. The actions in the next episode don't depend on the actions…

definitions environment episodes

asked Jan 12 '23 at 16:50

numq

1
1

Questions tagged [episodes]

Why are agents trained in episodes, even in non-episodic tasks?

Why could there be "information leak" if we do not use fixed horizons?

What are the parameters to consider when I set the length of an episode during the training of an RL model?

Why is the sliding puzzle problem episodic?