Why solely a one-step-lookahead in value/policy-iteration?

Asked Sep 25 '23 at 20:53

Active Oct 05 '23 at 06:21

Viewed 312 times

In value iteration and policy iteration we solely consider a one-step-lookahead where the lookahead is from the previous iteraiton and therefore need to sweep over all states and iterate this procedure.

Why dont we do one sweep over all states but instead of using a one-step-lookahead we build a complete scenario-tree thus a full-step-lookahead.

Question Is it due to efficiency, computational challenges(obviously you cant if the horizon is infinite), it is due to loops?

edited Oct 05 '23 at 06:21

asked Sep 25 '23 at 20:53

hugh

Why solely a one-step-lookahead in value/policy-iteration?

0 Answers0