2

In value iteration and policy iteration we solely consider a one-step-lookahead where the lookahead is from the previous iteraiton and therefore need to sweep over all states and iterate this procedure.

Why dont we do one sweep over all states but instead of using a one-step-lookahead we build a complete scenario-tree thus a full-step-lookahead.

Question Is it due to efficiency, computational challenges(obviously you cant if the horizon is infinite), it is due to loops?

hugh
  • 53
  • 3

0 Answers0