7

According to the definition of a fully observable environment in Russell & Norvig, AIMA (2nd ed), pages 41-44, an environment is only fully observable if it requires zero memory for an agent to perform optimally, that is, all relevant information is immediately available from sensing the environment.

From this definition and from the definition of an "episodic" environment in the same book, it is implied that all fully observable environments are, in fact, episodic or can be treated as episodic, which doesn't seem intuitive, but logically follows from the definitions. Also, no stochastic environment can be fully observable, even if the entire state space at a given point in time can be observed because rational action may depend on the previous observation that must be remembered.

Am I wrong?

nbro
  • 42,615
  • 12
  • 119
  • 217

1 Answers1

2

No, not all fully observable environments are episodic. Let's take a look again at the definitions from the book:

Fully Observable Environment (section 2.3.2)

If an agent’s sensors give it access to the complete state of the environment at each point in time, then we say that the task environment is fully observable. A task environment is effectively fully observable if the sensors detect all aspects that are relevant to the choice of action

Episodic Environment (section 2.3.2)

In an episodic task environment, the agent’s experience is divided into atomic episodes. In each episode the agent receives a percept and then performs a single action. Crucially, the next episode does not depend on the actions taken in previous episodes.

Take note of the "crucial" part at the end of the definition of episodic environment. A fully observable environment that is not episodic (and therefore sequential in the book's taxonomy) is chess. Chess is fully observable because the player can view the positions of all active pieces on the chess board, and that is all the information that needs to be known in order to take the optimal action. But chess is not episodic, because the player's current move depends on all previous moves, and the current move will have downstream effects in later turns.

In fact, if you look at Figure 2.6 in the book on pg. 45, they provide three examples of fully observable sequential (i.e. not episodic) environments: crossword puzzles, chess, and backgammon. There are of course many more. Most games are sequential as that is the main appeal of them - how to best sequence my moves now in order to ensure victory over my opponent at a future time?

adamconkey
  • 281
  • 1
  • 9