Suppose we have a vacuum cleaner operating in a $1 \times 2$ rectangle consisting of locations $A$ and $B$. The cleaner's actions are Suck, Left, and Right and it can't go out of the rectangle and the squares are either empty or dirty. I know this is an amateur question but how does randomization (for instance flipping a fair coin) avoid entering the infinite loop? Aren't we entering such a loop If the result of the toss is heads in odd tosses and tails in even tosses?
This is the text from the book "Artificial Intelligence: A Modern Approach" by Russell and Norvig
We can see a similar problem arising in the vacuum world. Suppose that a simple reflex vacuum agent is deprived of its location sensor and has only a dirt sensor. Such an agent has just two possible percepts: [Dirty] and [Clean]. It can Suck in response to [Dirty]; what should it do in response to [Clean]? Moving Left fails (forever) if it happens to start in square A, and moving Right fails (forever) if it happens to start in square B. Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. Escape from infinite loops is possible if the agent can randomize its actions. For example, if the vacuum agent perceives [Clean], it might flip a coin to choose between Right and Left. It is easy to show that the agent will reach the other square in an average of two steps. Then, if that square is dirty, the agent will clean it and the task will be complete. Hence, a randomized simple reflex agent might outperform a deterministic simple reflex agent.
And this is the agent program from the same source:
function REFLEX-VACUUM-AGENT([location,status]) returns an action
 if status = Dirty then return Suck
 else if location = A then return Right
 else if location = B then return Left
 
     
    