24

When designing solutions to problems such as the Lunar Lander on OpenAIGym, Reinforcement Learning is a tempting means of giving the agent adequate action control so as to successfully land.

But what are the instances in which control system algorithms, such as PID controllers, would do just an adequate job as, if not better than, Reinforcement Learning?

Questions such as this one do a great job at addressing the theory of this question, but do little to address the practical component.

As an Artificial Intelligence engineer, what elements of a problem domain should suggest to me that a PID controller is insufficient to solve a problem, and a Reinforcement Learning algorithm should instead be used (or vice versa)?

SeeDerekEngineer
  • 541
  • 4
  • 11

1 Answers1

10

I think the comments are basically on the right track.

PID controllers are useful for finding optimal policies in continuous dynamical systems, and often these domains are also used as benchmarks for RL, precisely because there is an easily derived optimal policy. However, in practice, you'd obviously prefer a PID controller for any domain in which you can easily design one: the controller's behaviors are well understood, while RL solutions are often difficult to interpret.

Where RL shines is in tasks where we know what good behaviour looks like (i.e., we know the reward function), and we know what sensor inputs look like (i.e. we can completely and accurately describe a given state numerically), but we have little or no idea what we actually want the agent to do to achieve those rewards.

Here's a good example:

  • If I wanted to make an agent to maneuver a plane from in front of an enemy plane with known movement patterns to behind it, using the least amount of fuel, I'd much prefer to use a PID controller.

  • If I wanted to make an agent to control a plane and shoot down an enemy plane with enough fuel left to land, but without a formal description of how the enemy plane might attack (perhaps a human expert will pilot it in simulations against our agent), I'd much prefer RL.

John Doucette
  • 9,452
  • 1
  • 19
  • 52