1

I'm working on a simulation model using RL to optimize an objective function. I'm trying to understand if I need to select my state variables such that I can write state update equations for each one using the state variables that are provided to the RL agent.

For example, I have a variable for "time to arrival" ($T_{arrival}$) that's calculated based on the vehicle's current location and destination under the hood (in the simulation). If I don’t include location and destination in the state space, can I still use $T_{arrival}$ in my RL model, even though in this case I can't write a state update equation for it directly?

Similarly, for a state variable like "earliest time a charging station is available" ($T_{charge}^{avail}$), do I need to include charge start and end times for each vehicle in the state space to be able to update/calculate $T_{charge}^{avail}$? Or can the simulation environment calculate $T_{charge}^{avail}$ and provide it to the RL agent without violating the principles of dynamic programming?

Ultimately, do I need to ensure that every state variable in my RL model can be updated directly using other state variables in the state space? or the simulation calculated state variables are enough?

1 Answers1

1

The state of an RL agent, in the naive case, just has to be a comprehensive description of the system, or at least the most comprehensive available (in some cases, it's impossible to have a complete description)

For this reason, you can give any information to the agent, even some random information that is completely irrelevant (though it will probably hinder the performance)

In very simple terms, consider a person doing that job and ask yourself which information that person needs to make an informed decision on the task you want to solve: that's the set of information you want to give to your agent.

Alberto
  • 2,863
  • 5
  • 12