1

I have a large control problem with multidimensional continuous inputs (13) and outputs (3). I tried several Reinforcement learning algorithms like Deep-Q-Networks (DQN), Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C). Unfortunately, they all yield poor results. As far as I understand, they are all based on neural networks. Because of this I think it might be possible that the neural network itself could be a problem as it might not be able to learn the mapping between inputs and outputs (I have experienced this in several other applications).

So, are there state-of-the-art reinforcement learning algorithms for large problems with multidimensional continuous state spaces and actions?

nbro
  • 42,615
  • 12
  • 119
  • 217
PeterBe
  • 276
  • 3
  • 14

1 Answers1

1

There are many state-of-the-art reinforcement learning algorithms for large problems with multidimensional continuous state spaces and actions. All of them rely on some sort of function approximator.

You can use any RL algorithm with really any sort of function approximator, whether a neural network, support vector machine, decision tree, or any other method. Every RL algorithm you mentioned can use any of these function approximators instead of a neural network, if so desired.

However, almost all state of the art results today use neural networks. This is largely due to 2 reasons, one theoretical and the other empirical. The theoretical reason is that neural networks have a universal function approximation theorem which roughly states that given an arbitrarily large network, they can approximate any continuous function. The empurical reason is that for complex problems, neural nets tend to outperform all other methods.

So to more directly address your question. Yes you can use the othe methods I mentioned above. But its probably a bad idea. You are likely better off with either a larger neural network, a neural network architecture more well suited to your problem, or perhaps there are other issues in your approach you have missed. However, it is very unlikely the issue you face is a fundamental problem of neural networks.

chessprogrammer
  • 3,050
  • 2
  • 16
  • 26