Could we add clipping in the output layer of the actor in DDPG?

Asked Sep 10 '21 at 15:31

Active Sep 11 '21 at 13:02

Viewed 154 times

I have a doubt about how clipping affects the training of the RL agents.

In particular, I have come across a code for training DDPG agents, the pseudo-code is the following:

1  for i in training iterations
2      action = clip(ddpg.prediction(state) * a + b, x, y)
3      state, reward = environment(action)
4      store action, state and reward
5      if the number of experiences is larger than L:
6          update the parameters of the agent

In this case, the actor NN that predicts the DDPG has a $\tanh$ activation in the output layer.

My question is, could we add the clipping in the output layer of the actor (changing $\tanh(x)$ by $\operatorname{clip}(a\cdot \tanh(x)+b, x, y$) in the training loop? Would the training work in that case?

edited Sep 11 '21 at 13:02

nbro

42,615
12
119
217

asked Sep 10 '21 at 15:31

Leibniz

Could we add clipping in the output layer of the actor in DDPG?

0 Answers0