Use this tag for questions related to "hyperbolic tangent activation functions" (tanh) used in neural networks.
Questions tagged [tanh]
6 questions
9
votes
1 answer
When to use Tanh?
When and why would you not use Tanh?
I just replaced ReLU with Tanh and my model trains about 2x faster, reaching 90% acc within 500 steps.
While using ReLU it reached 90% acc in >1000 training steps.
I believe the reason it trained faster was due…
vxnuaj
- 125
- 1
- 6
3
votes
3 answers
Why is there tanh(x)*sigmoid(x) in a LSTM cell?
CONTEXT
I was wondering why there are sigmoid and tanh activation functions in an LSTM cell.
My intuition was based on the flow of tanh(x)*sigmoid(x)
and the derivative of tanh(x)*sigmoid(x)
It seems to me that authors wanted to choose such a…
MASTER OF CODE
- 242
- 2
- 9
3
votes
1 answer
Why is tanh a "smoothly" differentiable function?
The sigmoid, tanh, and ReLU are popular and useful activation functions in the literature.
The following excerpt taken from p4 of Neural Networks and Neural Language Models says that tanh has a couple of interesting properties.
For example, the…
hanugm
- 4,102
- 3
- 29
- 63
2
votes
1 answer
Why and when do we use ReLU over tanh activation function?
I was reading LeCun Efficient Backprop and the author repeated stressed the importance of average the input patterns at 0 and thus justified the usage of tanh sigmoid. But if tanh is good then how come ReLU is very popular in most NNs (which is even…
Struggling_In_Final
- 21
- 2
1
vote
0 answers
Could we add clipping in the output layer of the actor in DDPG?
I have a doubt about how clipping affects the training of the RL agents.
In particular, I have come across a code for training DDPG agents, the pseudo-code is the following:
1 for i in training iterations
2 action = clip(ddpg.prediction(state)…
Leibniz
- 69
- 5
0
votes
1 answer
Producing nan when calculating log probability of sampled action from tanh distribution
policy.eval(); critic.eval() # BN eval mode for rollout
with torch.no_grad():
mean, std = policy(actor_critic_input)
dist = TransformedDistribution(Normal(mean, std), [TanhTransform()])
…
Khushal Badhan
- 29
- 5