2

I have read this post: How to choose an activation function?.

There is enough literature about activation functions, but when should I use a linear activation instead of ReLU?

What does the author mean with ReLU when I'm dealing with positive values, and a linear function when I'm dealing with general values.?

Is there a more detail answer to this?

nbro
  • 42,615
  • 12
  • 119
  • 217
jennifer ruurs
  • 589
  • 2
  • 10

2 Answers2

2

The activation function you choose depends on the application you are building/data that you have got to work with. It is hard to recommend one over the other, without taking this into account.

Here is a short-summary of the advantages and disadvantages of some common activation functions: https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/

What does the author mean with ReLU when I'm dealing with positive values, and a linear function when I'm dealing with general values.

ReLU is good for inputs > 0, since ReLU = 0 if input < 0(which would kill the neuron, if the gradient is = 0)

To remedy this, you could look into using a Leaky-ReLU instead. (Which avoids killing the neuron by returning a non-zero value in the cases of input <= 0)

Krrrl
  • 221
  • 1
  • 10
1

Nothing is written on stone in here, but as a rule of thumb linear activation is not very common. A linear activation function in a hidden layer can collapse more neurons in more layers. Linear activation can be implemented in the last layer if a scale of the outputs is not used. (This is the most common use I have seen.)