When should I use a linear activation instead of ReLU?

Question

I have read this post: How to choose an activation function?.

There is enough literature about activation functions, but when should I use a linear activation instead of ReLU?

What does the author mean with ReLU when I'm dealing with positive values, and a linear function when I'm dealing with general values.?

Is there a more detail answer to this?

score 2 · Accepted Answer · answered Oct 21 '19 at 22:17

The activation function you choose depends on the application you are building/data that you have got to work with. It is hard to recommend one over the other, without taking this into account.

Here is a short-summary of the advantages and disadvantages of some common activation functions: https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/

What does the author mean with ReLU when I'm dealing with positive values, and a linear function when I'm dealing with general values.

ReLU is good for inputs > 0, since ReLU = 0 if input < 0(which would kill the neuron, if the gradient is = 0)

To remedy this, you could look into using a Leaky-ReLU instead. (Which avoids killing the neuron by returning a non-zero value in the cases of input <= 0)

score 1 · Answer 2 · edited Sep 25 '20 at 02:45

1

Nothing is written on stone in here, but as a rule of thumb linear activation is not very common. A linear activation function in a hidden layer can collapse more neurons in more layers. Linear activation can be implemented in the last layer if a scale of the outputs is not used. (This is the most common use I have seen.)

edited Sep 25 '20 at 02:45

Robby Goetschalckx

615
3
9

answered Sep 21 '20 at 18:08

Joaquin Torrens

42
6

When should I use a linear activation instead of ReLU?

2 Answers2