0

I've read in this: F. Rosenblatt, Principles of neurodynamics. perceptrons and the theory of brain mechanisms that in the Multilayer Perceptron the activation functions in the second, third, ..., are all non linear, and they all can be different. And in the first layer, they are all linear.

Why?

On what does this depends?

  1. When it is said "the neural network learns automatically", in colloquial words, what does it mean?

AFAIK, one first train the NN, then at some point NN learns. When does the "automatically" enters then?

Thanks in advance for your help.

1 Answers1

2

Rosenblatt was probably discussing a specific architecture, for which there are many. However, for general purpose feed-forward back-propagation ANNs used for function aproximation and classification analysis, you can use whatever activation functions you want on the input-side, hidden layers, and output-side. Examples are identity, logistic, tanh, exponential, Hermite, Laguerre, RBFN, ReLu, softmax, etc. "Automatically," likely refers to the iterative learning process, which tends to be similar to gradient descent, during which partial derivatives of prediction error w.r.t to coefficients decrease.