Multi-layer network only predicts linear trends

Question

I have made a neural network from scratch (in java), which is refusing to switch out of linear regression. I have pushed up the layer sizes (it now has 2 hidden layers, both with 5 neurons), and yet when given harsh sloping polynomials to train on, it still predicts values that follow a gradient, even though this returns high cost.

The network is working in that the predictions do somewhat follow the polynomial as best as a line could, but why wont it actually give me predictions that follow a polynomial like the one it trains on?

I have checked all aspects of training, SGD is working as it should, as is the cost function (MSE), and yet the network just isn't able find a way to minimise cost, it can't seem to break free of linear regression.

score 0 · Accepted Answer · answered Sep 07 '22 at 11:02

Neural network is basically a composition of matrix multiplications (linear combination) and non-linear activation. In classical libraries, when an activation function is not specified then everything stays linear (with the linear activation function y=x).

These non-linear activation functions have several properties : hyperbolic tangent and sigmoid are double saturated, often used in classification task or combined with Normalization layers.

Cost function has nothing to do with non-linearity, it is just a way to measure how far we currently are from the goal.

Multi-layer network only predicts linear trends

1 Answers1