Does MSE loss function work in NN training for predicting values between 0-1?

Question

In a NN regression problem, considering that MSE is squaring the error and the error is between 0 and 1 would it be pointless to use MSE as our loss function during model training?

For example:

MSE = (y_pred - y_true) ^ 2
@ Expected model output range [0, 1]:
MSE = (0.1 - 0.01) ^ 2 = 0.0081
// Significantly larger error is less pronounced in the MSE output
MSE = (0.1 - 0.0001) ^ 2 = 0.00998001
@ Expected model output range [10, 20]:
MSE = (10 - 12) ^ 2 = 4
// Significantly larger error is more pronounced in the MSE output
MSE = (10 - 20) ^ 2 = 100

If it’s indeed useless for that range, would using RMSE allow us to use this loss function at 0-1 range to benefit from its outlier sensitivity during training or is there another loss that would mimic the effect of MSE for values between 0 and 1?

score 0 · Accepted Answer · answered Mar 27 '23 at 01:38

Depending on what you want to do, there are advantages to other loss functions (crossentropy) and other regression models (beta regression), but there is not necessarily a reason to dislike MSE as a loss function when the target is between $0$ and $1$, no. For instance, it might be that you know your outcome has a Gaussian distribution with a mean between $0$ and $1$. Then what you’re doing by minimizing MSE is equivalent to maximum likelihood estimation, which has many nice statistical properties.

From calculus, $f(x)$ and $\sqrt{f(x)}$ are minimized by the same $x$. Thus, RMSE and MSE are equivalent as loss functions in the sense that whatever minimized one minimizes the other.

score 0 · Answer 2 · answered Mar 30 '23 at 20:13

To summarize all the comments and what my original mistake was:

The sensitivity effect is still there even though yes the squared of the error value less than 0 is less that itself however:

Example:
For two error values of 0.9 and 0.2
MAE: 0.9
MSE: 0.9 ^ 2 = 0.81
MAE: 0.2
MSE: 0.2 ^ 2 = 0.04
MAE: 0.9 / 0.2 = 4.5 # less sensitive in MAE
MSE: 0.81 / 0.04 = 20.25 # still significantly sensitive in MSE

Does MSE loss function work in NN training for predicting values between 0-1?

2 Answers2