2

In a NN regression problem, considering that MSE is squaring the error and the error is between 0 and 1 would it be pointless to use MSE as our loss function during model training?

For example:

MSE = (y_pred - y_true) ^ 2

@ Expected model output range [0, 1]: MSE = (0.1 - 0.01) ^ 2 = 0.0081

// Significantly larger error is less pronounced in the MSE output MSE = (0.1 - 0.0001) ^ 2 = 0.00998001

@ Expected model output range [10, 20]: MSE = (10 - 12) ^ 2 = 4

// Significantly larger error is more pronounced in the MSE output MSE = (10 - 20) ^ 2 = 100

If it’s indeed useless for that range, would using RMSE allow us to use this loss function at 0-1 range to benefit from its outlier sensitivity during training or is there another loss that would mimic the effect of MSE for values between 0 and 1?

2 Answers2

0

Depending on what you want to do, there are advantages to other loss functions (crossentropy) and other regression models (beta regression), but there is not necessarily a reason to dislike MSE as a loss function when the target is between $0$ and $1$, no. For instance, it might be that you know your outcome has a Gaussian distribution with a mean between $0$ and $1$. Then what you’re doing by minimizing MSE is equivalent to maximum likelihood estimation, which has many nice statistical properties.

From calculus, $f(x)$ and $\sqrt{f(x)}$ are minimized by the same $x$. Thus, RMSE and MSE are equivalent as loss functions in the sense that whatever minimized one minimizes the other.

Dave
  • 868
  • 4
  • 9
0

To summarize all the comments and what my original mistake was:

The sensitivity effect is still there even though yes the squared of the error value less than 0 is less that itself however:

Example:

For two error values of 0.9 and 0.2

MAE: 0.9 MSE: 0.9 ^ 2 = 0.81

MAE: 0.2 MSE: 0.2 ^ 2 = 0.04

MAE: 0.9 / 0.2 = 4.5 # less sensitive in MAE MSE: 0.81 / 0.04 = 20.25 # still significantly sensitive in MSE