0

I am looking for some advice regarding the best choice of binary classification model based on training, validation and test set results. Model 1 (results in 1st image) shows better test set results than Model 2, but Model 2 (results in 2nd image) shows results that seem more intuitive to me with better training set performance than its test set performance. I feel as if the Model 1 test set results might have been a bit of a fluke, whereas Model 2 appears more like a well trained model with more long-term reliability.

Any advice on this is much appreciated.

Model 1 Results

Model 2 Results

1 Answers1

0

Generally if a model A performs worse than model B on "Test set" while performing better on "Train/Val set" then model A has suffered from overfitting.

What is overfitting?

So, in your case Model 1 should give you long term reliability against unseen data.