how to decide the optimum model?

Question

I have split the database available into 70% training, 15% validation, and 15% test, using holdout validation. I have trained the model and got the following results: training accuracy 100%, validation accuracy 97.83%, test accuracy 96.74%

In another trial for training the model, I got the following results: Training accuracy 100%, validation accuracy 97.61%, test accuracy 98.91%

The same data split is used in each run. Which model should I choose, the first case in which the the test accuracy is lower than the validation? or the second case in which the test is higher than the validation?

score 1 · Answer 1 · answered Nov 05 '21 at 02:32

Testing each time on a test set is against the point of a train-val-test split. The reason test is important, is that you are only supposed to test on it when you think your model is good and ready with all final model and hyperparameter decisions made.

A good description can be found in this article: https://machinelearningmastery.com/difference-test-validation-datasets/

but to sum it up, test should be unbiased. The more you test against it, the more you bias the result.

how to decide the optimum model?

1 Answers1