Should I choose the model with highest validation accuracy or the model with highest mean of training and validation accuracy?

Question

I'm training a deep network in Keras on some images for a binary classification (I have around 12K images). Once in a while, I collect some false positives and add them to my training sets and re-train for higher accuracy.

I split my training into 20/80 percent for training/validation sets.

Now, my question is: which resulting model should I use? Always the one with higher validation accuracy, or maybe the higher mean of training and validation accuracy? Which one of the two would you prefer?

Epoch #38: training acc: 0.924, validation acc: 0.944
Epoch #90: training acc: 0.952, validation acc: 0.932

s_bh · Accepted Answer · 2020-02-08T01:14:05.177

Neither of the above mentioned methods could be a potent indicator of the performance of a model.

A simple way to train the model just enough so that it generalizes well on unknown datasets would be to monitor the validation loss. Training should be stopped once the validation loss progressively starts increasing over multiple epochs. Beyond this point, the model learns the statistical noise within the data and starts overfitting.

This technique of Early stopping could be implemented in Keras with the help of a callback function:

class EarlyStop(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('val_loss') < LOSS_THRESHOLD and logs.get('val_categorical_accuracy') > ACCURACY_THRESHOLD):
            self.model.stop_training = True

callbacks= EarlyStop()
model.fit(...,callbacks=[callbacks])

The Loss and Accuracy thresholds can be estimated after a trial run of the model by monitoring the validation/training error graph.

score 1 · Answer 2 · answered Feb 05 '20 at 21:31

The training accuracy tells you nothing about how good it is on other data than the ones it learned on, it could be better on this data because it memorized this examples.

On the other hand the validation set is here to indicate you how good the model is to generalize what it learned to new data (hopefully the testing dataset accurately represents the diversity of the data).

As you are looking for a model which is good on every dataset you don't want to use training accuracy to choose your model and so you should choose the first one.

Should I choose the model with highest validation accuracy or the model with highest mean of training and validation accuracy?

2 Answers2