2

I have a neural network where there are two hidden layers. Each hidden layer has 128 neurons. The input layer has 20 inputs, and the output layer has 3 outputs.

I have 1 million records of data. 80% is used to train the network, 20% is used for validation. I run the training for 100000 epochs.

I see that the neural network attains 100% accuracy on the training data after only 12000 epochs.

Should I stop training or continue until all 100000 epochs are complete? Please, explain why.

nbro
  • 42,615
  • 12
  • 119
  • 217
user366312
  • 341
  • 1
  • 13

1 Answers1

6

First of all, as mentioned by @Neil Slater in the comment - you need to have three splits into the train, validation and test set.

One sometimes disregards the difference between validation and test set. However they serve for different purposes. Here I would like to cite https://machinelearningmastery.com/difference-test-validation-datasets/ :

Validation Dataset: The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.

Test Dataset: The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.

Secondly, in order to understand what's happening plot jointly the train and validation loss. In case the performance on validation data becomes much worse, that on the training - It is better to terminate training, since it is the indication of overfitting.

A good practice is to use early stopping, there is an implementation of this callback in Tensorflow - https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping.

It a kind of regularization procedure https://en.wikipedia.org/wiki/Early_stopping.