5

I am training a modified VGG-16 to classify crowd density (empty, low, moderate, high). 2 dropout layers were added at the end on the network each one after one of the last 2 FC layers.

network settings:

  • training data contain 4381 images categorized under 4 categories (empty, low, moderate, high), 20% of the training data is set for validation. test data has 2589 images.

  • training is done for 50 epochs.(training validation accuracy drops after 50 epochs)

  • lr=0.001, decay=0.0005, momentum=0.9

  • loss= categorical_crossentropy

  • augmentation for (training, validation and testing data): rescale=1./255, brightness_range=(0.2,0.9), horizontal_flip

With the above-stated settings, I get the following results:

  • training evaluation loss: 0.59, accuracy: 0.77

  • testing accuracy 77.5 (correct predictions 2007 out of 2589)

Regarding this, I have two concerns:

  1. Is there anything else I could do to improve accuracy for both training and testing?

  2. How can I know if this is the best accuracy I can get?

nbro
  • 42,615
  • 12
  • 119
  • 217
norahik
  • 125
  • 4

2 Answers2

5

Is there anything else I could do to improve accuracy for both training and testing?

Yes, of course, there are a lot of methods if you want to try to improve your accuracy, some that I can mention:

  • Try to use a more complex model: ResNet, DenseNet, etc.
  • Try to use other optimizers: Adam, Adadelta, etc.
  • Tune your hyperparameters (e.g. change your learning rate, momentum, rescale factor, convolution size, number of feature maps, epochs, neurons, FC layers)
  • Try to analyze your data, with ~75% and 4 categories, is that possible there is one category that difficult to classified?

In essence, you have to do a lot of experiments with your model until you think "it is enough" (if you have a deadline). If you don't have a hard deadline, you can keep improving updating your ML model.

How can I know if this is the best accuracy I can get?

No, you can't until you compare it with other models/hyperparameters. If you do more experiments (some ways like the one I mentioned above) or compare with other people's experiments that using the same data, you'll find which one is the best. For an academic paper, for example, you need to compare at least 3 to 4 models that similar or experiment with hundreds of different hyperparameters combination.

malioboro
  • 2,859
  • 3
  • 23
  • 47
3

One option is not mentioned by malioboro is getting more data. Getting bigger dataset is almost always improve training results. If it's too hard to obtain more labeled data you can use data augmentation on existing data - small random transformations while keeping the same label.

For images most common methods of augmentations are (applying padding if needed):

  • small random zooming in/out
  • small random shifts of image
  • adding random noise to image
  • small random change in brightness, contrast, color balance and similar parameters

There are more complex methods, but they are dependent on specific of dataset/goal of training

Stopping:

  • you should stop training if you see training error is not decreasing - your have reached (possibly local) minima.
  • you should stop if testing or validation error start to increase - your method is overfitting (data augmentation can help in that case)
mirror2image
  • 735
  • 7
  • 15