When training a CNN, what are the hyperparameters to tune first?

Question

I am training a convolutional neural network for object detection. Apart from the learning rate, what are the other hyperparameters that I should tune? And in what order of importance? Besides, I read that doing a grid search for hyperparameters is not the best way to go about training and that random search is better in this case. Is random search really that good?

score 1 · Answer 1 · answered Sep 05 '23 at 10:33

It depends a lot on what type of architecture you are using. However, most of the standard architectures are quite stable and there is no need for much hypreparameter tuning.

Choose whether you want to use a standard CNN architecture or a more advanced one like a ResNet.
Size? How many conv layers will you be using.
Number of filters in each of the layers. This is where you might need a bit of tuning.
You can try tuning the other parameters that are not specific to the problem, e.g. learning rate, regularization, dropout. If you already have a lot of feel for what works good and what not, then grid search according to your intuition would probably work pretty well. No need for random search in this case.

keshav · Answer 2 · 2020-01-23T14:13:43.460

Firstly when you say an object detection CNN, there are a huge number of model architectures available. Considering that you have narrowed down on your model architecture a CNN will have a few common layers like the ones below with hyperparameters you can tweak:

Convolution Layer:- number of kernels, kernel size, stride length, padding
MaxPooling Layer:- kernel size, stride length, padding
Dense Layer:- size
Dropout:- Percentage to keep/drop

score 0 · Answer 3 · answered Aug 10 '22 at 19:48

You can play around with batch Size, epochs, learning rate.

Start with a batch size of 8 and gradually increase two folds => 8,16,32

Start with a higher learning rate to find the direction towards global minima, then reduce learning rate to avoid bouncing around and reach local minima

Start with a low number of epochs to get quick feedback on the rest of hyperparameters. Once you have the optimal direction for hyperparameters move to higher number of epochs

When training a CNN, what are the hyperparameters to tune first?

3 Answers3