Why CNN filters (kernels) are randomly initialized?

Question

I learned that when CNN filters are defined, they are initialized with random weights and bias(Im not sure about bias).

Then as learning step goes on, the weight values change and each filter makes its own feature map.

What I don't understand is that, if filter is initialized with random values, is there any chance that

different filters make the same feature map
feature map varies every time it repeats.

It seems little unefficient to initialize the weights of every filter randomly. More precisely, I think (in most of the networks) the number of filters is too small to get meaningful features.

Is the second case is why CNN has randomness?

score 1 · Accepted Answer · answered Feb 08 '24 at 10:31

It is usual to initialize parameters with "a good guess", when you have prior information, in order to help the model converging.

However in deep learning most of the time you have no clue what the weights should be, so you initialize them randomly following certain schedules like Xavier initialization or Kaiming, that prevent gradients from vanishing or exploding (due to a bad initialization) by setting the mean activations to be zero and the variance of the activations to be constant accross all layers.

Regarding your questions :

Yes it is totally possible that multiples filters learn the same thing. In fact, a good part of the filters may be completly useless, which is the idea behing network pruning that tries to delete neurons in a network while keeping the same performances.
Yes, the network you'll obtain after training will depend on the initialization. You can have different results depending on the initialization (just permuting the filters, for instance, give a new network with same capabilities), since anyway you will never get the global optimum for your task but only sub-optimal weights. However if trained properly, the final results should not very too much.

Why CNN filters (kernels) are randomly initialized?

1 Answers1