I've heard somewhere that due to their nature of capturing spatial relations, even untrained CNNs can be used as feature extractors? Is this true? Does anyone have any sources regarding this I can look at?
2 Answers
Yes, it has been demonstrated that the main factor for CNNs to work is its architecture, which exploits locality during the feature extraction. A CNN with random weights will do a random partition of the feature space, but still with that spatial prior that works so well, so those random features are OK for classification (and sometimes even better than trained ones, as they don't introduce additional bias).
You can read more in these papers:
- 571
- 3
- 12
I'm not sure it's possible. Untrained CNN means it has random kernel values. Let's say you have a kernel with size 3x3 like below:
0 0 0
0 0 0
0 0 1
I don't think it is possible for that kernel to provide good information about the image. on the contrary, the kernel eliminates a lot of information. We cannot rely on random values for feature extraction.
But, if you use CNN with "assigned" kernel, then you don't need to train the convolutional layer. For example, you can start a CNN with a kernel that designed to extract vertical line:
-1 2 -1
-1 2 -1
-1 2 -1
- 2,859
- 3
- 23
- 47