I was reading Gary Marcus's a Critical Appraisal of Deep Learning. One of his criticisms is that neural networks don't incorporate prior knowledge in tackling a problem. My question is: have there been any attempts at encoding prior knowledge in deep neural networks?
5 Answers
Neural nets incorporate prior knowledge. This can be done in two ways: the first (most frequent and more robust) is in data augmentation. For example in convolutional networks, if we know that the "value" (whatever that is, class/regression) of the object we are looking is rotational/translational invariant (our prior knowledge), then we augment the data with random rotations/shifts. The second is in the loss function with some additional term.
- 176
- 4
Yes, we can do it in a deep learner.
For example, suppose we have an input vector likes $(a, b)$ and from prior knowledge, we know $a^2 + b^2$ is important too. Hence, we can add this value to the vectors likes $(a, b, a^2 + b^2)$.
As another example, suppose date time is important in your data, but not encoded in the input vector. We can add this to the input vector as a third dimension.
In summary, it depends on the structure of the prior knowledge, we can encode it into the input vector.
To add to the Foivos's answer,
Convolutional Neural Networks are shift-invariant. Fukushima introduced this to his Neocognitron. There is a trail to introduce scale-invariance to CNN. https://arxiv.org/abs/1411.6369
Also, CNN uses structural characteristic for the prior knowledge.
And neural networks are locally smooth.
It is not perfect, but neural networks are incorporating a lot of prior knowledge.
- 131
- 1
It kinda depends on how exactly you define knowledge, and what you believe about what the weights in a trained NN model really represent. But to answer this question in the most straightforward possible way (hopefully without sounding glib), then yes, a NN can be pre-trained, and then you can take that model and apply additional training to it, so in a sense, it is using "prior knowledge".
If, OTOH, knowledge means something a little different to you, and you're thinking about the kind of knowledge that's encoded in a semantic network, or a conceptual graph, or something of that nature, then I don't know - offhand - of any direct way to integrate that into an ANN. What you might be able to do is combine the NN with a different kind of reasoner that reasons of the semantic network / conceptual graph, and then integrate the results. AFAIK, the best way to do that is an unsolved research problem.
- 3,797
- 1
- 15
- 31
A simple example of this is token embeddings. If "prior knowledge" just means anything known prior to creation of the graph, then using pretrained vector embeddings meets this criteria. This is simply a way to provide a fixed method for projecting tokens into higher-dimensional space instead of training it at the same time as the rest of the model. Given that vector embeddings are somewhat interpretable and that the same embedding can be reused across tasks and models, I'd consider pretrained embeddings to be prior knowledge being incorporated.
The embeddings could also technically be handcrafted, but I'm not aware of any work like that and am skeptical of its usefulness in deep models.
- 481
- 3
- 7