Can prior knowledge be encoded in deep neural networks?

Question

I was reading Gary Marcus's a Critical Appraisal of Deep Learning. One of his criticisms is that neural networks don't incorporate prior knowledge in tackling a problem. My question is: have there been any attempts at encoding prior knowledge in deep neural networks?

score 5 · Answer 1 · answered Feb 16 '18 at 00:30

Neural nets incorporate prior knowledge. This can be done in two ways: the first (most frequent and more robust) is in data augmentation. For example in convolutional networks, if we know that the "value" (whatever that is, class/regression) of the object we are looking is rotational/translational invariant (our prior knowledge), then we augment the data with random rotations/shifts. The second is in the loss function with some additional term.

score 4 · Answer 2 · edited Dec 03 '20 at 14:41

Yes, we can do it in a deep learner.

For example, suppose we have an input vector likes $(a, b)$ and from prior knowledge, we know $a^2 + b^2$ is important too. Hence, we can add this value to the vectors likes $(a, b, a^2 + b^2)$.

As another example, suppose date time is important in your data, but not encoded in the input vector. We can add this to the input vector as a third dimension.

In summary, it depends on the structure of the prior knowledge, we can encode it into the input vector.

score 3 · Answer 3 · answered Jul 07 '18 at 08:33

To add to the Foivos's answer,

Convolutional Neural Networks are shift-invariant. Fukushima introduced this to his Neocognitron. There is a trail to introduce scale-invariance to CNN. https://arxiv.org/abs/1411.6369

Also, CNN uses structural characteristic for the prior knowledge.

And neural networks are locally smooth.

It is not perfect, but neural networks are incorporating a lot of prior knowledge.

score 1 · Answer 4 · answered Jul 08 '18 at 07:06

It kinda depends on how exactly you define knowledge, and what you believe about what the weights in a trained NN model really represent. But to answer this question in the most straightforward possible way (hopefully without sounding glib), then yes, a NN can be pre-trained, and then you can take that model and apply additional training to it, so in a sense, it is using "prior knowledge".

If, OTOH, knowledge means something a little different to you, and you're thinking about the kind of knowledge that's encoded in a semantic network, or a conceptual graph, or something of that nature, then I don't know - offhand - of any direct way to integrate that into an ANN. What you might be able to do is combine the NN with a different kind of reasoner that reasons of the semantic network / conceptual graph, and then integrate the results. AFAIK, the best way to do that is an unsolved research problem.

score 0 · Answer 5 · answered Dec 03 '20 at 15:37

A simple example of this is token embeddings. If "prior knowledge" just means anything known prior to creation of the graph, then using pretrained vector embeddings meets this criteria. This is simply a way to provide a fixed method for projecting tokens into higher-dimensional space instead of training it at the same time as the rest of the model. Given that vector embeddings are somewhat interpretable and that the same embedding can be reused across tasks and models, I'd consider pretrained embeddings to be prior knowledge being incorporated.

The embeddings could also technically be handcrafted, but I'm not aware of any work like that and am skeptical of its usefulness in deep models.

Can prior knowledge be encoded in deep neural networks?

5 Answers5

Linked