2

Weak supervision is supervised learning, with uncertainty in the labeling, e.g. due to automatic labeling or because non-experts labelled the data [1].

Distant supervision [2, 3] is a type of weak supervision that uses an auxiliary automatic mechanism to produce weak labels / reference output (in contrast to non-expert human labelers).

According to this answer

Self-supervised learning (or self-supervision) is a supervised learning technique where the training data is automatically labelled.

In the examples for self-supervised learning, I have seen so far, the labels were extracted from the input data.

What is the difference between distant supervision and self-supervision?


(Setup mentioned in discussion:

enter image description here

Make42
  • 163
  • 6

1 Answers1

1

The main difference between distant supervision (as described in the link you provided) and self-supervision lies on the task the network is trained on.

Distant supervision focuses on generating weak labels for the very same task that would be tackled with supervised labels, and the final result could be directly used for that matter.

Self-supervision is a means for learning a data representation. It does so by learning a surrogate task, which is defined by inputs and labels derived exclusively from the original input data.

I can imagine cases in which an implementation of self-supervision could be considered distant supervision (if the task casually matches the target task). On the other hand, if external data sources would be employed for training on a surrogate task, that would be a case of representation learning (that could incorporate self-supervision too).

David
  • 571
  • 3
  • 12