2

From my understanding, GPT-3 is trained on predicting the next token from a sequence of tokens. Given this, how is it able to take commands? For instance, in this example input, wouldn't the statistically most likely prediction be to insert a period and end the sentence?

Input: write me a beautiful sentence

Output: I cannot put into words how much I love you, so I'll just say it's infinite.

2 Answers2

1

No, it's actually a very precise guess determined by relatively complex statistical equations.

GPT-3 is able to take commands because it has been trained on a dataset of commands. This means that it is able to recognize patterns in sequences of tokens that indicate a command. In the example input, the most likely prediction is to insert a period and end the sentence because this is what typically happens at the end of a sentence. However, GPT-3 is able to take commands because it is able to recognize patterns that indicate a different kind of sequence, such as a command.

Faizy
  • 1,144
  • 1
  • 8
  • 30
1

The dataset has been curated in such a way as that it believes the next most likely set of tokens are answers.

i.e. Its been shown that after a command surrounded by special tokens the next tokens are always responses to the commands.