If GPT-3 is trained on predicting the next token, how is it able to take commands?

Question

From my understanding, GPT-3 is trained on predicting the next token from a sequence of tokens. Given this, how is it able to take commands? For instance, in this example input, wouldn't the statistically most likely prediction be to insert a period and end the sentence?

Input: write me a beautiful sentence
Output: I cannot put into words how much I love you, so I'll just say it's infinite.

score 1 · Answer 1 · answered Oct 17 '22 at 10:48

No, it's actually a very precise guess determined by relatively complex statistical equations.

GPT-3 is able to take commands because it has been trained on a dataset of commands. This means that it is able to recognize patterns in sequences of tokens that indicate a command. In the example input, the most likely prediction is to insert a period and end the sentence because this is what typically happens at the end of a sentence. However, GPT-3 is able to take commands because it is able to recognize patterns that indicate a different kind of sequence, such as a command.

score 1 · Answer 2 · answered Nov 07 '24 at 15:49

1

The dataset has been curated in such a way as that it believes the next most likely set of tokens are answers.

i.e. Its been shown that after a command surrounded by special tokens the next tokens are always responses to the commands.

answered Nov 07 '24 at 15:49

Leonhard Piff

21
3

If GPT-3 is trained on predicting the next token, how is it able to take commands?

2 Answers2