How was ChatGPT trained?

Question

I know that large language models like GPT-3 are trained simply to continue pieces of text that have been scraped from the web. But how was ChatGPT trained, which, while also having a good understanding of language, is not directly a language model, but a chatbot? Do we know anything about that? I presume that a lot of conversations was needed in order to train it. Did they simply scrape those conversations from the web, and where did they find such conversations in that case?

score 6 · Accepted Answer · answered Jan 02 '23 at 19:56

6

The key ingredient is called Reinforcement Learning from Human Feedback (RLHF), that is having humans rate the model answers and use the feedback to guide the model training.

The official blog explains this fairly well.

answered Jan 02 '23 at 19:56

Rexcirus

1,309
9
22

How was ChatGPT trained?

1 Answers1

Linked