How is a LLM able to override its prior knowledge through In-Context Learning?

Question

I came across a Google's blog (https://research.google/blog/larger-language-models-do-in-context-learning-differently/) discussing large language models (LLMs) and how we can overried their prior knowledge through in-context. Using examples where the labels contradict prior knowledge, they call it flipped-label in-context learning (for example, sentences containing positive sentiment labeled as “negative sentiment”).

I'm curious: how does the model is able to even "learn" (overried its priors) without changing its weights? Specifically, how does it adjust its understanding when faced with contradictory labels like positive sentences labeled as negative?

score 1 · Answer 1 · answered May 24 '24 at 21:17

Ok-Translator-5878 gave you a good reference on Reddit:

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers Damai Dai , Yutao Sun , Li Dong , Yaru Hao , Shuming Ma , Zhifang Sui , Furu Wei ACL 2023 | July 2023

How is a LLM able to override its prior knowledge through In-Context Learning?

1 Answers1