As a narrowing-in on the question How does DeepSeek-R1 perform its "reasoning" part exactly?, how exactly does the <think> step generation work? What is an example using demo short made-up numerical vectors (like 4D vectors to keep things simple)?
A prompt I am looking at seeing how it works is something simple but not a simple math equation.
Why do some metals rust while others do not?
Everywhere I look for an explanation is talking about DeepSeek-R1:
- "reasoning"
- "thinking"
- "deducing"
- "understanding"
- etc..
I am trying to understand exactly what is meant by these terms in this context, so I would like to know exactly how the <think> tags are generated.
After asking ChatGPT for several days, the most picture I gleaned was:
- Tokenize prompt into numerical vectors (I got the gist of how basic LLMs generate text from prompts, written up here).
- Generate initial "thought representation". Each token somehow attends to other tokens, somehow meaning the model calculates which words are important for reasoning. (Don't get this part). ChatGPT says of this step: "The model learns relationships between words, before generating reasoning steps. Contrast words like 'while' and 'do not' help frame an explanation." I don't get how it figures out words and phrases "frame an explanation", or what is exactly happening at this step at a practical, vector level.
- Generate step-by-step reasoning. Now that attention has structured relationships, the model expands the input into multi-step reasoning (somehow?). Each token's vector is updated based on attention scores, resulting in contextualized reasoning vectors? What does that mean exactly? Then somehow, this new set of vectors is used to predict structured reasoning steps inside
<think>tags. - Expand the reasoning vectors. DeepSeek-R1 now predicts each step one-by-one by expanding the reasoning vectors (somehow?). This somehow involves "isolating core concepts", and "contrasting things". Generating complete multi-step explanations.
None of the information in the above steps is very useful or practical, it's still too vague. From that information, I cannot imagine in my head the flow of numerical vectors, and couldn't explain what they mean by "reasoning" exactly.
Can you explain with a basic pseudo-code/pseudo-data example, using my prompt or something similar, how the "reasoning" might work, to generate the <think> tags in some detail?