For theoretical questions related to the DeepSeek models (R1, V3, etc.).
Questions tagged [deepseek]
7 questions
3
votes
2 answers
Which model has more accuracy ChatGPT or DeepSeek?
I have been using ChatGpt for clear my confusion related to different courses in my semester. But currently another LLM model DeepSeek-R1 is launched.
My question is which model has higher accuracy than other, ChatGpt or DeepSeek?
Also there are…
Monte_carlo
- 41
- 6
2
votes
1 answer
How exactly are the steps generated in DeepSeek-R1?
As a narrowing-in on the question How does DeepSeek-R1 perform its "reasoning" part exactly?, how exactly does the step generation work? What is an example using demo short made-up numerical vectors (like 4D vectors to keep things simple)?
A…
Lance Pollard
- 87
- 7
1
vote
1 answer
Token compression in native sparse attention
I have a question about the token compression in the native sparse attention in https://arxiv.org/pdf/2502.11089.
When we compute the attention of $q_t$ and $K^{\sim cmp
}_t$, is $K^{\sim cmp
}_t$ one of the $\varphi$ in formula (7) or the vector…
HIH
- 121
- 2
1
vote
2 answers
How can I effectively weed out hallucinations as a user?
I'm interested in effectively using DeepSeek for research. If I ask DeepSeek these questions,
Has anyone in the USA ever had their permanent residency revoked for what they said?
Can you give examples of people who have their permanent residency…
Evan Carroll
- 111
- 2
1
vote
1 answer
How does DeepSeek-R1 perform its "reasoning" part exactly?
I wrote up my understanding for how LLMs generate text responses to text prompts (at a somewhat practical yet high level), focusing on example numerical vectors and how they are transformed at each step.
How can I understand the "reasoning" portion…
Lance Pollard
- 87
- 7
1
vote
1 answer
Why is DeepSeek's inference so slow?
Why is DeepSeek's inference on HuggingFace so slow (compared to Qwen, Llama, etc.)?
Geremia
- 555
- 1
- 5
- 12
0
votes
1 answer
DeepSeek github is available, I wonder whether every secret of the deepseek is open?
I am not expert in the field, there seems to be a lot of discussion about definition of open source AI, I wonder what is DeepSeek in that terms, experts say that it is open weight, I've seen their Github and they have inference folder with python…
math boy
- 111
- 3