Questions tagged [skip-gram]
4 questions
21
votes
2 answers
What are the main differences between skip-gram and continuous bag of words?
The skip-gram and continuous bag of words (CBOW) are two different types of word2vec models.
What are the main differences between them? What are the pros and cons of both methods?
DRV
- 1,843
- 3
- 15
- 20
1
vote
1 answer
Is my interpretation of the mathematics of the CBOW and Skip-Gram models correct?
I am a mathematics student who is learning NLP, so I have paid a high amount of attention on the mathematics used in the subject, but my interpretations may or may not be right sometimes. Please correct me if any of them are incorrect or do not make…
Robert
- 111
- 1
0
votes
0 answers
Skip-Gram Model description in word2vec explanation article
In his article word2vec Parameter Learning Explained Xin Rong says (page 7):
Each output is computed using the same hidden->output matrix:
$$ p(w_{c,j} = w_{O.c}|w_I)=y_{c,j}=\frac{exp(u_{c,j})}{\sum_{j'=1}^{V}exp(u_{j'})} \ \ \ \ (25) $$
Looking…
Damir Tenishev
- 188
- 8
0
votes
1 answer
Why does skip-gram uses linear maps for embedding?
This question is inspired from Lilian Weng's blog here about skip-gram model, where she shows the model as multiplication via 2 matrices $W$ and $W'$ for embedding and word context matrix and $W$ takes in one hot encoded embedding into a 'better'…
Rias Gremory
- 113
- 2