Highest Voted 'cosine-similarity' Questions - Artificial Intelligence Stack Exchange

2

votes

1 answer

Which metric I should use in general for semantic similarity in text embedding?

I know this is a trivial question, but I’m really confused about which metric to choose—whether it depends on the model itself, or if there is a universally agreed-upon metric for computing semantic similarity. Suppose I have a text-to-embedding…

large-language-models word-embedding cosine-similarity

asked Feb 27 '25 at 03:25

Muhammad Ikhwan Perwira

800
3
10

1

vote

1 answer

JD-CV Matching: Cosine Similarity Not Performing Well

im working on a JD-CV matching system using Sentence Transformers (all-MiniLM-L6-v2) for embedding generation. I'm currently calculating cosine similarity between JD and CV embeddings, but the results are not very accurate.

transformer embeddings template-matching cosine-similarity

asked Feb 14 '25 at 11:38

GODGAMER

11
1

1

vote

1 answer

Cheap differentiable similarity metrics of vectors

I am looking to compute the similarity between a large set of vectors during neural network training - a process that is considerably expensive when choosing the wrong metric. So far, I am making use of cosine similarity, but I found that the…

optimization cosine-similarity

asked May 22 '23 at 11:43

postnubilaphoebus

356
2
13

1

vote

2 answers

Given embedding vector A and vector B, how to find top k embedding vectors such that they are similar to vector A and dissimilar to vector B

Which would be better approach for getting top k embedding vectors such that they are similar to embedding vector A and dissimilar to vector B. Approach 1: calculate f(V) = cosine_similarity(A,V) - cosine_similarity(B,V) for each vector V sort…

machine-learning word-embedding word2vec cosine-similarity

asked Jan 06 '22 at 13:04

Shubham

11
3

0

votes

2 answers

How do I choose a good treshold for classification (using cosine similarity scores)?

I am using openai's text-embedding-ada-002 embeddings model to do a semantic search on a database of articles to find articles that are most related to a given input text. I am looking for a way to define a minimum similarity score to prevent…

natural-language-processing classification embeddings cosine-similarity

asked May 26 '23 at 10:44

Stefan

1
1

0

votes

2 answers

How to reduce the number of clusters produced by the Markov Clustering Algorithm?

I have used the Markov Clustering Algorithm (MCL) to cluster tweets, based on their similarity. However, I got a too high number of clusters, and most of the clusters have only one tweet. Any suggestions to reduce the number of clusters?

natural-language-processing hyper-parameters clustering cosine-similarity

asked Sep 22 '21 at 07:05

Adnan Hussein

23
3

Questions tagged [cosine-similarity]

Which metric I should use in general for semantic similarity in text embedding?

JD-CV Matching: Cosine Similarity Not Performing Well

Cheap differentiable similarity metrics of vectors

Given embedding vector A and vector B, how to find top k embedding vectors such that they are similar to vector A and dissimilar to vector B

How do I choose a good treshold for classification (using cosine similarity scores)?

How to reduce the number of clusters produced by the Markov Clustering Algorithm?