If the unigram precision is (N-1)/N, then the bigram precision is :

Question

Consider the following machine translation scenario. The reference translation has N words (do not consider sentence beginner ‘hat’ and sentence finisher ‘dot’). The machine output also has N words. If the unigram precision is (N-1)/N, then the bigram precision is (a) Also (N-1)/N (b) At most (N-2)/(N-1) (c) At least (N-2)/(N-1) (d) At least (N-3)/(N-1)

Statement- If the unigram precision is (N-1)/N, then the bigram precision is At most (N-2)/(N-1).

Is there any counter example which could prove the above statement wrong having candidate translation, C and reference translation, R.

I am unable to find any such counter example, kindly please help.

Also, are there any generalised results.

Prof Justification: The given unigram precision implies that only one word in the candidate translation, C is different from the words in the reference translation, R. Under this circumstance the ‘best’ case of bigram match occurs when the unigram mismatch is at one of the boundaries and the rest of the words occur exactly in the same order as in R. Then only one out of N-2 bigrams does not match, giving rise to the bigram precision of (N-2)/(N-1). The worst case of bigram mismatch occurs when the order of words in C is very different from that of R. The no of bigram matches can be even 0, as in C: people laugh loudly R: men loudly laugh

score 1 · Answer 1 · answered Nov 06 '23 at 08:52

Unigram and bigram accuracy is a measure used in machine translation to assess the quality of a machine-generated translation compared to a reference translation.

Unigram accuracy is the number of words in the candidate translation that exactly match the words in the reference translation, divided by the total number of words in the candidate translation.
Bigram accuracy is the number of consecutive word pairs (bigrams) in the candidate's translation that exactly match the bigrams in the reference translation, divided by the total number of bigrams in the candidate's translation.

Now, to understand the statement, suppose we have a candidate translation C with N words and a reference translation R. If the precision of the unigrams is (N-1)/N, this means that all but one of the words in C match those in R.

However, this does not guarantee that the bigram accuracy will be at most (N-2)/(N-1). In fact, there could be cases where the bigram accuracy is much lower. For example, consider the following case:

C = "La chatte et sur le tapis"
R = "Le tapis est sous la chatte".

Here, every word in C is also in R, so the unigram precision is 1. However, none of the pairs of consecutive words in C correspond to a pair of consecutive words in R, so the bigram precision is 0. This is an example that contradicts the given statement.

If the unigram precision is (N-1)/N, then the bigram precision is :

1 Answers1