Consider the following machine translation scenario. The reference translation has N words (do not consider sentence beginner ‘hat’ and sentence finisher ‘dot’). The machine output also has N words. If the unigram precision is (N-1)/N, then the bigram precision is (a) Also (N-1)/N (b) At most (N-2)/(N-1) (c) At least (N-2)/(N-1) (d) At least (N-3)/(N-1)
Statement- If the unigram precision is (N-1)/N, then the bigram precision is At most (N-2)/(N-1).
Is there any counter example which could prove the above statement wrong having candidate translation, C and reference translation, R.
I am unable to find any such counter example, kindly please help.
Also, are there any generalised results.
Prof Justification: The given unigram precision implies that only one word in the candidate translation, C is different from the words in the reference translation, R. Under this circumstance the ‘best’ case of bigram match occurs when the unigram mismatch is at one of the boundaries and the rest of the words occur exactly in the same order as in R. Then only one out of N-2 bigrams does not match, giving rise to the bigram precision of (N-2)/(N-1). The worst case of bigram mismatch occurs when the order of words in C is very different from that of R. The no of bigram matches can be even 0, as in C: people laugh loudly R: men loudly laugh