2

In naive Bayes classification, we estimate the class of a document as follows

$$\hat{c} = \arg \max_{c \in C} P(c \mid d) = \arg \max_{c \in C} \dfrac{ P(d \mid c)P(c) }{P(d)} $$

It has been said in page 4 of this textbook that we can ignore the probability of document since it remains constant across classes.

We can conveniently simplify the above equation by dropping the denominator $p(d)$. This is possible because we will be computing $\dfrac{P(d \mid c)P(c)}{P(d)}$for each possible class. But $P(d)$ doesn't change for each class; we are always asking about the most likely class for the same document $d$, which must have the same probability $P(d)$. Thus, we can choose the class that maximizes this simpler formula

$$\hat{c} = \arg \max_{c \in C} P(c \mid d) = \arg \max_{c \in C} P(d \mid c)P(c) $$

Since the value of the document does not influence the choice of the class, naive Bayes algorithm does not consider that.

But, I want to know the value of $P(d)$. Is it $\dfrac{1}{N}$, if total number of documents are $N$? How should I calculate $P(d)$?

nbro
  • 42,615
  • 12
  • 119
  • 217
hanugm
  • 4,102
  • 3
  • 29
  • 63

1 Answers1

0

$P(d)$ (aka evidence) is a probability of your data (observation) and is defined as follows:

$$ P(d) = \sum_i P(d|c_i)P(c_i) $$

for all classes $c$.

According to the book, $P(c)=\frac{N_c}{N_{doc}}$ and $P(d|c)$ is a likelihood and, applying the assumptions from the book, can be defined as $P(w_i |c)=\frac{\text{count}(w_i, c)+1}{\sum_{w \in V}\text{count}(w, c) + |V|}$, where $V$ consists of the union of all the word types in all classes.

Taking the example from Section 4.3 of the book,

Dataset Cat Documents
Training - just plain boring
- entirely predictable and lacks energy
- no surprises and very few laughs
+ very powerful
+ the most fun film of the summer
Test ? predictable with no fun

we'll get:

$$ P(-) = \frac{3}{5}\\ P(+) = \frac{2}{5}\\ P(S|−) = \frac{2\times 2\times 1}{34^3}\\ P(S|+) = \frac{1\times 1 \times 2}{29^3}\\ P(d) = P(S|−)P(−) + P(S|+)P(+) = 6.1\times 10^{−5} + 3.2\times 10^{−5} $$

Aray Karjauv
  • 987
  • 8
  • 15