One way to design a spam filter is to look at the words in an email. In particular, some words are more frequent in spam emails. Suppose that we have the following information: 50% of emails are spam; 1% of spam emails contain the word "refinance"; 001% of non-spam emails contain the word "refinance". Suppose that an email is checked and found to contain the word "refinance". What is the probability that the email is spam?

Question

contestada

Respuesta :

MechEngineer MechEngineer · Answer 1 · 2020-02-13T08:57:51+01:00

Answer:

0.99

Step-by-step explanation:

Using Bayes theorem, let A be the event that the email is spam and B is the event that the email contains the word refinance.

A|B is the event that the email is a spam knowing that it contains the word "refinance". We are looking for the probability of this P(A|B)

B|A is the event that the email contains the word "refinance" given that it's a spam. P(B|A) = 0.01

P(A) is the probability that the email is spam = 0.5

P(B) is the probability that the email contains the word "refinance" = 0.5*0.01 + 0.5*0.0001 = 0.00505

Bayes formula

[tex]P(A|B) = \frac{P(B|A)P(A)}{P(B)} = \frac{0.01 * 0.5}{0.00505} = 0.99[/tex]

So the probability that the email is a spam is roughly 0.99