B Bayes’ Theorem

Brain teaser: A heritable disease occurs randomly in 10% of the population. If someone has the disease, it is passed on to their children with probability 50%. A mother has 1 healthy child. Given this, what’s the conditional probability that the mother has the disease?

  • Is the answer 10%? Less? More? How do we quantify it?
  • Let \(M\) be the event that the mother has the disease.
  • Let \(C\) be the event that the child has the disease.
  • We want \(P(M\mid \textrm{not } C)\). We have \(P(M)=0.1\) and \(P(\textrm{not }C\mid M)=0.5\).

Solution:

\[P(M \mid \textrm{not } C) = \frac{P(\textrm{not } C \mid M)P(M)}{P(\textrm{not } C)}\]

So we still need \(P(\textrm{not } C)\). This could happen in 2 ways (“law of total probability”)

\[P(\textrm{not } C)=P(\textrm{not } C \mid M)P(M) + P(\textrm{not } C \mid \textrm{not } M)P(\textrm{not } M)\]

We know \(P(\textrm{not } M)=1-P(M)=0.9\). And we assume \(P( C \mid \textrm{not } M)=0.1\) because the child can randomly get the disease like anyone else, so then \(P(\textrm{not } C \mid \textrm{not } M)=1-P( C \mid \textrm{not } M)=0.9\). Finally, then, we’re left with:

\[P(M \mid \textrm{not } C) = \frac{0.5 \times 0.1}{0.5\times 0.1 + 0.9 \times 0.9}\]

Sanity check: this is less than 10%. That’s what our intuition told us.

  • We can get what we need using Bayes’ Theorem.
  • We’ve seen above that, for events \(A\) and \(B\), \(P(A,B)=P(A\mid B)P(B)\).
  • We can also write this as \(P(A,B)=P(B\mid A)P(A)\).
  • Since these are equal, we get the famous Bayes’ theorem:

\[P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}\]

(0.5*0.1)/(0.5*0.1+0.9*0.9)
## [1] 0.05813953

Follow-up question: What’s the conditional probability that the next child will have the disease?

Suppose a drug test produces a positive result with probability 0.99 for drug users, \(P(T = 1\mid D = 1) = 0.99\). It also produces a negative result with probability \(0.99\) for non-drug users, \(P(T = 0\mid D = 0) = 0.99\). The probability that a random person uses the drug is \(0.001\), so \(P(D = 1) = 0.001\). What is the probability that a random person who tests positive is not actually a user, \(P(D = 0 \mid T = 1)\)?