B Bayes’ Theorem

Brain teaser: A heritable disease occurs randomly in 10% of the population. If someone has the disease, it is passed on to their children with probability 50%. A mother has 1 healthy child. Given this, what’s the conditional probability that the mother has the disease?

Is the answer 10%? Less? More? How do we quantify it?
Let \(M\) be the event that the mother has the disease.
Let \(C\) be the event that the child has the disease.
We want \(P(M\mid \textrm{not } C)\). We have \(P(M)=0.1\) and \(P(\textrm{not }C\mid M)=0.5\).

Solution:

\[P(M \mid \textrm{not } C) = \frac{P(\textrm{not } C \mid M)P(M)}{P(\textrm{not } C)}\]

So we still need \(P(\textrm{not } C)\). This could happen in 2 ways (“law of total probability”)

\[P(\textrm{not } C)=P(\textrm{not } C \mid M)P(M) + P(\textrm{not } C \mid \textrm{not } M)P(\textrm{not } M)\]

We know \(P(\textrm{not } M)=1-P(M)=0.9\). And we assume \(P( C \mid \textrm{not } M)=0.1\) because the child can randomly get the disease like anyone else, so then \(P(\textrm{not } C \mid \textrm{not } M)=1-P( C \mid \textrm{not } M)=0.9\). Finally, then, we’re left with:

\[P(M \mid \textrm{not } C) = \frac{0.5 \times 0.1}{0.5\times 0.1 + 0.9 \times 0.9}\]

Sanity check: this is less than 10%. That’s what our intuition told us.

We can get what we need using Bayes’ Theorem.
We’ve seen above that, for events \(A\) and \(B\), \(P(A,B)=P(A\mid B)P(B)\).
We can also write this as \(P(A,B)=P(B\mid A)P(A)\).
Since these are equal, we get the famous Bayes’ theorem:

\[P(A\mid B)=\frac{P(B\mid A)P(A)}{P(B)}\]

(0.5*0.1)/(0.5*0.1+0.9*0.9)

## [1] 0.05813953

Follow-up question: What’s the conditional probability that the next child will have the disease?

Suppose a drug test produces a positive result with probability 0.99 for drug users, \(P(T = 1\mid D = 1) = 0.99\). It also produces a negative result with probability \(0.99\) for non-drug users, \(P(T = 0\mid D = 0) = 0.99\). The probability that a random person uses the drug is \(0.001\), so \(P(D = 1) = 0.001\). What is the probability that a random person who tests positive is not actually a user, \(P(D = 0 \mid T = 1)\)?