Conditional Probability and Chain Rule

We can also define the probability of an event conditional upon "evidence" of another event. Let's consider the following Venn diagram to illustrate this concept:

We can formally express this as: Let $A$ and $B$ be events with $P(B)>0$ . The conditional probability $P(A \mid B)$ (probability of $A$ given $B$ ) is defined as:

P(A \mid B)=\frac{P(A, B)}{P(B)}

This represents the probability that $A$ will occur given that $B$ has occurred.

Conditioning on $B$ means that we are restricting the sample space to the outcomes contained in $A$ . We know that either $(A, B)$ or $(\bar{A}, B)$ will occur.

Rearrangement of this definition $(P(A, B)=P(A \mid B) P(B))$ can be further generalised to provide the Chain Rule in probability theory:

P(A, B, C)=P(C \mid A, B) P(B \mid A) P(A)

Derivation:

We first condition $P(A, B, C)$ on $A$ :

P(A, B, C)=P(B, C \mid A) P(A)

Then we condition $P(B, C \mid A)$ on $B$ :

P(B, C \mid A)=P(C \mid A, B) P(B \mid A)

Combining these equations results in the Chain Rule definition.

Note that conditioning changes the probability, not the event. In the same way that the function $P(\cdot)$ assigns a probability to any event $A \subseteq \Omega$ , the function $P(\cdot \mid B)$ assigns a conditional probability (given $B$ ) to any event $A \subseteq \Omega$ . Thus, all the identities we have encountered so far will still work, for example:

\begin{gathered} P(\bar{A} \mid B)=1-P(A \mid B) \\ P(A \cup C \mid B)=P(A \mid B)+P(C \mid B)-P(A, C \mid B) \end{gathered}

Example: Conditional Probability

Using the survey example from the previous section, what is the probability that a person has seen Game of Thrones, given that the said person is a Breaking Bad fan?

Using the same notation as before, we want to find $P(G \mid B)$ . This is a straightforward application of the conditional-probability definition:

P(G \mid B)=\frac{P(G, B)}{P(B)}

Since we know the probabilities $P(G, B)$ and $P(B)$ we can directly compute the answer:

P(G \mid B)=\frac{\frac{20}{100}}{\frac{55}{100}}=\frac{20}{55}

Note that this probability is the same as the probability of selecting a Game of Thrones viewer at random from the list of Breaking Bad viewers. Conditional probabilities can be thought to be reducing the sample space to outcomes satisfied by the conditioned event only.

General Laws of Probability The Total Law of Probability