Multivariate Distributions
Up to this point we have mostly focused on single (univariate) random variables (i.e. ). However, in certain experiments it might be appropriate to explore the relationships between several random variables, such as the relationship between blood cholesterol and heart disease. In this section, we will mostly focus on the bivariate case (e.g. - that is, pairs of random variables - as the main concepts can easily be extended to the general multivariate case.
Joint Cumulative Distribution Function
We begin by defining the joint cumulative distribution function (CDF), which we recall from our analysis of single (univariate) random variables is the same for discrete and continuous random variables:
The joint cumulative distribution function (joint CDF) of the random variables and is given by the function
Recall that the comma here represents the intersection between and (i.e. ). The function evaluated at a point is the probability that the random variable takes a value less than or equal to , and the random variable takes a value less than or equal to .
It should be noted that this definition also applies to mixed distributions (i.e. when and are a mixture of discrete and continuous random variables). Joint CDFs have the following properties:
- for all
- as or
- as and as
Property 3 stems from the fact that if , then the event occurs with probability 1 (and vice versa for the case where .
Joint Discrete Random Variables
Suppose that the random variables are both discrete. It is straightforward to generalise the univariate definition of the PMF to the bivariate (and multivariate) case:
The joint probability mass function (joint PMF) of the discrete random variables and is given by the function
The function evaluated at a point is the probability that the random variable takes the value , and the random variable takes the value .
If we know the joint , we can compute the joint by summation of the appropriate probabilities:
As in the univariate case, the total probability (i.e. summation over all values for and ) must equal 1 :
We can display the values of the joint PMF in a two-dimensional table, with the rows representing the values of and the columns representing the values of .
Example: Joint PMF for Two Fair Dice
Let's consider our favourite PMF example where we roll two fair dice. We will define to be the usual sum of the two dice (see the Examples in Section 2.2 and the PMF shown in Figure 3), and to be the larger of the two numbers (i.e. ).
We can compute each of the entries for by fixing a value for (the largest number on either die), fixing a value for (the sum of the two die), and then seeing the there exists an outcome such that .
Let's see how this applies to a few scenarios for :
By enumerating over all and , we can compute the entire joint PMF as:
0 | 0 | 0 | 0 | 0 | ||
0 | 0 | 0 | 0 | 0 | ||
0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | |||
0 | 0 | 0 | ||||
0 | 0 | 0 | ||||
0 | 0 | 0 | ||||
0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | 0 | ||
0 | 0 | 0 | 0 | 0 |
From the joint PMF we can directly compute a number of joint queries:
Joint Continuous Random Variables
Let us now consider the case where random variables and are both continuous.
The random variables and are jointly continuous if there exists a function and a joint probability density function (joint PDF) of , with the following property:
for any region . Integrating the PDF over a region gives the probability that take values within that region. The order of integration is not important and is interchangeable (as we shall see in the example below).
If the region of interest is rectangular, we have:
The total probability must be equal to 1 , so if we integrate the joint PDF everywhere, we obtain:
The joint CDF and PDF are linked in the usual way, but we need to integrate/differentiate twice,
Example: A Simple Joint PDF for Continuous RVs
Consider the PDF for two continuous random variables, and , given by:
(a) Find the value of so that is a valid joint PDF.
Solution:
The expression above is a joint PDF if and only if . So integrating the PDF over the correct range , and integrating over first we have:
Thus . The complete expression for then becomes:
(b) Compute .
Solution:
As we mentioned in the definition above, we need to integrate over the area defined by , and the order of integration (i.e. integrating over or first) should not matter. However, for nonrectangular areas (i.e. there is a dependence between and ) we must be careful in determining our range of integration.
Trick for Determining Integration Bounds for Non-rectangular Areas:
1.) First construct a table that specifies the bounds on each of the variables. For this particular example we have:
Upper bound | 1 | |
---|---|---|
Lower bound | 0 |
2.) Now we integrate the first variable over the range specified in the table above. The two possibilities for and are summarised in red below:
3.) Specifying the outer integration range for the second variable is a bit trickier, as we can no longer use the first variable in the range (it will have already been removed from the expression due to the first integration). In the case, if any of first variables appear in the range of the outer integration, we set them to the values of their corresponding bounds (as read from the table), which is shown in blue below:
Let us now compute by integrating over first and then :
We leave it to the reader to prove we get the same answer when integrating over first and then .
Marginal Distributions
We introduced the concept of marginalisation in Section 2.1.1, and we saw that it is essentially the Total Law of probability applied to random variables. Here we provide a more formal definition of marginalisation for discrete and continuous random variables and utilise examples to illustrate.
The marginal probability mass functions (marginal PMFs) of the discrete random variables and are the functions
The marginal of gives us the probability distribution of the random variable alone, ignoring any information about , and vice versa.
Example: Marginalisation of the Joint PMF for Two Dice
Consider again our example of a joint PMF , where was defined to be sum of the two dice and to be the larger of the two numbers. Using the above definition, we can compute by summing over all (i.e. over all columns) for a given ; the resulting values are shown in the far right column below. Likewise, we can compute by summing over all (i.e. over all rows) for a given ; the resulting values are shown in the bottom row below:
0 | 0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | ||||
0 | 0 | 0 | 0 | ||||
0 | 0 | 0 | |||||
0 | 0 | 0 | |||||
0 | 0 | 0 | |||||
0 | 0 | 0 | 0 | ||||
0 | 0 | 0 | 0 | ||||
0 | 0 | 0 | 0 | 0 | |||
0 | 0 | 0 | 0 | 0 | |||
Notice that the resulting and are indeed PMFs as they satisfy the two requirements (i.e. they are bounded between zero and one, and they sum to one). The univariate PMFs and are commonly written in the margins of the joint PMF table (as shown above), and are known as the marginals (hence the term "Marginalisation").
Pause and Reflect 1: Recall we defined marginalisation in Section 2.1.1 as . We have simply re-expressed this as using PMF functions for discrete random variables; these two expressions are identical.
Pause and Reflect 2: Does looks familiar? Check out Figure 3 and the table above it.
Pause and Reflect 3: Note that it is generally not possible to go the other way around; that is, to reconstruct the full joint PMF table from and alone (unless the random variables are independent, which will be discussed in the next section).
In an analogous fashion we can define marginalisation for continuous random variables as follows:
The marginal probability density functions (marginal PDFs) of the continuous random variables and are the functions
This is essentially the same as the previous definition; in a continuous setting, getting rid of a variable requires integration rather than summation. Let's consider a few examples to illustrate marginalising joint PDFs.
Example: Marginalisation of Joint PDFs for Continuous RVs
1.) Find for the following joint PDF:
2.) Find the marginal density functions and for the following joint PDF: