Artificial Intelligence 🤖
Continuous Random Variables

Continuous Random Variables

The previous section on discrete random variables sets us up nicely to consider continuous random variables. Informally, a random variable is continuous if it represents a quantity that is measured (as opposed to counted). For instance, if we were to conduct a sampling of the heights of everyone in the class, we would get a range of values that are not essentially countable (e.g. 183.32 cm183.32 \mathrm{~cm} ). However, we may be interested in computing the probability of observing heights greater than 182 cm182 \mathrm{~cm} and less than 185 cm185 \mathrm{~cm}; this section will introduce the methods for computing such probabilities.

NOTE: There is a slight change in nomenclature between discrete and continuous random variables; instead of probability mass functions (PMFs; fX(x)f_{X}(x) ) we have probability density functions (PDFs; also fX(x)!)\left.f_{X}(x) !\right).

We shall find out that all our definitions for the cumulative distribution function (CDF;FX(x))\left(\mathrm{CDF} ; F_{X}(x)\right), expectation (E[X])(E[X]) and variance (Var[X])(\operatorname{Var}[X]) are the same, except we replace summations with integrations.

Since we have an infinite number of values XX can take on as a continuous random variable, we can no longer computer the probability of observing an exact value (which is equal to zero, as we shall show mathematically). Thus, we need to define the probability over a range of values:

A random variable XX is continuous if there exists a function fX(x)f_{X}(x), the probability density function (PDF) of XX, with the property:

P(aXb)=FX(b)FX(a)=abfX(x)dxP(a \leq X \leq b)=F_{X}(b)-F_{X}(a)=\int_{a}^{b} f_{X}(x) d x

Integrating the PDF over an interval (a,b)(a, b) gives the probability that XX takes a value within that interval.

If we let aa \rightarrow-\infty, the probability FX(a)F_{X}(a) becomes 0 and we can write:

FX(b)=bfX(x)dxF_{X}(b)=\int_{-\infty}^{b} f_{X}(x) d x

Notice the similarities between the above equation and property 2 for discrete CDFs(FX(b)=xbfX(x))\operatorname{CDFs}\left(F_{X}(b)=\sum_{x \leq b} f_{X}(x)\right), where we have essentially replaced the summation with an integral; this tells us that we can integrate the PDF of XX to obtain the CDF. But what about evaluation of the CDF at a single point?

FX(b)=bbfX(x)dx=0F_{X}(b)=\int_{b}^{b} f_{X}(x) d x=0

From this we can conclude that the CDF evaluated at a single point is zero; this has important implications, in that the CDF can only be evaluated over an interval of values for random variable XX. This also leads to an often relaxed interchange between strict inequalities and inequalities in the computation of probabilities (you will see different versions in different texts):

For a<ba<b :

P(aXb)=P(a<Xb)=P(aX<b)=P(a<X<b)P(a \leq X \leq b)=P(a<X \leq b)=P(a \leq X<b)=P(a<X<b)

We can redefine the PDF as the derivative of the CDF:

fX(x)=ddxFX(x)f_{X}(x)=\frac{d}{d x} F_{X}(x)

The above definition is what is more commonly provided. Similar to a PMF, a function fX(x)f_{X}(x) is a PDF for a continuous random variable XX if and only iff:

 1. fX(x)0, for all xR 2. fX(x)dx=1\begin{aligned} & \text { 1. } f_{X}(x) \geq 0, \text { for all } x \in \mathbb{R} \\ & \text { 2. } \int_{-\infty}^{\infty} f_{X}(x) d x=1 \end{aligned}

Expectation and Variance for Continuous Random Variables

Analogous to the moments we computed for discrete random variables, we can do the same for continuous RVs:

Mnc=(xc)nfX(x)dxM00=fX(x)dx(=1)M10=xfX(x)dx(=E[X])M2E[X]=(xE[X])2fX(x)dx(=Var[X])\begin{aligned} M_{n}^{c} & =\int_{-\infty}^{\infty}(x-c)^{n} f_{X}(x) d x \\ M_{0}^{0} & =\int_{-\infty}^{\infty} f_{X}(x) d x \quad(=1) \\ M_{1}^{0} & =\int_{-\infty}^{\infty} x f_{X}(x) d x \quad(=E[X]) \\ M_{2}^{E[X]} & =\int_{-\infty}^{\infty}(x-E[X])^{2} f_{X}(x) d x \quad(=\operatorname{Var}[X]) \end{aligned}

The expectation, E[X]E[X], of a continuous random variable XX is defined as:

E[X]=xfX(x)dxE[X]=\int_{-\infty}^{\infty} x f_{X}(x) d x

In direct analogy to the discrete case, the expectation is a weighted integral over the values XX can take.

Similarly, we define the expectation of a function g(X)g(X) of XX to be:

E[g(X)]=g(x)fX(x)dxE[g(X)]=\int_{-\infty}^{\infty} g(x) f_{X}(x) d x

and note that the 'formal' definition of the variance is the same as in the discrete case:

Var[X]=E[(XE[X])2]=E[X2]E[X]2\operatorname{Var}[X]=E\left[(X-E[X])^{2}\right]=E\left[X^{2}\right]-E[X]^{2}

Example: Probability Density Function from 2008 Exam

The continuous random variable XX has the density function:

fX(x)=k(1+x)α for α>2 and 0<x<f_{X}(x)=k(1+x)^{-\alpha} \quad \text { for } \alpha>2 \quad \text { and } 0<x<\infty

(a) Show that k=α1k=\alpha-1. Solution:

0k(1+x)α=1k(α1)[1(1+x)(α1)]0=1[01]=(α1)kk=α1\begin{aligned} \int_{0}^{\infty} \frac{k}{(1+x)^{\alpha}} & =1 \\ \frac{-k}{(\alpha-1)}\left[\frac{1}{(1+x)^{(\alpha-1)}}\right]_{0}^{\infty} & =1 \\ {[0-1] } & =\frac{(\alpha-1)}{-k} \\ k & =\alpha-1 \end{aligned}

(b) Calculate the mean of XX.

Solution: The mean is the expectation of that random variable. Thus we need to compute:

E[X]=0x(α1)(1+x)αdxE[X]=\int_{0}^{\infty} x \frac{(\alpha-1)}{(1+x)^{\alpha}} d x

To solve for this we can use integration by parts:

udv=uvvdu where: u=xdu=1dv=(α1)(1+x)αv=1(1+x)(α1)\begin{aligned} \int u d v= & u v-\int v d u \\ \text { where: } \quad u=x & \rightarrow d u=1 \\ d v=\frac{(\alpha-1)}{(1+x)^{\alpha}} & \rightarrow \quad v=\frac{-1}{(1+x)^{(\alpha-1)}} \end{aligned}

Evaluation of the first term gives us:

uv=[x(1+x)(α1)]0u v=\left[\frac{-x}{(1+x)^{(\alpha-1)}}\right]_{0}^{\infty}

Which is equal to zero in both since α\alpha is strictly greater than two.

Computing the integral in the second term gives:

vdu=01(1+x)(α1)dx=1(α2)[1(1+x)(α2)]0\begin{aligned} \int v d u & =\int_{0}^{\infty} \frac{-1}{(1+x)^{(\alpha-1)}} d x \\ & =\frac{1}{(\alpha-2)}\left[\frac{1}{(1+x)^{(\alpha-2)}}\right]_{0}^{\infty} \end{aligned}

Which is equal to: 1(α2)\frac{-1}{(\alpha-2)}

Subtracting the two terms gives us E[X]=1(α2)E[X]=\frac{1}{(\alpha-2)}

(c) Compute the P(1X2)P(1 \leq X \leq 2) for α=3\alpha=3. Solution: We can simply integrate the PDF over the range x=[1,2]x=[1,2].

P(1X2)=12fX(x)dx=12(α1)(1+x)αdx=[1(1+x)(α1)]12=1(3)21(2)2=0.13889\begin{aligned} P(1 \leq X \leq 2) & =\int_{1}^{2} f_{X}(x) d x \\ & =\int_{1}^{2} \frac{(\alpha-1)}{(1+x)^{\alpha}} d x \\ & =\left[\frac{-1}{(1+x)^{(\alpha-1)}}\right]_{1}^{2} \\ & =\frac{-1}{(3)^{2}}-\frac{-1}{(2)^{2}}=0.13889 \end{aligned}

Figure 7 shows a plot of the PDFfX(x)\operatorname{PDF} f_{X}(x) and 12fX(x)dx\int_{1}^{2} f_{X}(x) d x for α=3\alpha=3.

Figure 7: Plot of the probability density function fX(x)=(α1)(1+x)αf_{X}(x)=\frac{(\alpha-1)}{(1+x)^{\alpha}}, for α=3\alpha=3. The area under the curve is equal to 0.13889 , which corresponds to the probability of observing xx in that interval.

Example: Probabilistic Interpretation of a Wave Function

Let's take a moment to think about where we have seen these operations before in our Chem Eng courses. Recall in your Properties of Matter course that you were provided the wave function, Ψ\Psi, for the "particle in a box problem" that satisfies the Schrödinger equation and allegedly contains all the dynamical information about the system.

  1. General Solution (Review):

Recall the Schrödinger equation for free motion in one direction is given by solutions to the equation:

22md2Ψ(x)dx2=EΨ(x)-\frac{\hbar^{2}}{2 m} \frac{d^{2} \Psi(x)}{d x^{2}}=E \Psi(x)

The general solution to this problem is:

Ψk(x)=Aeikx+BeikxEk=k222m\Psi_{k}(x)=A e^{i k x}+B e^{-i k x} \quad E_{k}=\frac{k^{2} \hbar^{2}}{2 m}

Where the wave function Ψk(x)\Psi_{k}(x) can be written more conveniently as:

Ψk(x)=Csinkx+Dcoskx using e±ix=cosx±isinx\Psi_{k}(x)=C \sin k x+D \cos k x \quad \text { using } e^{ \pm i x}=\cos x \pm i \sin x

This solution must satisfy the boundary conditions Ψk(x=0)=0\Psi_{k}(x=0)=0 and Ψk(x=L)=0\Psi_{k}(x=L)=0. Thus the expression simplifies to:

Ψk(x)=Csinkx as the sin function satisfies the boundary condition that Ψk(0)=0\Psi_{k}(x)=C \sin k x \quad \text { as the sin function satisfies the boundary condition that } \Psi_{k}(0)=0

Next we need Ψk(x)\Psi_{k}(x) to satisfy the other boundary condition Ψk(x=L)=0\Psi_{k}(x=L)=0, which results in the requirement that:

Ψk(x=L)=0=CsinkLkL=nπn=1,2,..\Psi_{k}(x=L)=0=C \sin k L \quad \rightarrow \quad k L=n \pi \quad n=1,2, . .

Solving for kk above, the final solution to the wave function for a particle in a box is therefore:

Ψn(x)=CsinnπxL0xLn=1,2,..\Psi_{n}(x)=C \sin \frac{n \pi x}{L} \quad 0 \leq x \leq L \quad n=1,2, . .

Where we now index the wave function by nn (the multiple of π\pi ).

  1. Normalisation of the Wave Function:

If we focus on the information regarding the location of a particle, then we can use the Born interpretation of the wave function, which states that the probability density function (PDF) for finding a particle as a function of its position xx is:

fX(x)=Ψ(x)2f_{X}(x)=|\Psi(x)|^{2}

Thus, starting from our solution to the Schrödinger equation (Ψ(x)(\Psi(x) above), we have to normalise the wave function so it satisfies fX(x)dx=1\int_{-\infty}^{\infty} f_{X}(x) d x=1 and can be utilised as a PDF to describe particle position:

0L(Ψn(x))2dx=1C20Lsin2nπxLdx=1C2L2=1C=2L\begin{aligned} \int_{0}^{L}\left(\Psi_{n}(x)\right)^{2} d x & =1 \\ C^{2} \int_{0}^{L} \sin ^{2} \frac{n \pi x}{L} d x & =1 \\ C^{2} \frac{L}{2} & =1 \\ C & =\sqrt{\frac{2}{L}} \end{aligned}

This results in the following normalised wave function for a particle in a box of length LL (given to us in PoM):

fX(x)=Ψn(x)2=2Lsin2nπxL0xLf_{X}(x)=\Psi_{n}(x)^{2}=\frac{2}{L} \sin ^{2} \frac{n \pi x}{L} \quad 0 \leq x \leq L

(a) Probability Density Function

(b) Probability Density Maps

Figure 8: Plot of the probability density function for the particle in a box problem described by fX(x)=f_{X}(x)= Ψn(x)2=2Lsin2nπxL\Psi_{n}(x)^{2}=\frac{2}{L} \sin ^{2} \frac{n \pi x}{L} for n=1n=1 and n=2n=2, and L=1 nmL=1 \mathrm{~nm}. (a) Notice that for n=1n=1, the particle has the highest probability for being found at x=0.5x=0.5 (in the middle of the box). When n=2n=2, we see that the particle is more likely to be found at x=0.25x=0.25 and x=0.75x=0.75, with equal probability. As nn increases, we will have nn maxima in our PDF of equal probability. (b) Top-down representation of PDF functions, where darker shading indicates areas with a higher probability of observing the particle.

Figure 8 shows what the PDF looks like for the particle in a box for n=1n=1 and n=2n=2 in a box of length L=1 nmL=1 \mathrm{~nm}.

Expectation and Variance of the Wave Function

Given the wave function, we can compute the expected position of the particle using our definition of Expectation (E[X](E[X]; often represented in physics textbooks as <x>)<x>) :

E[X]=xfX(x)dx=0Lx2Lsin2nπxLdx\begin{aligned} E[X] & =\int_{-\infty}^{\infty} x f_{X}(x) d x \\ & =\int_{0}^{L} x \frac{2}{L} \sin ^{2} \frac{n \pi x}{L} d x \end{aligned}

As in the previous example, this requires integration by parts and also the trigonometric equation sin2x=12(1cos2x)\sin ^{2} x=\frac{1}{2}(1-\cos 2 x)

udv=uvvdu where: xdu=1dv=2Lsin2nπxLdv=1L(1cos2nπxL)v=1L(xL2nπsin2nπxL)\begin{aligned} & \int u d v=u v-\int v d u \\ & \text { where: } \quad \rightarrow \quad x \quad d u=1 \\ & d v=\frac{2}{L} \sin ^{2} \frac{n \pi x}{L} \\ & d v=\frac{1}{L}\left(1-\cos \frac{2 n \pi x}{L}\right) \rightarrow \quad v=\frac{1}{L}\left(x-\frac{L}{2 n \pi} \sin \frac{2 n \pi x}{L}\right) \end{aligned}

We leave it to the reader to provide that uv=Lu v=L and vdu=L2\int v d u=\frac{L}{2}. Thus:

E[X]=LL2=L2E[X]=L-\frac{L}{2}=\frac{L}{2}

Thus, the mean particle position is in the middle of the box and is independent of nn.

In an analogous manner, we can compute E[X2]E\left[X^{2}\right] to determine the variance of these distributions using Var[X]=E[X2]E[X]2\operatorname{Var}[X]=E\left[X^{2}\right]-E[X]^{2}.

E[X2]=x2fX(x)dx=0Lx22Lsin2nπxLdx\begin{aligned} E\left[X^{2}\right] & =\int_{-\infty}^{\infty} x^{2} f_{X}(x) d x \\ & =\int_{0}^{L} x^{2} \frac{2}{L} \sin ^{2} \frac{n \pi x}{L} d x \end{aligned}

This expression is slightly more complicated, but again can be analytically computed using integration by parts:

udv=uvvdu where: u=x2du=2xdv=2Lsin2nπxLdv=1L(1cos2nπxL)v=1L(xL2nπsin2nπxL)\begin{aligned} & \int u d v=u v-\int v d u \\ & \text { where: } \quad \begin{aligned} & u=x^{2} \rightarrow d u=2 x \\ & d v=\frac{2}{L} \sin ^{2} \frac{n \pi x}{L} \\ & d v=\frac{1}{L}\left(1-\cos \frac{2 n \pi x}{L}\right) \rightarrow v=\frac{1}{L}\left(x-\frac{L}{2 n \pi} \sin \frac{2 n \pi x}{L}\right) \end{aligned} \end{aligned}

Evaluating the first term is simple: uv=L2u v=L^{2}. But the second term results in:

udv=2L231nπ0Lxsin2nπxLdx\int u d v=\frac{2 L^{2}}{3}-\frac{1}{n \pi} \int_{0}^{L} x \sin \frac{2 n \pi x}{L} d x

The latter term requires one more round of integration by parts (using u=x,du=1,dv=sin2nπxLu=x, d u=1, d v=\sin \frac{2 n \pi x}{L}, v=v= L2nπcos2nπxL)\left.-\frac{L}{2 n \pi} \cos \frac{2 n \pi x}{L}\right), which in total results in the terms in the brackets below:

E[X2]=L2[2L23+L22n2π2]=L23L22n2π2\begin{aligned} E\left[X^{2}\right] & =L^{2}-\left[\frac{2 L^{2}}{3}+\frac{L^{2}}{2 n^{2} \pi^{2}}\right] \\ & =\frac{L^{2}}{3}-\frac{L^{2}}{2 n^{2} \pi^{2}} \end{aligned}

Finally we can compute the variance as:

Var[X]=E[X2]E[X]2=L23L22n2π2(L2)2=L212L22n2π2=L22(161n2π2)\begin{aligned} \operatorname{Var}[X] & =E\left[X^{2}\right]-E[X]^{2} \\ & =\frac{L^{2}}{3}-\frac{L^{2}}{2 n^{2} \pi^{2}}-\left(\frac{L}{2}\right)^{2} \\ & =\frac{L^{2}}{12}-\frac{L^{2}}{2 n^{2} \pi^{2}} \\ & =\frac{L^{2}}{2}\left(\frac{1}{6}-\frac{1}{n^{2} \pi^{2}}\right) \end{aligned}

Although it was quite a bit of work, we can make a number of useful inferences with this analytical expression for Var[X]\operatorname{Var}[X]. For starters, we can see that the variance is always positive, which it should be as it measures the dispersion of our PDF.

Pause and Reflect 1: How does the variance change as a function of nn ? Does this make intuitive sense?

Pause and Reflect 2: How do these results for expectation and variance compare to what we would expect for classical expressions?

Hint: The probability density function for classical mechanics would be simply represented as a uniform distribution, which results in the PDFfX(x)=1L\operatorname{PDF} f_{X}(x)=\frac{1}{L}.

Example: Real Data for Continuous RVs - Particle Size Distribution

Finally let's consider an example of a continuous random variable based on actual experimental data. Many of such instances involve experiments where the cummulative distribution function (CDF;FX(x))\left(\mathrm{CDF} ; F_{X}(x)\right) can be measured, and then we need to fit the experimental data to some model to compute probabilities of interest. In this example we will use experimental data corresponding to the distribution of particle sizes, which has important applications in various aspects of Chemical Engineering (e.g. reaction engineering, powder processing, etc).

In particle characterisation, we make an approximation that the particles exhibit some sort of a spherical shape, with a equivalent spherical diameter of XX. Particle size analysis equipment of various sorts (e.g. sieves of different mesh sizes) can provide us with measurements pertaining to the total number of particles found to be below a particular size (or diameter, XX ). An example data set from actual measurements is presented below in Figure 9.

This raw data bears a striking resemblance to the shape of the discrete RV CDF for our sum of two dice example (see blue step function in Figure 4)! Indeed, the normalised form of this data represents our CDF for particle sizes XX. Of course the data presented in Figure 9 is discrete in that it is comprised of only 30 data points; in practice, however, we intuitively know that the actual sizes of the particles in the experiment take on a continuum of values between 0-60 microns.

In order to work with the particle sizes as a continuum, we will need to fit the available experimental CDF data to a mathematical model (this is a topic we defer for now but will explore in further detail in later chapters). In the particle characterisation community, a number of PDF and CDF models have been proposed; one such model, known as the Rosin-Rammler-Sperling-Bennett (RRSB) cumulative function, is presented below:

FX(x)=N3(x)=1exp[(xx63.2)n]F_{X}(x)=N_{3}(x)=1-\exp \left[-\left(\frac{x}{x_{63.2}}\right)^{n}\right]

where N3(x)N_{3}(x) implies the function models the cumulative mass or volume, x63.2x_{63.2} represents the particle size at which 63.2%63.2 \% of the distribution lies below and nn is a constant called the 'uniformity index'.

Figure 9: Raw data for measured particle size distribution. Measurements have been normalised to total mass of the sample. This function represents the CDF for our continuous random variable X:FX(x)X: F_{X}(x).

Now say that using the data in Figure 9 we have 'estimated' the parameter values (again, using methods described later in the course) for this RRSB cumulative function to be x63.2=37.76x_{63.2}=37.76 and n=15.165n=15.165. Now we can plot the particle size CDF for all sizes using the RRSB equation to represent our cumulative distribution function, as shown in the red curve in Figure 10.

By contrasting Figures 9 and 10, we can see that the RRSB CDF (red curve in Figure 10) approximates the raw data fairly well, particularly for the diameter range that describes most of the particles (33-41 microns). Since we now have a continuous CDF, we can differentiate this to obtain our probability density function (PDF), fX(x)=ddxFX(x)f_{X}(x)=\frac{d}{d x} F_{X}(x), as shown by the green curve in Figure 10.

Using the analytical model for our CDFFX(x)\operatorname{CDF} F_{X}(x), we can ask questions regarding the distribution of particle sizes in our sample. For example, if were interested in quantifying the population of particles having diameters between 30-35 microns, we would simply evaluate the following:

P(30X35)=FX(35)FX(30)=0.271160.03007=0.24108=3035fX(x)dx confirmed; computed numerically \begin{aligned} P(30 \leq X \leq 35) & =F_{X}(35)-F_{X}(30) \\ & =0.27116-0.03007=0.24108 \\ & =\int_{30}^{35} f_{X}(x) d x \quad \text { confirmed; computed numerically } \end{aligned}

Likewise, we can compute the expectation E[X]E[X] and variance Var[X]\operatorname{Var}[X] for the PDF using the moment functions defined earlier in the section. Taking the first moment of fX(x)f_{X}(x), we find that E[X]=36.326E[X]=36.326, as shown in Figure 11; note that this value does not correspond to the apex (or maximum value) of fX(x)f_{X}(x) in Figure 10, but lies slightly to the left, as the PDF has a broader left 'shoulder'.

Figure 10: RRSB cumulative function (FX(x)\left(F_{X}(x)\right., also referred to as N3(x)N_{3}(x), red curve) using parameter values x63.2=37.76x_{63.2}=37.76 and n=15.165n=15.165 estimated from raw data in Figure 9 . The PDF(fX(x)\operatorname{PDF}\left(f_{X}(x)\right., green curve )) can be calculated from the CDF via differentiation: fX(x)=ddxFX(x)f_{X}(x)=\frac{d}{d x} F_{X}(x). Note that the area under the curve of fX(x)f_{X}(x) is equal to 1 (via numerical integration using the trapezoid rule).

Lastly, computing the second moment of fX(x)f_{X}(x) about E[X]=36.326E[X]=36.326 provides Var[X]=8.721\operatorname{Var}[X]=8.721, as shown in Figure 12. It is interesting to examine the individual contributions to the variance calculation, which are shown by the green curve in Figure 12. From this, we see a clear bias in particle diameters below the mean value of 36.326 , which tells us that the distribution is asymmetric.

Figure 11: Expectation of PDFfX(x)\operatorname{PDF} f_{X}(x) by computing the first moment is found to be E[X]=36.326E[X]=36.326. The green curve represents individual contributions to the expectation: xδx+δufX(u)du\int_{x-\delta}^{x+\delta} u \cdot f_{X}(u) d u for a given xx. The red curve denotes the integral value up to xx : 0xufX(u)du\int_{0}^{x} u \cdot f_{X}(u) d u, which when x=60x=60 gives us E[X]E[X].

Figure 12: Variance of PDF fX(x)f_{X}(x) by computing the second moment about E[X]E[X] is found to be Var[X]=\operatorname{Var}[X]= 8.721. The green curve represents individual contributions to the variance: xδx+δ(uE[X])2fX(u)du\int_{x-\delta}^{x+\delta}(u-E[X])^{2} \cdot f_{X}(u) d u. The red curve denotes the integral value up to x:0x(uE[X])2fX(u)dux: \int_{0}^{x}(u-E[X])^{2} \cdot f_{X}(u) d u, which when x=60x=60 gives us Var[X]\operatorname{Var}[X]. One can see from this plot that the PDF is asymmetric about E[X]E[X] (i.e. its mean value).