MLE of a Normal Distribution: Sample Mean and Sample Variance

Before we begin our section on interval estimation, we will consider the MLE for Normal parameters $\mu$ and $\sigma$ . Recall the PDF for the Normal distribution, as given in Table 1:

f_{X}(x \mid \mu, \sigma)=\frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-(x-\mu)^{2} / 2 \sigma^{2}}

From which we can derive the likelihood function for a sample dataset $D$ as:

\begin{aligned} L(\theta=\{\mu, \sigma\})=P(D \mid \mu, \sigma)=\prod_{i=1}^{n} f_{X}\left(D_{i} \mid \mu, \sigma\right) & =\prod_{i=1}^{n} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\left(D_{i}-\mu\right)^{2} / 2 \sigma^{2}} \\ & =\left(2 \pi \sigma^{2}\right)^{-n / 2} e^{-\frac{1}{2 \sigma^{2}} \sum_{i=1}^{n}\left(D_{i}-\mu\right)^{2}} \end{aligned}

Taking the natural logarithm we get the corresponding log-likelihood:

\mathcal{L}(\mu, \sigma)=\ln L(\mu, \sigma)=-\frac{n}{2} \ln \left(2 \pi \sigma^{2}\right)-\frac{1}{2 \sigma^{2}} \sum_{i=1}^{n}\left(D_{i}-\mu\right)^{2}

Note that in this case, we need to solve for $\mu$ and $\sigma$ simultaneously; this will require setting the partial derivative of the log-likelihood function with respect to these variables to be equal to zero:

\begin{aligned} & \frac{\partial \mathcal{L}(\mu, \sigma)}{\partial \mu}=0=\frac{1}{\sigma^{2}} \sum_{i=1}^{n}\left(D_{i}-\mu\right) \\ & \frac{\partial \mathcal{L}(\mu, \sigma)}{\partial \sigma^{2}}=0=-\frac{n}{2 \sigma^{2}}+\frac{1}{2}\left(\frac{1}{\sigma^{2}}\right)^{2} \sum_{i=1}^{n}\left(D_{i}-\mu\right)^{2} \end{aligned}

We can easily solve the first equation for $\mu$ :

\begin{aligned} \sum_{i=1}^{n}\left(D_{i}-\mu\right) & =0 \\ n \mu & =\sum_{i=1}^{n} D_{i} \end{aligned}

\hat{\mu}=\frac{1}{n} \sum_{i=1}^{n} D_{i}=\bar{X}

Once again (and perhaps not surprising), we find that the MLE for $\mu$ is the sample mean. We can then substitute our maximum likelihood estimate for $\mu=\bar{X}$ into the equation for $\frac{\partial \mathcal{L}(\mu, \sigma)}{\partial \sigma^{2}}$ :

-n \sigma^{2}+\sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}=0

\hat{\sigma}^{2}=\frac{1}{n} \sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}

This is a useful property of Normal distribution, whose two parameters correspond to the mean and variance of the population. In other words, sample mean and sample variance are the MLE estimates for these two parameters for Normal distribution. Before stating this formally, we shall first correct for the bias in the expression for $\sigma^{2}$ .

Correcting the Bias in MLE of $\sigma^{2}$

Recall that when we derived the $M S E(\hat{\theta})$ (section 5.2.1), we defined a term known as the bias:

\operatorname{Bias}(\hat{\theta})=(E[\hat{\theta}]-\theta)

Thus, the bias of an estimator is exactly zero when $E[\hat{\theta}]=\theta$ ; let's analyse this for our maximum likelihood estimator for $\sigma^{2}$ :

\begin{aligned} E[\hat{\theta}]=E\left[\hat{\sigma}^{2}\right] & =E\left[\frac{1}{n} \sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}\right] \\ & =E\left[\frac{1}{n} \sum_{i=1}^{n}\left(D_{i}^{2}-2 D_{i} \bar{X}+\bar{X}^{2}\right)\right] \\ & =E\left[\frac{1}{n}\left(\sum_{i=1}^{n} D_{i}^{2}-n \bar{X}^{2}\right)\right] \\ & =\frac{1}{n}\left[\sum_{i=1}^{n} E\left[D_{i}^{2}\right]-n E\left[\bar{X}^{2}\right]\right] \end{aligned}

At this point we need to make use of the definition that $\operatorname{Var}[X]=E\left[X^{2}\right]-E[X]^{2}$ , and rearrange as $E\left[X^{2}\right]=\operatorname{Var}[X]+E[X]^{2}$ to substitute for $E\left[D_{i}^{2}\right]$ (noting that the random variable $D_{i}$ is drawn from the same distribution for $X$ ) and $E\left[\bar{X}^{2}\right]$ (noting our results for the sample mean):

\begin{aligned} E\left[\hat{\sigma}^{2}\right] & =\frac{1}{n}\left[\sum_{i=1}^{n}\left(\sigma^{2}+\mu^{2}\right)-n\left(\frac{\sigma^{2}}{n}+\mu^{2}\right)\right] \\ & =\frac{n-1}{n} \sigma^{2} \end{aligned}

Thus we see that $E[\hat{\theta}] \neq \theta$ ! To correct for this 'bias' the maximum likelihood estimator we need to multiply by $\frac{n}{n-1}$ :

S^{2}=\frac{n}{n-1} \cdot \hat{\sigma}^{2}=\frac{n}{n-1} \cdot \frac{1}{n} \sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}=\frac{1}{n-1} \sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}

We refer to $S^{2}$ as our sample variance. Finally, we define the sample mean (as before) and sample standard deviation as:

\begin{aligned} \text { Sample mean: } \bar{X} & =\frac{1}{n} \sum_{i=1}^{n} D_{i} \\ \text { Sample standard deviation: } S & =\sqrt{\frac{1}{n-1} \sum_{i=1}^{n}\left(D_{i}-\bar{X}\right)^{2}} \end{aligned}

The unbiased estimator for standard deviation is quite useful for some statistical tests, which we will see in the next chapter.

Estimators and Point Estimation Interval Estimation

MLE of a Normal Distribution: Sample Mean and Sample Variance

Correcting the Bias in MLE of σ2\sigma^{2}σ2

Correcting the Bias in MLE of $\sigma^{2}$