Single Sample Inferences about the Population Mean: $H_{0}: \mu=\mu_{0}$

The previous sections described a general framework for hypothesis testing. The remaining sections will provide a special treatment for the modelling of null distributions for common hypothesis tests. We begin by discussing methods for making claims about population means.

For instance, we may be interested in finding out whether the mean of a population has changed, given some new experimental design. We are concerned about testing the hypothesis that $H_{0}: \mu=\mu_{0}$ , that is that the true population parameter is equal to $\mu_{0}$ . The alternative hypothesis could either be simple or composite in this case. We will base our inference on $\bar{X}$ , the MLE estimator of $\mu$ .

Testing $H_{0}: \mu=\mu_{0}$ for Normal Population with Known Variance, $\sigma^{2}$

Let's start with a population that we know is Normally distributed, with some variance $\sigma^{2}$ that is known to us a priori. We are interested in modelling the mean of this population, $\mu$ .

In order to estimate this mean, we collect a sample $D=\left(D_{1}, D_{2}, \ldots, D_{n}\right)$ , each drawn independently from the true population distribution: $D_{i} \sim N\left(\mu, \sigma^{2}\right)$ . We then compute the sample mean:

\bar{X}=\frac{1}{n} \times \sum_{i=1}^{n} D_{i}

And recalling that the sum of several Normal random variables is always also Normally-distributed, we know the exact distribution of $\bar{X}$ :

\bar{X} \sim N\left(\mu, \sigma^{2} / n\right)

As we discussed in the previous chapter, we will use the Z-transformed distribution of $\bar{X}$ to be our test statistic $T$ since we know it follows the standard Normal distrubution. We are particularly interested in modelling the null distribution (i.e. the null hypothesis $H_{0}$ that our true population parameter is $\mu=\mu_{0}$ ) of the test statistic:

T=\frac{\bar{X}-\mu_{0}}{\sigma / \sqrt{n}} \sim N(0,1)

Recall that $\bar{X}$ will generally change depending for each sample $D$ selected from the same population. To quantify the uncertainty associated with these estimates, we had previously derived confidence intervals with a probability of $(1-\alpha)$ of containing the population parameter $\mu_{0}$ :

P\left[\bar{X}-z_{\alpha / 2} \cdot \frac{\sigma}{\sqrt{n}} \leq \mu_{0} \leq \bar{X}+z_{\alpha / 2} \cdot \frac{\sigma}{\sqrt{n}}\right]=(1-\alpha)

Where

P\left(Z \leq z_{\alpha / 2}\right)=(1-\alpha / 2)

In hypothesis testing, we are essentially performing the opposite calculation in that we are interested in specifying those values for $T$ that are not likely to be observed assuming we know the true population parameter is $\mu=\mu_{0}$ . In other words, we seek to define the critical region assuming the null hypothesis to be true (given some simple or composite alternative hypothesis).

Consider the case where the alternative hypothesis we are testing is $H_{1}: \mu>\mu_{0}$ . Here we would reject the null hypothesis if the test statistic $T$ is improbably high (i.e. falls in the critical region) given a significance level $\alpha$ . Recalling our tools for defining confidence intervals, we can define the critical value to be $c=z_{\alpha}$ such that:

P\left(T \geq z_{\alpha}\right)=P\left(T \leq-z_{\alpha}\right)=\alpha

That is, we will reject the null hypothesis if $T$ is greater than some critical value $c=z_{\alpha}$ for a given significance level $\alpha$ :

\alpha=P(\text { Type I error })=P\left(\text { Reject } H_{0} \mid H_{0} \text { is true }\right)=P\left(T>z_{\alpha} \mid \mu=\mu_{0}\right)

Using the standard Normal tables, we see that the critical value for $\alpha=0.05$ is $c=z_{\alpha}=1.645$ ; therefore we would reject the null hypothesis $H_{0}$ if $T>1.645$ , as the critical region for the hypothesis is $(1.645, \infty)$ . A symmetrical line of reasoning applies for alternative hypotheses of a form $H_{1}: \mu<\mu_{0}$ (i.e. reject $H_{0}$ if $\left.T<-z_{\alpha} \mid \mu=\mu_{0}\right)$ .

However, for simple hypotheses $\left(H_{1}: \mu \neq \mu_{0}\right)$ , a two-tailed test is required Assuming $\alpha=0.05$ , the rejection region would include both large positive and large negative values of $T$ each spanning $\frac{\alpha}{2}$ of the probability space. From our standard Normal table, we define our rejection region $R$ to be $R \equiv(\infty,-1.96) \cup(1.96, \infty)$ since $z_{\alpha / 2}=1.96$ . As this particular rejection region is symmetric, we could summarise the rejection rule in terms of modulo of the test statistic: "reject $H_{0}$ if $|T|>1.96$ ".

Example: Long Jump Distances

In 15 attempts, a long jumper records the following distances (in metres):

$\begin{array}{lllll}7.48 & 7.34 & 7.97 & 5.88 & 7.48 \\ 7.67 & 7.49 & 7.48 & 8.21 & 6.54 \\ 7.13 & 7.65 & 7.85 & 6.95 & 6.68\end{array}$

Suppose that the distances are normally distributed and the population standard deviation is $\sigma=0.58$ $\mathrm{m}$ . Are the distances consistent with a mean jump length of $7 \mathrm{~m}$ , given a significance level $\alpha=0.05$ ? What is the P-value?

Solution:

We first formally write down the hypotheses that we are interested in testing. In this example, we assume that the jump distances are normally distributed with some unknown mean, and known variance, i.e. $N\left(\mu, \sigma^{2}\right)$ , where we know $\sigma=0.58$ . We want to test the hypotheses:

H_{0}: \mu=\mu_{0} \quad H_{1}: \mu \neq \mu_{0}

where $\mu_{0}=7$ (metres).

Since the variance of the population is known, we will use the standard Z-test to make claims about $H_{0}$ . Our test statistic, $T$ , will therefore follow the standard Normal distribution under the null hypothesis and is therefore defined as:

T=\frac{\bar{X}-\mu_{0}}{\sigma / \sqrt{n}} \sim N(0,1)

In this particular case we are testing a simple hypothesis, therefore a two-tailed test is required (i.e. each tail has a probability of $\alpha / 2)$ . From the lookup table for standard Normal distribution given $\alpha=0.05$ , we find that $z_{\alpha / 2}=1.96$ and therefore $P(X<-1.96) \approx 0.025=\frac{\alpha}{2}$ . Symmetrically, $P(X>1.96) \approx 0.025$ . The resulting rejection rule for our problem at a significance level of $\alpha=0.05$ is: "we reject $H_{0}$ if the test statistic $|T|>1.96$ ". Given this rejection rule, we can compute $\bar{X}$ by computing the mean of our sample, $\bar{X}=\frac{109.8}{15}=7.32$ , and therefore claim that our test statistic is $T$ is equal to:

T=\frac{7.32-7}{0.58 / \sqrt{15}} \approx 2.14

Applying the rejection rule above, we reject the null hypothesis and claim the jumps are not consistent with a mean jump length of 7 metres for a significance level of $\alpha=0.05$ . The following figure illustrates our test graphically, where the blue line corresponds to our computed value $T$ :

By definition, the P-value is the lowest significance level at which the null hypothesis will be rejected. In order to find it, we look at the statistical probability tables again. In there we see that $P(Z>2.14)=0.0162$ . Since a two-tailed test has been used in our case, we need to be aware that $P(Z<-2.14)=0.0162$ , and our $\mathrm{P}$ -value is the sum of these two probabilities: P-value $=P(Z>2.14)+P(Z<-2.14)=0.0324$ .

Testing $H_{0}: \mu=\mu_{0}$ for Normal Population with Unknown Variance, $\sigma^{2}$ : Student's t-test

The previous section provides a convenient framework to deal with means of normally distributed data, when the variance $\sigma^{2}$ is known. We noted when constructing confidence intervals that this is not likely the case for real-world problems, and if we were to use the sample standard deviation $S$ as an approximation for $\sigma$ in computing the confidence intervals, then we required the Student $t$ -distribution to accurately model the distribution of the statistic. Thus, when the population variance is not know, we will represent the null distribution for the test statistic as:

T=\frac{\bar{X}-\mu_{0}}{S / \sqrt{n}} \sim t_{n-1}

Again remembering that $t_{n-1}$ is the Student $t$ -distribution with $n-1$ degrees of freedom and converges to the standard Normal in the limit that $n \rightarrow \infty$ . We can therefore define the critical values for a given significance level $\alpha$ by:

c=t_{\alpha, n-1} \text { such that } P\left(T \geq t_{\alpha, n-1}\right)=\alpha

For example, to compute a one-sided test (i.e. $H_{1}: \mu<\mu_{0}$ ) for a significance level $\alpha=0.01$ and a sample size of $n=13$ , we would use the lookup table for the Student $t$ -distribution to find $c=t_{0.01,12}$ such that $P\left(T \geq t_{0.01,12}\right)=0.01$ :

t_{0.01,12}=-2.681

Thus, the rejection region for $T$ is $(-\infty,-2.681)$ .

Example: Long Jump Distances, when variance is unknown

Suppose that we do not know the true population variance in the long jump distances example described in the previous section. Are the jump distances consistent with the mean jump length of 7 metres, given a significance level of 0.05 ? What is the P-value?

Solution:

We use the same experimental set up as in the previous example. Our test statistic is now defined in terms of the sample variance $S$ and follows the Student $t$ -distribution:

T=\frac{\bar{X}-\mu_{0}}{S / \sqrt{n}} \sim t_{n-1}

We can obtain critical values for the two-sided test under the Student $t$ -distribution with 14 degrees of freedom using the lookup table procedure described above for $\alpha / 2=0.025$ :

P\left(T \geq t_{0.025,14}\right)=0.025 \rightarrow c=t_{0.025,14}=2.1448

Therefore $P(T<-2.145) \approx 0.025$ and similarly $P(T>2.145) \approx 0.025$ , allowing us to write down the rejection rule: "reject the null hypothesis if $|T|>2.145$ .

To compute the test statistic, $T$ , we first need to compute $S$ from the sample:

S=\sqrt{\frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\overline{X_{i}}\right)^{2}}

The equation above gives us $S \approx 0.6034$ , resulting in a test statistic of $T \approx 2.05$ . Since $2.05<2.145$ we cannot reject the null hypothesis that the population mean is 7 metres. Graphically, this can be illustrated as follows:

We can use the CDF of the Student $t$ -distribution to compute the P-value, by noticing that, $P(T>$ $2.05) \approx 0.03$ and, similarly, $P(T<-2.05) \approx 0.03$ , therefore the $\mathrm{P}$ -value is approximately 0.06 .

Testing $H_{0}: \mu=\mu_{0}$ for Non-Normal Population, Large Sample Size

If we do not know the distribution of population (or, in general, we do not think the distribution of the population is Normal), but the sample size $n$ is large enough, we can assume that the mean of the population is asymptotically Normal by the Central Limit Theorem. Essentially, this means that z- and t-tests can be applied.

Similarly, at a large $n$ , the sample variance $S$ becomes a good estimator to the true population variance (and, in fact, $t_{n-1}$ distribution starts to converge to the standard Normal), meaning that we can use the $\mathrm{Z}$ distribution as our null distribution, plugging in $S$ for $\sigma$ .

Example: Apple Farmer

An apple farmer knows that a particular variety of tree should yield an average of 21.6 apples per tree. She is concerned that her trees of this variety are not performing as well as they should. From 400 trees the sample mean apple yield this year was 20.3 apples with sample standard deviation 8.4 apples. Should she be worried? What is the obvious flaw in this experiment? Assume the significance level is $\alpha=0.05$ .

Solution:

The yield of every apple tree could be modelled by a Poisson distribution. Each tree would most likely have a different parameter for this distribution, depending on the soil, height, age of the tree, but we will assume that it is the same for all the trees in the population (this is the flaw of the experiment). Whilst each of the random variables is Poisson-distributed, their mean is asymptotically Normal. We are interested in testing the hypotheses: $H_{0}: \mu=21.6$ (i.e. the apple trees behave as expected) and $H_{1}: \mu \neq 21.6$ (i.e. there's something wrong with the apple trees), given that we observed $\bar{X}=20.3$ apples on average. A sample of $n=400$ apple trees is large enough for the Z-test to be applicable under the CLT. Under the null hypothesis, our test statistic $T=\frac{\bar{X}-21.6}{8.4 \sqrt{n}}$ follows the standard normal:

T=\frac{\bar{X}-21.6}{8.4 / \sqrt{400}} \sim N(0,1)

From our previous examples, we know that the critical values for a two-tailed Z-test at significance level $\alpha=0.05$ are $c_{\text {lower }}=-1.96$ and $c_{\text {upper }}=1.96$ . Therefore we would reject the null hypothesis if $|T|>1.96$ . Substituting the numbers in we get:

T=\frac{\bar{X}-21.6}{8.4 / \sqrt{400}}=\frac{20.3-21.6}{8.4 / \sqrt{400}} \approx-3.0952

Since $|T|>1.96$ we claim that there is sufficient evidence to reject the null hypothesis for $\alpha=0.05$ .

The P-value Two Sample Inferences for Comparing Population Means

Single Sample Inferences about the Population Mean: H0:μ=μ0H_{0}: \mu=\mu_{0}H0​:μ=μ0​

Testing H0:μ=μ0H_{0}: \mu=\mu_{0}H0​:μ=μ0​ for Normal Population with Known Variance, σ2\sigma^{2}σ2

Example: Long Jump Distances

Testing H0:μ=μ0H_{0}: \mu=\mu_{0}H0​:μ=μ0​ for Normal Population with Unknown Variance, σ2\sigma^{2}σ2 : Student's t-test

Example: Long Jump Distances, when variance is unknown

Testing H0:μ=μ0H_{0}: \mu=\mu_{0}H0​:μ=μ0​ for Non-Normal Population, Large Sample Size

Example: Apple Farmer

Single Sample Inferences about the Population Mean: $H_{0}: \mu=\mu_{0}$

Testing $H_{0}: \mu=\mu_{0}$ for Normal Population with Known Variance, $\sigma^{2}$

Testing $H_{0}: \mu=\mu_{0}$ for Normal Population with Unknown Variance, $\sigma^{2}$ : Student's t-test

Testing $H_{0}: \mu=\mu_{0}$ for Non-Normal Population, Large Sample Size