Artificial Intelligence 🤖
Single Sample Inferences about the Population Mean

Single Sample Inferences about the Population Mean: H0:μ=μ0H_{0}: \mu=\mu_{0}

The previous sections described a general framework for hypothesis testing. The remaining sections will provide a special treatment for the modelling of null distributions for common hypothesis tests. We begin by discussing methods for making claims about population means.

For instance, we may be interested in finding out whether the mean of a population has changed, given some new experimental design. We are concerned about testing the hypothesis that H0:μ=μ0H_{0}: \mu=\mu_{0}, that is that the true population parameter is equal to μ0\mu_{0}. The alternative hypothesis could either be simple or composite in this case. We will base our inference on Xˉ\bar{X}, the MLE estimator of μ\mu.

Testing H0:μ=μ0H_{0}: \mu=\mu_{0} for Normal Population with Known Variance, σ2\sigma^{2}

Let's start with a population that we know is Normally distributed, with some variance σ2\sigma^{2} that is known to us a priori. We are interested in modelling the mean of this population, μ\mu.

In order to estimate this mean, we collect a sample D=(D1,D2,,Dn)D=\left(D_{1}, D_{2}, \ldots, D_{n}\right), each drawn independently from the true population distribution: DiN(μ,σ2)D_{i} \sim N\left(\mu, \sigma^{2}\right). We then compute the sample mean:

Xˉ=1n×i=1nDi\bar{X}=\frac{1}{n} \times \sum_{i=1}^{n} D_{i}

And recalling that the sum of several Normal random variables is always also Normally-distributed, we know the exact distribution of Xˉ\bar{X} :

XˉN(μ,σ2/n)\bar{X} \sim N\left(\mu, \sigma^{2} / n\right)

As we discussed in the previous chapter, we will use the Z-transformed distribution of Xˉ\bar{X} to be our test statistic TT since we know it follows the standard Normal distrubution. We are particularly interested in modelling the null distribution (i.e. the null hypothesis H0H_{0} that our true population parameter is μ=μ0\mu=\mu_{0} ) of the test statistic:

T=Xˉμ0σ/nN(0,1)T=\frac{\bar{X}-\mu_{0}}{\sigma / \sqrt{n}} \sim N(0,1)

Recall that Xˉ\bar{X} will generally change depending for each sample DD selected from the same population. To quantify the uncertainty associated with these estimates, we had previously derived confidence intervals with a probability of (1α)(1-\alpha) of containing the population parameter μ0\mu_{0} :

P[Xˉzα/2σnμ0Xˉ+zα/2σn]=(1α)P\left[\bar{X}-z_{\alpha / 2} \cdot \frac{\sigma}{\sqrt{n}} \leq \mu_{0} \leq \bar{X}+z_{\alpha / 2} \cdot \frac{\sigma}{\sqrt{n}}\right]=(1-\alpha)

Where

P(Zzα/2)=(1α/2)P\left(Z \leq z_{\alpha / 2}\right)=(1-\alpha / 2)

In hypothesis testing, we are essentially performing the opposite calculation in that we are interested in specifying those values for TT that are not likely to be observed assuming we know the true population parameter is μ=μ0\mu=\mu_{0}. In other words, we seek to define the critical region assuming the null hypothesis to be true (given some simple or composite alternative hypothesis).

Consider the case where the alternative hypothesis we are testing is H1:μ>μ0H_{1}: \mu>\mu_{0}. Here we would reject the null hypothesis if the test statistic TT is improbably high (i.e. falls in the critical region) given a significance level α\alpha. Recalling our tools for defining confidence intervals, we can define the critical value to be c=zαc=z_{\alpha} such that:

P(Tzα)=P(Tzα)=αP\left(T \geq z_{\alpha}\right)=P\left(T \leq-z_{\alpha}\right)=\alpha

That is, we will reject the null hypothesis if TT is greater than some critical value c=zαc=z_{\alpha} for a given significance level α\alpha :

α=P( Type I error )=P( Reject H0H0 is true )=P(T>zαμ=μ0)\alpha=P(\text { Type I error })=P\left(\text { Reject } H_{0} \mid H_{0} \text { is true }\right)=P\left(T>z_{\alpha} \mid \mu=\mu_{0}\right)

Using the standard Normal tables, we see that the critical value for α=0.05\alpha=0.05 is c=zα=1.645c=z_{\alpha}=1.645; therefore we would reject the null hypothesis H0H_{0} if T>1.645T>1.645, as the critical region for the hypothesis is (1.645,)(1.645, \infty). A symmetrical line of reasoning applies for alternative hypotheses of a form H1:μ<μ0H_{1}: \mu<\mu_{0} (i.e. reject H0H_{0} if T<zαμ=μ0)\left.T<-z_{\alpha} \mid \mu=\mu_{0}\right).

However, for simple hypotheses (H1:μμ0)\left(H_{1}: \mu \neq \mu_{0}\right), a two-tailed test is required Assuming α=0.05\alpha=0.05, the rejection region would include both large positive and large negative values of TT each spanning α2\frac{\alpha}{2} of the probability space. From our standard Normal table, we define our rejection region RR to be R(,1.96)(1.96,)R \equiv(\infty,-1.96) \cup(1.96, \infty) since zα/2=1.96z_{\alpha / 2}=1.96. As this particular rejection region is symmetric, we could summarise the rejection rule in terms of modulo of the test statistic: "reject H0H_{0} if T>1.96|T|>1.96 ".

Example: Long Jump Distances

In 15 attempts, a long jumper records the following distances (in metres):

7.487.347.975.887.487.677.497.488.216.547.137.657.856.956.68\begin{array}{lllll}7.48 & 7.34 & 7.97 & 5.88 & 7.48 \\ 7.67 & 7.49 & 7.48 & 8.21 & 6.54 \\ 7.13 & 7.65 & 7.85 & 6.95 & 6.68\end{array}

Suppose that the distances are normally distributed and the population standard deviation is σ=0.58\sigma=0.58 m\mathrm{m}. Are the distances consistent with a mean jump length of 7 m7 \mathrm{~m}, given a significance level α=0.05\alpha=0.05 ? What is the P-value?

Solution:

We first formally write down the hypotheses that we are interested in testing. In this example, we assume that the jump distances are normally distributed with some unknown mean, and known variance, i.e. N(μ,σ2)N\left(\mu, \sigma^{2}\right), where we know σ=0.58\sigma=0.58. We want to test the hypotheses:

H0:μ=μ0H1:μμ0H_{0}: \mu=\mu_{0} \quad H_{1}: \mu \neq \mu_{0}

where μ0=7\mu_{0}=7 (metres).

Since the variance of the population is known, we will use the standard Z-test to make claims about H0H_{0}. Our test statistic, TT, will therefore follow the standard Normal distribution under the null hypothesis and is therefore defined as:

T=Xˉμ0σ/nN(0,1)T=\frac{\bar{X}-\mu_{0}}{\sigma / \sqrt{n}} \sim N(0,1)

In this particular case we are testing a simple hypothesis, therefore a two-tailed test is required (i.e. each tail has a probability of α/2)\alpha / 2). From the lookup table for standard Normal distribution given α=0.05\alpha=0.05, we find that zα/2=1.96z_{\alpha / 2}=1.96 and therefore P(X<1.96)0.025=α2P(X<-1.96) \approx 0.025=\frac{\alpha}{2}. Symmetrically, P(X>1.96)0.025P(X>1.96) \approx 0.025. The resulting rejection rule for our problem at a significance level of α=0.05\alpha=0.05 is: "we reject H0H_{0} if the test statistic T>1.96|T|>1.96 ". Given this rejection rule, we can compute Xˉ\bar{X} by computing the mean of our sample, Xˉ=109.815=7.32\bar{X}=\frac{109.8}{15}=7.32, and therefore claim that our test statistic is TT is equal to:

T=7.3270.58/152.14T=\frac{7.32-7}{0.58 / \sqrt{15}} \approx 2.14

Applying the rejection rule above, we reject the null hypothesis and claim the jumps are not consistent with a mean jump length of 7 metres for a significance level of α=0.05\alpha=0.05. The following figure illustrates our test graphically, where the blue line corresponds to our computed value TT :

By definition, the P-value is the lowest significance level at which the null hypothesis will be rejected. In order to find it, we look at the statistical probability tables again. In there we see that P(Z>2.14)=0.0162P(Z>2.14)=0.0162. Since a two-tailed test has been used in our case, we need to be aware that P(Z<2.14)=0.0162P(Z<-2.14)=0.0162, and our P\mathrm{P}-value is the sum of these two probabilities: P-value =P(Z>2.14)+P(Z<2.14)=0.0324=P(Z>2.14)+P(Z<-2.14)=0.0324.

Testing H0:μ=μ0H_{0}: \mu=\mu_{0} for Normal Population with Unknown Variance, σ2\sigma^{2} : Student's t-test

The previous section provides a convenient framework to deal with means of normally distributed data, when the variance σ2\sigma^{2} is known. We noted when constructing confidence intervals that this is not likely the case for real-world problems, and if we were to use the sample standard deviation SS as an approximation for σ\sigma in computing the confidence intervals, then we required the Student tt-distribution to accurately model the distribution of the statistic. Thus, when the population variance is not know, we will represent the null distribution for the test statistic as:

T=Xˉμ0S/ntn1T=\frac{\bar{X}-\mu_{0}}{S / \sqrt{n}} \sim t_{n-1}

Again remembering that tn1t_{n-1} is the Student tt-distribution with n1n-1 degrees of freedom and converges to the standard Normal in the limit that nn \rightarrow \infty. We can therefore define the critical values for a given significance level α\alpha by:

c=tα,n1 such that P(Ttα,n1)=αc=t_{\alpha, n-1} \text { such that } P\left(T \geq t_{\alpha, n-1}\right)=\alpha

For example, to compute a one-sided test (i.e. H1:μ<μ0H_{1}: \mu<\mu_{0} ) for a significance level α=0.01\alpha=0.01 and a sample size of n=13n=13, we would use the lookup table for the Student tt-distribution to find c=t0.01,12c=t_{0.01,12} such that P(Tt0.01,12)=0.01P\left(T \geq t_{0.01,12}\right)=0.01 :

t0.01,12=2.681t_{0.01,12}=-2.681

Thus, the rejection region for TT is (,2.681)(-\infty,-2.681).

Example: Long Jump Distances, when variance is unknown

Suppose that we do not know the true population variance in the long jump distances example described in the previous section. Are the jump distances consistent with the mean jump length of 7 metres, given a significance level of 0.05 ? What is the P-value?

Solution:

We use the same experimental set up as in the previous example. Our test statistic is now defined in terms of the sample variance SS and follows the Student tt-distribution:

T=Xˉμ0S/ntn1T=\frac{\bar{X}-\mu_{0}}{S / \sqrt{n}} \sim t_{n-1}

We can obtain critical values for the two-sided test under the Student tt-distribution with 14 degrees of freedom using the lookup table procedure described above for α/2=0.025\alpha / 2=0.025 :

P(Tt0.025,14)=0.025c=t0.025,14=2.1448P\left(T \geq t_{0.025,14}\right)=0.025 \rightarrow c=t_{0.025,14}=2.1448

Therefore P(T<2.145)0.025P(T<-2.145) \approx 0.025 and similarly P(T>2.145)0.025P(T>2.145) \approx 0.025, allowing us to write down the rejection rule: "reject the null hypothesis if T>2.145|T|>2.145.

To compute the test statistic, TT, we first need to compute SS from the sample:

S=1n1i=1n(XiXi)2S=\sqrt{\frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\overline{X_{i}}\right)^{2}}

The equation above gives us S0.6034S \approx 0.6034, resulting in a test statistic of T2.05T \approx 2.05. Since 2.05<2.1452.05<2.145 we cannot reject the null hypothesis that the population mean is 7 metres. Graphically, this can be illustrated as follows:

We can use the CDF of the Student tt-distribution to compute the P-value, by noticing that, P(T>P(T> 2.05)0.032.05) \approx 0.03 and, similarly, P(T<2.05)0.03P(T<-2.05) \approx 0.03, therefore the P\mathrm{P}-value is approximately 0.06 .

Testing H0:μ=μ0H_{0}: \mu=\mu_{0} for Non-Normal Population, Large Sample Size

If we do not know the distribution of population (or, in general, we do not think the distribution of the population is Normal), but the sample size nn is large enough, we can assume that the mean of the population is asymptotically Normal by the Central Limit Theorem. Essentially, this means that z- and t-tests can be applied.

Similarly, at a large nn, the sample variance SS becomes a good estimator to the true population variance (and, in fact, tn1t_{n-1} distribution starts to converge to the standard Normal), meaning that we can use the Z\mathrm{Z} distribution as our null distribution, plugging in SS for σ\sigma.

Example: Apple Farmer

An apple farmer knows that a particular variety of tree should yield an average of 21.6 apples per tree. She is concerned that her trees of this variety are not performing as well as they should. From 400 trees the sample mean apple yield this year was 20.3 apples with sample standard deviation 8.4 apples. Should she be worried? What is the obvious flaw in this experiment? Assume the significance level is α=0.05\alpha=0.05.

Solution:

The yield of every apple tree could be modelled by a Poisson distribution. Each tree would most likely have a different parameter for this distribution, depending on the soil, height, age of the tree, but we will assume that it is the same for all the trees in the population (this is the flaw of the experiment). Whilst each of the random variables is Poisson-distributed, their mean is asymptotically Normal. We are interested in testing the hypotheses: H0:μ=21.6H_{0}: \mu=21.6 (i.e. the apple trees behave as expected) and H1:μ21.6H_{1}: \mu \neq 21.6 (i.e. there's something wrong with the apple trees), given that we observed Xˉ=20.3\bar{X}=20.3 apples on average. A sample of n=400n=400 apple trees is large enough for the Z-test to be applicable under the CLT. Under the null hypothesis, our test statistic T=Xˉ21.68.4nT=\frac{\bar{X}-21.6}{8.4 \sqrt{n}} follows the standard normal:

T=Xˉ21.68.4/400N(0,1)T=\frac{\bar{X}-21.6}{8.4 / \sqrt{400}} \sim N(0,1)

From our previous examples, we know that the critical values for a two-tailed Z-test at significance level α=0.05\alpha=0.05 are clower =1.96c_{\text {lower }}=-1.96 and cupper =1.96c_{\text {upper }}=1.96. Therefore we would reject the null hypothesis if T>1.96|T|>1.96. Substituting the numbers in we get:

T=Xˉ21.68.4/400=20.321.68.4/4003.0952T=\frac{\bar{X}-21.6}{8.4 / \sqrt{400}}=\frac{20.3-21.6}{8.4 / \sqrt{400}} \approx-3.0952

Since T>1.96|T|>1.96 we claim that there is sufficient evidence to reject the null hypothesis for α=0.05\alpha=0.05.