Artificial Intelligence 🤖
Hypotheses about Distributions: Goodness-of-Fit Tests

Hypotheses about Distributions: Goodness-of-Fit Tests

The final application for statistical hypothesis testing that we will consider is whether the data sampled could be modelled by some distribution. For instance, we may be interested to see if the model we came up with is actually representative of a given the data set. The goodness-of-fit test considers the following hypotheses:

H0 : the population follows the chosen distribution H1 : the population does not follow this distribution \begin{aligned} & H_{0} \text { : the population follows the chosen distribution } \\ & H_{1} \text { : the population does not follow this distribution } \end{aligned}

Essentially we are testing to see if there is any evidence in the sample that the chosen distribution is a bad fit. Note that evidence for rejecting a particular distributional model does not point to why it fails, and thus offers no guidance for finding an alternative (i.e. better-fitting) model.

Let's illustrate this concept with the help of an example. Suppose that for a particular experiment, we suspect that the number of particles suspended in a dusty gas could be modelled using a Poisson distribution. In order to verify this, we momentarily flash the light onto a microscope field and count the particles seen a couple of times. We record our findings in a following table:

Number of particles seen (x)(\mathrm{x})0123456\geq 6Total
Frequency (nx)\left(n_{x}\right)34463819420143

Please note how we have discretised the infinite range of values that xx can take into k=7k=7 bins. The test statistic for goodness-of-fit tests rely on this discretisation of the probability distribution. Namely, Pearsons chi-squared statistic is defined as:

X2=i=1k(OiEi)2EiX^{2}=\sum_{i=1}^{k} \frac{\left(O_{i}-E_{i}\right)^{2}}{E_{i}}

Where:

kk \quad is the number of discrete bins for the distribution (in the example k=7k=7 )

OiO_{i} \quad is the number of observations falling into the ii-th bin (i.e. the frequency row in the table)

EiE_{i} \quad is the expected number of observations falling into the particular distribution, given the model

In order to compute the expected counts (Ei)\left(E_{i}\right), we first fit some distribution of choice to the data. Since we are using Poisson distribution in our example, we only have one parameter to fit: λ\lambda. We therefore compute its MLE estimate:

λ^=0×34+1×46+2×38+3×19+4×4+5×21431.43357\hat{\lambda}=\frac{0 \times 34+1 \times 46+2 \times 38+3 \times 19+4 \times 4+5 \times 2}{143} \approx 1.43357

We can now compute the probabilities associated with each of the bins P(xBiλ)P\left(x \in B_{i} \mid \lambda\right).

Bin (i)(i)P(xBiλ)P\left(x \in B_{i} \mid \lambda\right)Probability (pi)\left(p_{i}\right)
0P(X=0λ)P(X=0 \mid \lambda)0.2385
1P(X=1λ)P(X=1 \mid \lambda)0.3418
2P(X=2λ)P(X=2 \mid \lambda)0.2450
3P(X=3λ)P(X=3 \mid \lambda)0.1171
4P(X=4λ)P(X=4 \mid \lambda)0.0420
5P(X=5λ)P(X=5 \mid \lambda)0.0120
61j=05P(X=jλ)1-\sum_{j=0}^{5} P(X=j \mid \lambda)0.0036

If we were to randomly place the values into the bins, according to the probabilities pip_{i} specified in the table we would expect to obtain n×pin \times p_{i} items in each of the bin, where nn is the total number of items placed (i.e. n=143n=143 in this particular example). We will not prove this, but the result comes from a generalisation of Binomial probability distribution for more than two outcomes.

After computing each of the EiE_{i} values, we can augment the table:

Number of particles seen (x)(\mathrm{x})0123456\geq 6Total
Observed (Oi)\left(O_{i}\right)34463819420143
Expected (Ei)\left(E_{i}\right)34.099348.883735.039016.74366.0011.72050.5131143
Squared difference0.00998.31568.76755.09144.00300.07810.26331-

Note how EiE_{i} values are fractional, but all still sum to the same nn. From this table we compute the value for our test statistic X2X^{2} :

X2=i=1k(OiEi)2Ei=1.9503X^{2}=\sum_{i=1}^{k} \frac{\left(O_{i}-E_{i}\right)^{2}}{E_{i}}=1.9503

Under the null hypothesis the Pearson's chi-squared test statistic follows a chi-square distribution with kp1k-p-1 degrees of freedom, where kk is the number of data points and pp is the number of parameters in the fitted distribution:

X2=i=1k(OiEi)2Eiχkp12X^{2}=\sum_{i=1}^{k} \frac{\left(O_{i}-E_{i}\right)^{2}}{E_{i}} \sim \chi_{k-p-1}^{2}

We complete the test by consulting the tables of chi-squared distribution. Note that this test is always a right-tailed test, and therefore we reject the null hypothesis if the test statistic X2X^{2} is strictly greater than the critical value for a given significance level.

In the dust particles example we used, the critical value for χ7112\chi_{7-1-1}^{2} at α=0.05\alpha=0.05 significance level is 11.070. In fact, the X2=1.9503X^{2}=1.9503 value corresponds to p-value of 0.8569 under the said distribution. Therefore we cannot reject the null hypothesis (and therefore suspect the data can be modelled using Poisson distribution).