Goodness of fit test - overview

This page offers structured overviews of one or more selected methods. Add additional methods for comparisons by clicking on the dropdown button in the right-hand column. To practice with a specific method click the button at the bottom row of the table

Goodness of fit test
Paired sample $t$ test
Spearman's rho
Friedman test
Paired sample $t$ test
Independent variableIndependent variableVariable 1Independent/grouping variableIndependent variable
None2 paired groupsOne of ordinal levelOne within subject factor ($\geq 2$ related groups)2 paired groups
Dependent variableDependent variableVariable 2Dependent variableDependent variable
One categorical with $J$ independent groups ($J \geqslant 2$)One quantitative of interval or ratio levelOne of ordinal levelOne of ordinal levelOne quantitative of interval or ratio level
Null hypothesisNull hypothesisNull hypothesisNull hypothesisNull hypothesis
  • H0: the population proportions in each of the $J$ conditions are $\pi_1$, $\pi_2$, $\ldots$, $\pi_J$
or equivalently
  • H0: the probability of drawing an observation from condition 1 is $\pi_1$, the probability of drawing an observation from condition 2 is $\pi_2$, $\ldots$, the probability of drawing an observation from condition $J$ is $\pi_J$
H0: $\mu = \mu_0$

Here $\mu$ is the population mean of the difference scores, and $\mu_0$ is the population mean of the difference scores according to the null hypothesis, which is usually 0. A difference score is the difference between the first score of a pair and the second score of a pair.
H0: $\rho_s = 0$

Here $\rho_s$ is the Spearman correlation in the population. The Spearman correlation is a measure for the strength and direction of the monotonic relationship between two variables of at least ordinal measurement level.

In words, the null hypothesis would be:

H0: there is no monotonic relationship between the two variables in the population.
H0: the population scores in any of the related groups are not systematically higher or lower than the population scores in any of the other related groups

Usually the related groups are the different measurement points. Several different formulations of the null hypothesis can be found in the literature, and we do not agree with all of them. Make sure you (also) learn the one that is given in your text book or by your teacher.
H0: $\mu = \mu_0$

Here $\mu$ is the population mean of the difference scores, and $\mu_0$ is the population mean of the difference scores according to the null hypothesis, which is usually 0. A difference score is the difference between the first score of a pair and the second score of a pair.
Alternative hypothesisAlternative hypothesisAlternative hypothesisAlternative hypothesisAlternative hypothesis
  • H1: the population proportions are not all as specified under the null hypothesis
or equivalently
  • H1: the probabilities of drawing an observation from each of the conditions are not all as specified under the null hypothesis
H1 two sided: $\mu \neq \mu_0$
H1 right sided: $\mu > \mu_0$
H1 left sided: $\mu < \mu_0$
H1 two sided: $\rho_s \neq 0$
H1 right sided: $\rho_s > 0$
H1 left sided: $\rho_s < 0$
H1: the population scores in some of the related groups are systematically higher or lower than the population scores in other related groups H1 two sided: $\mu \neq \mu_0$
H1 right sided: $\mu > \mu_0$
H1 left sided: $\mu < \mu_0$
AssumptionsAssumptionsAssumptionsAssumptionsAssumptions
  • Sample size is large enough for $X^2$ to be approximately chi-squared distributed. Rule of thumb: all $J$ expected cell counts are 5 or more
  • Sample is a simple random sample from the population. That is, observations are independent of one another
  • Difference scores are normally distributed in the population
  • Sample of difference scores is a simple random sample from the population of difference scores. That is, difference scores are independent of one another
  • Sample of pairs is a simple random sample from the population of pairs. That is, pairs are independent of one another
Note: this assumption is only important for the significance test, not for the correlation coefficient itself. The correlation coefficient itself just measures the strength of the monotonic relationship between two variables.
  • Sample of 'blocks' (usually the subjects) is a simple random sample from the population. That is, blocks are independent of one another
  • Difference scores are normally distributed in the population
  • Sample of difference scores is a simple random sample from the population of difference scores. That is, difference scores are independent of one another
Test statisticTest statisticTest statisticTest statisticTest statistic
$X^2 = \sum{\frac{(\mbox{observed cell count} - \mbox{expected cell count})^2}{\mbox{expected cell count}}}$
Here the expected cell count for one cell = $N \times \pi_j$, the observed cell count is the observed sample count in that same cell, and the sum is over all $J$ cells.
$t = \dfrac{\bar{y} - \mu_0}{s / \sqrt{N}}$
Here $\bar{y}$ is the sample mean of the difference scores, $\mu_0$ is the population mean of the difference scores according to the null hypothesis, $s$ is the sample standard deviation of the difference scores, and $N$ is the sample size (number of difference scores).

The denominator $s / \sqrt{N}$ is the standard error of the sampling distribution of $\bar{y}$. The $t$ value indicates how many standard errors $\bar{y}$ is removed from $\mu_0$.
$t = \dfrac{r_s \times \sqrt{N - 2}}{\sqrt{1 - r_s^2}} $
Here $r_s$ is the sample Spearman correlation and $N$ is the sample size. The sample Spearman correlation $r_s$ is equal to the Pearson correlation applied to the rank scores.
$Q = \dfrac{12}{N \times k(k + 1)} \sum R^2_i - 3 \times N(k + 1)$

Here $N$ is the number of 'blocks' (usually the subjects - so if you have 4 repeated measurements for 60 subjects, $N$ equals 60), $k$ is the number of related groups (usually the number of repeated measurements), and $R_i$ is the sum of ranks in group $i$.

Remember that multiplication precedes addition, so first compute $\frac{12}{N \times k(k + 1)} \times \sum R^2_i$ and then subtract $3 \times N(k + 1)$.

Note: if ties are present in the data, the formula for $Q$ is more complicated.
$t = \dfrac{\bar{y} - \mu_0}{s / \sqrt{N}}$
Here $\bar{y}$ is the sample mean of the difference scores, $\mu_0$ is the population mean of the difference scores according to the null hypothesis, $s$ is the sample standard deviation of the difference scores, and $N$ is the sample size (number of difference scores).

The denominator $s / \sqrt{N}$ is the standard error of the sampling distribution of $\bar{y}$. The $t$ value indicates how many standard errors $\bar{y}$ is removed from $\mu_0$.
Sampling distribution of $X^2$ if H0 were trueSampling distribution of $t$ if H0 were trueSampling distribution of $t$ if H0 were trueSampling distribution of $Q$ if H0 were trueSampling distribution of $t$ if H0 were true
Approximately the chi-squared distribution with $J - 1$ degrees of freedom$t$ distribution with $N - 1$ degrees of freedomApproximately the $t$ distribution with $N - 2$ degrees of freedomIf the number of blocks $N$ is large, approximately the chi-squared distribution with $k - 1$ degrees of freedom.

For small samples, the exact distribution of $Q$ should be used.
$t$ distribution with $N - 1$ degrees of freedom
Significant?Significant?Significant?Significant?Significant?
  • Check if $X^2$ observed in sample is equal to or larger than critical value $X^{2*}$ or
  • Find $p$ value corresponding to observed $X^2$ and check if it is equal to or smaller than $\alpha$
Two sided: Right sided: Left sided: Two sided: Right sided: Left sided: If the number of blocks $N$ is large, the table with critical $X^2$ values can be used. If we denote $X^2 = Q$:
  • Check if $X^2$ observed in sample is equal to or larger than critical value $X^{2*}$ or
  • Find $p$ value corresponding to observed $X^2$ and check if it is equal to or smaller than $\alpha$
Two sided: Right sided: Left sided:
n.a.$C\%$ confidence interval for $\mu$n.a.n.a.$C\%$ confidence interval for $\mu$
-$\bar{y} \pm t^* \times \dfrac{s}{\sqrt{N}}$
where the critical value $t^*$ is the value under the $t_{N-1}$ distribution with the area $C / 100$ between $-t^*$ and $t^*$ (e.g. $t^*$ = 2.086 for a 95% confidence interval when df = 20).

The confidence interval for $\mu$ can also be used as significance test.
--$\bar{y} \pm t^* \times \dfrac{s}{\sqrt{N}}$
where the critical value $t^*$ is the value under the $t_{N-1}$ distribution with the area $C / 100$ between $-t^*$ and $t^*$ (e.g. $t^*$ = 2.086 for a 95% confidence interval when df = 20).

The confidence interval for $\mu$ can also be used as significance test.
n.a.Effect sizen.a.n.a.Effect size
-Cohen's $d$:
Standardized difference between the sample mean of the difference scores and $\mu_0$: $$d = \frac{\bar{y} - \mu_0}{s}$$ Cohen's $d$ indicates how many standard deviations $s$ the sample mean of the difference scores $\bar{y}$ is removed from $\mu_0.$
--Cohen's $d$:
Standardized difference between the sample mean of the difference scores and $\mu_0$: $$d = \frac{\bar{y} - \mu_0}{s}$$ Cohen's $d$ indicates how many standard deviations $s$ the sample mean of the difference scores $\bar{y}$ is removed from $\mu_0.$
n.a.Visual representationn.a.n.a.Visual representation
-
Paired sample t test
--
Paired sample t test
n.a.Equivalent ton.a.n.a.Equivalent to
-
  • One sample $t$ test on the difference scores.
  • Repeated measures ANOVA with one dichotomous within subjects factor.
--
  • One sample $t$ test on the difference scores.
  • Repeated measures ANOVA with one dichotomous within subjects factor.
Example contextExample contextExample contextExample contextExample context
Is the proportion of people with a low, moderate, and high social economic status in the population different from $\pi_{low} = 0.2,$ $\pi_{moderate} = 0.6,$ and $\pi_{high} = 0.2$?Is the average difference between the mental health scores before and after an intervention different from $\mu_0 = 0$?Is there a monotonic relationship between physical health and mental health?Is there a difference in depression level between measurement point 1 (pre-intervention), measurement point 2 (1 week post-intervention), and measurement point 3 (6 weeks post-intervention)?Is the average difference between the mental health scores before and after an intervention different from $\mu_0 = 0$?
SPSSSPSSSPSSSPSSSPSS
Analyze > Nonparametric Tests > Legacy Dialogs > Chi-square...
  • Put your categorical variable in the box below Test Variable List
  • Fill in the population proportions / probabilities according to $H_0$ in the box below Expected Values. If $H_0$ states that they are all equal, just pick 'All categories equal' (default)
Analyze > Compare Means > Paired-Samples T Test...
  • Put the two paired variables in the boxes below Variable 1 and Variable 2
Analyze > Correlate > Bivariate...
  • Put your two variables in the box below Variables
  • Under Correlation Coefficients, select Spearman
Analyze > Nonparametric Tests > Legacy Dialogs > K Related Samples...
  • Put the $k$ variables containing the scores for the $k$ related groups in the white box below Test Variables
  • Under Test Type, select the Friedman test
Analyze > Compare Means > Paired-Samples T Test...
  • Put the two paired variables in the boxes below Variable 1 and Variable 2
JamoviJamoviJamoviJamoviJamovi
Frequencies > N Outcomes - $\chi^2$ Goodness of fit
  • Put your categorical variable in the box below Variable
  • Click on Expected Proportions and fill in the population proportions / probabilities according to $H_0$ in the boxes below Ratio. If $H_0$ states that they are all equal, you can leave the ratios equal to the default values (1)
T-Tests > Paired Samples T-Test
  • Put the two paired variables in the box below Paired Variables, one on the left side of the vertical line and one on the right side of the vertical line
  • Under Hypothesis, select your alternative hypothesis
Regression > Correlation Matrix
  • Put your two variables in the white box at the right
  • Under Correlation Coefficients, select Spearman
  • Under Hypothesis, select your alternative hypothesis
ANOVA > Repeated Measures ANOVA - Friedman
  • Put the $k$ variables containing the scores for the $k$ related groups in the box below Measures
T-Tests > Paired Samples T-Test
  • Put the two paired variables in the box below Paired Variables, one on the left side of the vertical line and one on the right side of the vertical line
  • Under Hypothesis, select your alternative hypothesis
Practice questionsPractice questionsPractice questionsPractice questionsPractice questions