# One sample z test for the mean - overview

This page offers structured overviews of one or more selected methods. Add additional methods for comparisons by clicking on the dropdown button in the right-hand column. To practice with a specific method click the button at the bottom row of the table

One sample $z$ test for the mean
Binomial test for a single proportion
Two sample $z$ test
Chi-squared test for the relationship between two categorical variables
Independent variableIndependent variableIndependent/grouping variableIndependent /column variable
NoneNoneOne categorical with 2 independent groupsOne categorical with $I$ independent groups ($I \geqslant 2$)
Dependent variableDependent variableDependent variableDependent /row variable
One quantitative of interval or ratio levelOne categorical with 2 independent groupsOne quantitative of interval or ratio levelOne categorical with $J$ independent groups ($J \geqslant 2$)
Null hypothesisNull hypothesisNull hypothesisNull hypothesis
H0: $\mu = \mu_0$

Here $\mu$ is the population mean, and $\mu_0$ is the population mean according to the null hypothesis.
H0: $\pi = \pi_0$

Here $\pi$ is the population proportion of 'successes', and $\pi_0$ is the population proportion of successes according to the null hypothesis.
H0: $\mu_1 = \mu_2$

Here $\mu_1$ is the population mean for group 1, and $\mu_2$ is the population mean for group 2.
H0: there is no association between the row and column variable

More precisely, if there are $I$ independent random samples of size $n_i$ from each of $I$ populations, defined by the independent variable:
• H0: the distribution of the dependent variable is the same in each of the $I$ populations
If there is one random sample of size $N$ from the total population:
• H0: the row and column variables are independent
Alternative hypothesisAlternative hypothesisAlternative hypothesisAlternative hypothesis
H1 two sided: $\mu \neq \mu_0$
H1 right sided: $\mu > \mu_0$
H1 left sided: $\mu < \mu_0$
H1 two sided: $\pi \neq \pi_0$
H1 right sided: $\pi > \pi_0$
H1 left sided: $\pi < \pi_0$
H1 two sided: $\mu_1 \neq \mu_2$
H1 right sided: $\mu_1 > \mu_2$
H1 left sided: $\mu_1 < \mu_2$
H1: there is an association between the row and column variable

More precisely, if there are $I$ independent random samples of size $n_i$ from each of $I$ populations, defined by the independent variable:
• H1: the distribution of the dependent variable is not the same in all of the $I$ populations
If there is one random sample of size $N$ from the total population:
• H1: the row and column variables are dependent
AssumptionsAssumptionsAssumptionsAssumptions
• Scores are normally distributed in the population
• Population standard deviation $\sigma$ is known
• Sample is a simple random sample from the population. That is, observations are independent of one another
• Sample is a simple random sample from the population. That is, observations are independent of one another
• Within each population, the scores on the dependent variable are normally distributed
• Population standard deviations $\sigma_1$ and $\sigma_2$ are known
• Group 1 sample is a simple random sample (SRS) from population 1, group 2 sample is an independent SRS from population 2. That is, within and between groups, observations are independent of one another
• Sample size is large enough for $X^2$ to be approximately chi-squared distributed under the null hypothesis. Rule of thumb:
• 2 $\times$ 2 table: all four expected cell counts are 5 or more
• Larger than 2 $\times$ 2 tables: average of the expected cell counts is 5 or more, smallest expected cell count is 1 or more
• There are $I$ independent simple random samples from each of $I$ populations defined by the independent variable, or there is one simple random sample from the total population
Test statisticTest statisticTest statisticTest statistic
$z = \dfrac{\bar{y} - \mu_0}{\sigma / \sqrt{N}}$
Here $\bar{y}$ is the sample mean, $\mu_0$ is the population mean according to the null hypothesis, $\sigma$ is the population standard deviation, and $N$ is the sample size.

The denominator $\sigma / \sqrt{N}$ is the standard deviation of the sampling distribution of $\bar{y}$. The $z$ value indicates how many of these standard deviations $\bar{y}$ is removed from $\mu_0$.
$X$ = number of successes in the sample$z = \dfrac{(\bar{y}_1 - \bar{y}_2) - 0}{\sqrt{\dfrac{\sigma^2_1}{n_1} + \dfrac{\sigma^2_2}{n_2}}} = \dfrac{\bar{y}_1 - \bar{y}_2}{\sqrt{\dfrac{\sigma^2_1}{n_1} + \dfrac{\sigma^2_2}{n_2}}}$
Here $\bar{y}_1$ is the sample mean in group 1, $\bar{y}_2$ is the sample mean in group 2, $\sigma^2_1$ is the population variance in population 1, $\sigma^2_2$ is the population variance in population 2, $n_1$ is the sample size of group 1, and $n_2$ is the sample size of group 2. The 0 represents the difference in population means according to the null hypothesis.

The denominator $\sqrt{\frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_2}}$ is the standard deviation of the sampling distribution of $\bar{y}_1 - \bar{y}_2$. The $z$ value indicates how many of these standard deviations $\bar{y}_1 - \bar{y}_2$ is removed from 0.

Note: we could just as well compute $\bar{y}_2 - \bar{y}_1$ in the numerator, but then the left sided alternative becomes $\mu_2 < \mu_1$, and the right sided alternative becomes $\mu_2 > \mu_1$.
$X^2 = \sum{\frac{(\mbox{observed cell count} - \mbox{expected cell count})^2}{\mbox{expected cell count}}}$
Here for each cell, the expected cell count = $\dfrac{\mbox{row total} \times \mbox{column total}}{\mbox{total sample size}}$, the observed cell count is the observed sample count in that same cell, and the sum is over all $I \times J$ cells.
Sampling distribution of $z$ if H0 were trueSampling distribution of $X$ if H0 were trueSampling distribution of $z$ if H0 were trueSampling distribution of $X^2$ if H0 were true
Standard normal distributionBinomial($n$, $P$) distribution.

Here $n = N$ (total sample size), and $P = \pi_0$ (population proportion according to the null hypothesis).
Standard normal distributionApproximately the chi-squared distribution with $(I - 1) \times (J - 1)$ degrees of freedom
Significant?Significant?Significant?Significant?
Two sided:
Right sided:
Left sided:
Two sided:
• Check if $X$ observed in sample is in the rejection region or
• Find two sided $p$ value corresponding to observed $X$ and check if it is equal to or smaller than $\alpha$
Right sided:
• Check if $X$ observed in sample is in the rejection region or
• Find right sided $p$ value corresponding to observed $X$ and check if it is equal to or smaller than $\alpha$
Left sided:
• Check if $X$ observed in sample is in the rejection region or
• Find left sided $p$ value corresponding to observed $X$ and check if it is equal to or smaller than $\alpha$
Two sided:
Right sided:
Left sided:
• Check if $X^2$ observed in sample is equal to or larger than critical value $X^{2*}$ or
• Find $p$ value corresponding to observed $X^2$ and check if it is equal to or smaller than $\alpha$
$C\%$ confidence interval for $\mu$n.a.$C\%$ confidence interval for $\mu_1 - \mu_2$n.a.
$\bar{y} \pm z^* \times \dfrac{\sigma}{\sqrt{N}}$
where the critical value $z^*$ is the value under the normal curve with the area $C / 100$ between $-z^*$ and $z^*$ (e.g. $z^*$ = 1.96 for a 95% confidence interval).

The confidence interval for $\mu$ can also be used as significance test.
-$(\bar{y}_1 - \bar{y}_2) \pm z^* \times \sqrt{\dfrac{\sigma^2_1}{n_1} + \dfrac{\sigma^2_2}{n_2}}$
where the critical value $z^*$ is the value under the normal curve with the area $C / 100$ between $-z^*$ and $z^*$ (e.g. $z^*$ = 1.96 for a 95% confidence interval).

The confidence interval for $\mu_1 - \mu_2$ can also be used as significance test.
-
Effect sizen.a.n.a.n.a.
Cohen's $d$:
Standardized difference between the sample mean and $\mu_0$: $$d = \frac{\bar{y} - \mu_0}{\sigma}$$ Cohen's $d$ indicates how many standard deviations $\sigma$ the sample mean $\bar{y}$ is removed from $\mu_0.$
---
Visual representationn.a.Visual representationn.a.
--
Example contextExample contextExample contextExample context
Is the average mental health score of office workers different from $\mu_0 = 50$? Assume that the standard deviation of the mental health scores in the population is $\sigma = 3.$Is the proportion of smokers amongst office workers different from $\pi_0 = 0.2$?Is the average mental health score different between men and women? Assume that in the population, the standard devation of the mental health scores is $\sigma_1 = 2$ amongst men and $\sigma_2 = 2.5$ amongst women.Is there an association between economic class and gender? Is the distribution of economic class different between men and women?
n.a.SPSSn.a.SPSS
-Analyze > Nonparametric Tests > Legacy Dialogs > Binomial...
• Put your dichotomous variable in the box below Test Variable List
• Fill in the value for $\pi_0$ in the box next to Test Proportion
-Analyze > Descriptive Statistics > Crosstabs...
• Put one of your two categorical variables in the box below Row(s), and the other categorical variable in the box below Column(s)
• Click the Statistics... button, and click on the square in front of Chi-square
• Continue and click OK
n.a.Jamovin.a.Jamovi
-Frequencies > 2 Outcomes - Binomial test
• Put your dichotomous variable in the white box at the right
• Fill in the value for $\pi_0$ in the box next to Test value
• Under Hypothesis, select your alternative hypothesis
-Frequencies > Independent Samples - $\chi^2$ test of association
• Put one of your two categorical variables in the box below Rows, and the other categorical variable in the box below Columns
Practice questionsPractice questionsPractice questionsPractice questions