Two sample $z$ test - sampling distribution of the difference between two sample means, and its standard deviation
Definition of the sampling distribution of the difference between two sample means $ \bar{y}_1 - \bar{y}_2$, and its standard deviation
Sampling distribution of the difference between two sample means $ \bar{y}_1 - \bar{y}_2$:When we draw a sample of size $ n_1$ from population 1, and a sample of size $ n_2$ from population 2, we can compute the mean of a variable $ y$ in sample 1 and in sample 2, and then compute the difference between the two sample means: $ \bar{y}_1 - \bar{y}_2$. Now suppose that we repeated these steps many times. Specifically, suppose that we drew an infinite number of of group 1 and group 2 samples, each time of size $ n_1$ and $ n_2$. Each time we have a group 1 and group 2 sample, we could compute the difference between the two sample means: $ \bar{y}_1 - \bar{y}_2$. Different samples would give different sample means and differences. The distribution of all these differences $ \bar{y}_1 - \bar{y}_2$ is the sampling distribution of $ \bar{y}_1 - \bar{y}_2$. Note that this sampling distribution is purely hypothetical. We would never really draw an infinite number of group 1 and group 2 samples, but hypothetically, we could. Standard deviation:Suppose that the assumptions of the two sample $ z$ test hold:
Note that the $ z$ statistic $ z = \frac{(\bar{y}_1 - \bar{y}_2) - 0}{\sqrt{\frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_2}}}$ thus indicates how many standard deviations $\sqrt{\frac{\sigma^2_1}{n_1} + \frac{\sigma^2_2}{n_2}}$ the observed difference $\bar{y}_1 - \bar{y}_2$ is removed from 0: the difference $\mu_1 - \mu_2$ according to the null hypothesis. |