# Chi-squared test for the relationship between two categorical variables: sampling distribution of the chi-squared statistic

Definition of the sampling distribution of the chi-squared statistic

As you may know, when we perform a chi-squared test for the relationship between two categorical variables, we compute the chi-squared statistic
$$
X^2 = \sum{\frac{(\mbox{observed cell count} - \mbox{expected cell count})^2}{\mbox{expected cell count}}}
$$
based on our sample data. Now suppose that we would draw many more samples. Specifically, suppose that we would draw an infinite number of samples, each with the same sample size. In each sample, we could compute the chi-squared statistic $ X^2 = \sum{\frac{(\mbox{observed cell count} - \mbox{expected cell count})^2}{\mbox{expected cell count}}}$. Different samples will give different values for $X^2$. The distribution of all these $ X^2$ values is the |