The chi-square test is used to test the independence of two variables cross classified in a two-way table. (A chi-square statistic with n degrees of freedom is based on a statistic equal to the sum of the squares of n independent normally distributed random variables with mean=0 and unit variance.)

For example, suppose we wished to test the hypothesis that blood pressure cuff size is independent of gender and that we have the following observed frequencies obtained as a result of the cross-classification of blood pressure cuff sizes and gender.

1 | 2 | 3 | 4 | Cumulative | |
---|---|---|---|---|---|

Men | 63 | 1387 | 2409 | 453 | 4312 |

Women | 222 | 2065 | 2002 | 493 | 4782 |

Both genders | 285 | 3452 | 4411 | 946 | 9094 |

In a simple random sample setting (unweighted data), the expected cell frequencies under the null hypothesis that blood pressure cuff size and gender are independent could be obtained by multiplying the marginal total for the jth column by the proportion of individuals in the ith row.

For example, the expected value of blood pressure cuff size 1 for men would be 285*(4312/9094)=135; the expected value of blood pressure cuff size 4 for women would be 946*(4782/9094)=497.

Thus, if *O _{ij} * = the observed frequency of the ith row and
jth column, where i=1,2, … i and j=1,2, … j and

*E _{ij}* = the expected frequency of the ith row and jth
column

Then the formula to test the null hypothesis of independence, using the chi-square statistic, would be

(1)

This statistic has degrees of freedom equal to the number of rows minus 1, multiplied by the number of columns minus 1.

In a complex sample setting, you would use a statistic similar to equation
(1) above, modified to account for survey design with degrees of
freedom equal to the number of PSUs minus the number of strata containing
observations. This statistic can be obtained through SAS *proc surveyfreq *
(CHISQ, based on the Rao-Scott chi-square with an adjusted F statistic). The
analogous procedure in SUDAAN version 9.0 (proc crosstab), provides limited
chi-square statistics based on Wald chi-square and does not provide an F
adjusted p-value. However, SUDAAN regression models do provide F adjusted
chi-square statistics which are recommended for analyzing NHANES data.

The Cochran Mantel Haenzel Test, an extension of the Pearson Chi-Square, can
be applied to stratified two-way tables to test for homogeneity or independence
in a non-survey setting. For a complex sample its analogue can be obtained in
SUDAAN *proc crosstab* (*cmh*).

**References:**

Agresti A. An Introduction to Categorical Data Analysis. Wiley Series in Probability and Statistics. 1996. New York.