Task 2a: How to Calculate a Chi-Square Test Using SUDAAN

In this task, you will use the chi-square test in SUDAAN to determine whether calcium supplement use and treatment for osteoporosis are independent of each other for men and women ages 50 years and older.


Step 1: Determine variables of interest

This example uses the demoadv dataset (download at Sample Code and Datasets).This dataset already contains a variable anycalsup that has a value of 1 for those who report calcium supplement use, and a value of 2 for those who do not. A participant was considered not to have any calcium supplement use if the daily average amount of calcium supplement use was zero; otherwise, a participant was considered a supplement user (see Supplement Code under Sample Code and Module 9, Task 4 for more information). The variable treatosteo indicates treatment for osteoporosis. A participant was coded as having had treatment for osteoporosis if he or she responded “yes” to OSQ.070 (“{Were you/Was SP} treated for osteoporosis?”) from the osteoporosis questionnaire, and was set to “no” if he or she responded “no” to OSQ.070 or to OSQ.060 (“Has a doctor ever told {you/SP} that {you/s/he} had osteoporosis, sometimes called thin or brittle bones?”) from the osteoporosis questionnaire. (The SAS code to create this variable is found in the “Supplement Program” sample SAS code.) The demoadv dataset for this example only includes those with MEC weights (wtmec2yr>0).


Step 2: Set Up SUDAAN to Perform Chi-Square Test

The chi-square statistic is requested from the SUDAAN procedure proc crosstab.  The summary table below provides an example of how to code for a chi-square test in SUDAAN.


These programs use variable formats listed in the sample program. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.


Calculating the chi-square test Using SUDAAN Procedure proc crosstab

Statements Explanation

proc sort data =demoadv;

  by sdmvstra sdmvpsu;

run ;

Use the SAS procedure, proc sort, to sort the data by strata (sdmvstra) and PSUs(sdmvpsu) before running the crosstab  procedure in SUDAAN.

proc crosstab data=demoadv design=wr;

Use proc crosstab to examine the relationship between two categorical variables.

nest sdmvstra sdmvpsu;

Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.

weight wtmec2yr;

Use the weight statement to account for the unequal probability of sampling and non-response. In this example, the MEC weight for 2 years of data is used.

subpopn ridageyr >= 50 ;

Use the subpopn statement to select those ages 50 years and older.

Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup SAS in a datastep when preparing the analytical data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211).

class riagendr anycalsup treatosteo/NoFreq;

Use the class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. The NoFreq option suppresses printing frequencies in the output.

table riagendr* anycalsup*treatosteo;


Use the table statement to choose the categorical variables gender (riagendr), supplement use (anycalsup) and osteoporosis treatment (treatosteo) for cross tabulation.

print nsum rowper colper/tests=all;


Use the print statement to obtain the N, row percent (rowper),and column percent (colper). Use the tests option to request all available statistics.

rformat riagendr gender. ;  rformat anycalsup yesnos. ;  rformat treatosteo yesno. ;

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Chi-square test for calcium supplement use and osteoporosis: NHANES 2003-2004" ;

run ;

Use the rtitle statement to title the output.



SUDAAN Version 9.0 proc crosstab provides only limited chi-square results (Wald) with p-values based on unadjusted F-statistics (not the recommended statistic for complex survey data). However, the SUDAAN regression procedures do produce the recommended F adjusted chi-square statistics (e.g. Rao-Scott and Satterthwaite) for use in analyzing NHANES data. SUDAAN Version 10 may have additional capabilities.


Step 3: Review Output


close window icon Close Window to return to module page.