Task 2d: How to Perform Chi-Square Test using NHANES-CMS linked data in SUDAAN

In this task, you will use the chi-square test to determine whether obesity is associated with gender among those who were age 65 and older at the NHANES examination and have claims on the 2005 Carrier File.

Step 1: Set Up SUDAAN to Perform Chi-Square Test

The Chi-square statistic is requested from the SUDAAN procedure proc crosstab.  The summary table below provides an example of how to code for a chi-square test in SUDAAN.

Information Icon Information

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.


Calculating chi-square Using SUDAAN Procedure proc crosstab
Statements Explanation

proc sort data=DS1;
by sdmvstra sdmvpsu;


Use the SAS procedure, proc sort, to sort the data by strata (sdmvstra) and PSUs (sdmvpsu) before running the procedure in SUDAAN.

proc crosstab

data=DS1 design=wr;

Use proc crosstab to examine the relationship between two categorical variables.

nest sdmvstra sdmvpsu;

Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.

Weight wt_linkage_adj;

Use the weight statement to account for the unequal probability of sampling and nonresponse. In this example, the adjusted weight for “linkage non-response” is used for six years of data.

subpopn ridageyr >= 65 and cms_medicare_match=1 and on_carrier_2005=1; 

Use a subpopn statement to subset on the subgroup of interest. In this example, it selects people aged 65 or older (ridageyr>=65) that linked to the Medicare files at some point during 1999-2007 (CMS_Medicare_match=1) and were on the 2005 Carrier File (on_carrier_2005=1).
Because only those 65 years and older who linked to the 2005 Carrier File are of interest in this example, use the subpopn statement to select this subgroup. Please note that for accurate estimates of the standard error, it is preferable to use subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file. (See Section 5.4 of Korn and Graubard Analysis of Data from Health Surveys, pp 207-211.)

class riagendr obese/NoFreq;

Use the class statement for categorical variables in version 9.0. In earlier versions, you need a subgroup and levels statement. The NoFreq option suppresses printing frequencies in the output.

table riagendr*obese;

Use the table statement to choose the categorical variables gender (riagendr) and indicator for being obese (obese) for cross tabulation.

print nsum rowper colper/tests=all;

Use the print statement to obtain the N, row percent (rowper),and column percent (colper). Use the tests option to request all available statistics.

rformat riagendr sexfmt.;

rformat obese obese_.;

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Chi-square test for gender by obesity status and on the 2005 Carrier File: NHANES 1999-2004 linked to Medicare 1999-2007";


Use the rtitle statement to title the output.

Information Icon Information

SUDAAN Version 9.0 proc crosstab provides only limited Chi-square results (Wald) with p-values based on unadjusted F-statistics (not the recommended statistic for complex survey data). However, the SUDAAN regression procedures do produce the recommended F adjusted chi-square statistics (e.g. Rao-Scott and Satterthwaite) for use in analyzing NHANES data.


Step 2: Review Output



Korn, E.L. and B.I. Graubard.  1999.  Analysis of Health Surveys. New York: Wiley.



close window icon Close Window to return to module page.