Task 3a: How to Create an Appropriate Subset of Your Data for NHANES Analyses in SUDAAN

The following example demonstrates the critical code necessary to create subsets of your data appropriately for SUDAAN and SAS Survey procedure analyses.  These examples only highlight the portion of code necessary to illustrate creation of appropriate subsets of data. For examples of full SUDAAN and SAS Survey procedure codes, please see the Logistic Regression module.

Example used throughout this task:  You are interested in analyzing only 20-49 year old females who were tested for total cholesterol in a 2-year dataset.

Step 1: Create Dataset

First, you determine that you will include all MEC examined individuals in your data set.

The ridstatr variable on your demographic file designates interviewed participants with a value=1, and interviewed plus examined participants with a value = 2. Therefore, in the SAS data step, you keep the ridstatr variable (ridstatr=2) to create a MEC-examined subset of data.

Step 2: Specify correct weight in program

Next, in SUDAAN you specify the correct weight to be used in the procedure by using a weight statement. Since you are using a single 2-year cycle, use the wtmec2yr variable.

Step 3: Include selected subset

Then, in the SUDAAN procedure you will include a subpopn statement that creates a subset of the data which includes those who are greater than or equal to age 20 and less than or equal to age 49 years, are female, and have a valid measure for the total cholesterol variable lbxtc.

IMPORTANT NOTE

The correct method for creating a subset of your sample population for SUDAAN analyses is to use the subpopn statement to designate sample subdomains to analyze and only use sample weight-related variables in the SAS data step.

Sample Statements to Include Weight and Select Subset of Dataset in SUDAAN Procedures
Statements Explanation
If ridstatr=2;

The ridstatr variable on your demographic file designates interviewed participants with a value=1, and interviewed plus examined participants with a value = 2. Therefore, in the SAS data step, you use this SAS statement to create a MEC-examined subset of data.

weight wtmec2yr;

Specify the correct weight to be used in the procedure by using a weight statement.

subpopn ridageyr >=20 and ridageyr <=49 and riagendr=2 and lbxtc > -1 ;

Subpopn statement creates a subset of the data which includes those who are greater than or equal to age 20 and less than or equal to age 49 years, are female, and have a valid measure for the total cholesterol variable lbxtc.

IMPORTANT NOTE

SUDAAN does not accept some SAS terminology such as GE for >= or LE for <=. Therefore, you can not combine criteria such as 20 le ridageyr le 49, or use ‘.' to designate missing value. See SUDAAN manual for other limitations.