## Key Concepts About Specifying Sampling Parameters in NHANES I Using SUDAAN and SAS Survey Procedures

Accounting for the complex sampling design of NHANES I is critical when calculating statistical estimates and estimating standard errors of means, geometric means, percentages and other statistics. Replication and linearization are two statistical methods that can be used to properly address these complex design issues. SAS Survey and SUDAAN use linearization for calculating standard errors for a variety of statistics, such as means, geometric means and percentages.

### SUDAAN

Currently, SUDAAN offers six options for designating survey design (see SUDAAN manual for more details about the use and implications of all design options).   SUDAAN assumes a with replacement (WR) design if the design parameter is omitted.

In the next task, you will be using the with replacement (WR) design for analyzing NHANES I data.

In order to implement the WR sampling option in SUDAAN, design variables specifying the first stage of the cluster design and the sample weight are needed.

For more detailed information and sample code, see "Task 2a: How to Use SUDAAN Code to Specify Sampling Parameters in NHANES I."

### SAS Survey Procedures

In SAS, a group of procedures, known as the Survey procedures, produce estimates from complex sample survey data.  These procedures can also produce variance estimates through linearization (see variance estimation module) and confidence limits on many estimates. Currently, Taylor Series Linearization is the only variance estimation method available through SAS Survey procedures. In the SAS Survey procedure, the sample design is not directly specified in the proc statement, as in SUDAAN, but rather, strata and PSU variables are specified in separate statements. Similarly, SAS Survey procedure also specifies the weight statement. For more detailed information and sample code, see "Task 2b: How to Use SAS Survey Code to Specify Sampling Parameters in NHANES I."

### Sample Weights

A sample weight is assigned to each sample person. It is a measure of the number of people in the population represented by that sample person in NHANES I, reflecting the unequal probability of selection, nonresponse adjustment, and adjustment to independent population controls. When unequal selection probability is applied, as in the NHANES I sample, the sample weights are used to produce an unbiased national estimate. More information about sample weights and how they are created can be found in the Weighting module.

### Variance units

Unlike in continuous NHANES where masked variance units (MVUs) were used, NHANES I did not create MVUs.  Instead, PSEUDO primary sampling units and stratification variables are provided.  In NHANES I, 35 pseudo strata and 235 pseudo-PSUs were created for variance estimation for the 1-65 or 1-100 sample (location) design.