## Key Concepts About NHANES II Survey Design

NHANES II data are NOT obtained using a simple random sample.  Rather, a complex, multistage, probability sampling design is used to select participants representative of the civilian, non-institutionalized US population. The sample does not include persons residing in nursing homes, members of the armed forces, institutionalized persons, or U.S. nationals living abroad.

### NHANES II Sampling Procedure

The NHANES II sampling procedure consists of 4 stages, shown and described below.

#### Four Stages of NHANES Sampling Procedure

• Stage 1: Primary sampling units (PSUs) were selected.  These were mostly single counties or, in a few cases, groups of contiguous counties with selected probability proportional to a measure of size (PPS).
• Stage 2: The PSUs were divided up into segments (generally city blocks or their equivalent). As with each PSU, sample segments were selected with PPS.
• Stage 3: Households within each segment were listed, and a sample was randomly drawn. In geographic areas where the proportion of age selected for oversampling was high, the probability of selection for those groups was greater than in other areas.
• Stage 4: Individuals were chosen to participate in NHANES II from a list of all persons residing in selected households. Individuals were drawn at random after the screening within designated age subdomains. Approximately one person per sample household was selected for an exam.

### What is a Sample Weight?

A sample weight is assigned to each sample person. It is a measure of the number of people in the population represented by that sample person in NHANES II, reflecting the unequal probability of selection, nonresponse adjustment, and adjustment to independent population control totals. When unequal selection probability is applied, as in the NHANES II sample, the sample weights are used to produce an unbiased national estimate. More information about sample weights and how they are created can be found in the Weighting module.

### Oversampling

NHANES II was designed to sample larger numbers of certain subgroups of particular public health interest. Certain subgroups were oversampled to increase the reliability and precision of estimates of health status indicators for these population subgroups.

Different subgroups have been oversampled in other survey years. For example, during the late 1960s and early 1970s, there was concern that people of very low income and women of childbearing age were at greater risk of malnutrition than the general population. Therefore, during the first National Health and Nutrition Examination Survey (NHANES I), conducted in 1971-74, these subgroups were oversampled.

The NHANES II sampling frame covered sample persons age 6 months through 74 years of age. In NHANES II, preschool children (6 months -5 years), older persons (60-74 years), and the poor (persons below the poverty level defined by the US Bureau of the Census using 1970 census results) were the subgroups that were oversampled.

WARNING

For your own analyses, it is critical to carefully review the documentation for each survey cycle to determine which subgroups were oversampled.

### Strata and Variance Units

The NHANES II sample represented the total civilian, non-institutionalized population, six months through 74 years of age, in the 50 states and the District of Columbia of the United States.   The first stage of the design consisted of a sample of 64 PSUs that were mostly individual counties. The sample frame consisted of  PSUs from the National Health Interview Survey that were categorized as either self-representing or nonself-representing within each of the four main Census regions (Northeast, Midwest, South, and West). They were then stratified into 64 superstrata and one PSU was selected from each. More detail on the stratification and selection of PSUs as well as the selection of housing units and sample persons can be found in the NCHS series report titled "Plan and Operation of the Second National Health and Nutrition Examination Survey 1976-80".

NHANES II was conducted over a four year period (1976-1980). As with NHANES 1999-2004, the PSUs in NHANES II are divided into strata with two PSUs in each stratum. Together, these strata and the PSUs represent the variance units (sampling units used to estimate the sampling error). Unlike the continuous NHANES where Masked Variance Units (MVUs) were used, MVUs were not created for the NHANES II. Instead, PSEUDO primary sampling units and stratification variables are provided.  In NHANES II, 32 pseudo-strata and 64 pseudo-PSUs were created for variance estimation. All data files contain the sample design and sample weight variables and their names are designated according to the name of the data file. Please see Task 2 of the Locate Variables module for an explanation of the NHANES II variable naming conventions. For the purposes of this tutorial the sample design variables from the Medical History Questionnaire for Ages 12-74 years will be used. The SAS code provided will create and name the pseudo-strata variable as N2AH0324 and the pseudo-PSU variable as N2AH0326.