CDC Home

## Variance Estimation

### Purpose

This module introduces the basic concepts of variance (sampling error) estimation for NHANES data.  You will learn how the complex survey design of NHANES and clustering of the data affect variance estimation, which methods are appropriate to use when calculating variance for NHANES data, and how to calculate degrees of freedom and construct confidence limits for NHANES estimates.

### Comparison of Variance Estimation in NHANES III and Continuous NHANES

The contents of this module in the Continuous NHANES tutorial also apply to the NHANES III data. Therefore, links to the main tutorial are provided here for your reference. The following text points out the key differences between NHANES III and continuous NHANES when it comes to variance estimation.

#### Methods of Variance Estimation

As with continuous NHANES, the Taylor Linearization Method is the recommended method for variance estimation that incorporates the complex survey sample design.  Replication approaches are also acceptable and replicate weights for the Balanced Repeat Replication (BRR) Method are provided on the NHANES III files.  For the combined 6-year sample of the NHANES III for both the interviewed and MEC examined sample, 52 replicate weights were created using Fay's method, a variant of the (BRR) method.  For more details on Fay's Method, refer to Judkins (1990) (see the Analytic Guidelines for full reference).

#### Survey Design Variables

PSEUDO primary sampling units and stratification variables are provided in NHANES III, as opposed to the masked variance units and strata provided Continuous NHANES. However, this difference does not affect how you use the statistical program in variance estimation.

#### Estimate Stability and Sample Design

As mentioned in the basic tutorial, variance stability is greatly increased by including more years of data.  For the same reasons it is suggested you combine survey cycles in NHANES 1999-2004, it is also recommended to combine both phases of NHANES III data when looking at estimates for detailed sub-domains.  Combining phases increases the stability of the estimate and the stability of the estimated variance.

In addition, because samples in phase 1 and phase 2 are not statistically independent due to sampling variability, an analyst can not compare estimates from Phase 1 to Phase 2.  There is no correct way to determine if differences seen are real or are due to sampling variability.

Other than the differences pointed out above, the rest of the principles covered in the basic tutorial apply to NHANES III as well. Please consult the basic tutorial on this topic.

#### Variables

The stratum and PSU variable names necessary to specify the sample design are SDPPSU6 for the PSU variable and SDPSTRA6 for the stratum variable. These are to be used when analyzing the combined 6 year - two phases of the NHANES III survey. Individual stratum and PSU variables are also available for analyzing individual phases when necessary. The variable names for phase 1 are SDPSTRA1 and SDPPSU1 and for phase 2 SDPSTRA2 and SDPPSU2.

Go to Continuous NHANES Web Tutorial Variance Estimation module