Key Concepts About Variance Estimation Methods in NHANES


Brief description of variance estimation procedures used with NHANES data

Variance of estimates (sampling errors) should be calculated for all survey estimates to aid in determining statistical reliability. For complex sample surveys, exact mathematical formulas for variance estimates are usually not available. Variance approximation procedures are required to provide reasonable estimates of sampling error. Two variance approximation procedures which account for the complex sample design and compute design effects are replication methods and Taylor Series Linearization. Initially, the delete 1 jackknife method, a replication method, was used to estimate variances based on data from the NHANES 1999-2000 survey. Balance repeated replication was used for NHANES III.


Currently NCHS recommends the use of the Taylor Series Linearization methods for variance estimation in all NHANES surveys. SUDAAN, Stata and the SAS Survey procedures can be used to obtain variance estimated by this method. Survey design variables identifying strata and PSU are required in order to utilize these software packages. If replication methods are used, you must compute your own replicate weights.


Taylor Linearization Procedures

For either linearization or replication, strata and PSU variables must be available on the survey data file. Because of confidentiality issues associated with a two-year data release, true PSUs cannot be released. In order to use the Taylor Series Linearization approach for variance estimation, Masked Variance Units (MVUs) were created and provided on the demographic data files. MVUs are equivalent to Pseudo-PSUs used to estimate sampling errors in past NHANES. These MVUs on the data file are not the " true" design PSUs, but they produce variance estimates that closely approximate the variances that would have been estimated using the " true" design.  See Sampling module: Key Concepts about NHANES Survey Design.


These MVUs have been created and provided for the continuous NHANES (e.g. NHANES 1999-2000, NHANES 2001-2002, etc ...) and will be added to the demographic data files for all two-year survey cycles. They can also be used when combining four or more years of data. The stratum variable is sdmvstra and the PSU variable is sdmvpsu. See Sampling module: Key Concepts about NHANES Survey Design.


Software such as SUDAAN, Stata, SPSS, and SAS Survey procedures can all be used to estimate sampling errors by the Taylor series (linearization) method.


close window icon Close Window