Supplementary Resources for National Health Interview Survey Public Use Files With Variance Estimation Singleton PSUs
The National Health Interview Survey (NHIS) sample is selected by a multistage process, beginning with selection of geographic areas called primary sampling units (PSU) that are defined within sampling strata. The public use file variance estimation structure for the NHIS consists of variance estimation PSUs and variance estimation strata, which have some similarity to the sampling PSUs and sampling strata, but are not identical to limit disclosure risk.
In general, variance calculations require at least 2 variance estimation PSUs in each variance estimation stratum. If a variance estimation stratum contains only one variance estimation PSU, the PSU is referred to as a “singleton PSU” or something similar. If a singleton PSU is present in a variance estimation stratum, special techniques are required to obtain appropriate variance estimates.
The public use file variance estimation structure for the 1997-present NHIS almost always provides 2 variance estimation PSUs for each variance estimation stratum. This page provides supplementary resources for the small number of occurrences of singleton PSUs in the 1997-present NHIS public use files so data users can obtain appropriate variance estimates..
Singleton PSUs occur in the following public use files:
- 2003 Sample Child File (1 singleton PSU)
- 2005 Sample Child File (2)
- 2007 Sample Child File, 2007 Child Alternative Medicine File (2)
Some complex sample design software packages (e.g., SUDAAN, Stata 10, R (including the Survey add-on package)) have options available to compute appropriate variance estimates when singleton PSUs are present. Other complex sample design software packages (e.g., SPSS, Stata 9, SAS survey procedures) do not compute appropriate variance estimates when singleton PSUs are present.
NCHS has created supplemental files to enable users to compute appropriate variance estimates with all contemporary complex sample design packages. It is not necessary to use these supplemental files with SUDAAN, Stata 10, or R, but the files can be used if a user chooses to do so. Stata 9 will generate missing values for standard error estimates if the supplemental files are not used, and SPSS and the SAS survey procedures will produce standard error estimates that are slightly smaller than they should be if the supplemental files are not used.