As stated in the module on sampling in NHANES, the NHANES has a complex, multistage, probability cluster design. Typically, individuals within a cluster (i.e., county, school, city, census block) are more similar to one another than those in other clusters and this homogeneity of individuals within a given cluster is measured by the intra cluster correlation. When working with a complex sample, you ideally want to decrease the amount of correlation between sample persons within clusters. To achieve this, you want to sample fewer people within each cluster but sample more clusters. However, because of operational limitations (e.g., cost of moving the survey MECs, geographic distances between primary sampling units [PSUs], etc.) NHANES can only sample 30 PSUs within a 2-year survey cycle. The sample size in each PSU is roughly equal and it is intended to yield about 5,000 examined persons per year.
In a complex sample survey setting such as NHANES, variance estimates computed using standard statistical software packages that assume simple random sampling are generally too low (i.e., significance levels are overstated) and biased because they do not account for the differential weighting and the correlation among sample persons within a cluster.
Standard statistical software packages that assume simple random sampling calculate variance estimates that are generally too low and biased because they do not account for differential weighting and the correlation among sample persons within a cluster.
The impact of the complex sample design upon variance estimates is measured by the design effect (DEFF). It is defined as the ratio of the variance of a statistic which accounts for the complex sample design to the variance of the same statistic based on a hypothetical simple random sample of the same size.
Design Effect = Variance estimate (cluster) / Variance estimate (simple random sample)
If the DEFF is 1, the variance for the estimate under the cluster sampling is the same as the variance under simple random sampling. The DEFFs for NHANES are typically greater than 1.
When the DEFF is greater than 1, the effective sample size is less than the number of sample persons but greater than the number of clusters. The effective sample size is calculated by dividing the sample size in a subgroup by the DEFF. Another way to think about clustering is that there is a loss of precision and a reduction in the effective sample size because individuals are chosen within clusters instead of being sampled randomly throughout the population.
Moving from a 6-year (i.e.,NHANES III) to a 2-year data release in the continuous NHANES, the sample size for the survey is smaller for both the number of persons sampled and the number of geographic areas (PSUs) sampled. Due to smaller sample sizes in each 2-year cycle of the continuous NHANES, data are subject to larger sampling variation. For example, standard errors for a variable in NHANES 1999-2000 will be approximately 70% greater than for the corresponding variable in NHANES III (or when combining three cycles of the continuous NHANES).
Design effects for a variable can be different for race/ethnicity or age groups. Within the continuous NHANES survey, DEFF can be very different for different variables due to differences in variation by geography, by household intra class correlation, and by demographic heterogeneity. Because DEFFs are highly variable for different variables within each 2-year cycle of the continuous NHANES, it is difficult to set a single minimum sample size for analysis. The general statistical consideration is that an estimated proportion should have a relative standard error of 30% or less. The NHANES III Analytic Guidelines contain sample sizes required for reliable estimates and for testing differences between subdomains. The required sample size depends on the DEFF for the variable of interest. The sample size tables in Appendix B of the NHANES III Analytic Guidelines provide guidance, but it is best to compute an estimate for the sampling error of a statistic and use a reliability cut-point such as 30% relative standard error.
Software such as SUDAAN or SAS Survey procedures that account for the sampling design effect must be used to calculate an asymptotically unbiased estimate of the variance and should be used for all statistical tests and the construction of confidence limits. These procedures require information on the first stage of the sample design (identification of the PSU and stratum) for each sample person.
Park, I and Lee, H (2004) " Design Effects for the weighted mean and total estimators under complex survey sampling." Survey methodology 30:183-193.