## Task 2: Key Concepts about Generating Confidence Intervals

Typically, a sufficiently large probability sample will have point
estimates that are approximately normally distributed. The end points of the
confidence interval, then, are a function of the estimate (), its standard error (), and a percentile of the normal
distribution with zero mean and unit variance, referred to as the standard
normal deviate (z score), and are given by:

#### Equation for Confidence Interval Endpoints

The continuous NHANES sample is a multistage, area probability sample. The
number of independent pieces of information, or degrees of freedom, depends upon
the number of PSUs rather than on the number of sample persons. Sample persons
within a given PSU are not independent. Therefore, a t-statistic with degrees
of freedom equal to the difference between the number of PSUs and the number of
strata containing observations is used instead of a z-statistic, which would
otherwise be used in a large sample. The endpoints for a confidence interval
for the continuous NHANES are given by:

#### Equations for Confidence Interval Endpoints in Continuous
NHANES

Sample weights and other design effects (e.g. strata, PSUs) must be
incorporated when calculating an estimate and its standard error (see “Module 5:
Overview of NHANES Survey Design and Weighting” for more information). Taylor
Series Linearization is one example of a design-based method. The design
variables needed to obtain estimates of standard errors through this method are
provided on the demographic files for the continuous NHANES (see below for an
example of a program).

#### Interpretation

Confidence intervals, as constructed above, are based on one possible sample
from a finite population. Many possible samples of the same size can be obtained
using the same procedures and measurements. For each of these samples, a
confidence interval can be constructed. For a 95% CI, 95% percent of these
intervals would then contain the true value of the population parameter.