Task 2: Key Concepts About the NHANES Sample Weights

Each sampled person in NHANES is assigned a numerical sample weight that measures the number of people in the population represented by that specific person. Sample weights for NHANES participants incorporate adjustments for unequal selection probabilities and certain types of non-response, as well as an adjustment to independent estimates (called control totals) of population sizes for specific age, sex, and race/ethnicity categories. These adjustments are made at the aggregate level for an NHANES sample, so that estimates computed from that sample are nationally representative. Because not all sampled persons completed all portions of the survey, each individual represented in a public release data file may have several different sample weights assigned, depending on the nature of the non-response adjustments required.

Two sets of sample weight variables are included in the demographics data file, an interview weight and an exam weight. Other sets of sample weight variables are included in the files for the 24-hour dietary recall data – a day one dietary weight and (for 2003-04 and later surveys) a two-day dietary weight. For most dietary analyses, a user should use one of these versions (see “Choosing the correct weight” section). However, there are special subsamples that have their own sets of sample weights that are different from the main four.

The set of all individuals that have nonzero values for a particular version of sample weights comprise a nationally representative sample, so long as those sample weights are incorporated into statistical analyses. Performing append and merge operations to pull together data elements from many data files may result in missing or zero values for sample weight variables, so it is important to ensure that the data set for a particular analysis does not include individuals with zero or missing values for the desired version of sample weight.



Due to the way NHANES participants are selected, sample weights must always be used to produce an unbiased national estimate.


Sample weights are assigned in three steps (for more detailed information about these steps, see the Sample Design module in the Continuous NHANES Web Tutorial):

  1. the base weight is calculated;
  2. adjustments for unit non-response are made; and
  3. adjustment to population sizes (control totals) for selected age, gender, and race/ethnicity groups are made.

1. Calculating the base weight

In general a sample person is assigned a weight that is equivalent to the reciprocal of his/her probability of selection.  In other words:

Sample person's weight equals 1 divided by the probability of selection


However, calculating the base weight for a sample person in NHANES is much more complicated due to the survey's complex, multistage design. In NHANES, the following equation, which takes into account the survey design, is used to determine the base weight for a sample person:

Base weight equals 1 divided by the final probability



Final probability equals probability of PSU being selected  multiplied by probability of segment of the PSU being selected  multiplied by probability of a household being selected multiplied by probability of an individual being selected


2. Adjusting for unit non-response

In NHANES, an individual may be broadly categorized as a non-respondent to two major components of the survey - .the in-home interview and the MEC exam. An individual is considered a non-respondent to the interview if he/she was selected to be in the sample, but did not participate in the in-home interview. Similarly, an individual who agreed to complete the interview but did not agree to, or did not come in for, the MEC portion of the survey is considered a non-respondent to the exam. Adjustments made for these types of survey non-response account only for sample person interview or exam non-response, but not for component/item non-response (i.e., a sample person declined to have their blood pressure measured in the examination component but completed all other examination components).

The base weights were adjusted for non-response to the in-home interview when creating interview weights and further adjusted for non-response to the MEC exam when creating exam weights.

Most individuals in the NHANES have nonzero interview and nonzero exam weights. These two sample weight versions are the most commonly used for statistical analyses, but there are two more versions of sample weights, used specifically for dietary analyses, that incorporate additional non-response adjustments.

At the MEC interview, each individual was asked to complete a 24-hour dietary recall. An individual who did not agree to complete the recall, or who provided unreliable data for the recall, is considered a non-respondent for the Day 1 dietary recall. In addition to adjustments for interview and MEC exam non-response, the base weights were adjusted for non-response to the Day 1 dietary recall when creating the Day 1 dietary recall weights.

For 2003-04 and later surveys, each individual who completed the 24-hour dietary recall in the MEC was scheduled to complete a second recall at a later date by telephone. An individual who did not agree to complete the second recall, or who provided unreliable data for the recall, is considered a non-respondent for the second dietary recall. In addition to adjustments for interview, MEC exam, and Day 1 dietary recall non-response, the base weights were adjusted for non-response to the second dietary recall when creating the two-day dietary recall weights.

Special subsamples

Statistically defined (or random) subsamples of the participants who participated in the MEC exams are asked to participate in a variety of survey components. These include a variety of lab, nutrition/dietary, environmental, or mental health components. (Please see the respective survey protocol/documentation for more specific information.) For example, approximately one-half of participants are selected to give a fasting blood sample on the morning of their MEC exams. The subsamples selected for these components are chosen at random with a specified sampling fraction (for example, 1/2 or 1/3 of the total examined group) according to the protocol for that component. Each component subsample has its own designated sample weight, which accounts for the additional probability of selection into the subsample, as well as the additional non-response.


3. Adjustment to population control totals

In addition to accounting for sample person non-response, weights are also post-stratified to match the population control totals for each sampling subdomain. This additional adjustment makes the weighted counts the same as an independent count of the Current Population Survey (CPS) of the U.S. Census. 



Please see CPS website for more information: http://www.bls.gov/cps/home.htm.

For the Day 1 and two-day dietary weights, an additional post-stratification step was performed to balance recalls across days of the week.


Weights for over-sampled subgroups

The average sample weight within specific subgroups of the overall population may vary because of over-sampling.  For certain subgroups of particular public health interest, the proportion of individuals in the NHANES sample is larger than the corresponding proportion in the U.S. population. This over-sampling iincreases the reliability and precision of estimates of health status indicators for these population subgroups.  Weighting schemes allow estimates from these subgroups to be combined to obtain a national estimate that reflects the relative proportions of these groups in the population as a whole.

Examples of over-sampled subgroups in surveys since 1999 include:

Different subgroups have been over-sampled in other survey years. For example, during the late 1960s and early 1970s, there was concern that people of very low income and women of childbearing age were at greater risk of malnutrition than the general population. Therefore, during the first National Health and Nutrition Examination Survey (NHANES I), conducted in 1971-74, these subgroups were over-sampled.



For your own analyses, it is critical to carefully review the General Data Release Documentation, found on the home page of each survey cycle, to determine which subgroups are over-sampled.


