Task 1: Key Concepts about Measurement Error

The concept of “usual” or long-term average intake is important because dietary recommendations are intended to be met over time and diet-health hypotheses are based on dietary intakes over the long term. However, there is no perfect dietary assessment tool to measure usual intake; all self-report dietary assessment instruments are prone to error. 

In statistics, an “error” is a deviation of the sample from the true mean.  It is estimated by calculating the residual (i.e., the difference between a point and the sample mean).  The variance is the sum of the squared residuals, divided by the sample size (usually N-1 to create an unbiased estimate).  Lots of error leads to a large variance; a small amount of error leads to a small variance.

When considering variation in dietary intake data, it is important to distinguish variation between people from variation within people.  Between-person variability is a function of the difference between a person’s usual intake and the population’s usual intake.  However, within a person, we also expect variation around his/her usual intake.  This type of variation usually takes two forms:  day-to-day variability and measurement error.  These are depicted graphically in Figure 1.  However, we cannot usually distinguish between these two sources of error, so they are jointly referred to as “within-person variation”.

Figure 1.  Between-person and within-person variation.  Between-person variation is represented by the difference between Person A’s and Person B’s usual intake and the population’s usual intake.  The dark blue dots (and jagged line) represent day to day variation in intake, whereas the light blue dots represent the measurement of intake.  Taken together, these comprise within-person variation. 

line chart showing difference between Person A's (top) and Person B's (bottom) usual intake and the population's usual intake (dashed line in the middle) over time.

Within-person variation may be random, resulting in an estimate of usual intake that is unbiased (Figure 2a), meaning that a person’s true usual intake is estimated accurately on average, although with some error.  However, measurement errors also may be systematic, which lead to bias (Figure 2b).  These are “mistakes” in the measurement.  For example, a person may not report intake of sugar in coffee, but drink many cups of coffee per day, resulting in a biased estimate of sugar intake.


Figure 2a. Random within person errors

line graph showing random variation resulting in unbiased usual intake

Figure 2b. Systematic within person errors


Systematic within-person errors may arise in different ways: 

Like within-person error, between-person error may be random or systematic.  When error is random between people, it results in an unbiased estimate of usual intake for the population.  Even with random measurement error within a person, it is possible to calculate an unbiased estimate for the population, by balancing out overestimation of some individuals with underestimation for others.  With random error, the mean is estimated without bias, but the variance is inflated.

Systematic between-person error may arise if systematic within-person error occurs non-randomly.  For example, if the database was in error for collard greens, but people reported consumption of collard greens to varying degrees, systematic between-person bias could occur.  Of course, with self-report tools like the 24-hour recall and food propensity questionnaire, systematic between-person error also can result from person-specific bias and intake-related bias. 

Various types of bias have different effects on the estimated mean and distribution of usual intakes for a population. When systematic errors are only additive, the mean of the distribution is shifted, but is otherwise unchanged.  Although person-specific bias results in a biased estimate of the individual’s mean intake, it does not lead to a biased estimate of the group mean, and does not affect correlation with true intake.  At the group level, the person-specific bias cancels out, but results in a distribution with a larger variance, and decreased correlation with true intake.  Systematic intake-related bias, however, can shift the mean, and may also change the correlation with true intake.  Depending on the direction of the bias – whether it increases or decreases with intake – the correlation may be stronger or weaker. 

Importantly, these types of systematic errors do not usually occur in isolation.  When interest is on relating diet to a health parameter, what is often observed is a “flattened slope” effect (Figure 3).  Those with the lowest levels of intake tend to overreport, and those with the highest levels of intake underreport; this results from a combination of intake-related bias and systematic error.  These errors are often accompanied by person-specific bias, so the direction of the shift of the mean and the correlation between the assessment tool and truth is not always clear.

Figure 3.  The effects of random error on the relationship between usual intake and a health parameter.  The black dots and solid regression line represent the true relationship, and the blue triangles and dashed line represent the observed attenuated relationship.

The effects of random error on the relationship between usual intake and a health parameter. The blue dots and regression line represent the true relationship, and the red dots and line represent the observed attenuated relationship.


Statisticians have proposed models to separate the different sources of error using a measurement error model. When an unbiased estimate of truth is available, the different types of errors may be estimated.

equation showing estimate of error equal to the sum of additive errors, intake-related errors, person-specific errors, and random errors.


Among the most frequently used methods of assessing dietary intake are the 24-hour recall and the food frequency questionnaire (FFQ).  The FFQ administered in NHANES 03-06 does not have portion size assessment. The 24-hour recall and the FFQ have key differences. The FFQ is focused on intake over an extended period. It captures the majority of a person's diet, but is limited to foods on the instrument. Because of this and cognitive difficulties in recalling typical intake over a long period, FFQ reports also fail to truly reflect a person's long-term average daily intake.

In contrast to the FFQ, during a recall, people are asked to report everything eaten and drunk during the previous 24 hours. Therefore, 24-hour recalls are generally preferred to the FFQ due to their ability to capture rich details about daily intake of every item consumed (when, how, how much, with what). Validation studies have shown that the 24-hour recall is less prone to measurement error than an FFQ.  However, the biggest strength of the 24-hour recall also may be considered its biggest limitation.  Because food intake is only captured for one day, and most individuals’ diets vary from day to day, one day of intake is not sufficient to capture usual intake for an individual.  That is, a single recall does not reflect a person's long-term average daily intake; it represents only a "snapshot in time." 

Validation studies have examined reported intakes on 24-hour recalls and FFQs and compared them to biomarkers for energy and protein to try to understand the structure of measurement error for these self-report instruments.  Both 24-hour recalls and food frequency questionnaires have been shown to be prone to all of the systematic and random sources of measurement error discussed above when measuring energy and protein (Kipnis et al., 2003; Neuhouser et al., 2008). Because total energy is prone to error, at least some foods are subject to being reported with error on 24-hour recalls.  However, it is not possible to know the impact of measurement error on other nutrients or individual foods because unbiased biomarkers are not available for other nutrients.  In spite of this, in all of the methods described in the Dietary Tutorial, we make the assumption that the 24-hour recall is an unbiased instrument, i.e., that it is subject only to random within-person and between-person error, but not additive and intake-related error.  It is important to acknowledge this limitation of the 24-hour recall data when reporting the results of NHANES dietary intake analyses.

Even random error, however, may affect the estimates of usual intake from one or two 24-hour recalls.  Figure 3 illustrates the distribution curves from one 24-hour recall, the average of two recalls, and true usual intake.  In surveillance, one may be interested in examining mean intakes or estimating the fraction of the population above or below a cutpoint.  If our interest is in estimating the mean intake, recall data for one day will be adequate because with random error, the mean is unbiased.  However, random error results in inflated variance. Thus, if interest is in measuring the percentage of the population whose intakes fall above or below a cutpoint, biased estimates of the prevalence of inadequate or excess intake will be obtained with only one day of data.  Even using the mean of two days will lead to biased estimates of inadequate or excess intake.  Therefore, statistical methods are needed to adjust for measurement error.

Figure 4.  Hypothetical distribution of usual intake of a nutrient (black solid line), contrasted with the estimated distribution from one 24-hour recall (gray dotted dashed line) or two day average of 24-hour recalls (blue dashed line).  The vertical dashed line represents a hypothetical cutpoint of interest.

Hypothetical distribution of usual intake of a nutrient (green), contrasted with the estimated distribution from one 24-hour recall (red) or two day average of 24-hour recalls (blue). The dashed line represents a hypothetical cutpoint of interest.


With only 2 days of recall data, statistical modeling is needed to account for random measurement error.  This course describes statistical methods for estimating the effects of individual variables on usual intake, estimating the distribution of usual intake, and estimating usual intake for use in relating it to health parameters.



Kipnis V, Subar AF, Midthune D, Freedman LS, Ballard-Barbash R, Troiano RP, Bingham S, Schoeller DA, Schatzkin A, Carroll RJ. Structure of dietary measurement error: results of the OPEN biomarker study. American Journal of Epidemiology 2003 Jul 1;158(1):14-21; discussion 22-6.

Neuhouser ML, Tinker L, Shaw PA, Schoeller D, Bingham SA, Horn LV, Beresford SA, Caan B, Thomson C, Satterfield S, Kuller L, Heiss G, Smit E, Sarto G, Ockene J, Stefanick ML, Assaf A, Runswick S, Prentice RL. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women's Health Initiative. American Journal of Epidemiology 2008 May 15;167(10):1247-1259.




close window icon Close Window to return to module page.