Key Concepts about Outliers in NHANES Data

Outliers, or extreme values in the data, are common in surveys such as NHANES. They can occur as a result of errors in data collection or recording, or for other reasons.  Because the data were reviewed carefully before release, data collection and recording errors should be minimal within the publicly available NHANES CVX data. Problematic outliers include estimated values that are far outside the range of other values in the data. Examples of these might include estimated VO2 values less than 20 and greater than 90.

Consider outliers carefully, as their presence may substantially affect your results, especially if the sample weight associated with the outlying value is large. In some types of analysis, outliers have the potential to distort statistical estimates, alter apparent relationships, and lead to faulty conclusions. In these cases, the outliers may be deleted or the data transformed to lessen their impact. On the other hand, if the data are assumed to be correct and the statistical methods are robust in dealing with outlying values, outliers may sometimes be accommodated.

Please consult the Analytical Guidelines for more information on this topic.