Overview of Analysis of Pooled Serum Samples for Select Chemicals, NHANES 2005-2016
In the National Biomonitoring Program, pooling samples to make chemical measurements is used to address:
- The need to improve the sensitivity of the measurement; that is, the chemical concentrations are so low that a larger sample volume is necessary to achieve lower limits of detection and greater likelihood of detectable results; and
- The need to reduce the number of samples analyzed, based on weighing the costs of the analysis against a low frequency of detectable results.
Pooled Samples in the National Report on Human Exposure to Environmental Chemicals
Analysis of Pooled Serum Samples for Select Chemicals, NHANES 2005-2016 includes results for chemicals that were measured in samples created from pools of serum samples from NHANES 2005-2006, 2007-2008, 2009-2010, 2011-2012, 2013-2014, and 2015-2016. Some of these chemicals were previously measured in individual samples from the general U.S. population; these data can be found in Analysis of Whole Blood, Serum, and Urine Samples, NHANES 1999-2018.
Overview of Sample Pooling
Beginning with NHANES 2005-2006, the CDC used a weighted pooled-sample design to measure serum concentrations of dioxins, furans, polychlorinated biphenyls (PCBs), organochlorine pesticides and metabolites, and the brominated flame retardants (polybrominated diphenyl ethers and the polybrominated diphenyl, PBB 153). In previous NHANES survey periods, measurements were made on individual serum samples from the U.S. population using stratified multistage selection.
Although NHANES sampling weights can be incorporated to make pooled sample estimates representative of the non-institutionalized U.S. population, there are statistical estimation challenges. Measurements of these chemicals in individuals tend to have a log-normal distribution with central tendency best estimated using a geometric mean. However, the measured value for a pooled sample is comparable to an arithmetic average of measurements in individuals. Consequently, the pooled sample result is expected to be higher than the geometric mean of multiple individual results. Another challenge is that direct calculation of the design effects required for accurate standard error and confidence interval estimation is not possible because samples are pooled across the design cells of the original survey. For these reasons, data tables showing the pooled sample results present only weighted arithmetic means and unadjusted standard errors for each category.
Interpreting the Data
The following cautions and suggestions apply to the pooled sample results and data tables:
- Individual sample data, as explained above, tend to be log-normally distributed with central tendency best estimated using a geometric mean. Pooled data are comparable to an arithmetic mean of individual sample results and therefore, pooled data results are expected to be higher than the geometric mean of individual sample results. Therefore, the weighted arithmetic means from pooled samples are expected to be higher than the geometric means provided in the historical (companion) data tables or in other publications based on data provided on individual results.
- The standard errors are unadjusted and therefore, do not reflect the design effects of the survey. In many cases, the standard errors are based on very few pooled sample measurements and cannot be expected to accurately reflect the true imprecision of the weighted arithmetic mean estimates. Therefore, when the unadjusted standard error was more than 30% of the weighted arithmetic mean, this is noted with a double asterisk (**) and footnoted.
Additional information about sample pooling is available from the following sources:
Geometric mean estimation from pooled samples (2007)
Characterizing populations of individuals using pooled samples (2010)
Use of pooled samples from the National Health and Nutrition Examination Survey (2012)
Estimation of exposure distributions from pooled samples (2014)