Confidence intervals reflect the range of variation in the estimation of the cancer rates. The width of a confidence interval depends on the amount of variability in the data. Narrow confidence intervals tend to imply greater certainty in the estimate, while wide confidence intervals tend to imply more variability in the data and could mean there is less certainty. Sources of variability include the underlying occurrence of cancer as well as uncertainty about when the cancer is detected and diagnosed, when a death from cancer occurs, and when the data about the cancer are sent to the registry or the state health department.
In any given year, when large numbers of a particular cancer are diagnosed or when large numbers of cancer patients die, the effects of random variability are small compared with the small populations where these changes cause greater variation and the confidence interval would likely be narrow. With rare cancers, however, the rates are small and the chance occurrence of more or fewer cases or deaths in a given year can markedly affect those rates. Under these circumstances, the confidence interval will be wide to indicate uncertainty or instability in the cancer rate.
The Poisson Process
To estimate the extent of this uncertainty, a statistical framework is applied.1 The standard model used for rates for vital statistics is the Poisson process,2 which assigns more uncertainty to rare events relative to the size of the rate than it does to common events.
Parameters are estimated for the underlying disease process. For this report, we estimated a single parameter to represent the incidence rate and its variability. Of note, the Poisson model is capable of estimating separate parameters that represent contributions to the rate from various population risk factors, the effects of cancer control interventions, and other attributes of the population risk profile in any particular year.
Modified Gamma Intervals
Confidence intervals that are expected to include the true underlying rate 95% of the time are used in the Data Visualizations tool and are modified gamma intervals3 computed using SEER*Stat. The modified gamma intervals are more efficient than the gamma intervals of Fay and Feuer4 in that they are less conservative while still retaining the nominal coverage level. Various factors such as population heterogeneity can sometimes lead to “extra-Poisson” variation in which the rates are more variable than would be predicted by a Poisson model. No attempt was made to correct for this. In addition, the confidence intervals do not account for systematic (in other words, nonrandom) biases in the incidence rates.
Considerations When Comparing Rates
The use of overlapping confidence intervals to determine significant differences between two rates presented in the Data Visualizations tool is discouraged because the practice fails to detect significant differences more frequently than standard hypothesis testing.5
Another consideration when comparing differences between rates is their public health importance. For some rates presented in the Data Visualizations tool, numerators and denominators are large and standard errors are therefore small, resulting in statistically significant differences that may be so small as to lack importance for decisions related to population-based public health programs.
- Särndal C-E, Swennson B, Wretman J. Model-Assisted Survey Sampling. New York (NY): Springer-Verlag; 1992.
- Brillinger DR. The natural variability of vital rates and associated statistics. Biometrics 1986;42(4):693–734.
- Tiwari RC, Clegg LX, Zou Z. Efficient interval estimation for age-adjusted cancer rates. Statistical Methods in Medical Research 2006;15(6):547–569.
- Fay MP, Feuer EJ. Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statistics in Medicine 1997;16(7):791–801.
- Schenker N, Gentleman JF. On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician 2001;55(3):182–186.