Sample Text for Manuscripts

U.S. Cancer Statistics Public Use Database

You can use the following sample text to describe CDC’s National Program of Cancer Registries (NPCR) and NCI’s Surveillance, Epidemiology, and End Results (SEER) Program incidence – U.S. Cancer Statistics data methods in manuscripts.

Methods

Cancer Incidence

U.S. Cancer Statistics data, which combine cancer registry data from the Centers for Disease Control and Prevention’s (CDC’s) National Program of Cancer Registries (NPCR) and the National Cancer Institute’s (NCI’s) Surveillance, Epidemiology, and End Results (SEER) Program, were analyzed (1). This dataset includes cancer incidence data from central cancer registries reported to NPCR in 46 states, the District of Columbia, [IF APPLICABLE] and Puerto Rico (2) and to SEER in 4 states. Data about all new diagnoses of cancer from patient records at medical facilities such as hospitals, physicians’ offices, therapeutic radiation facilities, freestanding surgical centers, and pathology laboratories are reported to central cancer registries, which collate these data and use state vital records to collect information about any cancer deaths that were not reported as cases. The central cancer registries use uniform data items and codes as documented by the North American Association of Central Cancer Registries. These data are submitted annually to CDC and NCI and combined into one dataset (3). Cancer registries demonstrate that data were of high quality by meeting U.S. Cancer Statistics publication criteria (1); during [YEARX–YEARY], data from [X] cancer registries met these criteria, covering [X%] of the United States population. This report includes new cases of primary invasive [CANCER TYPE] cancer (International Classification of Diseases for Oncology, Third Edition code [CXX.X–CXX.X]) (4) diagnosed during [YEARX–YEARY]; [IF APPLICABLE] excluding histology codes 9050–9055, 9140, and 9590–9992 [OR] restricted to histology codes [XXXX–XXXX].

[IF APPLICABLE] Race and Ethnicity (U.S. data only)

Data were analyzed by five major racial/ethnic groups: White, Black, American Indian and Alaska Native (AI/AN), Asian/Pacific Islander (API), and Hispanic. Information about race and Hispanic ethnicity were collected separately. An algorithm was applied to Hispanic ethnicity data to reduce misclassification of Hispanic persons as being of unknown ethnicity (5). To reduce misclassification of AI/AN race, some central cancer registries link case data with the Indian Health Service (IHS) patient registration database, which contains records of individuals who are members of federally recognized tribes; cases linked with the IHS database were coded as AI/AN (6).

Because states can opt not to present state-specific counts and rates for [AS APPLICABLE: API, Hispanic, and AI/AN populations], these data are not shown for the following states [CHECK STATE LIST AT https://www.cdc.gov/cancer/uscs/public-use/cautionary-notes.htm. FOR EXAMPLE, Because states can opt not to present state-specific counts and rates for AI/AN populations, these data are not shown for Illinois, Kansas, New Jersey, and New York.]

[IF APPLICABLE] Histology

Analyses by histology included only cases that were microscopically confirmed ([X%] of cases).

[IF APPLICABLE] Stage

Stage is classified using a merged variable that spans the time periods when three different staging schemes were used: SEER Summary Stage 2000, Derived Summary Stage, and Summary Stage 2018. The staging criteria characterize cancers as localized, regional, distant, or unknown stage. Localized cancer is confined to the primary site; regional cancer has spread directly beyond the primary site (regional extension) or to regional lymph nodes; and distant cancer has spread to other organs (distant extension) or remote lymph nodes (7).

Population Estimates

Population estimates for rate denominators were a modification of annual county population estimates by age, sex, bridged race, and ethnicity produced by the U.S. Census Bureau in collaboration with CDC and with support from NCI (8). Modifications incorporated bridged, single-race estimates that were derived from multiple-race categories in the Census and accounted for known issues in certain counties (8). The modified county-level population estimates, summed to the state and national levels, were used as denominators in rate calculations (8).

Statistical Analysis

Incidence and Death Rates

Average annual rates for [YEARX–YEARY] per 100,000 population were age-adjusted (using 19 age groups) by the direct method to the 2000 U.S. standard population (9). Corresponding 95% confidence intervals (CIs) were calculated as modified gamma intervals (10). Rates based on fewer than 16 cases tend to have poor reliability and were not presented. To determine differences between subgroups, rate ratios were calculated; rates were considered statistically different if the 95% CIs of the rate ratios excluded 1 (11). Rates were calculated using SEER*Stat software version[ X.X.X]. (12).

[IF APPLICABLE] Trends in Rates

Annual percentage change (APC) was used to quantify the change in rates during [YEARX–YEARY] and was calculated using weighted least squares regression (13). A two-sided t-test was used to test whether the APC was statistically different from zero (P <.05). Rates were considered to increase or decrease if P <.05; otherwise rates were considered stable. APCs were calculated using SEER*Stat software version [X.X.X]. (12).
[OR]
Change in rates during [YEARX–YEARY] was calculated using joinpoint regression, which involves fitting a series of joined straight lines on a logarithmic scale to the trends in the annual age-standardized rates (14); up to [X] joinpoints ([X] line segments) were allowed. The trend of the line segment was used to quantify the annual percentage change (APC). A two-sided t-test was used to test whether the APC was statistically different from zero (P <.05). The average annual percentage change (AAPC) for [YEARX–YEARY] was calculated using a weighted average of the slope coefficients of the underlying joinpoint regression line with the weights equal to the length of each segment over the interval. To determine whether the AAPC was statistically different from zero (P <.05), a two-sided t-test was used for 0 joinpoints, and a two-sided z-test was used for 1 or more joinpoints. Rates were considered to increase or decrease if P <.05; otherwise rates were considered stable. Trends were calculated using Joinpoint regression program version [X.X.X]. (15).

Footnotes for Tables

It is recommended that standard footnotes from U.S. Cancer Statistics or slight derivations be used for tables and figures.

For Population Coverage

Data are from population-based registries that participate in CDC’s National Program of Cancer Registries and/or NCI’s Surveillance, Epidemiology, and End Results Program and meet high-quality data criteria. These registries cover approximately [XX]% of the United States population.

For Age-Adjusted Rates

Rates are per 100,000 persons and are age-adjusted to the 2000 U.S. standard population (19 age groups – Census P25–1130).

References

1Centers for Disease Control and Prevention. U.S. Cancer Statistics. https://www.cdc.gov/uscs/ Accessed [ENTER DATE].

2Singh SD, Henley SJ, Ryerson AB. Surveillance for cancer incidence and mortality—United States, 2012. MMWR 2016;63(55):17–58.

3National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database: [ENTER DATABASE TITLE], United States Department of Health and Human Services, Centers for Disease Control and Prevention. Released [DATE], based on the November [YEAR] submissions. Available at www.cdc.gov/cancer/uscs/public-use/.

4Fritz A, Percy C, Jack A, Shanmugarathnam K, Sobin L, Parkin D, et al., editors. International Classification of Diseases for Oncology. 3rd edition. Geneva, Switzerland: World Health Organization; 2000.

5NAACCR Race and Ethnicity Work Group. NAACCR Guideline for Enhancing Hispanic/Latino Identification: Revised NAACCR Hispanic/Latino Identification Algorithm [NHIA v2.2.1]. [PDF-435KB] Springfield (IL): North American Association of Central Cancer Registries. September 2011.

6Jim MA, Arias E, Seneca DS, Hoopes MJ, Jim CC, Johnson NJ, Wiggins CL. Racial misclassification of American Indians and Alaska Natives by Indian Health Service Contract Health Service Delivery Area. American Journal of Public Health 2014;104:S295–S302.

7Young JL Jr, Roffers SD, Ries LAG, Fritz AG, AA H, editors. SEER Summary Staging Manual – 2000: Codes and Coding Instructions. Bethesda, MD: National Cancer Institute; 2001.

8National Cancer Institute. Surveillance, Epidemiology, and End Results (SEER) Program. Modifications to Census Bureau’s County Population Data.

9Anderson R, Rosenberg H. Age standardization of death rates: implementation of the year 2000 standard. National Vital Statistics Report 1998;47:1–16.

10Tiwari RC, Clegg LX, Zou Z. Efficient interval estimation for age-adjusted cancer rates. Statistical Methods in Medical Research 2006;15:547–569.

11Fay MP. Approximate confidence intervals for rate ratios from directly standardized rates with sparse data. Communications in Statistics: Theory and Methods 2007;28(9):2141–2160.

12National Cancer Institute. SEER*Stat software. Bethesda, MD: National Cancer Institute, Surveillance Research Program; [YEAR]. Available at https://seer.cancer.gov/seerstat/.

13Kleinbaum DG, Kupper LL, Muller KE. Applied Regression Analysis and Other Multivariable Methods. 2nd ed. Boston, Mass: PWS-Kent; 1988.

14Kim H-J, Fay MP, Feuer EJ, Midthune DN. Permutation tests for joinpoint regression with applications to cancer rates. Statistics in Medicine 2000;19:335–351.

15National Cancer Institute. Joinpoint Trend Analysis Software. Bethesda, MD: National Cancer Institute, Surveillance Research Program, Statistical Methodology and Applications Branch; [YEAR]. Available at https://surveillance.cancer.gov/joinpoint/.