About the Databases

U.S. Cancer Statistics public use databases include cancer incidence and population data for all 50 states, the District of Columbia, and Puerto Rico, providing information on more than 35 million cancer cases.

The databases include data by demographic characteristics (for example, age, sex, and race) and tumor characteristics (for example, year of diagnosis, primary tumor site, histology, behavior, and stage at diagnosis).

Hospitals, physicians, and laboratories across the nation report these data to central cancer registries supported by CDC and the National Cancer Institute (NCI). The databases are intended for researchers to conduct focused analyses beyond what is available through the U.S. Cancer Statistics Data Visualizations tool.

Researchers, public health professionals, clinicians, decision makers, and others can use these data to inform scientific inquiries, programs, and policies by identifying disparities in cancer burden, investigating trends and geographic distributions in cancer incidence, and evaluating and monitoring cancer prevention activities.

The current data come from the 2022 National Program of Cancer Registries (NPCR) and Surveillance, Epidemiology, and End Results (SEER) program submissions, which include cancer cases diagnosed from January 1, 2001 through December 31, 2020. Each year, NPCR- and SEER-supported central cancer registries submit data from a referent year to the close of the most current diagnosis year. The submitted data includes information from previous years and are updated with information from the newly submitted records to ensure case completeness and high quality.

CDC and NCI support the data collection and quality standards in the North American Association of Central Cancer Registries (NAACCR) consensus documents. During data collection, CDC and NCI also apply additional rigorous quality control edits, data completeness evaluations, and data quality assessments. For a registry’s data to be included in the U.S. Cancer Statistics public research data file, they must meet the U.S. Cancer Statistics publication standard.

Two public use databases are available—

Number of Records in the Databases

The table below shows the number of cases available for the most recent U.S. Cancer Statistics data release.*
Database All Cases** Malignant Cases† Malignant and In Situ Cases†
U.S. (2001–2020) 35,152,269 31,746,107 34,193,813
U.S. and Puerto Rico (2005–2020) 29,385,497 26,373,609 28,460,152

*The following criteria apply to the U.S. Cancer Statistics public use databases—

NPCR- and SEER-supported cancer registries report all incident cases coded as in situ (non-malignant) and invasive (malignant; primary site only), and non-malignant (including borderline and benign) central nervous system tumors according to the International Classification of Diseases for Oncology, Third Edition (ICD-O-3), with the following exceptions—

  • In situ cancers of the cervix are not reported.
  • Basal and squamous cell carcinomas of the skin are not reported, except when these occur on the skin of the genital organs.
  • Additionally, in situ urinary bladder cancers were re-coded as invasive behavior.

**The “All Cases” column includes benign and borderline brain and other nervous system tumors.

†Malignant and in situ cases are defined using the Behavior code ICD-O-3.