Interpreting Race and Ethnicity in Cancer Data
The North American Association of Central Cancer Registries (NAACCR) Race and Ethnicity Identifier Assessment Project confirmed the importance of publishing cancer rates by race and ethnicity (specifically, Hispanic origin).1 When reporting cancer incidence, race and ethnicity information is abstracted from medical records and grouped into race and ethnicity categories.2 Although registries use standardized data items and codes for both race and ethnicity, the initial collection of this information by health care facilities and practitioners and the procedures for assigning and verifying codes for race and ethnicity are not well standardized.1 Thus, some inconsistency is expected in this information.
When reporting cancer mortality, race and Hispanic origin are recorded separately on the death certificate by the funeral director as provided by an informant or, in the absence of an informant, on the basis of observation.3 Inconsistencies in the collection and coding of data on race and Hispanic origin and their effect on mortality statistics have been described.4 The net effect of misclassification is greatest for American Indian and Alaska Native people; misclassification is smaller for Asian and Pacific Islander people and Hispanic people, and minimal for Black people and White people. Therefore, incidence and/or mortality data published in this report may be underestimated for Asian and Pacific Islander, American Indian and Alaska Native, and Hispanic people, possibly due to racial and Hispanic origin misclassification. CDC’s National Center for Health Statistics is working with states to improve the reporting of race and ethnicity on death certificates.
The Data Visualizations tool presents cancer incidence and mortality data for all races combined and by race and ethnicity (Hispanic).
Although central cancer registries have designated codes for race that allow them to document the occurrence of cancer in 23 Asian and Pacific Islander subpopulations,2 the subpopulations are grouped into a single Asian and Pacific Islander category because of small numbers and concerns regarding possible misclassification of race data.
Studies show excellent agreement (k=0.90) between Asian and Pacific Islander race in Surveillance, Epidemiology, and End Results (SEER) registry data and self-reported data from the U.S. Census.5 Studies examined the misclassification of race for Asian and Pacific Islander subpopulations in cancer registries.6 7 8 Nearly all National Program of Cancer Registries (NPCR) and SEER registries assigned Asian, not otherwise specified to a more specific Asian race through the standardized use of the NAACCR Asian and Pacific Islander Identification Algorithm (NAPIIA) version 1.2.
Kansas opted not to present state- and county-specific Asian and Pacific Islander counts and rates. The national rates presented include data for Kansas.
A study reported 90% agreement between Asian and Pacific Islander race reported on death certificates and self-reported data from the U.S. Census.4
The overall agreement between Hispanic ethnicity collected by SEER registries and self-reported ethnicity from the U.S. Census was substantial (k=0.61). Hispanic people were found to be underclassified in the SEER data compared to self-reports.5 Nearly all NPCR and SEER registries assigned Hispanic ethnicity through the standardized use of the NAACCR Hispanic Identification Algorithm (NHIA) version 2 (NHIA v2). After applying the NHIA v2, cases not classified as Hispanic are classified as non-Hispanic, leaving no cases with unknown Hispanic status.
Massachusetts opted not to present state- and county-specific, NHIA-classified Hispanic counts and rates for all years. The national rates presented include data for Massachusetts.
A study reported an 88% record-by-record agreement between Hispanic origin on death certificates and self-reported data.4
Death counts and rates for Hispanic people are presented at the national and state levels for all 50 states and for the District of Columbia. Hispanic origin is assigned to cancer mortality data on the basis of information collected from death certificates.
More American Indian and Alaska Native patients are misclassified as another race in cancer registry records than patients in other racial groups. Studies have found that this racial misclassification contributes to underestimates of cancer incidence and death rates among the American Indian and Alaska Native population.4 9 Accurate determination of disease burden is a critical first step toward identifying health disparities. Methods that can improve the accuracy of cancer burden estimates among the American Indian and Alaska Native population are described below.
Method 1: Linkage with Indian Health Service administrative records
The Indian Health Service (IHS) provides medical services to American Indian and Alaska Native people who are enrolled members of federally recognized tribes. The IHS provides health care to about 2.2 million people, a number equivalent to about 64% of the U.S. American Indian and Alaska Native population.9 While IHS coverage of these populations varies by region, it does not include American Indian and Alaska Native people who are members of non-federally recognized tribes, and underrepresents those who live in certain urban areas. People who are eligible to receive IHS services have sufficient native ancestry in a federally recognized tribe to be classified accurately as an American Indian or Alaska Native person.
As a standard practice, central cancer registries classify race as coded in the medical record. To address American Indian and Alaska Native misclassification in cancer registry data, selected registries in CDC’s NPCR and all registries in the National Cancer Institute’s SEER program linked their central cancer registry data to the IHS administrative records database for cases diagnosed from 1995 to 2019 and 1988 to 2019, respectively. Results of the linkage were captured in the data element, IHS Link (NAACCR data item 192).2 Central cancer registries include race and IHS Link in their annual data submissions to CDC or NCI. Using the race and IHS Link data elements, CDC and NCI created a recoded race variable. If a cancer case had an IHS Link value that indicated a match to IHS and race is White, other, or unknown, then the recoded race variable was coded as American Indian and Alaska Native. Although the linkage with IHS does not completely resolve the classification of race for American Indian and Alaska Native cases, it helps provide a more comprehensive and accurate picture of the cancer burden in this population.
Method 2: Restriction to IHS Purchased/Referred Care Delivery Areas
The IHS Purchased/Referred Care Delivery Area (PRCDA) is the geographic area within which the IHS makes purchased/referred care available to members of an identified Indian community who reside in the area. The IHS uses it to determine eligibility for services not directly available within the IHS. The IHS PRCDA consists of counties that include all or part of an American Indian or Alaska Native reservation or have a common boundary with a federally recognized tribal land, as defined in the October 10, 2017 Federal Register (82 FR 47004). There are 36 states that have at least one PRCDA-designated county. The PRCDA counties have higher proportions of American Indian and Alaska Native people in relation to the total population than non-PRCDA counties, with 53.2% of the U.S. American Indian and Alaska Native population residing in the 651 counties designated as PRCDA. Linkage studies have indicated more accurate race classification for American Indian and Alaska Native persons in PRCDA counties.9 10 11 12 13
Method 3: Restriction to non-Hispanic populations
Updated bridged intercensal population estimates significantly overestimated the number of American Indian and Alaska Native persons of Hispanic origin.14 15 Because these population estimates are used as denominators in rate calculations, larger than expected denominators can result in underestimation of rates. Studies demonstrate that restricting analysis to non-Hispanic populations can improve the accuracy of cancer incidence and death rate estimates among American Indian and Alaska Native people.15
American Indian and Alaska Native people data in the U.S. Cancer Statistics Data Visualizations tool
The U.S. Cancer Statistics Data Visualizations tool presents national, state, and county data by race, including American Indian and Alaska Native. The national data include American Indian and Alaska Native populations in all U.S. counties. These data use the results from the linkage with IHS to classify race, but still may underestimate the cancer burden more than previously published cancer incidence rates focusing on American Indian and Alaska Native people because no other restrictions were applied to the data. State- and county-specific American Indian and Alaska Native data are not presented for some states that opted not to present these data (Illinois, Kansas, New Jersey, and New York).
American Indian and Alaska Native people data in the At a Glance section
The U.S. Cancer Statistics Data Visualizations tool’s American Indian and Alaska Native restricted to PRCDA only module presents data from the United States Cancer Statistics American Indian and Alaska Native Incidence Analytic Database (USCS AIAD) in the tool’s At a Glance section. This database uses the three methods described above to improve the accuracy of cancer burden estimates among American Indian and Alaska Native people—
- First, this database uses the recoded race variable to classify race; only people of American Indian and Alaska Native race or White race (as comparison) are included in the module.
- Second, the database is restricted to persons residing in PRCDA counties.
- Third, the database is restricted to persons of non-Hispanic origin.
This database includes data elements specific to the American Indian and Alaska Native population, such as IHS Region and PRCDA county.
The USCS AIAD data can be displayed for all IHS regions combined or by six IHS regions: Alaska, Pacific Coast, Southwest, Northern Plains, Southern Plains, and East. The states grouped by IHS region are—
- Alaska: Alaska.
- Pacific Coast: California, Idaho, Oregon, and Washington.
- Southwest: Arizona, Colorado, Nevada, New Mexico, and Utah
- Northern Plains: Indiana, Iowa, Michigan, Minnesota, Montana, Nebraska, North Dakota, South Dakota, Wisconsin, and Wyoming.
- Southern Plains: Kansas, Oklahoma, and Texas.
- East: Alabama, Connecticut, Florida, Louisiana, Massachusetts, Maine, Mississippi, New York, North Carolina, Pennsylvania, Rhode Island, South Carolina, and Virginia.
The percentages of the American Indian and Alaska Native population living in PRCDA-designated counties by IHS region from 2015–2019 were—
- Pacific Coast=60.7%.
- Northern Plains=54.0%.
- Southern Plains=56.6%.
- Total United States=53.2%.
Studies have shown substantial variation in rates in the American Indian and Alaska Native population by IHS region.16 17 IHS regions have been presented in several publications focusing on American Indian and Alaska Native people, and this approach was determined to be preferable to the use of smaller jurisdictions, such as IHS Administrative Areas, which yielded less stable estimates.10
- O’Malley C, Hu KU, West DW. North American Association of Central Cancer Registries: Race and Ethnicity Identifier Assessment Project. Springfield (IL): North American Association of Central Cancer Registries; 2001.
- Havener L, Hulstrom D. Standards for Cancer Registries Vol. II: Data Standards and Data Dictionary. 10th ed., version 11. Springfield (IL): North American Association of Central Cancer Registries; 2004.
- Miniño AM, Heron MP, Smith BL, Kochanek K. Deaths: Final data for 2004. [PDF-3.4MB] National Vital Statistics Reports 2007;55(19).
- Arias E, Heron M, Hakes JK. The validity of race and Hispanic-origin reporting on death certificates in the United States: An update. [PDF-3.1MB] Vital and Health Statistics 2016;2(172).
- Clegg LX, Reichman ME, Hankey BF, Miller BA, Lin YD, Johnson NJ, Schwartz SM, Bernstein L, Chen VW, Goodman MT, Gomez SL, Graff JJ, Lynch CF, Lin CC, Edwards BK. Quality of race, Hispanic ethnicity, and immigrant status in population-based cancer registry data: implications for health disparity studies. Cancer Causes and Control 2007;18(2):177–187.
- NAACCR Race and Ethnicity Work Group. NAACCR Asian Pacific Islander Identification Algorithm [NAPIIA v1.2.1] [PDF-132KB]. Springfield (IL): North American Association of Central Cancer Registries; 2008.
- Boscoe FP. Issues with the coding of Asian race in cancer registration. Journal of Registry Management 2007;34(4):135–139.
- Boscoe FP, Schymura MJ, Hsieh M, Williams MA, Henry KA. Issues with the coding of Pacific Islanders in central cancer registries. Journal of Registry Management 2008;35(2):47–51.
- Jim MA, Arias E, Seneca DS, Hoopes MJ, Jim CC, Johnson NJ, Wiggins CL. Racial misclassification of American Indians and Alaska Natives by Indian Health Service Contract Health Service Delivery Area. American Journal of Public Health 2014;104(6 suppl 3):S29–S302.
- Espey DK, Wiggins CL, Jim MA, Miller BA, Johnson CJ, Becker TM. Methods for improving cancer surveillance data in American Indian and Alaska Native populations. Cancer 2008;113(5 suppl):1120–1130.
- Sugarman JR, Holliday M, Ross A, Castorina J, Hui Y. Improving American Indian cancer data in the Washington State Cancer Registry using linkages with the Indian Health Service and tribal records. Cancer 1996;78(7 Suppl):1564–1568.
- Frost F, Taylor V, Fries E. Racial misclassification of Native Americans in a Surveillance, Epidemiology, and End Results cancer registry. Journal of the National Cancer Institute 1992;84(12):957–962.
- Kwong SL, Perkins CL, Snipes KP, Wright WF. Improving American Indian cancer data in the California Cancer Registry by linkage with the Indian Health Service. Journal of Registry Management 1998;25(1):17–20.
- Arias E, Xu J, Jim MA. Period life tables for the non-Hispanic American Indian and Alaska Native population, 2007–2009. American Journal of Public Health 2014;104:S312–S319.
- Espey DK, Jim MA, Richards TB, Begay C, Haverkamp D, Roberts D. Methods for improving the quality and completeness of mortality data for American Indians and Alaska Natives. American Journal of Public Health 2014;104:S286–S294.
- Wiggins CL, Espey DK, Wingo PA, Kaur JS, Wilson RT, Swan J, Miller BA, Jim MA, Kelly JJ, Lanier AP. Cancer among American Indians and Alaska Natives in the United States, 1999–2004. Cancer 2008;113(5 suppl):1142–1152.
- White MC, Espey DK, Swan J, Wiggins CL, Eheman C, Kaur JS. Disparities in cancer mortality and incidence among American Indians and Alaska Natives in the United States. American Journal of Public Health 2014;104:S377–S387.