Interpreting Race and Ethnicity in Cancer Data

The North American Association of Central Cancer Registries (NAACCR) Race and Ethnicity Identifier Assessment Project confirmed the importance of publishing cancer rates by race and ethnicity (specifically, Hispanic origin).1 When reporting cancer incidence, race and ethnicity information is abstracted from medical records and grouped into race and ethnicity categories.2 Although registries use standardized data items and codes for both race and ethnicity, the initial collection of this information by health care facilities and practitioners and the procedures for assigning and verifying codes for race and ethnicity are not well standardized.1 Thus, some inconsistency is expected in this information.

When reporting cancer mortality, race and Hispanic origin are recorded separately on the death certificate by the funeral director as provided by an informant or, in the absence of an informant, on the basis of observation.3 Inconsistencies in the collection and coding of data on race and Hispanic origin and their effect on mortality statistics have been described.4 The net effect of misclassification is greatest for American Indians/Alaska Natives; misclassification is smaller for Asians/Pacific Islanders and Hispanics and minimal for blacks and whites. Therefore, incidence and/or mortality data published in this report may be underestimated for Asians/Pacific Islanders, American Indians/Alaska Natives, and Hispanics, possibly due to racial and Hispanic origin misclassification. CDC’s National Center for Health Statistics is working with states to improve the reporting of race and ethnicity on death certificates.

The Data Visualizations tool presents cancer incidence and mortality data for all races combined and by race and ethnicity (Hispanics). Data for Asians/Pacific Islanders and American Indians/Alaska Natives are presented only for the nation and for states with at least 50,000 population because of concerns regarding possible misclassification of race data and the relatively small sizes of these populations in the United States.

Asians/Pacific Islanders

Although state cancer registries have designated codes for race that allow them to document the occurrence of cancer in 23 Asian/Pacific Islander subpopulations,2 the subpopulations are grouped into a single Asian/Pacific Islander category because of small numbers and concerns regarding possible misclassification of race data.

Studies show excellent agreement (k=0.90) between Asian/Pacific Islander race in Surveillance, Epidemiology, and End Results (SEER) registry data and self-reported data from the U.S. Census.5 Studies are underway to examine the misclassification of race for Asian/Pacific Islander subpopulations in cancer registries.6 7 8 Nearly all National Program of Cancer Registries (NPCR) and SEER registries assigned Asian, not otherwise specified to a more specific Asian race through the standardized use of the NAACCR Asian/Pacific Islander Identification Algorithm (NAPIIA) version 1.2.

The following NPCR registries opted not to present state- and county-specific Asian/Pacific Islander counts and rates: Delaware, Kansas, and Kentucky. The national rates presented include data for these registries.

A study reported 90% agreement between Asian/Pacific Islander race reported on death certificates and self-reported data from the U.S. Census.4

Hispanics

The overall agreement between Hispanic ethnicity collected by SEER registries and self-reported ethnicity from the U.S. Census was substantial (k=0.61). Hispanics were found to be underclassified in the SEER data compared to self-reports.5 Nearly all NPCR and SEER registries assigned Hispanic ethnicity through the standardized use of the NAACCR Hispanic Identification Algorithm (NHIA) version 2 (NHIA v2). After applying the NHIAv2, cases not classified as Hispanic are classified as non-Hispanic, leaving no cases with unknown Hispanic status.

The following NPCR registries opted not to present state- and county-specific, NHIA-classified Hispanic counts and rates for all years: Delaware, Kentucky, and Massachusetts. The national rates presented include data for these registries.

A study reported an 88% record-by-record agreement between Hispanic origin on death certificates and self-reported data.4

Death counts and rates for Hispanics are presented at the national and state levels for all 50 states and for the District of Columbia. Hispanic origin is assigned to cancer mortality data on the basis of information collected from death certificates.

Improving Estimation of Cancer Burden among American Indians and Alaska Natives

More American Indian and Alaska Native patients are misclassified as another race in cancer registry records than patients in other racial groups. Studies have found that this racial misclassification contributes to underestimates of cancer incidence and death rates among the American Indian/Alaska Native (AI/AN) population.4 10 Accurate determination of disease burden is a critical first step toward identifying health disparities. Methods that can improve the accuracy of cancer burden estimates among the AI/AN population are described below.

Method 1: Linkage with Indian Health Service administrative records

The Indian Health Service (IHS) provides medical services to American Indians and Alaska Natives who are enrolled members of federally recognized tribes. The IHS provides health care to about 2.2 million people, a number equivalent to about 64% of the U.S. AI/AN population.10 While IHS coverage of these populations varies by region, it does not include American Indians and Alaska Natives who are members of non-federally recognized tribes, and underrepresents those who live in certain urban areas. People who are eligible to receive IHS services have sufficient native ancestry in a federally recognized tribe to be classified accurately as an American Indian or an Alaska Native.

As a standard practice, central cancer registries classify race as coded in the medical record. To address AI/AN misclassification in cancer registry data, selected registries in CDC’s National Program of Cancer Registries and all registries in the National Cancer Institute’s Surveillance Epidemiology End Results (SEER) program linked their central cancer registry data to the IHS administrative records database for cases diagnosed from 1995 to 2015 and 1988 to 2015, respectively. Results of the linkage were captured in the data element, IHS Link (NAACCR data element 192).2 Central cancer registries include race and IHS Link in their annual data submissions to CDC or NCI. Using the race and IHS Link data elements, CDC and NCI created a recoded race variable. If a cancer case had an IHS Link value that indicated a match to IHS, the recoded race variable was then coded as AI/AN. Although the linkage with IHS does not completely resolve the classification of race for AI/AN cases, it helps provide a more comprehensive and accurate picture of the cancer burden in this population.

Method 2: Restriction to IHS Purchased/Referred Care Delivery Areas

The IHS Purchased/Referred Care Delivery Area (PRCDA) is the geographic area within which the IHS makes purchased/referred care available to members of an identified Indian community who reside in the area. The IHS uses it to determine eligibility for services not directly available within the IHS. The IHS PRCDA consists of counties that include all or part of an AI/AN reservation or have a common boundary with a federally recognized tribal land, as defined in the April 7, 2016 Federal Register (81 FR 20388). There are 35 states that have at least one PRCDA-designated county. The PRCDA counties have higher proportions of AI/AN persons in relation to the total population than non-PRCDA counties, with 53.1% of the U.S. AI/AN population residing in the 643 counties designated as PRCDA (these counties represent 20.5% of the 3,141 counties in the United States). Linkage studies have indicated more accurate race classification for AI/AN persons in PRCDA counties.10, 11, 12, 13, 14

Method 3: Restriction to non-Hispanic populations

Updated bridged intercensal population estimates significantly overestimated the number of AI/AN persons of Hispanic origin.15 16 Because these population estimates are used as denominators in rate calculations, larger than expected denominators can result in underestimation of rates. Studies demonstrate that restricting analysis to non-Hispanic populations can improve the accuracy of cancer incidence and death rate estimates among American Indians and Alaska Natives.16

AI/AN data in the U.S. Cancer Statistics Data Visualizations tool

The U.S. Cancer Statistics Data Visualizations tool presents national, state, and county data by race, including AI/AN. The national data include AI/AN populations in all U.S. counties. These data use the results from the linkage with IHS to classify race, but still may underestimate the cancer burden more than previously published cancer incidence rates focusing on AI/AN because no other restrictions were applied to the data. State- and county-specific AI/AN data are not presented for some states that opted not to present these data (Delaware, Illinois, Kansas, Kentucky, New Jersey, and New York).

AI/AN data in the More Topics Section

The U.S. Cancer Statistics Data Visualization tool’s AI/AN Incidence data module presents data from the United States Cancer Statistics AI/AN Incidence Analytic Database (USCS AIAD) in the tool’s More Topics section. This database uses the three methods described above to improve the accuracy of cancer burden estimates among AI/AN:

  • First, this database uses the recoded race variable to classify race; only persons of AI/AN race or white race (as comparison) are included in the module.
  • Second, the database is restricted to persons residing in PRCDA counties.
  • Third, the database is restricted to persons of non-Hispanic origin.

This database includes data elements specific to the AI/AN population such as IHS Region and PRCDA county.

The AIAD data can be displayed for all IHS regions combined or by six IHS regions: Alaska, Pacific Coast, Southwest, Northern Plains, Southern Plains, and East. The states grouped by IHS region are:

  • Alaska: Alaska.
  • Pacific Coast: California, Idaho, Oregon, and Washington.
  • Southwest: Arizona, Colorado, Nevada, New Mexico, and Utah
  • Northern Plains: Indiana, Iowa, Michigan, Minnesota, Montana, Nebraska, North Dakota, South Dakota, Wisconsin, and Wyoming.
  • Southern Plains: Kansas, Oklahoma, and Texas.
  • East: Alabama, Connecticut, Florida, Louisiana, Massachusetts, Maine, Mississippi, New York, North Carolina, Pennsylvania, Rhode Island, and South Carolina.

The percentages of the AI/AN population living in PRCDA-designated counties by IHS region from 2011–2016 were:

  • Alaska=100%.
  • Pacific Coast=60.1%.
  • Southwest=84.0%.
  • Northern Plains=54.3%.
  • Southern Plains=56.8%.
  • East=16.4%.
  • Total United States=53.1%.

Studies have shown substantial variation in rates in the AI/AN population by IHS region.17 18 IHS regions have been presented in several publications focusing on AI/AN, and this approach was determined to be preferable to the use of smaller jurisdictions, such as IHS Administrative Areas, which yielded less stable estimates.11

References

  1. O’Malley C, Hu KU, West DW. North American Association of Central Cancer Registries: Race and Ethnicity Identifier Assessment Project. Springfield (IL): North American Association of Central Cancer Registries; 2001.
  2. Havener L, Hulstrom D. Standards for Cancer Registries Vol. II: Data Standards and Data Dictionary. 10th ed., version 11. Springfield (IL): North American Association of Central Cancer Registries; 2004.
  3. Miniño AM, Heron MP, Smith BL, Kochanek K. Deaths: Final data for 2004. pdf icon[PDF-3.4MB] National Vital Statistics Reports 2007;55(19).
  4. Arias E, Heron M, Hakes JK. The validity of race and Hispanic-origin reporting on death certificates in the United States: An update. pdf icon[PDF-3.1MB] Vital and Health Statistics 2016;2(172).
  5. Clegg LX, Reichman ME, Hankey BF, Miller BA, Lin YD, Johnson NJ, Schwartz SM, Bernstein L, Chen VW, Goodman MT, Gomez SL, Graff JJ, Lynch CF, Lin CC, Edwards BK. Quality of race, Hispanic ethnicity, and immigrant status in population-based cancer registry data: implications for health disparity studies.external icon Cancer Causes and Control 2007;18(2):177–187.
  6. NAACCR Race and Ethnicity Work Group. NAACCR Asian Pacific Islander Identification Algorithm pdf icon[NAPIIA v1.2.1] [PDF-132KB].external icon Springfield (IL): North American Association of Central Cancer Registries; 2008.
  7. Boscoe FP. Issues with the coding of Asian race in cancer registration. Journal of Registry Management 2007;34(4):135–139.
  8. Boscoe FP, Schymura MJ, Hsieh M, Williams MA, Henry KA. Issues with the coding of Pacific Islanders in central cancer registries. Journal of Registry Management 2008;35(2):47–51.
  9. NAACCR Race and Ethnicity Work Group. NAACCR Guideline for Enhancing Hispanic/Latino Identification: Revised NAACCR Hispanic/Latino Identification Algorithm [NHIA v2.2.1]. Springfield (IL): North American Association of Central Cancer Registries. September 2011.
  10. Jim MA, Arias E, Seneca DS, Hoopes MJ, Jim CC, Johnson NJ, Wiggins CL. Racial misclassification of American Indians and Alaska Natives by Indian Health Service Contract Health Service Delivery Area.external icon American Journal of Public Health 2014;104(6 suppl 3):S29–S302.
  11. Espey DK, Wiggins CL, Jim MA, Miller BA, Johnson CJ, Becker TM. Methods for improving cancer surveillance data in American Indian and Alaska Native populations.external icon Cancer 2008;113(5 suppl):1120–1130.
  12. Sugarman JR, Holliday M, Ross A, Castorina J, Hui Y. Improving American Indian cancer data in the Washington State Cancer Registry using linkages with the Indian Health Service and tribal records.external icon Cancer 1996;78(7 Suppl):1564–1568.
  13. Frost F, Taylor V, Fries E. Racial misclassification of Native Americans in a Surveillance, Epidemiology, and End Results cancer registry.external icon Journal of the National Cancer Institute 1992;84(12):957–962.
  14. Kwong SL, Perkins CL, Snipes KP, Wright WF. Improving American Indian cancer data in the California Cancer Registry by linkage with the Indian Health Service. Journal of Registry Management 1998;25(1):17–20.
  15. Arias E, Xu J, Jim MA. Period life tables for the non-Hispanic American Indian and Alaska Native population, 2007–2009.external icon American Journal of Public Health 2014;104:S312–S319.
  16. Espey DK, Jim MA, Richards TB, Begay C, Haverkamp D, Roberts D. Methods for improving the quality and completeness of mortality data for American Indians and Alaska Natives.external icon American Journal of Public Health 2014;104:S286–S294.
  17. Wiggins CL, Espey DK, Wingo PA, Kaur JS, Wilson RT, Swan J, Miller BA, Jim MA, Kelly JJ, Lanier AP. Cancer among American Indians and Alaska Natives in the United States, 1999–2004.external icon Cancer 2008;113(5 suppl):1142–1152.
  18. White MC, Espey DK, Swan J, Wiggins CL, Eheman C, Kaur JS. Disparities in cancer mortality and incidence among American Indians and Alaska Natives in the United States.external icon American Journal of Public Health 2014;104:S377–S387.