Mapping Census Tract Clusters of Type 2 Diabetes in a Primary Care Population

This map shows areas with significantly high and significantly low prevalence of diabetes and prediabetes in a health center population in Chicago. Prevalence was determined by ICD (International Classification of Diseases) codes and measured hemoglobin A1c (HbA1c). The map highlights regional clusters and isolated areas of diabetes prevalence that could be targeted with interventions to improve health outcomes. Diagnoses determined by ICD codes are shown in colors as hot and cold spot cluster cores corresponding to “high–high” (HH) and “low–low” (LL) LISA (local indicator of spatial autocorrelation) statistics, where selected census tracts and neighboring tracts both have high rates (HH) or both have low rates (LL) of diabetes. Hot spot outliers have high diabetes rates compared with neighboring tracts (“high–low” [HL]), whereas cold spot outliers have low diabetes rates compared with neighboring tracts (“low–high” [LH]). Hot spots of prediabetes and diabetes determined by measured HbA1c levels are also shown. LISA significance set at P < .05. Supermarket data are from Kolak et al (1). Census tract and community area boundary data are from the Chicago Data Portal (2,3). Basemap imagery is from OpenStreetMap, Leaflet, and Carto.


Background
Although much effort has been made by public health agencies to geographically plot chronic diseases at the national, state, and city levels, information is limited on how disease is distributed in smaller geographic areas or populations, such as a health center population (4). Our objective was to identify patterns of type 2 diabetes in a patient population of a large urban federally qualified health center by using census tracts as a proxy for neighborhoods. This approach expands on other longitudinal studies that found associations between rates of type 2 diabetes and neighborhood social and physical environment characteristics such as access to healthy foods (5-7).
As public health data have become more available, the spatial analyses of disease prevalence at local levels using clinical records has emerged as a powerful population-based health tool (8,9). Adapting these best methodological practices to an individual health center may provide additional strategies for identifying localized areas of health risk for targeting interventions and improving care in a health center's population. Ultimately, merging GIS (geographic information systems) capabilities with primary care patient panels is a novel approach to guide disease management strategies and community outreach in a local health care system.

Date Sources and Map Logistics
We generated maps from data extracted from the electronic medical records (N = 10,523) of a primary care patient population seen from August 1, 2015, to September 30, 2017, at a health center in Chicago. Residential addresses of patients were geocoded and converted to spatial data points. Points were then joined and aggregated to Chicago census tract boundaries (2). We created a subset of census tracts as the core service area; 31% of all patientresiding census tracts (140 of 455 tracts) included most of the health center's patients (n = 9,126) and were located within 5 miles of the health center. This selection process reduced the number of spurious census tract outliers (ie, those with few health center patients). The study was approved by the Northwestern University Institutional Review Board.
We identified 1,246 patients with a type 2 diabetes diagnosis using the International Classification of Diseases (ICD), Ninth Revision, Clinical Modification codes 250.xx and ICD, Tenth Revision, Clinical Modification codes E11.XX; 854 of these patients resided in the core service area. We also classified patients by measured hemoglobin A1 c (HbA 1c ) levels; patients with levels from 5.4% to 6.4% were classified as having prediabetes, and patients with levels of 6.5% or higher were classified as having diabetes. Fifteen percent of patients in the core service area had HbA 1c levels ranging from 5.4% to more than 14%; in the wider population, 25% of patients were classified as such. On the basis of HbA 1c levels, 959 patients (of the total population) had type 2 diabetes, of whom 49 did not have an ICD code for diabetes. We used 5-year averages from the 2016 American Community Survey, as prepared by the Centers for Disease Control and Prevention's Social Vulnerability Index database (10), for the following social determinants of health: percentage of persons living in poverty, rate of unemployment, per capita income, percentage of persons with no high school diploma, percentage of single parents, percentage limited in speaking English, percentage living in crowded housing, percentage having no vehicles, and percentage uninsured. Data for these covariates were extracted and joined to data for census tract areas.
We used exploratory spatial data analysis techniques on raw and population-adjusted data to analyze variation in type 2 diabetes distribution by census tract. We conducted an empirical Bayes smoothed univariate cluster and outlier detection analysis using the local indicator of spatial autocorrelation (LISA) statistic to determine type 2 diabetes prevalence (11). Empirical Bayes smoothing analysis uses a prior distribution, in this case, the average value of the sample, corrected for the variance instability associated with rates that have a small population base. We determined hot and cold spot clusters of type 2 diabetes prevalence. Hot spot clusters refer to areas that are significantly similar to their neighbors in high disease prevalence ("high-high" [HH] statistic), and cold spot clusters refer to areas that are significantly similar to their neighbors in low disease prevalence ("low-low" [LL] statistic). Clusters are composed of both cluster cores and nearby neighbors. Hot and cold spot clusters were visually identified by their cluster core in our analysis; these cores are surrounded by areas with proportionally high or low values. We also determined hot and cold spot outliers ("high-low" [HL] and "low-high" [LH]), which refer to areas that are significantly different (at P < .05) from their neighbors.
We compared the mean values of social determinants of health indicators in diabetes hot and cold spot tracts clusters with the mean values in all other core-service-area tracts using analysis of variance. We selected cluster cores and their neighboring tracts to represent complete clusters in this descriptive analysis. We added supermarket locations to the map as a proxy for access to healthy foods to demonstrate how incorporating environmental features may be useful in evaluating disease distribution (6). Although features other than supermarkets may be associated with disease occurrence, our analysis explored how clusters of type 2 diabetes may intersect with physical locations for food access. We is (12,13). We generated final maps by using R and Adobe Illustrator version 22.1.

Highlights
Among 140 census tracts, we found 31 hot spot clusters and 25 cold spot clusters of patients with type 2 diabetes, along with 4 hot outliers and 3 cold outliers. Compared with all census tracts in the core service area, the census tracts in the hot spot clusters had significantly lower income levels and high school graduation rates and a significantly greater percentage of households with vehicles, single parents, residents with limited English proficiency, crowded housing, and persons without health insurance (Table). The raw type 2 diabetes rate in hot spot census tracts was significantly higher than in all other tracts (0.14 in 36 hot spots vs 0.10 in 104 non-hot spots; P = .005) and double the rate in cold spot census tracts (0.14 vs 0.07). We also found consistent overlap between hot spot census tracts and higher HbA 1c ranges; conversely, some tracts with high HbA 1c rates were not identified as hot spot tracts.

Action
Our analysis demonstrates that stable calculation at the census tract level of a health center's patient population can facilitate spatial analysis and identify patient groups with health risks in neighborhood clusters. This research can set a foundation for developing targeted interventions in a health center population at the neighborhood level to improve health outcomes. Our maps also identified neighborhoods at lower risk for type 2 diabetes and create opportunities to explore possible neighborhood resiliency factors.
Our data and findings represent a patient population and are not meant to serve as a true sample of the actual population. However, comparing the differences between hot and cold spot clusters can open avenues for mobilizing outreach at community health centers or federally qualified health centers and partnering among local resources to support interventions for the clinical population. Findings can be shared with policy makers and community advocates to influence what resources are needed and where they are needed most. Although geographically plotting of chronic diseases at the national, state, and city levels provides important information on the distribution of disease among populations, our map illustrates the use of exploratory data analysis techniques to expand the use of patient panels or registries and identify and address the health needs of vulnerable patient populations within a health system at the local level. This methodology can build bridges and partnerships between community health centers, federally qualified health centers, public health officials, and community organizations to develop neighborhood initiatives and outreach.