CDC - Methods for County-Level Estimates - Interactive Atlas - Diabetes DDT
Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to site content Skip directly to page options
CDC Home

Methodology for County-Level Estimates

Data Sources and Methodology for County-Level Estimates of Diagnosed Diabetes and Selected Risk Factors

How to Read the Maps of County-Level Estimates of Diagnosed Diabetes and Selected Risk Factors

Methodology for Mapping County-Level Estimates of Diagnosed Diabetes and Selected Risk Factors

What method was used to create county-level estimates?

The prevalence and incidence of diagnosed diabetes and selected risk factors by county and by county and sex was estimated using data from CDC's Behavioral Risk Factor Surveillance System (BRFSS),1 and data from the U.S. Census Bureau’s Population Estimates Program.2 The BRFSS is an ongoing, monthly, state-based telephone survey of the adult population. The survey provides state-specific information on behavioral risk factors and preventive health practices. Respondents were considered to have diabetes if they responded "yes" to the question, "Has a doctor ever told you that you have diabetes?" Women who indicated that they only had diabetes during pregnancy were not considered to have diabetes. Respondents’ incident diabetes status was calculated by comparing their age at diagnosis to their current age. Age at diagnosis was obtained from the Diabetes Module question, How old were you when you were told you have diabetes?” Respondents were considered obese if their body mass index was 30 or greater. Body mass index (weight[kg] / height[m]) was derived from self-report of height and weight. Respondents were considered to be physically inactive if they answered "no" to the question, "During the past month, other than your regular job, did you participate in any physical activities or exercises such as running, calisthenics, golf, gardening, or walking for exercise?" Major changes to BRFSS survey methods began in 2011.

Three years of data were used to improve the precision of the year-specific county-level estimates of diagnosed diabetes and selected risk factors. For example, 2003, 2004, and 2005 were used for the 2004 estimate and 2004, 2005, and 2006 were used for the 2005 estimate. Estimates were restricted to adults 20 years of age or older to be consistent with population estimates from the U.S. Census Bureau. The U.S. Census Bureau provides year-specific county population estimates by demographic characteristics—age, sex, race, and Hispanic origin.

The county-level estimates for more than 3,200 counties or county equivalents (e.g., parish, borough, municipio) in the 50 U.S. states, Puerto Rico, and the District of Columbia were developed using modern small area estimation techniques.3 This approach employs a statistical model that “borrows strength” in making an estimate for one county from BRFSS data collected in other counties. Bayesian multilevel modeling techniques were used to obtain these estimates. Multilevel logistic regression models with random effects of demographic variables (age 20–44, 45–64, 65+; race; sex) at the county-level and at the state level were developed. Models were fit using a simulation method known as Markov Chain Monte Carlo. The model specification is given in Cadwell, et al.4

The model specification for incidence is given in Barker et al.5

For all years, rates were age adjusted by calculating age specific rates for the following three age groups, 20–44, 45–64, 65+. A weighted sum based on the distribution of these three age groups from the 2000 census was then used to adjust the rates by age. The weights used were as follows: 0.52, 0.31, 0.17.

Ranks for county-level data of diagnosed diabetes and selected risk factors were based on age-adjusted prevalence rates. As part of the model fitting process we generated and saved two thousand draws from the posterior distribution of each county's age-adjusted prevalence rate. For each of these draws we sorted the counties by prevalence and saved the counties' ranks. This gave us 2,000 draws from the posterior distribution of each county's rank. We then used the median for the rank estimate and the 5th and 95th percentiles for a 90% confidence interval.

Are the same types of data available for all years?

Age-adjusted rates and rankings for county-level estimates of diabetes prevalence and related risk factors are available for all years beginning in 2004 to the current year of available data. However, only data for diagnosed diabetes prevalence is available for Puerto Rico municipios from 2004 to the current year of available data.

Age-adjusted rates for county-level estimates of diabetes prevalence and related risk factors by sex are available starting in 2009. Ranks were not computed for county-level estimates of diabetes prevalence by sex because of high variability in the estimates.

Age-adjusted rates for county-level estimates of diabetes incidence are available for all years beginning in 2004 to the current year of available data. However, data are not available for Puerto Rico municipios. Ranks were not computed for county-level estimates of diabetes incidence because of high variability in the estimates.


Can I download the map data for county-level estimates?

Excel files with county estimates for the entire nation and for each state are available for downloading. Click on the Download Data button then select an indicator. Next, you will select either the nation, which contains data for all the states, or individual state data that you want to download. The files are saved in xml format but can be easily opened and viewed in Excel. If you wish to import the data into statistical software, you will need to save the xml file as an xls file in Excel.

How do I access the various types of maps available?

Both national-level and state-level views are available for county estimates. When you open the county data report the national map will be displayed. For state-level view, you can click on the "Select State" button to select a state which will be zoomed into display. To remove, the selected state click on the Deselect State button. To return to the national map, click on the "Zoom Out" button. To select an indicator, click on the "Indicator" button, which will display all available indicators, and chose one indicator to be displayed.

What factors do I need to chose to display a map?

You will select the "Indicator" button then click on an indicator. Next, you will select the data type (percentae, age-adjusted percentage, rate, age-adjusted rate) then the year. To change the data classification you click on the "Legend Settings" button. You may change the number of classes. You can select a minimum of 2 classes to a maximum of 10 classes. You can change the data classifiers to include equal interval, quantile, natural break, continuous. The quartile cut-offs may differ from those presented elsewhere in the Data & Trends Web site because of different software used for data classification. To return to the original map settings, return to the "Indicator" button and select indicator, data type, and year.

How can I use the maps to look at trends?

For the national map, you click on the play button located on the time slider above the map to view trends over time for the nation. For a state map, you select a state, click on the play button on the time slider and it will display trends at the state level.

How do I interpret the different colors in the maps of county-level estimates?

Colors used in the shaded area maps represent the different levels of the scale. The lighter color represents the lowest level of the scale whereas the darker color represents the highest level of the scale.

Back to Top

Can I use the county maps and estimates to make comparisons or rank counties?

Caution should be exercised in making comparisons based on the county maps and estimates. The estimates are intended as individual point estimates. Significance testing or hypothesis testing may be inappropriate. The maps are presented for displaying possible geographic patterns and stimulating further investigation, but are not intended as formal representations of similarities and differences.

Bayesian 95% confidence intervals and standard deviations are provided as precision indicators of the individual county-level point estimates and should be used in data analyses.

One should not assume that counties mapped in different colors have significantly different prevalence. The county estimates are grouped in categories by various methods to produce a state or national map. This grouping does not incorporate the standard deviation or confidence interval and does not imply any formal comparison between counties.

Back to Top

How were ranks created for the data?

Ranks for county-level data of diagnosed diabetes and selected risk factors were based on age-adjusted prevalence rates. Models were fit using a Bayesian simulation method known as Markov Chain Monte Carlo.3-5 As part of the model fitting process we generated and saved 2,000 draws from the distribution of each county's age-adjusted prevalence rate. For each of these draws we sorted the counties by prevalence and saved the counties' ranks. This gave us 2,000 draws from the distribution of each county's rank. We then used the median for the rank estimate and the 5th and 95th percentiles for a 90% confidence interval. Note that ranks for Puerto Rico were not included with the national dataset because Puerto Rico ranks were not generated using the national data. Ranks for Puerto Rico are specific to that territory.

Back to Top

How can we use the county ranks?

A county's rank is a reflection of relative burden. The associated confidence interval quantifies the uncertainty associated with a county's rank and determines the extent to which conclusions may be based on ranks. For example, if a county's rank confidence interval is entirely below 1571, which is the median rank for all counties, we could confidently place that county in the lower half of counties.

Back to Top

How can we map the county ranks?

For each indicator, confidence intervals of counties' ranks were used to identify counties that were either below the median rank for all counties or above the median rank for all counties. You can obtain the maps showing counties above and below the median rank by selecting either the County Ranks report or by clicking on the "Low-High Rank Maps" button in the County Data report. State-level maps are not available for ranks because the counties' ranks are based on the national estimates. For more information about mapping county ranks, see the related Morbidity and Mortality Weekly Report.

Back to Top

Methodology for Mapping County-Level Estimates of Diagnosed Diabetes and Selected Risk Factors

What method was used to create the maps of county-level estimates?

The maps were created by merging the modeled estimates in database format, with geographic boundary files, called shapefiles. In this manner, the statistical data in the database were spatially referenced with their associated state and county boundaries. As a result, the data can be viewed as a map and the user can interactively map the geospatially-based data. The Albers Equal-Area (Continental United States) projection was used for the national maps and the NAD 1983 UTM Zone 14N map projection was used for the state maps.


What color sequences were used for the maps?

Color schemes were chosen based on the number of data classes or categories, the types of data being mapped (e.g., number of adults versus percentage of adults), consideration of the display devices to be used for the resulting maps, and the need to avoid colors that cannot be differentiated by individuals with impaired color-vision.6 The color schemes for the maps were selected by referring to ColorBrewer (http://www.colorbrewer.org), an online tool for selecting color schemes.

Back to Top

References

1. CDC Behavioral Risk Factor Surveillance System Web site. http://www.cdc.gov/brfss/index.htm: Accessed October 10, 2012.
2. U.S. Census Bureau. Population Estimates Web site. http://www.census.gov/popest/index.html: Accessed October 10, 2012.
3. Rao JNK. Small Area Estimation. Hoboken, New Jersey: John Wiley & Sons, Inc; 2003.
4. Cadwell BL, Thompson TJ, Boyle JP, Barker LE (2010). Bayesian small area estimates of diabetes prevalence by U.S. county, 2005. Journal of Data Science 8(1): 173-188.
5. Barker LE, Thompson TJ, Kirtland KA, Boyle JP, Geiss LS, McCauley MM, Albright AL (2013). Bayesian Small Area Estimates of Diabetes Incidence by United States County, 2009. Journal of Data Science 11:249–269.
6. Malec D, Sedransk J, Moriarity CL, LeClere FB. Small area inference for binary variables in the National Health Interview Survey. Journal of the American Statistical Association 1997;92(439):815-826.
7. Brewer, CA. Basic mapping principles for visualizing cancer data using geographic information systems (GIS). American Journal of Preventive Medicine, 2006;30(2S):S25–S36.

Back to Top

 
Contact Us:
  • CDC Diabetes Public Inquiries
  • Mail
  • 800-CDC-INFO
    (800-232-4636)
    TTY: (888) 232-6348
    8am-8pm ET
    Monday-Friday
    Closed Holidays
  • Contact CDC-INFO
USA.gov: The U.S. Government's Official Web PortalDepartment of Health and Human Services
Centers for Disease Control and Prevention   1600 Clifton Rd. Atlanta, GA 30333, USA
800-CDC-INFO (800-232-4636) TTY: (888) 232-6348 - Contact CDC–INFO
A-Z Index
  1. A
  2. B
  3. C
  4. D
  5. E
  6. F
  7. G
  8. H
  9. I
  10. J
  11. K
  12. L
  13. M
  14. N
  15. O
  16. P
  17. Q
  18. R
  19. S
  20. T
  21. U
  22. V
  23. W
  24. X
  25. Y
  26. Z
  27. #