Statistical Methods

Learn more about the statistical methods used for the maps created with the Interactive Heart Disease and Stroke Atlas.

All mortality and Medicare hospitalization and hospitalization discharge data have been directly age-standardized using the 2000 US Standard Population. An age- standardized rate is a weighted average of age specific rates calculated on the basis of the proportion of people in the corresponding age groups of a standard population. Age adjustment allows for comparison of rates between counties with different age distributions.

Reference: Klein RJ, Schoenborn CA. Age adjustment using the 2000 projected U.S. population. Healthy People Statistical Notes, no. 20. Hyattsville, Maryland: National Center for Health Statistics; January 2001.

The following is a list of inclusion criteria applied to the beneficiaries in the Master Beneficiary Summary File (MBSF). These inclusion criteria are used to determine what months an individual beneficiary could be eligible for inclusion. A beneficiary is considered eligible for any given month if he or she is eligible within that month for at least 1 day.

  1. Non-Null and valid gender.
  2. Non-Null and valid race/ethnicity.
  3. Beneficiary who turns 65 on the extract year or who became 65 before January 1st of the extract year.
  4. Beneficiary who is at least 65 years old on January 1st of the extract year and is living on January 1st of the extract year, or beneficiary who turns 65 on the extract year and is living on his birthday.
  5. For the months of eligibility determined by elements above, the beneficiary satisfies the following:
    1. During required months, enrolled in Part A Only, or enrolled in Part A in concert with other parts:
      • Part B.
      • State buy-in.
      • Part B and state buy-in.
    2. Was enrolled as a fee-for-service beneficiary (including those enrolled in a case- or disease-management demonstration project).
    3. Was not a member of a Health Maintenance Organization (HMO).
    4. Exception rule: If the beneficiary was eligible for all 12 months of the extraction year per items 1-4 above, then the beneficiary must fulfill requirements 5a-5c for 11 months of those 12 months to be included. If the beneficiary was eligible for less than 12 months of the extraction year per items 1-4 above, then the beneficiary must meet the criteria in 5a-5c for the full period of his or her eligibility.

Inclusion criteria for MedPAR mirror all criteria of the MBSF extraction. In addition, there are the following inclusion criteria:

  1. Beneficiary’s primary ICD-9-CM code is one of the ICD­-9-CM codes for a selected health indicator.
  2. A Group Health Organization has not paid the provider for the claim(s).
  3. The hospital stay is classified as a short stay.
  4. Beneficiary’s age as of date of admission is 65 or older.

Data for counties with small populations are not displayed when a reliable rate could not be generated.

Mortality and Medicare Hospitalization Data

For spatially smoothed data, county-level rates were generated when the following criteria were met over a 3-year time period within each of the filters (e.g., age, race, and gender).

At least one of the following 3 criteria:

  • At least 20 events occurred within the county and its adjacent neighbors.


  • At least 16 events occurred within the county.


  • At least 5,000 population years within the county.

AND all 3 of the following criteria:

  • At least 6 population years for each age group used for age adjustment if that age group had 1 or more events.
  • The number of population years in an age group was greater than the number of events.
  • At least 100 population years within the county.

For unsmoothed data, county-level rates were generated when the following criteria were met over a 3-year time period with respect to the applied filters (e.g., age, race, and gender):

  • At least 100 population years within the county.
  • At least 20 events occurred within the county.

Medicare Hospitalization Discharge Data

Suppression criteria for Medicare hospitalization discharge data was based on the relative standard error (RSE): RSE = 100 * sqrt(p * (1 – p) / N) / p), where p is the national proportion of discharge to each destination and N is the population associated with the county.

Counties with a RSE <= 30% were suppressed.

For spatially smoothed data N was the sum of the population in the county and within its adjacent neighbors.

For unsmoothed data, N was the population in the county only.

We limited the analyses to only Medicare Fee-for-service beneficiaries with both Medicare Part A and Part B coverage and who were aged 65 years or older. We excluded beneficiaries enrolled in Medicare Advantage or in Medicare Part A only or Part B only. Maps are based on beneficiary-years, or the number of years people were covered by Medicare Fee-for-service health insurance coverage.

Data for counties with small populations are not displayed. County-level health care costs were suppressed when counties had <20 beneficiaries.

All cost data have been directly age-standardized using the 2000 US Standard Population.

To make Medicare payments across geographic areas comparable, payments have been standardized to remove geographic differences in payment rates for individual services, such as those that account for local wages or input prices.

Costs per capita are the average costs incurred by a Medicare beneficiary with diagnosed heart disease in a given county. Incremental costs are calculated by taking the difference between mean annual costs per capita for beneficiaries with and without diagnosed heart disease for each county.

Learn more about the Health Care Costs data.

  • This tool is available when a map of a single state is being viewed.
  • The Hot Spot Analysis tool calculates the Getis-Ord Gi* statistic for each county displayed on the map. The resultant Z score tells you where counties with either high or low values cluster spatially. This tool works by looking at each county within the context of neighboring counties. A county with a high value is interesting, but may not be a statistically significant hot spot. To be a statistically significant hot spot, a county will have a high value and be surrounded by other counties with high values as well. The local sum for a county and its neighbors is compared proportionally with the sum of all counties. When the local sum is much different than the expected local sum, and that difference is too large to be the result of random chance, a statistically significant Z score results.
  • To determine if the Z score is statistically significant, it is compared with the range of values for a particular confidence level. For example, at a 95% confidence level, a Z score would have to be less than –1.96 or greater than 1.96 to be statistically significant.
  • The higher the confidence level, the stronger the association. For statistically significant positive Z scores, the larger the Z score is, the more intense the clustering of high values (hot spot). For statistically significant negative Z scores, the smaller the Z score is, the more intense the clustering of low values (cold spot). A “Not Significant” value for a county means that the calculated Z score is near zero, and, therefore, has neighbors with a range of values leading to no apparent concentration.
  • Within the Interactive Atlas of Heart Disease and Stroke, the Getis-Ord Gi* is calculated based on a distance that ensures that all counties have at least one neighbor. When running the Hot Spot Analysis for counties within a state, the counties within 100 miles in surrounding states are included in the calculation of clustering, though only those counties within the selected state are shown in the final results.
  • Note: Results are not reliable with less than 30 features. If your map includes less than 30 counties, you should expand the view to include additional areas.
  • Learn more.

A quantile is a set of categories with approximately the same number of items. For most of the maps on this website, data are displayed using quintiles (five equally-sized categories) to help show the spatial distribution of the data on the maps.

The cutpoints that define the quantile categories are shown in the legend for each map.

The distribution of the data values for each map can be displayed using the histogram tool.

Exceptions: Income inequality is displayed using quintiles calculated at the national level; state-specific maps do not display quintiles. Some health services data (e.g., hospitals) are displayed as the number per county.

All county-level heart disease and stroke rates and percentages can be displayed as either “spatially smoothed” or not. Spatial smoothing is a technique used to make estimates more statistically reliable and to generate stable estimates for counties with small populations. County-level data were spatially smoothed using a Local Empirical Bayes algorithm to stabilize risk by borrowing information from neighboring geographic areas.

Reference: Marshall RJ. Mapping disease and mortality rates using Empirical Bayes estimators. Journal of the Royal Statistical Society. 1991;40:283-94.