Data Questions

Correct. We are not producing estimates for individuals, only census tracts and cities. There is one estimate per measure for the entire population of each census tract and of each city. The modeling process uses individual-level responses, and includes county-and state-level contextual effects (fixed and random) to estimate the probability of developing an outcome at the individual level, given their age, race/ethnicity, sex, education, and county-level poverty. We will apply these probabilities to the target population (e.g., city or census tract) to derive the estimated prevalence. So, the Project uses a combination of individual characteristics and responses, as well as county and state context.

There is one estimate per measure for the entire population of each census tract and of each city.

It is a priority for CDC to ensure that the data are made available in a manner that facilitates local use; therefore, we have provided “GIS friendly” data files for the city-level estimates and for the census tract-level estimates. In addition, we have provided the GIS boundary files (also known as shapefiles) that enable users of geographic information systems to create their own maps. The files can be accessed at the Chronic Data Portal.

For the static maps that CDC made available, we based the data classifications for the maps on the entire dataset and chose to represent nine classes of data, helping to ensure that there is perceived geographic variations in the maps of any particular city. For some cities where the range of estimated prevalence at the census tract is narrow, the resulting map may only have two or three classes. For other cities with a greater range in data estimates at the tract level, there are more classes mapped. If we had chosen fewer classes, overall, it would be possible for an individual city map to have had only one data class. The data class breaks are the same across all maps, thus facilitating map-to-map comparisons.

Confidence intervals are presented alongside the data estimates.

Details on the validation are available in the 2015 AJE paper and in the 2017 PCD paper that is cited on the 500 Cities website. Sensitivity and specificity analyses were not applicable to this type of modeling procedure and thus were not conducted.


There are no individual-level data. The data estimates are aggregated to the census tract and the city levels.

The SAE for each city is dependent mainly upon the demographic characteristics of that city, but they also are affected by the county- and state-level context that was included as random effects in the modeling procedure.

We cannot include census tract as a random effect. However, we do not assume that there is no variation across census tracts. In the prediction step, we incorporate tract-level poverty; in addition, differences in the population demographics of the blocks that make up the census tracts are also considered in the prediction step.

County-level poverty was used in the first step of the modeling procedure, because that is the smallest geographic level that corresponds to the geocode available for the BRFSS survey respondent. In the prediction step of the modeling process we do use census-tract poverty estimates.

We hope that individual cities, more familiar with local definitions and conceptualizations of neighborhoods, may make use of these data in their own public health and outreach efforts. It would be technically possible, for instance, for a city to download their data from 500 Cities and incorporate the data in a local website—perhaps even GIS-enable maps that include overlays of the boundaries of local neighborhoods as defined by the community. The CDC 500 Cities interactive mapping application includes a feature that allows users to change the basemap (e.g., aerial imagery, streetmaps) and to increase the data layer transparency to enable them to “see” areas of interest within the city. This provides an improved spatial context for understanding where perceived geographic variations in a measure are taking place.

The preventive measures and core unhealthy behaviors were selected based on the following factors:

  • Amenable to public health intervention.
  • Reflect public health priorities to address leading causes of morbidity and mortality.
  • Uses preventive services to be consistent with US Preventive Services Task Force recommendations.
  • Exhibit substantial, meaningful variation at the city and census-tract level.
  • Can be estimated for small area levels from existing, regularly-collected surveillance data—BRFSS.
  • Fills a niche for health data at the city- and census-tract level, which are not presently available, although not duplicating health-related data that are available elsewhere.
  • Compliments similar state-level measures that are available elsewhere.