Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to page options Skip directly to site content

Data Questions

It sounds like the data will only allow for ecological-level analyses and individual-level indicators will not be available. Is that accurate?

Correct. We are not producing estimates for individuals, only census tracts and cities. There is one estimate per measure for the entire population of each census tract and of each city. The modeling process uses individual-level responses, and includes county-and state-level contextual effects (fixed and random) to estimate the probability of developing an outcome at the individual level, given their age, race/ethnicity, sex, education, and county-level poverty. We will apply these probabilities to the target population (e.g., city or census tract) to derive the estimated prevalence. So, the Project uses a combination of individual characteristics and responses, as well as county and state context.

Are the measures only presented one map per measure or will there be levels for age group, gender, race, or poverty?

There is one estimate per measure for the entire population of each census tract and of each city.

Will the data also be available as GIS-ready, downloadable files for those who would like to use it with other data sources?

It is a priority for CDC to ensure that the data are made available in a manner that facilitates local use; therefore, we have provided “GIS friendly” data files for the city-level estimates and for the census tract-level estimates. In addition, we have provided the GIS boundary files (also known as shapefiles) that enable users of geographic information systems to create their own maps. The files can be accessed at the Chronic Data Portal.

Why are there so many data classes?

For the static maps that CDC made available, we based the data classifications for the maps on the entire dataset and chose to represent nine classes of data, helping to ensure that there is perceived geographic variations in the maps of any particular city. For some cities where the range of estimated prevalence at the census tract is narrow, the resulting map may only have two or three classes. For other cities with a greater range in data estimates at the tract level, there are more classes mapped. If we had chosen fewer classes, overall, it would be possible for an individual city map to have had only one data class. The data class breaks are the same across all maps, thus facilitating map-to-map comparisons.

How will uncertainty of margins of error be represented in the data?

Confidence intervals are presented alongside the data estimates.

Will you share measures from the validity studies, for example sensitivity or specificity?

Details on the validation are available in the 2015 AJE paper and in the 2017 PCD paper that is cited on the 500 Cities website. Sensitivity and specificity analyses were not applicable to this type of modeling procedure and thus were not conducted.

Will the measures include confidence levels?


Will there be individual-level data with tract level geocodes available?

There are no individual-level data. The data estimates are aggregated to the census tract and the city levels.

How will SAE be impacted for a city if the population characteristics of that city are very different than the rest of the state?

The SAE for each city is dependent mainly upon the demographic characteristics of that city, but they also are affected by the county- and state-level context that was included as random effects in the modeling procedure.

BRFSS does not have census tract ID. Do you have to make an assumption that there is no variation across census tract?

We cannot include census tract as a random effect. However, we do not assume that there is no variation across census tracts. In the prediction step, we incorporate tract-level poverty; in addition, differences in the population demographics of the blocks that make up the census tracts are also considered in the prediction step.

Why did you use county-level poverty for the small area estimate? Is there no stable, census tract-level poverty data available?

County-level poverty was used in the first step of the modeling procedure, because that is the smallest geographic level that corresponds to the geocode available for the BRFSS survey respondent. In the prediction step of the modeling process we do use census-tract poverty estimates.

Can you communicate results at a neighborhood level for those unfamiliar with their census tract?

We hope that individual cities, more familiar with local definitions and conceptualizations of neighborhoods, may make use of these data in their own public health and outreach efforts. It would be technically possible, for instance, for a city to download their data from 500 Cities and incorporate the data in a local website—perhaps even GIS-enable maps that include overlays of the boundaries of local neighborhoods as defined by the community. The CDC 500 Cities interactive mapping application includes a feature that allows users to change the basemap (e.g., aerial imagery, streetmaps) and to increase the data layer transparency to enable them to “see” areas of interest within the city. This provides an improved spatial context for understanding where perceived geographic variations in a measure are taking place.

Can you clarify how "preventive measures" listed on the CDC website are connected to goals of RWJF and the 500 Cities Project?

The preventive measures and core unhealthy behaviors were selected based on the following factors:

  • Amenable to public health intervention.
  • Reflect public health priorities to address leading causes of morbidity and mortality.
  • Uses preventive services to be consistent with US Preventive Services Task Force recommendations.
  • Exhibit substantial, meaningful variation at the city and census-tract level.
  • Can be estimated for small area levels from existing, regularly-collected surveillance data—BRFSS.
  • Fills a niche for health data at the city- and census-tract level, which are not presently available, although not duplicating health-related data that are available elsewhere.
  • Compliments similar state-level measures that are available elsewhere.