Using the Data
There are a number of methods for small area estimation (SAE). The multilevel regression modeling with poststratification framework, which was used in the PLACES and 500 Cities Projects, is one methodology that communities might consider when embarking on generating their own small area estimates. Additional information on the methodology is available on this website. Some communities have already generated their own direct survey estimates or small area estimates, and they are encouraged to use their local estimates as their primary data. However, the estimates from the PLACES Project may provide additional insights into the health issues affecting residents for those communities.
The SAE code used in the PLACES Project was developed specifically for the project outcomes, using the entire BRFSS dataset for all states and Washington, DC, and include variables in the model for state and local levels. The use of the code by Washington, DC and other communities may or may not be appropriate without some modification.
In addition, use of the SAE code assumes that the end user has access to geocoded (in the case of PLACES Project this was the county) survey data. Restricted BRFSS data, which includes substate geographic identifiers (county) are available through the Research Data Centers (RDCs) by way of a formal data hosting agreement on a case-by-case basis for research purposes. Learn more about the proposal process.
Unfortunately, CDC does not currently have the capacity or resources to respond to all individual requests for technical assistance on the modeling process, modifying the code, or running special data analyses. Requests for such assistance will have to be handled on a case-by-case basis and will depend on existing resources and workload.
We cannot include policy or program intervention effects, which would occur locally, in the modeling process. Therefore, the estimates for local areas are the statistically expected prevalence of the risk factor, health outcome, or preventive service use based on the associations that we observe through the overall model. It is possible that a community may have a program intervention that has a substantial effect, such that the resulting prevalence of a health risk factor (for example) is lower or preventive service is higher than what is statistically predicted by our model. In that case, if a community relies solely on the small area estimates, the effect of that local intervention would be underestimated. Thus, without reliable local information about public health programs, model-based local estimates should not be used to evaluate the effect of local public health programs, policies, or interventions. We would suggest using the model-based estimates for the baseline and communities conducting their own surveys to evaluate the effect of their interventions.
The data can be used to:
- Identify the health issues facing a local area or neighborhoods.
- Establish key health objectives.
- Develop and implement effective and targeted prevention activities.
Because these are modeled and not direct estimates, the data should not be used for ranking the overall health of any county, place, census tract, or ZCTA. The PLACES Project does not provide a weighted composite score for the included counties, places, census tracts, or ZCTAs; therefore, the data should not be used to rank the overall health of a local area. However, counties, places, census tracts, or ZCTAs can be compared on individual measures.
The current modeling procedure does not support using the estimates to track changes at the local level over time.
Estimates depend on two main components: 1) the survey responses in the given survey year; and 2) the detailed population distribution within the local area. Because we use the 2010 US Census as the poststratification dataset for estimates at place, census tract and ZCTA levels, we cannot incorporate year-to-year population change in the modeled results. So the assumption for any given point-in-time estimate is that the place, census tract and ZCTA population in that year is the same as it was measured in 2010. For county level estimates, we do use census annual population estimates as the poststratification dataset, but the time is not included in the model as a variable, therefore the data could not be used to access trend for county.
The modeling process uses individual-level responses, including age, race/ethnicity, sex, and education, along with county-level poverty and county- and state-level contextual effects (random effects) to estimate the probability of developing an outcome. Therefore, in a secondary analysis, we would recommend adjusting for them in a model and their associations with the outcome are expected. See Amanda Y. Kong and Xingyou Zhang, “The Use of Small Area Estimates in Place-Based Health Researchexternal icon,” American Journal of Public Health 2020;110(6):829–832, for a more detailed discussion.