BRFSS Maps: Methods and Frequently Asked Questions (FAQs)

BRFSS Maps is an interactive mapping application that graphically displays the prevalence of behavioral risk factors at the state and MMSA level. Using GIS (geographic information systems) mapping technology and BRFSS data, it allows users to visually compare prevalence data for states, territories, and local areas. Features include multiple data classification methods, map panning and zooming, related prevalence tables, downloadable map images, and the capability to download the BRFSS data in a GIS shapefile data format for more detailed analysis.

The information for this site is obtained from the Behavior Risk Factor Surveillance System (BRFSS) and its Selected Metropolitan/Micropolitan Area Risk Trends (SMART). Metropolitan/micropolitan statistical area (MMSA) populations were obtained from the U.S. Census Bureau – 2000 Census. For more information, see the BRFSS FAQs and SMART BRFSS FAQs. To access the data and corresponding documentation, see the BRFSS Annual Survey Data or the SMART: BRFSS City and County Data and Documentation.

Yes, you can download the BRFSS data in a GIS shapefile data format. To download the entire data file for a given year, visit the GIS Data and Documentation page. From the Download GIS Data and Documentation page, you may select a data year. Also see the BRFSS suggested citation.

MMSAs with at least 500 completed interviews in the BRFSS data were selected for inclusion in this project. The MMSAs included in the project met certain weighting criteria for a given year. Some MMSAs, especially micropolitan areas, may not be able to attain a large enough sample size to be included every year.

Contact your state health department using the information available on the BRFSS state coordinators page.

Some MMSAs are geographically large, comprising many cities and counties. The circle representing an MMSA on the map is placed at the MMSA’s geographic center, or centroid, occasionally placing the circle outside the actual city for which the MMSA is named.

For example, the circle representing the Washington-Arlington-Alexandria, DC-VA-MD-WV MMSA is not located in the map for the District of Columbia, but is located in northern Virginia, because that is the centroid of this statistical area. Additionally, because the Washington-Arlington-Alexandria, DC-VA-MD-WV MMSA encompasses a different and larger area than the actual boundaries of the District itself, the prevalence data for the metropolitan division and the district itself will be different.

Yes. By default, the original map that is displayed after selecting a question/answer combination includes only the 50 states and the District of Columbia (DC). The data classification for this map is based only on the data estimates for those 50 states and Washington, DC. When the Outlying Territories are selected for display, the map refreshes and the data for the three territories are included in the dataset. The data are then classified and displayed based on the 54 state (including Washington, DC) and territory data estimates. Conversely, when the Outlying Territories are deselected for display, the data classification reverts back to a 51-state (including Washington, DC) dataset.

The Information and Print/Save Map windows are pop-ups. Pop-ups must be enabled in your browser, or your pop-up-blocking software must be disabled.

The maps were created by merging BRFSS data, in database format, with geographic boundary files, called shapefiles. In this manner, the statistical data in the BRFSS database are spatially referenced with their associated administrative boundaries (e.g., states and MMSAs). This permits the data to be mapped and seen. Users can specify the number of data classes into which the data are categorized, as well as the statistical method of determining the class break values (e.g., equal-interval, quantiles, natural breaks, and standard deviations).

Several different map projections were used to present the information in BRFSS Maps.

Maps of the Continental United States, Alaska, and Hawaii are projected to the Albers Equal-Area (Continental United States) projection.
Puerto Rico was projected to the Albers Equal-Area (North America) projection.
Guam was projected to the World Mercator projection.

Alaska, Hawaii, Puerto Rico, and Guam are not in the same geographic scale relative to each other, nor to the continental United States in these maps.

Each method provided in the BRFSS Maps section enables the user to choose the data classification method that they feel is most appropriate. There is no single best data classification method; each classification method has advantages and disadvantages. When creating a map, the map user should consider the purpose of the map, the data distribution (if known), and the knowledge level (i.e., mapping and statistical awareness) of the intended audience. The following are brief descriptions of the four data classification methods available to users of the SMART and BRFSS data used in the BRFSS Map application.

Equal-interval: In equal-interval classifications, the data ranges for all classes are the same. In other words, the range of the entire dataset is divided by the desired number of data classes, such that each class occupies an equal interval along the range of data values. The major advantage of the equal-interval classification is that the resulting equal intervals may be easy for many map users to interpret. The major disadvantage of the equal-interval classification is that the data distribution is not considered when determining class breaks for the intervals (only the lower and upper data values are used).

Quantiles: In quantile classifications, an equal number of observations are placed in each class. For example, if there are 50 observations, 10 observations would be placed in each class of a five-class (quintile) quantile map. The data are first rank-ordered, and then the appropriate observations are assigned to each class (class 1, class 2, class 3, etc.). The number of classes also determines the specific type of quantile map (three classes = tertile; four classes = quartile; five classes = quintile). Two major advantages of the quantile classification are that it is useful for ordinal data (because the data are rank-ordered) and that it can help facilitate map comparisons (as long as the same number of classifications is used for all maps). The major disadvantage of the quantile classification is that it does not consider how the data are distributed. Therefore, if the data have a highly skewed distribution (e.g., many outliers) this classification will force data observations into the same class (either the lowest or highest, in this case) where this may not be appropriate; as a result, the quantile classification may give a false impression that there is a relatively normal data distribution.

Standard Deviations: In standard deviations classifications, the data are assigned to classes based on where they fall relative to the mean and standard deviations of the data distribution. The major advantage of this classification method is that by using the mean as a dividing point, a contrast of values above and below the mean is readily seen. This method only works well for a dataset that is normally distributed. An even number of classes should be used, such that the mean of the data serves as the dividing point between an even number of classes above and below the mean. The major disadvantage of the standard deviations classification is that it requires a basic understanding of statistical concepts, and hence may be difficult for some map users to interpret.

Natural Breaks: In this classification method (also variously known as Optimal Breaks and Jenks’ Method), the data are assigned to classes based upon their position along the data distribution relative to all other data values. This classification uses an iterative algorithm to optimally assign data to classes such that the variances within all classes are minimized, while the variances among classes are maximized. In this manner, the data distribution is explicitly considered for determining class breaks; this is the major advantage of the Natural Breaks classification method. The major disadvantage is that the concept behind the classification may not be easily understood by all map users, and the legend values for the class breaks (e.g., the data ranges) may not be intuitive.

The values that are entered in the text boxes correspond to the upper value of each data class. For example, assume that the lowest data value for a particular dataset is “15.0.” If you enter “25.0” in the first text box and “35.0” in the second text box, then the data ranges for the two lowest data classes will be “15.0 to 25.0” and “25.1 to 35.0.” The number in the rightmost text box is automatically entered and cannot be manually changed; this value corresponds to the maximum value for the dataset, which by default corresponds to the upper value of the last data class.

Class break values are determined for the states’ data and are concurrently applied to the states’ and MMSAs’ data. Therefore, when displayed simultaneously, the class breaks are the same for both states and MMSAs. The exception to this rule is for the last class (e.g., the fifth class in a five-class map): the upper values for each dataset are usually different; therefore, the data range for the last class varies between the states and the MMSAs. However, the lower data value for states and MMSAs in that class are identical. The opposite applies for the first (lowest) data class. In this case, the lower data values usually differ between states and MMSAs; however, the class break point is identical for the upper range of the first class. To use one legend for both states and MMSAs, the values for the first class are indicated as, for example, “≤ 23.0” (the upper value of the lowest class). By inference, the lowest values for the states and MMSAs are lower than this stated value, but are not specifically depicted in the legend. Similarly, the legend label for the last class is, for example, “≥ 55.2” (the lower value of the last class). By inference, the highest values for the states and MMSAs are higher than this stated value, but are not specifically depicted in the legend. The reason that class breaks for both states and MMSAs are determined based upon the dataset for the states is that in the majority of question and answer combinations, the range of the MMSAs prevalence estimates is greater than that for the states. If the class breaks were determined for the dataset with the greater range (i.e., MMSAs), it is conceivable that when these class breaks are applied to the states’ data, there may be classes to which no states are assigned (usually the first and the last classes). In order to avoid this issue, the breaks are determined for the states’ data, and then applied to the MMSAs. This usually results in a map in which MMSAs and states are assigned to and depicted for all data classes.

Color schemes were chosen based upon the number of data classes, the types of data being mapped (e.g., sequential or diverging data), consideration of the display devices to be used for the resulting maps (e.g., computer CRT monitor, computer LCD monitor, LCD projector, and print copy), and the need to avoid colors that cannot be differentiated by individuals with impaired color-vision (e.g., red-green color combinations).

The two color schemes for the BRFSS Maps were selected by consulting ColorBrewer, an online tool for selecting color schemes.

The color scheme chosen for natural breaks, quantile, and equal interval maps is the Sequential Oranges scheme. This scheme works well with ordinal, interval, and ratio data, such as the prevalence data in the BRFSS. The color scheme chosen for the standard deviation maps is the Diverging Purple-Orange scheme. This scheme emphasizes the natural midpoint of a diverging dataset (e.g., the mean) and the diverging values from the mean (e.g., positive and negative standard deviations). The color schemes are automatically selected based upon the user’s statistical classification method selection for categorizing the data.

The following Web sites feature reference maps with background data. You can use these resources to compare other sociodemographic data to health statistics.

NationalAtlas.gov includes maps at the county, state, and other geographic levels for a variety of data including income, crime rates, cancer mortality, election results, race/ethnicity, population density, etc.

American Factfinder shows 2000 Census (race, household size, Hispanic ethnicity, age, sex, household and family structure, income, education, commuting, ancestry, etc.) and other Census Bureau data sources via both reference and thematic maps. Maps can be created at the block, census tract, county, MSA, city, ZIP code equivalent, state, and other geographic levels.

BRFSS Maps: Methods and Frequently Asked Questions (FAQs)

What is the BRFSS Maps application?

What are the data sources for this site?

Can I download the GIS data files?

How were the MMSAs selected? Why are different MMSAs available for different years?

Whom do I contact for more information about my state?

The circle representing a certain MMSA falls outside the geographic boundary of its namesake city. Why is this?

Does selecting/deselecting the display of “Outlying Territories” affect how the data are classified for display in the map?

When I click the Information icon or the Print/Save Map link, nothing happens. What can I do to address this?

Describe the methodology used to create the maps.

How were the maps projected?

How should one choose a data classification method?

How do I enter the break points for the “Custom” class breaks method?

Because there are two datasets being simultaneously mapped (i.e., states and MMSAs), how do I interpret the legend, and to which dataset are the class breaks assigned?

How were the color schemes chosen for the maps?

Where can I find additional GIS resources?