Lesson 4: Displaying Public Health Data

Answers to Self-Assessment Quiz

  1. B, C, D. Tables and graphs are important tools for summarizing, analyzing, and presenting data. While data are occasionally collected using a table (for example, counting observations by putting tick marks into particular cells in table), this is not a common epidemiologic practice.
  2. A, B, C, D. A table in a printed publication should be self-explanatory. If a table is taken out of its original context, it should still convey all the information necessary for the reader to understand the data. Therefore, a table should include, in addition to the data, a proper title, row and column labels, source of the data, and footnotes that explain abbreviations, symbols, and exclusions, if any. Tables generally present the data, while the accompanying text of the report may contain an explanation of key findings.
  3. B (False). Rounding that results in totals of 99.9% or 100.1% is common in tables that show percentages. Nonetheless, the total percentage should be displayed as 100.0%, and a footnote explaining that the difference is due to rounding should be included.
  4. C. In the two-by-two table presented in Question 4, the total number of cases is shown as the total of the left column (labeled “Cases”). That column total number is 25.
  5. D. A table shell is the skeleton of a table, complete with titles and labels, but without the data. It is created when designing the analysis phase of an investigation. Table shells help guide what data to collect and how to analyze the data.
  6. B. Creation of table shells should be part of the overall study plan or protocol. Creation of table shells requires the investigator to decide how to analyze the data, which dictates what questions should be asked on the questionnaire.
  7. A, B, C, D, E. All of the methods listed are in Question 6 are appropriate and commonly used by epidemiologists
  8. B (False). The number of observations with missing values is important when interpreting the data, particularly for making generalizations.
  9. B (False). The limits of the class intervals must not overlap. For example, would a 70-year-old be counted in the 65–70 category or in the 70–75 category?
  10. A (True). In general, before you create a graph, you should observe the data in a table. By reviewing the data in the table, you can anticipate the range of values that must be covered by the axes of a graph. You can also get a sense of the patterns in the data, so you can anticipate what the graph should look like.
  11. B, C. On an arithmetic-scale line graph, the axes and tick marks should be clearly labeled. For both the x- and y-axis, a particular distance anywhere along the axis should represent the same increase in quantity, although the x- and y-axis usually differ in what is measured. The y-axis, measuring frequency, should begin at zero. But the x-axis, which often measures time, need not start at zero.
    1. B. One of the key advantages of a semilogarithmic-scale line graph is that it can display a wide range of values clearly.
    2. A. A starting value of, say, 100,000 and a constant rate of change of, say, 10%, would result in observations of 100,000, 110,000, 121,000, 133,100, 146, 410, 161,051, etc. The resulting plotted line on an arithmetic-scale line graph would curve upwards. The resulting plotted line on a semilogarithmic-scale line graph would be a straight line.
    3. B. Values of 0.1, 1,10, and 100 represent orders of magnitude typical of the y-axis of a semilogarithmic-scale line graph.
    4. C. Both arithmetic-scale and semilogarithmic-scale line graphs can be used to plot numbers or rates.
    1. B. A bar chart is used to graph the frequency of events of a categorical variable such as sex, or geographic region.
    2. C. The columns of either a histogram or a bar chart can be shaded to distinguish subgroups. Note that a bar chart with shaded subgroups is called a stacked bar chart.
    3. A. A histogram is used to graph the frequency of events of a continuous variable such as time.
    4. A. An epidemic curve is a particular type of histogram in which the number of cases (on the y-axis) that occur during an outbreak or epidemic are graphed over time (on the x-axis).
  12. C. A typical population pyramid usually displays the youngest age group at the bottom and the oldest age group at the top, with males on one side and females on the other side. A young population would therefore have a wide bar at the bottom with gradually narrowing bars above.
  13. A, B. A frequency polygon differs from a line graph in that a frequency polygon represents a frequency distribution, with the area under the curve proportionate to the frequency. Because the total area must represent 100%, the ends of the frequency polygon must be closed. Although a line graph is commonly used to display frequencies over time, a frequency polygon can display the frequency distribution of a given period of time as well. Similarly, the y-axis of both types of graph can measure percentages.
    1. C. The y-axis of both cumulative frequency curves and survival curves typically display percentages from 0% at the bottom to 100% at the top. The main difference is that a cumulative frequency curve begins at 0% and increases, whereas a survival curve begins at 100% and decreases.
    2. B. Because a survival curve begins at 100%, the plotted curve begins at the top of the y-axis and at the beginning time interval (sometimes referred to as time-zero) of the x-axis, i.e., in the upper left corner.
    3. A. Because a cumulative frequency curve begins at 0%., the plotted curve begins at the base of the y-axis and at the beginning time interval (sometimes referred to as time-zero) of the x-axis, i.e., in the lower left corner.
    4. C. Because the y-axis represents proportions, a horizontal line drawn from the 50% tick mark to the plotted curve will indicate 50% survival or 50% cumulative frequency. The median is another name for the 50% mark of a distribution of data.
  14. A, C. A scatter diagram graphs simultaneous data points of two continuous variables for individuals or communities. Drug levels, infant mortality, and mean annual income are all examples of continuous variables. Eye color, at least as presented in the question, is a categorical variable.
  15. D. A frequency distribution, one-variable table, pie chart, and simple bar chart are all used to display the frequency of categories of a single variable. A scatter diagram requires two variables.
  16. B. A scatter diagram graphs simultaneous data points of two continuous variables for individuals or communities; whereas a dot plot graphs data points of a continuous variable according to categories of a second, categorical variable.
  17. B (False). The spots on a spot map usually reflect one or more cases, i.e., numbers. The shading on an area map may represent numbers, proportions, rates, or other measures.
  18. B (False). Shading should be consistent with frequency. So rather than using different colors of the same intensity, increasing shades of the same color or family of colors should be used.
  19. B (False). The primary purpose of any visual is to communicate information clearly. 3-D columns, bars, and pies may have pizzazz, but they rarely help communicate information, and sometimes they mislead.
  20. A (False). The difference between a stacked bar chart and a 100% component bar chart is that the bars of a 100% component bar chart are all pulled to the top of the y-axis (100%). The units on the x-axis are the same.
  21. D. Any bar chart can be oriented vertically or horizontally. The creator of the chart can choose, and often does so on the basis of consistency with other graphs in a series, opinion about which orientation looks better or fits better, and whether the labels fit adequately below vertical bars or need to placed beside horizontal bars.
    1. B, C. Both line graphs and histograms are commonly used to graph numbers of cases over time. Line graphs are commonly used to graph secular trends over longer time periods; histograms are often used to graph cases over a short period of observation, such as during an epidemic.
    2. A. A grouped bar chart (or a stacked bar chart) is ideal for graphing frequency over two categorical variables. A pie chart is used for a single variable.
    3. D. A pie chart (or a simple bar chart) is used for graphing the frequency of categories of a single categorical variable such as breed of dog.
    4. C. Rates over time are traditionally plotted by using a line graph.