Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to page options Skip directly to site content

Lesson 2: Summarizing Data

Contact Us:
  • Division of Scientific Education and Professional Development
    1600 Clifton Rd
    Mailstop E-92
    Atlanta, GA 30333
    Contact DSEPD
  • 800-CDC-INFO
    (800-232-4636)
    TTY: (888) 232-6348
  • Contact CDC–INFO

Pencil graphicExercise Answers

Exercise 2.1

  1. C
  2. A
  3. D
  4. A
  5. D

Exercise 2.2

Previous YearsFrequency
Total19
02
15
24
33
41
51
61
70
81
90
100
110
121

Exercise 2.3

  1. Create frequency distribution (done in Exercise 2.2, above)
  2. Identify the value that occurs most often.
    Most common value is 1, so mode is 1 previous vaccination.

Exercise 2.4

  1. Arrange the observations in increasing order.
    0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 6, 8, 12
  2. Find the middle position of the distribution with 19 observations.
    Middle position = (19 + 1) ⁄ 2 = 10
  3. Identify the value at the middle position.
    0, 0, 1, 1, 1, 1, 1, 2, 2, *2*, 2, 3, 3, 3, 4, 5, 6, 8, 12
    Counting from the left or right to the 10th position, the value is 2. So the median = 2 previous vaccinations.

Exercise 2.5

  1. Add all of the observed values in the distribution.
    2 + 0 + 3 + 1 + 0 + 1 + 2 + 2 + 4 + 8 + 1 + 3 + 3 + 12 + 1 + 6 + 2 + 5 + 1 = 57
  2. Divide the sum by the number of observations
    57 ⁄ 19 = 3.0

    So the mean is 3.0 previous vaccinations

Exercise 2.6

Using Method A:

  1. Take the log (in this case, to base 2) of each value.
    ID #ConvalescentLog base 2
    11:5129
    21:5129
    31:1287
    41:5129
    51:102410
    61:102410
    71:204811
    81:1287
    91:409612
    101:102410
  2. Calculate the mean of the log values by summing and dividing by the number of observations (10).
    Mean of log2(xi) = (9 + 9 + 7 + 9 + 10 + 10 + 11 + 7 + 12 + 10) ⁄ 10 = 94 ⁄ 10 = 9.4
  3. Take the antilog of the mean of the log values to get the geometric mean.
    Antilog2(9.4) = 29.4 = 675.59. Therefore, the geometric mean dilution titer is 1:675.6.

Exercise 2.7

  1. E or A; equal number of patients in 1999 and 1998.
  2. C or B; mean and median are very close, so either would be acceptable.
  3. E or A; for a nominal variable, the most frequent category is the mode.
  4. D
  5. B; mean is skewed, so median is better choice.
  6. B; mean is skewed, so median is better choice.

Exercise 2.8

  1. Arrange the observations in increasing order.
    0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5, 6, 8, 12
  2. Find the position of the 1st and 3rd quartiles. Note that the distribution has 19 observations.
    Position of Q1 = (n + 1) ⁄ 4 = (19 + 1) ⁄ 4 = 5
    Position of Q3 = 3(n + 1) ⁄ 4 = 3(19 + 1) ⁄ 4 = 15
  3. Identify the value of the 1st and 3rd quartiles.
    Value at Q1 (position 5) = 1
    Value at Q3 (position 15) = 4
  4. Calculate the interquartile range as Q3 minus Q1.
    Interquartile range = 4 − 1 = 3
  5. The median (at position 10) is 2. Note that the distance between Q1 and the median is 2 − 1 = 1. The distance between Q3 and the median is 4 − 2 = 2. This indicates that the vaccination data is skewed slightly to the right (tail points to greater number of previous vaccinations).

Exercise 2.9

  1. Calculate the arithmetic mean.
    Mean = (2 + 0 + 3 + 1 + 0 + 1 + 2 + 2 + 4 + 8 + 1 + 3 + 3 + 12 + 1 + 6 + 2 + 5 + 1) ⁄ 19
    = 57 ⁄ 19
    = 3.0
  2. Subtract the mean from each observation. Square the difference.
  3. Sum the squared differences.
    Value Minus MeanDifferenceDifference Squared
    2 − 3.0−1.01.0
    0 − 3.0−3.09.0
    3 − 3.00.00.0
    1 − 3.0−2.04.0
    0 − 3.0− 3.09.0
    1 − 3.0−2.04.0
    2 − 3.0−1.01.0
    2 − 3.0−1.01.0
    4 − 3.01.01.0
    8 −3.05.025.0
    1 − 3.0−2.04.0
    57 − 57.0 = 00.0162.0
  4. Divide the sum of the squared differences by n − 1.
    Variance = 162 ⁄ (19 − 1) = 162 ⁄ 18 = 9.0 previous vaccinations squared
  5. Take the square root of the variance. This is the standard deviation.
    Standard deviation = 9.0 = 3.0 previous vaccinations

Exercise 2.10

Standard error of the mean = 42 divided by the square root of 4,462 = 0.629

Exercise 2.11

  1. Summarize the blood level data with a frequency distribution.
    Table 2.14 Frequency Distribution (1:g/dL Intervals) of Blood Lead Levels — Rural Village, 1996 (Intervals with No Observations Not Shown)
    Blood Lead Level (g/dL)Frequency
    171
    262
    351
    381
    391
    441
    451
    461
    491
    501
    541
    561
    Blood Lead Level (g/dL)Frequency
    572
    583
    611
    631
    641
    671
    681
    691
    721
    731
    741
    Blood Lead Level (g/dL)Frequency
    762
    783
    791
    841
    861
    1031
    1041
    Unknown48
    To summarize the data further you could use intervals of 5, 10, or perhaps even 20 mcg/dL. Table 2.15 below uses 10 mcg/dL intervals.
    Table 2.15 Frequency Distribution (10 mcg/dL Intervals) of Blood Lead Levels — Rural Village, 1996
    Blood Lead Level (g/dL)Frequency
    0–90
    10–191
    20–292
    30–393
    40–496
    50–598
    60–696
    70–799
    80–892
    90–990
    100–1102
    Total39
  2. Calculate the arithmetic mean.
    Arithmetic mean = sum ⁄ n = 2,363 ⁄ 39 = 60.6 mcg/dL
  3. Identify the median and interquartile range.
    Median at (39 + 1) ⁄ 2 = 20th position. Median = value at 20th position = 58
    Q1 at (39 + 1) ⁄ 4 = 10th position. Q1 = value at 10th position = 48
    Q3 at 3 × Q1 position = 30th position. Q3 = value at 30th position = 76
  4. Calculate the standard deviation.
    Square of sum = 2,3632 = 5,583,769
    Sum of squares × n = 157,743 × 39 = 6,157,977
    Difference = 6,151,977 − 5,583,769 = 568,208
    Variance = 568,208 ⁄ (39 × 38) = 383.4062
    Standard deviation = square root (383.4062) = 19.58 mcg/dL
  5. Calculate the geometric mean using the log lead levels provided.
    Geometric mean = 10(68.45 ⁄ 39) = 10(1.7551) = 56.9 mcg/dL

Top