Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to site content Skip directly to page options
CDC Home
Contact Us:
  • Division of Scientific Education and Professional Development
    1600 Clifton Rd
    Mailstop E-92
    Atlanta, GA 30333
    Contact DSEPD
  • 800-CDC-INFO
    TTY: (888) 232-6348
  • Contact CDC–INFO

Lesson 2: Summarizing Data

Section 1: Organizing Data

Whether you are conducting routine surveillance, investigating an outbreak, or conducting a study, you must first compile information in an organized manner. One common method is to create a line list or line listing. Table 2.1 is a typical line listing from an epidemiologic investigation of an apparent cluster of hepatitis A.

A variable can be any characteristic that differs from person to person, such as height, sex, smallpox vaccination status, or physical activity pattern. The value of a variable is the number or descriptor that applies to a particular person, such as 5'6" (168 cm), female, and never vaccinated.

The line listing is one type of epidemiologic database, and is organized like a spreadsheet with rows and columns. Typically, each row is called a record or observation and represents one person or case of disease. Each column is called a variable and contains information about one characteristic of the individual, such as race or date of birth. The first column or variable of an epidemiologic database usually contains the person's name, initials, or identification number. Other columns might contain demographic information, clinical details, and exposures possibly related to illness.

Table 2.1 Line Listing of Hepatitis A Cases, County Health Department, January — February 2004

ID Date of
Town Age (Years) Sex Hosp Jaundice Outbreak IV Drugs IgM Pos Highest ALT*
01 01/05 B 74 M Y N N N Y 232
02 01/06 J 29 M N Y N Y Y 285
03 01/08 K 37 M Y Y N N Y 3250
04 01/19 J 3 F N N N N Y 1100
05 01/30 C 39 M N Y N N Y 4146
06 02/02 D 23 M Y Y N Y Y 1271
07 02/03 F 19 M Y Y N N Y 300
08 02/05 I 44 M N Y N N Y 766
09 02/19 G 28 M Y N N Y Y 23
10 02/22 E 29 F N Y Y N Y 543
11 02/23 A 21 F Y Y Y N Y 1897
12 02/24 H 43 M N Y Y N Y 1220
13 02/26 B 49 F N N N N Y 644
14 02/26 H 42 F N N Y N Y 2581
15 02/27 E 59 F Y Y Y N Y 2892
16 02/27 E 18 M Y N Y N Y 814
17 02/27 A 19 M N Y Y N Y 2812
18 02/28 E 63 F Y Y Y N Y 4218
19 02/28 E 61 F Y Y Y N Y 3410
20 02/29 A 40 M N Y Y N Y 4297

* ALT = Alanine aminotransferase


Some epidemiologic databases, such as line listings for a small cluster of disease, may have only a few rows (records) and a limited number of columns (variables). Such small line listings are sometimes maintained by hand on a single sheet of paper. Other databases, such as birth or death records for the entire country, might have thousands of records and hundreds of variables and are best handled with a computer. However, even when records are computerized, a line listing with key variables is often printed to facilitate review of the data.

Epi InfoIcon of the Epi Info computer software developed at CDC

One computer software package that is widely used by epidemiologists to manage data is Epi Info, a free package developed at CDC. Epi Info allows the user to design a questionnaire, enter data right into the questionnaire, edit the data, and analyze the data. Two versions are available:

Epi Info 3 (formerly Epi Info 2000 or Epi Info 2002) is Windows-based, and continues to be supported and upgraded. It is the recommended version and can be downloaded from the CDC website:

Epi Info 6 is DOS-based, widely used, but being phased out.

This lesson includes Epi Info commands for creating frequency distributions and calculating some of the measures of central location and spread described in the lesson. Since Epi Info 3 is the recommended version, only commands for this version are provided in the text; corresponding commands for Epi Info 6 are offered at the end of the lesson. The U.S. Government's Official Web PortalDepartment of Health and Human Services
Centers for Disease Control and Prevention   1600 Clifton Road Atlanta, GA 30329-4027, USA
800-CDC-INFO (800-232-4636) TTY: (888) 232-6348 - Contact CDC–INFO
A-Z Index
  1. A
  2. B
  3. C
  4. D
  5. E
  6. F
  7. G
  8. H
  9. I
  10. J
  11. K
  12. L
  13. M
  14. N
  15. O
  16. P
  17. Q
  18. R
  19. S
  20. T
  21. U
  22. V
  23. W
  24. X
  25. Y
  26. Z
  27. #