Lesson 2: Summarizing Data

Section 1: Organizing Data

Whether you are conducting routine surveillance, investigating an outbreak, or conducting a study, you must first compile information in an organized manner. One common method is to create a line list or line listing. Table 2.1 is a typical line listing from an epidemiologic investigation of an apparent cluster of hepatitis A.

A variable can be any characteristic that differs from person to person, such as height, sex, smallpox vaccination status, or physical activity pattern. The value of a variable is the number or descriptor that applies to a particular person, such as 5’6″ (168 cm), female, and never vaccinated.

The line listing is one type of epidemiologic database, and is organized like a spreadsheet with rows and columns. Typically, each row is called a record or observation and represents one person or case of disease. Each column is called a variable and contains information about one characteristic of the individual, such as race or date of birth. The first column or variable of an epidemiologic database usually contains the person’s name, initials, or identification number. Other columns might contain demographic information, clinical details, and exposures possibly related to illness.


Table 2.1 Line Listing of Hepatitis A Cases, County Health Department, January — February 2004

ID Date of
Town Age
Sex Hosp Jaundice Outbreak IV Drugs IgM Pos