Lesson 2: Summarizing Data
Section 2: Types of Variables
Look again at the variables (columns) and values (individual entries in each column) in Table 2.1. If you were asked to summarize these data, how would you do it?
First, notice that for certain variables, the values are numeric; for others, the values are descriptive. The type of values influence the way in which the variables can be summarized. Variables can be classified into one of four types, depending on the type of scale used to characterize their values (Table 2.2).
Table 2.2 Types of Variables
"categorical" or "qualitative"
|yes / no
Stage I, II, III, or IV
"continuous" or "quantitative"
|date of birth
tuberculin skin test
|any date from recorded time to current
0 – ??? of induration
- A nominal-scale variable is one whose values are categories without any numerical ranking, such as county of residence. In epidemiology, nominal variables with only two categories are very common: alive or dead, ill or well, vaccinated or unvaccinated, or did or did not eat the potato salad. A nominal variable with two mutually exclusive categories is sometimes called a dichotomous variable.
- An ordinal-scale variable has values that can be ranked but are not necessarily evenly spaced, such as stage of cancer (see Table 2.3).
- An interval-scale variable is measured on a scale of equally spaced units, but without a true zero point, such as date of birth.
- A ratio-scale variable is an interval variable with a true zero point, such as height in centimeters or duration of illness.
Nominal- and ordinal-scale variables are considered qualitative or categorical variables, whereas interval- and ratio-scale variables are considered quantitative or continuous variables. Sometimes the same variable can be measured using both a nominal scale and a ratio scale. For example, the tuberculin skin tests of a group of persons potentially exposed to a co-worker with tuberculosis can be measured as "positive" or "negative" (nominal scale) or in millimeters of induration (ratio scale).
Table 2.3 Example of Ordinal-Scale Variable: Stages of Breast Cancer*
|Stage||Tumor Size||Lymph Node Involvement||Metastasis (Spread)|
|I||Less than 2 cm||No||No|
|II||Between 2 and 5 cm||No or in same side of breast||No|
|III||More than 5 cm||Yes, on same side of breast||No|
|IV||Not applicable||Not applicable||Yes|
* This table describes the stages of breast cancer. Note that each stage is more extensive than the previous one and generally carries a less favorable prognosis, but you cannot say that the difference between Stages 1 and 3 is the same as the difference between Stages 2 and 4.
For each of the variables listed below from the line listing in Table 2.1, identify what type of variable it is.
- ____ Date of diagnosis
- ____ Town of residence
- ____ Age (years)
- ____ Sex
- ____ Highest alanine aminotransferase (ALT)