Skip Navigation Links
Centers for Disease Control and Prevention
 CDC Home Search Health Topics A-Z

National Center for Chronic Disease Prevention and Health Promotion
Healthy Youth
Healthy Youth Home | Contact Us

Healthy Youth Home

Publications

Handbook for Evaluating HIV Education
 Booklet 1
 Booklet 2
 Booklet 3
 Booklet 4
 Booklet 5
 Booklet 6
 Booklet 7
 Booklet 8
 Booklet 9
 Download


Resource Library
The Handbook for Evaluating HIV Education: Booklet 1

Evaluating HIV Education Programs

 
On this Page:
A data-gathering design for program-improvement evaluations
A data-gathering design for program-continuation evaluations
Sampling
Final thoughts about Guideline 3
   

In addition to selecting appropriate assessment instruments, attention must be paid to obtaining needed permissions and enhancing student anonymity. Be sure that you attend to all three of the elements in this critically important guideline.

Guideline 3: Use a data-gathering design consistent with the orientation of the evaluation.

Once you have identified the assessment instruments you will use in your evaluation study, you must next determine your data-gathering design. Putting it more simply, you must decide how and when to administer the assessment instruments.

To keep these guidelines simple, we will consider one data-gathering strategy for program improvement evaluation studies and one for program-continuation studies. If you want to explore other options, you can find a wide array of choices in almost any textbook on research methods for the behavioral sciences.

A data-gathering design for program-improvement evaluations

Let's assume you are carrying out a program-improvement evaluation of a district-level HIV education program. The chief decision makers involved are the teachers and curriculum specialists who planned and implemented the program. You must secure evidence to help these decision makers make their program more effective. As an evaluator, you are not trying to prove that the HIV education program works. Rather, you intend to provide your colleagues with data-based insights to help them improve their program. Your choice of a data-gathering design, then, should be consistent with that orientation.

The recommended data-gathering design for program improvement evaluations of HIV education programs, presented in Figure 1, is known as the one-group pretest-posttest design. As seen in Figure 1, this design involves a preprogram measurement and a postprogram measurement. If one of your instruments is an anonymous questionnaire regarding students' HIV-risk behaviors, for example, you would administer that questionnaire to students before and after the program. Differences between the pretest and the posttest data would be credited to the program's effects.

Measurement
->
HIV Education
Program (or a
segment of the program)
->
Measurement

Figure 1. A one-group pretest-posttest design

The HIV education program, of course, is not the only possible reason for a change between students' pretest and posttest questionnaire responses. As students grow older, increased maturity may alter their approach to HIV-risk situations. Similarly, if they discovered that one of their classmates is infected with HIV, it will have a tremendous impact on their responses. These events, unrelated to the program, can pose interpretive problems for program continuation evaluators, which must often prove a program's effectiveness to incredulous decision makers and must, therefore, use data-gathering designs that control for such factors.


The program improvement evaluator, however, usually has no such constraints and often needs only to point out that extraneous factors may have influenced the results.

You will note in Figure 1 that the pretest and posttest measurements may be used not only with the HIV education program in its entirety, but also with segments of the program. Suppose an HIV education program devoted three class periods to promoting students' refusal skills in situations that might involve high-risk sexual activity. If the program's staff were eager to improve this segment of the program, you could gather presegment and postsegment evidence from students to see if the three-day treatment of refusal, kills led to increases in students' measured ability to apply those skills. If the presegment-to-postsegment gains were as the staff hoped, the program would not need modifying. On the other hand, if the presegment-to-postsegment gains were too small or nonexistent, alterations would be in order.

Here is a more detailed illustration. You are assigned to evaluate a school district's HIV education program for improvement purposes. Although the program has been in place for several years, the district's school board has asked administrators to ensure that the program is as effective as possible. Your job is to help teachers identify the parts of the program in need of revision.

You meet with the district's HIV education teachers and agree on five assessment instruments consistent with the program's stated objectives. The five instruments are: (1) an HIV knowledge test, (2) a test of students' refusal skills, (3) an attitudinal inventory assessing students' perceptions of their vulnerability to HIV infection, (4) an attitudinal inventory reflecting students' belief that they can take actions to reduce their likelihood of HIV infection, and (5) a questionnaire regarding the extent to which students engage in HIV-risk behaviors.

The district's HIV education program consists of fifteen hours of HIV-specific instruction during a required tenth grade health education class. You administer the five assessment instruments before and after the classes and discover that students display substantial progress on the knowledge and skill instruments but almost no change on the behavioral questionnaire, your most important instrument, or on the two attitudinal inventories. Such results would place you in a position to suggest that program alterations are warranted. Because the promotion of students' skill and knowledge appears to be successful, you might suggest that parts of the program be modified to better address the two attitudinal dimensions (students' perceived vulnerability and self-efficacy), and their behavior. If you are familiar with instructional psychology, you might suggest particular modifications in the instructional procedures used by the teachers. If you do not possess such knowledge, you could suggest that the HIV education staff rethink the dimensions on which little student progress is evident. You might also, at this point, seek qualitative data from student interviews—individual or focus group sessions—about which program components the students thought did or did not work.

One disadvantage of this design, as we have discussed, is the possibility that factors other than the HIV education program have influenced students' pretest-versus-posttest responses. You will have to be attentive to such possibilities. If other events, such as the release of a popular film about AIDS, occur during the period that the HIV education program took place, you will need to describe them in your report.

Another potential disadvantage of this data-gathering design stems from the use of the same assessment instruments before and after the program. The use of a pretest may result in a reactive effect by alerting students to what they are expected to get out of the program. Students may react differently to the program than they would have merely because the pretest let them know "what's important" in the program. If you are considering assessment instruments you fear would be reactive, you may wish to consider alternative data-gathering approaches such as those described in the additional readings at the end of this booklet.

Return to top


A data-gathering design for program-continuation evaluations

The initial consideration in selecting a data-gathering design for program continuation evaluations of HIV education is the confidence with which you can supply convincing evidence about the program's effectiveness. Although a data-gathering scheme such as the one-group pretest-posttest design might prove satisfactory for program-improvement purposes, it does not fill the needs of program continuation evaluators wishing to supply evidence about whether a program really worked. You need a data-gathering design that allows you to make defensible statements about an HIV education program's success or lack of it. And because the evaluation of school-based HIV education programs must take place in the midst of ongoing education programs, a data-gathering design must be selected that can be realistically implemented in a school setting.

The pretest-posttest two-group design, portrayed in Figure 2, provides the strongest basis for a program-continuation data-collection scheme to address these considerations. This design involves two groups, with only Group 1 initially receiving HIV education. Group 2 begins as an untreated control group.* This data-gathering design requires that a preprogram measurement be given to both groups.

Group 1:
Measurement
->
HIV Education
Program
->
Measurement

Group 2:
Measurement
->
No HIV Education
Program
->
Measurement
->

 

HIV Education
Program

Figure 2. A pretest-posttest two-group design


*If Group 2 is not receiving any HIV instruction, it is termed the "control group." Sometimes, however, Group 2 is receiving a different intervention (perhaps an earlier version). Group 2 is then called the "comparison group." 


After Group 1 has completed the HIV education program, both groups are posttested. Because an effective HIV education program will provide students with content that can quite literally save their lives, the prospect of employing a data-gathering design in which an "untreated control group" of students receives no HIV education runs counter to our sense of educational responsibility. Therefore, enough time must be set aside during the school year to ensure that Group 2 also receives the HIV education program after the posttest.

The key comparisons in this two-group design are those between the pretest-to-posttest changes made in Group 1 (the treated group) and those made in Group 2 (the untreated group). If Group 1 outperforms Group 2 on the posttest, it would indicate that the program is effective. If there is no difference between the two groups' pretest-to-posttest changes, or if Group 2 outperforms Group 1, a lack of program effectiveness is indicated.

Interpretations of the effectiveness of the HIV education program, however, are totally dependent on the degree to which students in the two groups are similar. If the groups are essentially the same, you can draw meaningful conclusions as to whether the HIV education program worked. As the two groups become less similar, the conclusions to be drawn become less meaningful. For example, one of the concerns with classroom-based evaluations is that students in one classroom are different from students in another classroom. One reason for this is that students may be assigned to particular classes on the basis of their ability or interests. When classroom assignments are not made randomly, it is impossible to assume that students within those classrooms will be similar. Therefore, if the two groups (treated and untreated) are composed of only two different classrooms, it is nearly impossible to determine whether posttest differences are due to the intervention program or to differences among students in the individual classrooms.

One solution to this problem is to increase the number of classrooms to at least two per intervention and two per control group. The more classrooms that can be included per group, and the more randomly those classrooms can be selected from all possible classrooms, the more likely it is that the students in the intervention and control groups will be equivalent at the pretest. If a large number of classrooms (e.g., 20) can be randomly selected from the school district and randomly assigned to treatment or control situations, then a "posttest only" design may be used, in which only differences between posttest scores are examined (because we feel confident that students' scores were equivalent to begin with).

Of course, it is not always possible to study many classrooms at one time, or to select classrooms randomly from the school district. In that case, it is important to use pretest scores in the analysis to control for the potential lack of confidence about initial equality. This more common situation is analyzed in a "pretest-posttest (nonequivalent) two-group design," which is shown in Figure 2.

It may also be important to consider that the intervention could have different effects on different types of students. Age, gender, or ethnicity, for example, may be key indicators of a student's receptivity to some or all intervention components. Therefore, it may be important to analyze the results on the basis of key student characteristics. This is a somewhat more complicated design. It will require a bit more work to set up, and will need more students than the simpler designs described above. Data analysis could also be more complex. However, if particular student characteristics are responsible for different reactions to the intervention program, then it would be well worth the effort to examine those differences in the analysis which will, hopefully, lead to more meaningful results. Finally, the location of the schools within a given district may also have an effect on the results for a number of reasons. Therefore, it may be desirable to try to match classrooms from schools in similar neighborhoods, and/or with similar student populations and then randomly assign one of each pair to the treatment and the other to the control condition.

Return to top


Sampling

Whenever possible, several schools should be randomly sampled from the school district for inclusion in the study. Random sampling can be as simple as pulling school names from a hat containing all school names, and then randomly selecting one or two classes from each school. These classes can be randomly assigned to treatment or control conditions by the flip of a coin.

If you are interested in matching schools on a key set of characteristics, the school district office may have relevant information on school location and student composition. You may then want to group all district schools into different types, such as urban versus suburban, and then randomly sample from within each group. Preselecting groups of schools from which to draw your random sample is known as stratified sampling. These and other sampling procedures are described in most standard research-oriented textbooks.

Return to top

Final thoughts about Guideline 3

There are many more data-gathering strategies than the two basic models presented here. In the evaluation of HIV education programs, however, you will find that these two designs will satisfy almost all of your data-gathering requirements.

The one-group pretest-posttest design is recommended for program improvement evaluations. A two-group variation of that design is recommended for program continuation evaluations. Although it is certainly possible to use a one-group design in program continuation evaluations, its results will not be as convincing as if a control group were used. It is equally possible to use a control-group design in program improvement evaluations. You may find, however, that control groups often add needless complications to an evaluation focusing on program improvement.

Return to top


Back to Booklet 1 Table of Contents

Back to Handbook for Evaluating HIV Education - Introduction



Healthy Youth Home | Contact Us

CDC Home | CDC Search | Health Topics A-Z

Privacy Policy | Accessibility

This page last updated April 29, 2005

United States Department of Health and Human Services
Centers for Disease Control and Prevention
National Center for Chronic Disease Prevention and Health Promotion
Division of Adolescent and School Health