How to Be an Informed Consumer of Evidence Ratings: It’s in the Details

What are evidence-based strategies and how can public health practitioners find evidence without conducting extensive literature reviews? We developed an inventory of clearinghouses and other resources that disseminate research on evidence of effectiveness. We examined differences in evidence classification among 6 evidence clearinghouses that rate the effectiveness of community-level strategies to address determinants of health. Most evidence clearinghouses clearly defined their scope, but only a few clearinghouses explicitly defined the types of strategies they assess (eg, programs, policies, practices). The term “evidence-based” was widely used, but definitions and standards were inconsistent across organizations and disciplines. Evidence clearinghouses varied in the way they used evidence rating classifications and criteria for assigning ratings. Attention to detail is important. The criteria for the top rating of some evidence clearinghouses, for example, require a more thorough literature review with more robust results than the criteria for the top rating of others. In addition, some clearinghouses report only on strategies considered to be evidence-based, whereas others also report on strategies that have no effect, mixed evidence, or no qualifying studies, demonstrating that a listing of a strategy by an evidence clearinghouse does not necessarily mean that it is effective. We conclude by providing guidance for users of evidence clearinghouses about how to interpret and effectively apply rating criteria across platforms: look closely at the details of how clearinghouses assign their ratings and be aware of similarities and differences when you are aligning potential strategies with your local priorities. We encourage communities to balance evidence with local needs, resources, and culture in strategy selection and funding decisions.


Abstract
What are evidence-based strategies and how can public health practitioners find evidence without conducting extensive literature reviews? We developed an inventory of clearinghouses and other resources that disseminate research on evidence of effectiveness. We examined differences in evidence classification among 6 evidence clearinghouses that rate the effectiveness of community-level strategies to address determinants of health. Most evidence clearinghouses clearly defined their scope, but only a few clearinghouses explicitly defined the types of strategies they assess (eg, programs, policies, practices). The term "evidence-based" was widely used, but definitions and standards were inconsistent across organizations and disciplines. Evidence clearinghouses varied in the way they used evidence rating classifications and criteria for assigning ratings. Attention to detail is important. The criteria for the top rating of some evidence clearinghouses, for example, require a more thorough literature review with more robust results than the criteria for the top rating of others. In addition, some clearinghouses report only on strategies considered to be evidence-based, whereas others also report on strategies that have no effect, mixed evidence, or no qualifying studies, demonstrating that a listing of a strategy by an evidence clearinghouse does not necessarily mean that it is effective. We conclude by providing guidance for users of evidence clearinghouses about how to interpret and effectively apply rating criteria across platforms: look closely at the details of how clearinghouses assign their ratings and be aware of similarities and differences when you are aligning potential strategies with your local priorities. We encourage communities to balance evidence with local needs, resources, and culture in strategy selection and funding decisions.
What Is Evidence and Why Is It Important?
Since the early 1990s, evidence-based decision making has gained prominence in the field of medicine, followed by the field of public health. In medicine and public health, evidence typically refers to research evidence, rather than experiential or contextual evidence (1,2). Our study examines best available research evidence as both strength of evidence and effectiveness. "Strength of evidence" refers to how rigorously a program, policy, or practice has been evaluated and to the quality and quantity of evidence available to determine whether the program or policy is producing the desired outcomes. Effectiveness considers whether the outcomes observed are, in fact, a product of the program, policy, or practice itself and whether those outcomes are desirable or not desirable (2). Systematic reviews of randomized control trials (RCTs) are widely recognized as the gold standard of intervention research. Such re- The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions.
views follow an established process for searching, critically appraising, and summarizing results of research studies, accounting for all relevant qualifying studies and their results and establishing whether research findings are consistent and generalizable across populations and settings. Individual studies (for example, an RCT, a cohort study, a case-control study, a case series, and a case report) vary in strength of evidence. Sometimes, however, no study is available, and practitioners might turn to expert opinion (3,4). Researchers acknowledge that best evidence can exist in various forms (5), often in tandem with contextual factors such as clinical expertise, patient preference, and environmental and organizational context (6). Medical literature describes various methods for assessing evidence to support clinical practice recommendations, such as the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system (7); however, these methods are primarily designed to evaluate clinical practice rather than community-based interventions.
Community leaders and practitioners have numerous approaches to finding evidence and varying criteria for considering evidence in decisions (8). Understanding the details of approaches to synthesizing and rating evidence can help practitioners harness best evidence to implement locally applicable, effective solutions. Using what has been successful through research, evidence can drive smart investments and support wise allocation of scarce dollars and other resources. And, knowing whether strategies exist to address local priorities can inform decisions about when to innovate and when to adopt strategies that have already been tested and shown to be effective. When strategies that support local priorities have strong evidence of effectiveness, practitioners have a solid starting place for action. When strategies that support local priorities do not have strong evidence of effectiveness or cannot be implemented with fidelity, which increases the likelihood of expected results (9), innovation using new untested strategies can be a better approach, especially when combined with evaluation.
Where Can Communities Find Evidence?
Searching the scientific literature for RCTs or other studies is often not feasible for public health practitioners or community members, largely because of limited time and access to scientific literature (10). Evidence clearinghouses offer registries of strategies that communities can implement to address local priorities. Some evidence clearinghouses also review and assess evidence to rate strategies on the basis of the strength of the evidence of effectiveness. All aim to help guide local strategy selection or design decisions, but approaches differ. Our use of the term "evidence clearinghouse" refers to all clearinghouses that support this aim, incorporating a spectrum of methods and content areas.
We developed a comprehensive, but not exhaustive, inventory of evidence clearinghouses and other resources that summarize evidence on strategies that address the multiple determinants of health (Table 1). We focused on clearinghouses that regularly update content and make it available through searchable web-based platforms. We identified these clearinghouses through a general internet search and by using search terms such as "evidence ratings" and "research clearinghouses" and reviewing inventories compiled by groups such as the Results First Initiative (14), the Bridgespan Group (15), and the Corporation for National and Community Service (16). Some clearinghouses, such as healthevidence.org and Strengthening Families Evidence Review, focus on the quality of individual studies. Others, such as The Guide to Community Preventive Services (The Community Guide), conduct systematic reviews and provide a summary rating. Clearinghouses such as What Works for Health (WWFH) (the authors' clearinghouse, part of the County Health Rankings & Roadmaps program) consider study quality and rate intervention effectiveness. Our inventory notes 21 clearinghouses that rate intervention effectiveness (Table 1).
Each clearinghouse has its own scope of interest, methods, and rating classifications. Many clearinghouses also provide additional content to accompany evidence ratings and support effective decision making. Some, such as The Community Guide, WWFH, and Social Programs that Work (SPTW), provide cost-related information. This information ranges from The Community Guide's economic effectiveness analysis (conducted for strategies they rate as "recommended") to study details noted by WWFH and SPTW. Some also emphasize tools or content that can bolster efforts to increase equity or reduce disparities in health-related outcomes. WWFH, for example, assesses the likely effect of each strategy among socioeconomic, racial/ethnic, and geographic groups. Many clearinghouses that assess and rate evidence also provide examples, stories, or other action-focused resources to support implementation.

How Is Evidence Rated?
To understand how evidence clearinghouses rate evidence, we selected a sample of clearinghouses that provide evidence of effectiveness ratings for strategies that affect multiple determinants of health. Multiple determinants of health are defined in several ways, for example, "genetics, behavior, social circumstances, environmental and physical influences, and medical care" (17). The County Health Rankings model (18), on which WWFH is based, and this analysis, exclude genetics. In selecting our sample, we excluded clearinghouses that rate the quality of individual studies about an intervention but do not assess the effectiveness of that intervention overall. We also excluded clearinghouses that indicated their content is no longer updated. We minimized the inclusion of clearinghouses that are part of the Results First Clearinghouse Database, "an online resource that brings together information on the effectiveness of social policy programs from nine national clearinghouses" (12), because Results First provides tables to help users compare and contrast these ratings (11).
Our focused review examined the work of the following 6 evidence clearinghouses: Best Evidence Encyclopedia (BEE); The Community Guide; Healthy Communities Institute (HCI); Rural Health Information Hub (RHIhub); SPTW (formerly the Coalition for Evidence-Based Policy); and WWFH.
We conducted a qualitative analysis of the scope, methods, and ratings as described on the website of each of the 6 selected clearinghouses, with particular attention to the literature assessment (eg, literature review, systematic review), the criteria used to assess the quality of individual studies, and the type and number of studies required to establish each rating. We also considered scope of interest and the types of strategies assessed. We completed reviews in September 2018 and confirmed our information in October 2018. We invited staff members from each of the 6 clearinghouses to provide feedback on the accuracy of our information.
Each evidence clearinghouse has its own scope of interest ( Table  2). The types of strategies (eg, programs, policies) assessed also varies, and selection of these strategies is largely tied to scope of interest and approach to compiling and assessing the literature. BEE, SPTW, and WWFH monitor topic-relevant research to identify potential strategies for assessment; SPTW and WWFH consult with experts. The Community Guide has a set process and priority-setting criteria to determine which strategies will be assessed. HCI accepts submissions and reviews them for inclusion on all community sites; local site administrators can decide whether to include submissions that are not selected for inclusion on all HCI sites. RHIhub also accepts submissions and includes programs that address rural health issues, are implemented in a rural US community, and include a program contact.
Scope of interest and type of strategies assessed. WWFH, HCI, and The Community Guide address multiple determinants of health; the latter two also address several diseases and injuries. RHIhub focuses on programs and interventions in rural communities. SPTW focuses on social programs and BEE on education programs. Some, such as The Community Guide and WWFH, emphasize broadly defined policy, systems, and environmental change (PSE) strategies; WWFH also includes some named programs, such as Nurse Family Partnership and Reach Out and Read. Other clearinghouses, such as SPTW and HCI, focus more heavily on named programs.
Approach to compiling and assessing literature. The websites of these 6 clearinghouses indicate various approaches to compiling and assessing available literature in support of their evidence ratings. The Community Guide and SPTW conduct systematic reviews, and BEE conducts systematic reviews with meta-analysis. WWFH conducts an extensive literature review, informed by the principles of systematic review methods, to capture and assess available evidence in a shorter time frame than systematic reviews, allowing inclusion of more strategies than the aforementioned clearinghouses. HCI and RHIhub seek and accept submissions from evaluators, practitioners, and others, fostering dissemination of early practice-based results. Review criteria for submissions were not apparent in our search of these 2 websites; it was also unclear whether a formal literature review process is used to inform evidence ratings.
Study types considered. Studies vary in their ability to determine causality; reviewed clearinghouses vary in the types of studies required to support evidence rating assignments. SPTW and BEE include RCTs and strong quasi-experimental designs (QEDs) as the foundation for their rating assignments. The Community Guide and WWFH include RCTs, QEDs, and some weaker study designs in their reviews. Strong QEDs are based on sound theory, use comparison groups, and typically include multiple measurement points; weaker study designs are also based on sound theory but do not have comparison groups and might not include multiple measurement points (2). Although HCI and RHIhub require peerreviewed studies, overall, these clearinghouses do not specify the types of studies required for each rating. HCI and RHIhub describe pre-post designs for their highest rating categories and appear to assign this rating to strategies studied with or without comparison groups.
Replicability. The 6 clearinghouses also vary in their approach to replication, or demonstrations of generalizability, which is important to ensure a study's results are valid in different settings, with different populations, or at different times (2). BEE, SPTW, and WWFH require multiple strong studies, a strong study implemented in multiple sites, or systematic review(s) of strong studies for their highest evidence ratings. The Community Guide conducts an applicability assessment process to evaluate generalizability along with the criteria used to assign their highest evidence rating. RHIhub requires successful implementation in more than 1 community via peer-reviewed program evaluations as a means to gauge replicability. HCI does not appear to require a demonstration of replication; its highest rating category can be assigned on the basis of 1 study that demonstrates program success in 1 or more locations. Rating categories. Each of the 6 clearinghouses has a unique scale for rating evidence and a unique number of ratings (Table 3). Most ratings indicate degree of effectiveness, and some ratings indicate additional evidence is needed. Most rating categories are favorable (eg, "strong," "recommended, "effective"), but WWFH and The Community Guide also assign unfavorable ratings: WWFH assigns "evidence of ineffectiveness," and The Community Guide assigns "recommended against." WWFH is the only organization with the rating "expert opinion." "Expert opinion" is assigned to new strategies or innovations that have limited or no qualifying research but are recommended by credible, impartial experts. Additionally, this category may be indicated for strategies with benefits that are not described in empirical literature (eg, adding a dental clinic in a rural area without dental providers improves access to oral health care for at least some residents) or are difficult to test. RCTs are not always practical, as clearly pointed out by Smith and Pell in their systematic review of studies examining parachute use (19). WWFH also differentiates between "mixed evidence" (when strategies have been tested more than once in strong studies and results are inconsistent) and "insufficient evidence" (when too few studies assess the strategy of interest), whereas other clearinghouses might not; for example, The Community Guide covers both categories under "insufficient evidence."

Key Lessons in Considering Evidence of Effectiveness Ratings Provided by Evidence Clearinghouses
Look for information about the scope of interest and types of strategies included. Most evidence clearinghouses in our review clearly define their scope of interest and outline a framework for the topics covered. However, the types of strategies assessed (eg, policy, program) might not be so well defined. Understanding the scope and types of strategies covered can help users search appropriately for strategies to address local priorities.
Ascertain what constitutes "evidence-based" for each clearinghouse, because no consensus exists. Among the clearinghouses we examined, there is no universal definition of "evidence-based." Clearinghouses vary in the terminology they use to describe levels of evidence and effectiveness and the criteria used to assign their ratings. Although evidence clearinghouses provide a streamlined way to learn about evidence, it is important for practitioners to pay attention to how each clearinghouse defines each term used in their rating classifications.
Understand that evidence clearinghouses weight research designs differently. Some, but not all, clearinghouses give greater weight to evidence from systematic reviews, RCTs, and strong QEDs than to other study types, particularly in their highest evidence rating categories. Systematic reviews and RCTs are recognized as the gold standard of effectiveness; seeking out interventions with this level of evidence can be important when a community is scaling up an intervention or investing substantial time or money, or when political stakes for success are high.
Recognize differences in evidence clearinghouses' requirements for literature review and their considerations of study quality and quantity. Some clearinghouses search for evidence more systematically and judge study quality and design more strictly than others. Some also emphasize replicability more heavily. Yet others focus more on dissemination of early practice-based results. Understanding the breadth and replicability of studies provides practitioners with critical information as they consider deploying interventions in their own community.
Be aware that most evidence clearinghouses do not assign ratings for ineffectiveness, expert opinion, or mixed results. Only 2 clearinghouses that we examined closely include information about strategies with evidence of ineffectiveness, and WWFH is the only one that has the category "expert opinion." Exploring evidence along the entire continuum of effectiveness can provide practitioners with information about ineffective policies or programs that might need to end, strategies with mixed evidence that may need a closer look, and strategies rated "insufficient evidence" or "expert opinion" that may especially benefit from more rigorous evaluation designs.
In general, more focus appears to be on what works rather than on what does not or is unknown. This discrepancy is likely due, at least in part, to the fact that more literature is available for what works than what does not -partially a result of publication bias (20). This focus on what works raises 2 important caveats. First, inclusion of a strategy in an evidence clearinghouse should not be considered a recommendation for implementation, because included strategies are sometimes ineffective. Second, little is known about strategies that are not listed in evidence clearinghouses. Are they ineffective, or have they simply not been studied or reviewed for inclusion?

Guidance for Public Health Practitioners, Community Members, and Policy Makers
What knowledge do community leaders and policy makers need to be informed consumers of evidence clearinghouses that summarize evidence about health improvement efforts? As demonstrated in our qualitative review of publicly available data and in a 2016 assessment of education-related evidence resources, "the methods used in these syntheses vary in fundamental ways" (20). In using PREVENTING CHRONIC DISEASE any evidence clearinghouse, paying attention to the fine print is important. Each clearinghouse has a unique approach to assessing evidence and communicating effectiveness. Particularly, the top evidence rating for some clearinghouses -communicating strategies that are most effective -requires a less thorough literature search with less robust results in some clearinghouses than others. This variability reflects different choices in search methods, replication requirements, and often, the scope of strategies included. Users of such clearinghouses can consult our list of key lessons as they examine the criteria of each clearinghouse to ensure that they understand the ratings and confirm ratings align with their local expectations and goals. Going forward, evaluation is needed to ensure that selected strategies work in the local population, setting, and context, as well as to add new examples to the evidence base.
Caution should be taken in implementing strategies that are found to have no effect or mixed results; communities interested in such strategies should consider study results, possible modifications to the strategy, and implications of implementation fidelity. Strategies for which literature reviews yield no qualifying studies might simply be too new to determine likely effectiveness. In these situations, conducting a pilot or implementing a rigorous evaluation to be sure that these strategies do, in fact, achieve expected outcomes is a wise approach.
Finally, evidence clearly matters to decision making but so do other factors. Knowledge building is a continuous process, and the creativity of local communities in addressing perplexing challenges, accompanied by a "test and see" approach, is often a source of new evidence. Local culture, potential effect on disparities, feasibility, and cost, are also important considerations. Purposeful approaches to balance these factors, along with evidence of effectiveness, can best support efforts to select strategies that will appropriately address local priorities.

Recommended against
The systematic review of available studies provides strong or sufficient evidence that the intervention is harmful or not effective.

Healthy Communities Institute
Evidence-based practice At minimum, the program description includes information on the sponsoring organization, program goals, program implementation steps, and outcomes that demonstrated success in achieving the program goal in one or more localities. Results from an evaluation of the program include quantitative measures showing improvement in the outcome of interest after the implementation of the program (eg, increase in smoking cessation, not just the delivery of a smoking cessation program). The outcome measure is compared at relevant periods before and after the intervention or program implementation. Alternatively, the evaluation study compares the outcome between an intervention group and an appropriate control group. The study is of peer-review quality and presents data in a scientific manner; measurements of precision and reliability are included (eg, confidence intervals, standard errors), results from statistical tests show a significant difference or change in the outcome measure and relevant point estimates and P values. If results from an evaluation of a program are presented in a scientific manner and the outcome measure improved from baseline or in the control group but the difference was not significant, the practice is classified as effective and not evidence-based.
Effective practice At minimum, the program description includes information on the sponsoring organization, program goals, program implementation steps, and outcomes that demonstrated program success and/or promise in achieving the program goal in one or more localities. The results from an evaluation of the program include quantitative measures of improvement in outcome of interest (ie, increase in voter registration, not just delivery of voter registration drive) and/or the outcome measure increased or improved from baseline or in the control group but the difference was not significant.

Good idea
The program description includes information on the sponsoring organization, program goals, program funding source, program implementation steps, and outcomes. The program evaluation is limited to descriptive measure(s) of success/accomplishment (eg, program participation rates, number of services/education sessions/radio messages provided). Programs that have not yet been evaluated, but which show promise in improving health or quality of life, are classified as Good Ideas until an evaluation is conducted. These programs are often newly implemented, and a program evaluation has not yet been conducted.

Rural Health Information Hub
Evidence-based A review study of the approach in a peer-reviewed publication. Approach tried in more than one location or setting. Overall results were positive for the approach and may have varied by setting or location.  Emerging Anecdotal account of a program, without documentation of a formal evaluation. Typically includes a single location or setting. Program result may be positive (success story), negative (lesson learned), or mixed.
Social Programs that Work Top tier Programs shown in well-conducted RCTs, carried out in typical community settings, to produce sizable, sustained effects on important outcomes. Top Tier evidence includes a requirement for replication: the demonstration of such effects in ≥2 RCTs conducted in different implementation sites, or, alternatively, in 1 large multi-site RCT. Such evidence provides confidence that the program would produce important effects if implemented faithfully in settings and populations similar to those in the original studies.
Near top tier Programs that meet almost all elements of the top tier standard and that need only 1 additional step to qualify. This category primarily includes programs that meet all elements of the top tier standard in a single study site but need a replication RCT to confirm the initial findings and establish that they generalize to other sites. This standard is best viewed as tentative evidence that the program would produce important effects if implemented faithfully in settings and populations similar to those in the original study.
Suggestive tier Programs that have been evaluated in ≥1 well-conducted RCTs (or studies that closely approximate random assignment) and found to produce sizable positive effects, but whose evidence is limited by only short-term follow-up, effects that fall short of statistical significance, or other factors. Such evidence suggests that the program may be an especially strong candidate for further research but does not yet provide confidence that the program would produce important effects if implemented in new settings.
What Works for Health Scientifically supported Strategies with this rating are most likely to make a difference. These strategies have been tested in multiple robust studies with consistently favorable results.
Some evidence Strategies with this rating are likely to work, but further research is needed to confirm effects. These strategies have been tested more than once and results trend favorable overall.
Expert opinion Strategies with this rating are recommended by credible, impartial experts but have limited research documenting effects; further research, often with stronger designs, is needed to confirm effects.

Insufficient evidence
Strategies with this rating have limited research documenting effects. These strategies need further research, often with stronger designs, to confirm effects.
Mixed evidence Strategies with this rating have been tested more than once and results are inconsistent; further research is needed to confirm effects.

Evidence of ineffectiveness
Strategies with this rating are not good investments. These strategies have been tested in multiple studies with consistently unfavorable or harmful results. a Rating descriptions are from each evidence clearinghouse's website.