Skip Navigation LinksSkip Navigation Links
Centers for Disease Control and Prevention
Safer Healthier People
Blue White
Blue White
bottom curve
CDC Home Search Health Topics A-Z spacer spacer
spacer
Blue curve MMWR spacer
spacer
spacer

Persons using assistive technology might not be able to fully access information in this file. For assistance, please send e-mail to: mmwrq@cdc.gov. Type 508 Accommodation and the title of the report in the subject line of e-mail.

System To Generate Semisynthetic Data Sets of Outbreak Clusters for Evaluation of Outbreak-Detection Performance

Christopher A. Cassa,1,2 K. Olson,1,3 K. Mandl1,3
1
Children's Hospital Boston, Boston, Massachusetts; 2Massachusetts Institute of Technology, Cambridge, Massachusetts;
3
Harvard Medical School, Boston, Massachusetts

Corresponding author: Christopher Cassa, Massachusetts Institute of Technology, 77 Massachusetts Ave., Rm. E25-519, Cambridge, MA 02139. Telephone: 617-355-2930; Fax: 617-730-0921; E-mail: cassa@mit.edu.

Abstract

Introduction: The outbreak detection performance of a syndromic surveillance system can be measured in terms of its ability to detect signal (disease outbreak) against background noise (normal variation of baseline disease within a region). However, because a limited number of persons have been infected with agents of biologic terrorism, such data are virtually nonexistent. Therefore, simulation is necessary. One approach to evaluation is to present detection algorithms with semisynthetic data sets. These data sets contain simulated signal superimposed on real background noise.

Objectives: The Children's Hospital Informatics Program (CHIP) Cluster Generator automates the creation of spatio-temporal patient cluster data to help evaluate epidemic-detection software. The spatio-temporal data can then be used to analyze the sensitivity and specificity of spatial or temporal detection algorithms.

Methods: A software tool (available at http://www.chip.org/biosurv/resources.htm) was created to generate artificial outbreaks of spatially clustered cases and inject them into background noise. Each cluster is defined by a controlled feature set. Parameters (e.g., outbreak magnitude, duration, temporal progression, and location) can be varied by the user.

Results: The open-source program accepts a valid set of patient test cluster parameters and creates geospatial patient test data for a single cluster or a series of clusters. The tool automates the creation of valid patient data sets for rigorous testing of outbreak-detection algorithms. The tool outputs either single-patient clusters or series of patient clusters as files containing patient longitude and latitude coordinates. When used with geographic information system software, these clusters can be displayed on a map (Figure). In testing, all generated clusters were properly created within the parameters set at program execution. The cluster generator is in use for rigorous testing of outbreak-detection algorithms.

Conclusions: Automated generation of semisynthetic data sets facilitates evaluation of public health surveillance systems for early detection of outbreaks.


Figure

Figure 1
Return to top.

Use of trade names and commercial sources is for identification only and does not imply endorsement by the U.S. Department of Health and Human Services.


References to non-CDC sites on the Internet are provided as a service to MMWR readers and do not constitute or imply endorsement of these organizations or their programs by CDC or the U.S. Department of Health and Human Services. CDC is not responsible for the content of pages found at these sites. URL addresses listed in MMWR were current as of the date of publication.

Disclaimer   All MMWR HTML versions of articles are electronic conversions from ASCII text into HTML. This conversion may have resulted in character translation or format errors in the HTML version. Users should not rely on this HTML document, but are referred to the electronic PDF version and/or the original MMWR paper copy for the official text, figures, and tables. An original paper copy of this issue can be obtained from the Superintendent of Documents, U.S. Government Printing Office (GPO), Washington, DC 20402-9371; telephone: (202) 512-1800. Contact GPO for current prices.

**Questions or messages regarding errors in formatting should be addressed to mmwrq@cdc.gov.

Page converted: 9/14/2004

HOME  |  ABOUT MMWR  |  MMWR SEARCH  |  DOWNLOADS  |  RSSCONTACT
POLICY  |  DISCLAIMER  |  ACCESSIBILITY

Safer, Healthier People

Morbidity and Mortality Weekly Report
Centers for Disease Control and Prevention
1600 Clifton Rd, MailStop E-90, Atlanta, GA 30333, U.S.A

USA.GovDHHS

Department of Health
and Human Services

This page last reviewed 9/14/2004