Public health surveillance data: legal, policy, ethical, regulatory, and practical issues.
In the United States, data systems are created by the ongoing, systematic collection of health, demographic, and other information through federally funded national surveys, vital statistics, public and private administrative and claims data, regulatory data, and medical records data. Certain data systems are designed to support public health surveillance and have used well-defined protocols and standard analytic methods for assessing specific health outcomes, exposures, or other endpoints. However, other data systems have been designed for a different purpose but can be used by public health programs for surveillance. Several public health surveillance programs rely substantially on others' data systems. An example of data used for surveillance purposes but collected for another reason is vital statistics data. CDC's National Center for Health Statistics (NCHS) purchases, aggregates, and disseminates vital statistics (birth and death rates) that are collected at the state level. These data are used to understand disease burden, monitor trends, and guide public health action. Administrative data also can be used for surveillance purposes (e.g., Medicare and Social Security Disability data that have been linked to survey data to monitor changes in health and health-care use over time). Some data can be released easily to others with few or no restrictions. These include public use data sets and some regulatory or administrative data; however, these files were restricted at some point but were altered to protect respondent confidentiality and privacy. Public use data sets can be shared with everyone because they will not contain personally identifiable information (PII) and have had information removed that would allow identification of any persons. Some data cannot be released to anyone under almost any circumstances because they are highly sensitive and considered classified for security purposes. Data that allow identification of persons, either collected by surveillance programs or by other programs, can only be shared if regulation or legislation allows. PII usually is needed to identify information that allows these data to be linked to other data sets or to identify persons with a specific health condition or disease. In all but the most unusual circumstances at the level of data collection, identifiable data are maintained where public health surveillance interventions occur, usually at the local or state level. Collaborative efforts to meet the needs of public health surveillance programs and other initiatives, programs, or objectives (e.g., information on payment, increased use of medical records, or evaluation of effectiveness of treatment) can maximize the utility of data collected. Information from disparate sources or programs often shed light on patterns that individual program data cannot. Furthermore, appropriate use of data sets collected for multiple purposes can, in some instances, be more cost effective than the collection of new data targeted at a specific condition or health event. The ability of data stewards to share with surveillance or other programs depends on several factors: 1) the rules and regulations governing how and why the data are collected and released, 2) the availability of resources to put the data into a form that can be shared, and 3) the willingness to use those resources. One method of distributing previously restricted data is to determine how to make the data unrestricted (e.g., by perturbing the data or releasing pretabulated, aggregated estimates that preserve confidentiality). This report proposes a vision for improving access to and sharing of data useful for public health surveillance, identifies challenges and opportunities, and suggests approaches to attain the vision. This topic was identified by CDC leadership as one of six major concerns that must be addressed by the public health community to advance public health surveillance in the 21st century. The six topics were discussed by CDC workgroups that were convened as part of the 2009 Surveillance Consultation to advance public health surveillance to meet continuing and new challenges (1). This report is based on workgroup discussions and is intended to continue the conversations with the public health community for a shared vision for public health surveillance in the 21st century.
Amy B Bernstein, ScD, Office of Analysis and Epidemiology, National Center for Health Statistics, CDC, 3311 Toledo Road, Room 6214, Hyattsville, MD 20782