Using Data Matching to Facilitate Collaboration and Data Integration

PCSI Success Stories

Strengthening Collaboration and Service Integration in Philadelphia

At the Philadelphia Department of Public Health (PDPH), HIV/AIDS, STD, tuberculosis, and viral hepatitis data are stored in individual surveillance systems that are not electronically linked. Data sharing, co-infection status, and other information that could be helpful in treating and/or caring for a patient and in tracking disease trends were not available. Some of the disease-specific systems were also not equipped to store data related to associated infections.

In May 2008, this began to change when a PDPH Program Collaboration and Service Integration (PCSI) workgroup was established in response to a CDC green paper that described PCSI as a mechanism for organizing and blending interrelated health issues, activities, and prevention strategies to facilitate a comprehensive delivery of services. Data integration was an area identified early in the discussions and in 2010, when PDPH received funding through CDC‘s PCSI Cooperative Agreement, changes began to take place.

Registry Matching Success

A full-time epidemiologist was hired to work across multiple programs areas to routinely match data across HIV/AIDS, STDs, tuberculosis and viral hepatitis, and to analyze the matched data for various projects. Registries were matched for 2000 through 2010 and included STD, hepatitis, HIV/AIDS, and tuberculosis. This data match led to improved data quality and data collection across data systems and established baseline co-infection rates in Philadelphia.

Benefits from Data Matching

This data integration and matching capability led to numerous improvements, including

  • The correction of discrepant risk factor information between various systems through the discovery of different means of data collection. Previously, some individuals were characterized as men who have sex with men (MSM) in one system and men who have sex with women (MSW) in another. The STD database collects risk-factor information through direct patient interviews whereas the HIV database collects risk-factor information through medical record abstraction. Finding these issues led to important data collection discussions for both programs.
  • Improvement of data quality by first identifying issues and correcting them. Before 2005, HIV was reported anonymously to PDPH. Once name-based reporting was implemented, the HIV program tried to match individuals with their records, which resulted in the identification of inconsistencies related to gonorrhea as a predictor of HIV. They discovered that some individuals were in the HIV system as being HIV positive, but they were also in the STD system as having subsequent negative tests. Data for these individuals were given to the HIV program to review and correct.
  • Transfer of demographic and risk-factor information for an individual from one system to another. While race is missing for nearly a third of individuals in the STD database, it is almost 100% complete in the HIV database. As a result of data matching, more complete information is now available for co-infected persons in both databases.

Successes and Future Directions

PCSI data registry matching clearly indicated the importance of data sharing to improve surveillance and individual patient data. Knowing disease status and risk factor information are key elements to ensuring persons are provided with the best care possible.

The PDPH will continue these successful registry matches and expand them beyond HIV/AIDS, STDs, tuberculosis and viral hepatitis to include birth, cancer, and death registry matches.

For more information, please contact:
Caroline Johnson
PCSI Champion Division of Disease Control Philadelphia Department of Public Health 215-685-6740


Printable PDF versionpdf icon of PCSI Success Story

Get Tested - Find a Testing Site Near You
Atlas Plus - Explore Interactive CDC Data
CDC 24/7 - Protecting America's Safety, Health, and Security
Page last reviewed: April 28, 2014