Accelerating Data into Action
When people’s lives are on the line, connected and integrated data helps us put the pieces together faster and take action to protect health.

With this priority, we’re tapping into more data sources, promoting health equity, and increasing capacities for scalable outbreak response, forecasting, and predictive analytics.
- Health Equity: We continue to bridge the gap between the data we have now and the data we need to fully understand and address the drivers of health disparities.
- Data Linkage: We’re combining traditional surveillance data with non-traditional data, such as geospatial, social vulnerability, and administrative data, to uncover disease impacts.
- Interoperability: We continue working toward shared data standards, such as FHIR, that connect previously disconnected data systems and create hubs for rapid, bidirectional data exchange.
- Privacy and Security: We use new Privacy Preserving Record Linkage (PPRL) technology to keep personal information protected and to safely link and share health data.
- Open Data: We provide more data directly to the public and to researchers for faster insights on COVID-19, health equity and other priorities.
- Common Operating Picture: We build updated platforms that bring trusted, real-time data together in one place for easier analysis during an outbreak or other public health emergency.
- Scalable Emergency Response: We increase the use of systems that can be rapidly scaled-up when needed, so that the same system can be used for 300 or 3 million cases.
- Forecasting and Outbreak Analytics: We launched a new National Center for Forecasting and Outbreak Analytics that will allow us to predict, inform, and innovate to fight any disease.
CDC is expanding its platform for multiple respiratory illness surveillance in ways that will better prepare us for the next outbreak or pandemic – including from diseases like flu, measles, mumps, and Legionnaires’ Disease. For example, the new Legionella System for Outbreak Response, Coordination, and Surveillance (SOURCE) will automatically process data that traditionally had to be entered by hand, including reports on notifiable diseases, cases, laboratory results, outbreaks, and real-time investigations. Flu SOURCE, Measles SOURCE, and Mumps SOURCE are also in development.
Explore a few of the many projects and activities that are accelerating public health data into action.
HHS Protect offers a “common operating picture” where we can share data in near-real time.
Effective public health means equitable public health. Data can help us get there.
Public datasets like data.cdc.gov promote open government and transparency, research promotion, and efficiency.
Data modernization is critical to growing the nation’s data, modeling, and analytics capabilities.
CDC is working closely with partners to help public health data “speak the same language.”
Collaboration drives cutting-edge solutions to get better, faster, complete, and accurate data to state and local public health and other partners.
Linked data files enable researchers to examine factors that influence disability, chronic disease, health care utilization, morbidity, and mortality.
Advanced tools like machine learning, artificial intelligence, and natural language processing are helping to solve complex public health problems.
Real-time data from patients is needed to understand what’s happening in emergencies like COVID, mpox, and opioids. There is a lot of data that needs to be collected and linked to understand the true picture. At the same time, patients need their information to be safe and protected at all times. That’s where innovative Privacy Preserving Record Linkage (PPRL) comes in. During the COVID response, CDC’s Immunization Data Lake demonstrated one of the first applications of PPRL at scale, which immensely improved the data quality. As of late 2021, PPRL had been used to securely track more than 86 million vaccine doses. Through this solution, CDC has been able to expand vaccine effectiveness data and push that data out to the public faster through the COVID Data Tracker.
By December 2021, CDC’s new Immunization Data Lake (IZDL) contained and had processed more than 530 million unique vaccine administration records.
As of October 2022, CDC had published 1.9 million SARS-CoV-2 sequences with information available on CDC’s COVID Data Tracker.
More than 2.3 million users visited data.cdc.gov in the last year, meaning more scientists, researchers, and decision-makers are finding the data they need.
The Environmental Health Tracking Network has more than 2.7 million interactive maps, tables, and charts covering 525 environmental health measures available for sharing and embedding.