CDC’s COVID-19 Data Improvement
Learn the eight core goals public health surveillance of COVID-19 works to advance.
Data drives decisions in public health, and especially at CDC. Good data across our nation’s public health system is critical. However, years of under-investment in our data have led to many places in America that remain underserved by public health. To respond, CDC is navigating the challenges of COVID-19 surveillance while at the same time improving the nation’s posture for the next public health emergency
Responding to COVID-19 requires many data sources to reveal the true picture of what is happening and drive public health action. No one data source gives CDC all the information that scientists and researchers need. CDC has been relying on many data sources, new and old, including data on cases, deaths, laboratory tests, emergency department visits, hospitalizations, hospital capacities, healthcare data, variants, vaccine administrations, surveys, cohort studies, serology studies, mobility data, and many more. Some of CDC’s data are reported to us from states while others CDC conducts through field-based studies.
The central challenge of public health is to take these vast data—delivered at different times, through different channels and intermediaries, and of different quality and completeness—and turn them into useful, actionable information to improve the nation’s response.
Public health surveillance of COVID-19 works to advance eight core goals:
- Monitor trends and intensity of SARS-CoV-2 transmission, identify outbreaks, and provide data to initiate case and contact investigations
- Understand disease severity and the spectrum of illness
- Monitor and track vaccine distribution, uptake, and effectiveness
- Describe risk factors for severe disease and transmission
- Monitor for variants
- Assess impact on health care systems
- Estimate disease burden, and forecast trends, impact, and clinical and public health needs
Monitor impact of disease and interventions on health equity
CDC is Innovating
Over the course of the pandemic, CDC has been improving the timeliness, completeness, and quality of critical data for the response. Significant developments include accelerating case reports, creating shared data and analytic capabilities, and launching a comprehensive modernization initiative to transform data for COVID-19 and beyond.
Accelerating Electronic Case Reporting
Since the outset of the pandemic, CDC has massively expanded Electronic Case Reporting capabilities. Electronic Case Reporting is one way data comes from healthcare providers and public health agencies to CDC.
CDC in Action
In the beginning of 2020, only a handful of healthcare facilities and states were even capable of using Electronic Case Reporting. But as of August 2021, all 50 states, Washington, D.C., Puerto Rico, and 12 large local jurisdictions are now capable of receiving electronic case reports.
Before Electronic Case Reporting, reporting was slow and often relied on paper-based systems and fax machines. This impacted CDC’s ability to make quick decisions. Now, Electronic Case Reporting automatically generates and sends relevant information from electronic health records to public health agencies. This has resulted in earlier disease detection and intervention as well as richer, more useful data to drive decisions. Before COVID, just 187 healthcare facilities were using Electronic Case Reporting. As of August 2021, more than 9,400 healthcare facilities in all 50 states can send COVID-19 Electronic Case Reporting. Resources have been provided from CDC to state and local partners to help modernize their systems and optimize the use of Electronic Case Reporting and other automated electronic data.
Sharing Data Analytics and Visualization Capabilities
To advance public health science, a cloud-based suite of technology, tools, and resources that collects, organizes and connects data across CDC was created. It offers the agency a streamlined way to process data, store it, and visualize it. It provides the means for breaking down data silos in favor of a centralized data ecosystem and allows CDC scientists to catalogue, analyze and publish findings faster than previously possible. To date, CDC has saved more than $8M dollars in infrastructure investments that would have been made to build smaller versions of data silos.
CDC in Action
When the pandemic struck, there was no national system that could track both positive and negative test results. CDC and partners expanded Electronic Laboratory Reporting at breakneck speeds to deliver more than 1 million records per day directly from jurisdictions to CDC.
A fully modernized public health data infrastructure and workforce is required to bring these diverse data sources into an integrated, coherent whole.
CDC in Action
Data modernization investments have facilitated the analysis of more than 800,000 unique SARS-CoV-2 genomes using new computational capabilities.
In 2020, CDC launched the Data Modernization Initiative (DMI), a major multi-year effort to modernize core data and surveillance infrastructure across the federal and state public health landscape. Modern data creates value at the state and local level and in communities by making the information needed for decisions higher quality, more complete, more accessible, and more representative of all people. CDC has laid out priorities and plans to strengthen and unify critical public health infrastructure, accelerate data into action, grow a state-of-the-art workforce, support and extend partnerships, and manage change and governance to support strategic innovation. The goal is a high-speed, integrated public health infrastructure that can prevent problems before they happen and reduce the harm from the problems that do happen.
CDC in Action
We have answered the public’s needs for information by applying machine learning, natural language processing, and artificial intelligence to problems such as identifying cases of multisystem inflammatory syndrome in children (MIS-C) and by creating the “Clara” COVID chatbot.
Problems CDC is Solving Now
There are several examples of how modernization and innovation during the pandemic have allowed CDC to use data to solve problems in ways we could not before. Many of these innovations will serve CDC well into the future as we face both known and unforeseen public health challenges. Because of improvements made, CDC can now:
- Share real-time data from many sources: The CDC COVID Data Tracker was developed in April 2020 to integrate data from multiple core surveillance systems with the goal of creating a “one-stop shop” for COVID data that is viewable and available to everyone. The COVID Data Tracker averages about 3-4 million views a week.
- Respond more flexibly in any crisis: Moving public health into the cloud not only makes data more accessible but allows us to scale up rapidly for emergencies without changing systems.
- Open up more data to the public: Using new privacy technologies has allowed CDC to release COVID-19 public use data sets that increase transparency and let the world help us figure out the big problems facing public health.
- Apply new tools for public health: CDC is standing up a new center to advance the use of forecasting and outbreak analytics in public health decision-making, with the goal of supporting more efficient and effective outbreak responses.
What is Next for the Data
CDC is taking action to address immediate COVID-19 surveillance needs by:
Developing a regular synthesis of all the complex data to make it easier for people to know what is happening and understand the data being generated.
Monitoring vaccine effectiveness and “breakthrough” COVID-19 infections and disseminating this information to the public and policymakers. This will require:
- Advancing work on Privacy-Preserving Record Linkage (PPRL) to allow use of patient-level, real world healthcare data.
- Helping jurisdictions link case data to immunization information system data to better track breakthrough infections and monitor trends in cases and severe disease over time.
- Launching a new webpage on the COVID Data Tracker to monitor vaccine effectiveness and vaccine impact by age group, underlying medical conditions, time since vaccination, severity, and product.
- Working with NIH to expand the capacity of our vaccine effectiveness platforms to detect changes in effectiveness more rapidly.
Enhancing pediatric hospitalization data by working with interagency partners to implement data collection guidance for the Unified Hospital Data Surveillance System, which is a primary resource for hospitalization data.
Focus on Health Equity
Because health equity is a priority, we will keep working to provide greater access to data, incorporate new and non-traditional data sources, and increase our focus on behavioral health and race and ethnicity data. CDC will continue to use data to measure and address structural inequities and prioritize the socioeconomic variables that will help guide resources where they’re needed most. CDC’s COVID-19 Response Health Equity Strategy prioritizes data-driven approaches to expand the evidence base for examining health and social inequities.
On a broader scale, we must continue to evolve policies and processes as a nation that better support the public health mission. We need a common approach to collect the kinds of data that are helpful to all, such as COVID hospitalizations, breakthrough hospitalizations, or race and ethnicity stratification of cases. We must bridge the current workforce crisis and reverse its decades-long erosion in ways that can sustain our skills for the long term, both at CDC and with our state and local partners. And we must bring public health into the healthcare ecosystem through better interoperability, while reducing the burden on providers of data in healthcare and at state and local health departments.