Excess Deaths Associated with COVID-19
Provisional Death Counts for COVID-19
NOTICE: THIS WEBSITE WAS ARCHIVED ON SEPTEMBER 27, 2023.
Datasets linked on this page are available on data.cdc.gov. Please note that these datasets will no longer be updated after September 27, 2023. Provisional data is available on CDC WONDER (wonder.cdc.gov). Data are updated weekly, and users can query data by a variety of demographic, geographic, and temporal options. Please direct questions and inquiries to firstname.lastname@example.org with the subject line “NVSS Mortality Surveillance Data”.
Estimates of excess deaths can provide information about the burden of mortality potentially related to the COVID-19 pandemic, including deaths that are directly or indirectly attributed to COVID-19. Excess deaths are typically defined as the difference between the observed numbers of deaths in specific time periods and expected numbers of deaths in the same time periods. This visualization provides weekly estimates of excess deaths by the jurisdiction in which the death occurred. Weekly counts of deaths are compared with historical trends to determine whether the number of deaths is significantly higher than expected.
Counts of deaths from all causes of death, including COVID-19, are presented. As some deaths due to COVID-19 may be assigned to other causes of deaths (for example, if COVID-19 was not diagnosed or not mentioned on the death certificate), tracking all-cause mortality can provide information about whether an excess number of deaths is observed, even when COVID-19 mortality may be undercounted. Additionally, deaths from all causes excluding COVID-19 were also estimated. Comparing these two sets of estimates — excess deaths with and without COVID-19 — can provide insight about how many excess deaths are identified as due to COVID-19, and how many excess deaths are reported as due to other causes of death. These deaths could represent misclassified COVID-19 deaths, or potentially could be indirectly related to the COVID-19 pandemic (e.g., deaths from other causes occurring in the context of health care shortages or overburdened health care systems).
Estimates of excess deaths can be calculated in a variety of ways, and will vary depending on the methodology and assumptions about how many deaths are expected to occur. Estimates of excess deaths presented in this webpage were calculated using Farrington surveillance algorithms (1). A range of values for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound of the 95% prediction interval), by week and jurisdiction.
Provisional death counts are weighted to account for incomplete data. However, data for the most recent week(s) are still likely to be incomplete. Weights are based on completeness of provisional data in prior years, but the timeliness of data may have changed in 2020 relative to prior years, so the resulting weighted estimates may be too high in some jurisdictions and too low in others. As more information about the accuracy of the weighted estimates is obtained, further refinements to the weights may be made, which will impact the estimates. Any changes to the methods or weighting algorithm will be noted in the Technical Notes when they occur. More detail about the methods, weighting, data, and limitations can be found in the Technical Notes.
This visualization includes several different estimates:
- Number of excess deaths: A range of estimates for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound threshold), by week and jurisdiction. Negative values, where the observed count fell below the threshold, were set to zero.
- Percent excess: The percent excess was defined as the number of excess deaths divided by the threshold.
- Total number of excess deaths: The total number of excess deaths in each jurisdiction was calculated by summing the excess deaths in each week, from February 1, 2020 to present. Similarly, the total number of excess deaths for the US overall was computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms.
Select a dashboard from the menu, then click on “Update Dashboard” to navigate through the different graphics.
- The first dashboard shows the weekly predicted counts of deaths from all causes, and the threshold for the expected number of deaths. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.
- The second dashboard shows the weekly predicted counts of deaths from all causes and the weekly count of deaths from all causes excluding COVID-19. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.
- The third dashboard shows the weekly counts of deaths from all causes. Predicted counts (weighted) are shown, along with reported (unweighted) counts, to illustrate the impact of underreporting. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.
- The fourth dashboard shows the total number of excess deaths since early February 2020. Jurisdictions with one or more excess deaths are shown. Use the radio button to select all-cause mortality, or all-cause excluding COVID-19. Use the drop-down menu to select certain jurisdictions.
- The fifth dashboard shows the percent by which the observed counts exceed the threshold (i.e., percent excess) by week and jurisdiction. Use the radio button to select all-cause mortality, or all-cause excluding COVID-19. Use the drop-down menu to select certain jurisdictions.
- The sixth dashboard shows weekly counts of death by age group. Use the drop-down menu to select certain jurisdictions.
- The seventh dashboard shows weekly counts of death by race and Hispanic origin. Use the drop-down menus to select certain jurisdictions and mortality outcomes (e.g., all-cause mortality, all-cause excluding COVID-19, and COVID-19 deaths).
- The eighth dashboard shows the change in the weekly number of deaths in 2020 relative to 2015-2019, by race and Hispanic origin. Use the drop-down menu to select certain jurisdictions.
- The ninth dashboard shows weekly counts of death due to select cause of death groups (Respiratory diseases, Circulatory diseases, Malignant neoplasms, and Alzheimer disease and dementia). Use the drop-down menu to select a jurisdiction.
- The tenth dashboard shows weekly counts of death for more detailed causes of death within three of the larger groups: Respiratory diseases and Circulatory diseases. Use the drop-down menus to select causes of death and certain jurisdictions.
- The eleventh dashboard shows the change in the weekly number of deaths in 2020 relative to 2015-2019, by cause of death. Use the drop-down menu to select certain jurisdictions.
- The twelfth dashboard shows the total number of deaths above the average count since early February 2020, by cause of death. Use the drop-down menu to select certain jurisdictions.
- The thirteenth dashboard shows the total number of deaths above the average count since early February 2020, by jurisdiction and cause of death. Use the drop-down menu to select certain jurisdictions.
Download datasets in CSV format by clicking on the link for the desired dataset under “CSV Format” link. Additional file formats are available for download for each dataset at Data.CDC.Gov.
On March 15, 2023, the methodology for estimating excess deaths was updated to account for the fact that approximately 160 weeks of data during the pandemic were being excluded in the algorithm (so that expected values were not inflated due to substantially elevated mortality during the pandemic), resulting in unstable estimates of expected weekly numbers of deaths in some cases. To account for this limitation and provide more stable estimated expected numbers for recent time periods, the Farrington surveillance algorithms (1) were first applied to data through 2020 and used to predict the expected weekly number of deaths through 2020. To estimate the expected number of deaths for 2021, weekly counts of death above the 95% prediction interval in 2020 were replaced with imputed values, assuming that deaths (on average) in 2020 reflected the expected numbers and variability predicted by the Farrington algorithm. These imputed values were randomly drawn from a normal distribution with a mean equal to the expected number of deaths, and a standard deviation based on the distance between the expected value and the one-sided 95% prediction interval (divided by 1.65). The Farrington algorithm was then applied to predict the expected number of deaths in 2021, based on the imputed values from 2020 and observed data from 2019 and earlier. The same imputation process was carried out for weeks during 2021 that fell above the 95% prediction interval and subsequently predicting the expected number of deaths in 2022 based on imputed values from 2020 through 2021. The same process was repeated for 2023, predicting the expected number of deaths based on imputed data for 2020 through 2022. This process predicts the expected number of deaths for each subsequent year, assuming that trends in deaths would have continued had the pandemic not occurred, rather than excluding the entire pandemic period (160 weeks or more) from the analysis. As a result, trends in the expected number of deaths were smoother and more stable, with fewer aberrations in recent years. However, regardless of the methodology used, the estimated expected numbers of deaths and corresponding estimates of excess deaths are subject to a greater degree of uncertainty as the pandemic has gone on for more than three years as it is increasingly difficult to predict what trends in mortality would have looked like had the pandemic not occurred.
More information is available in the Frequently Asked Questions section.
NOTE: Visualization is optimized for a viewing screen of 950 pixels or wider (i.e., PC and tablets in landscape orientation).
Number of deaths reported on this page are the total number of deaths received and coded as of the date of analysis and do not represent all deaths that occurred in that period. Data are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death. See https://www.cdc.gov/nchs/nvss/vsrr/COVID19/index.htm for more information. Data for New York excludes New York City. Data on all deaths excluding COVID-19 exclude deaths with U07.1 as an underlying or multiple cause of death. Death counts were derived from the National Vital Statistics System database that provides the timeliest access to the vital statistics mortality data and may differ slightly from other sources due to differences in completeness, COVID-19 definitions used, data processing, and imputation of missing dates. Weighted estimates may be too high or too low in certain jurisdictions where the timeliness of provisional data has changed in recent weeks relative to prior years. Data for jurisdictions where counts are between 1 and 9 are suppressed.
Counts of deaths in the most recent weeks were compared with historical trends (from 2013 to present) to determine whether the number of deaths in recent weeks was significantly higher than expected, using Farrington surveillance algorithms (1). The ‘surveillance’ package in R (2) was used to implement the Farrington algorithms, which use overdispersed Poisson generalized linear models with spline terms to model trends in counts, accounting for seasonality. For each jurisdiction, a model is used to generate a set of expected counts, and an upper bound threshold based on a one-sided 95% prediction interval of these expected counts is used to determine whether a significant increase in deaths has occurred. Estimates of excess deaths are provided based on the observed number of deaths relative to two different thresholds. The lower end of the excess death estimate range is generated by comparing the observed counts to the upper bound threshold, and a higher end of the excess death estimate range is generated by comparing the observed count to the average expected number of deaths. Reported counts were weighted to account for potential underreporting in the most recent weeks.
This method is useful in detecting when jurisdictions may have higher than expected numbers of deaths, but cannot be used to determine whether a given jurisdiction has fewer deaths than expected given that the data are provisional. Provisional counts of deaths are known to be incomplete, and the degree of completeness varies considerably by jurisdiction and time. Incomplete data in recent weeks can contribute to observed counts below the threshold. Thus, the estimates of excess deaths – the numbers of deaths falling above the threshold – may be underestimated. While reported counts are weighted to account for potential underreporting in the most recent weeks, the true magnitude of underreporting is unknown. Therefore, weighted counts of deaths may over- or underestimate the true number of deaths in a given jurisdiction.
Excess deaths are calculated based on comparing the observed numbers of deaths to the average expected number of deaths, and 2) the upper bound of the 95% prediction interval of the expected number of deaths, by week and jurisdiction. The data files also include the upper bound of the 95% prediction interval of the expected number of deaths. Negative values, where the observed count fell below the thresholds, were set to zero. The percent excess was defined as the number of excess deaths divided by the threshold. The total number of excess deaths in each state was calculated by summing the excess deaths in each week, from February 1, 2020 to present. Similarly, the total number of excess deaths in the US was calculated by summing the total numbers of excess deaths across the jurisdictions.
Estimates of excess deaths for the US overall were computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms. Summation (rather than estimation) was chosen to account for the possibility that some jurisdictions may have substantially incomplete data while other jurisdictions report may more deaths than expected, these negative and positive values will cancel each other out when estimating excess deaths for the US directly using the Farrington surveillance algorithms. Until data are finalized (typically 12 months after the close of the data year), it is not possible to determine whether observed decreases in mortality using provisional data are due to true declines or to incomplete reporting. Thus, when computing excess deaths directly for the US, negative values due to incomplete reporting in some jurisdictions will offset excess deaths observed in other jurisdictions. For example, the total number of excess deaths in the US computed directly for the US using the Farrington algorithms was approximately 25% lower than the number calculated by summing across the jurisdictions with excess deaths. This difference is likely due to several jurisdictions reporting lower than expected numbers of deaths – which could be a function of underreporting, true declines in mortality in certain areas, or a combination of these factors. In addition, potential discrepancies between the number of excess deaths in the US when estimated directly compared with the sum of jurisdiction-specific estimates could be related to different estimated thresholds for the expected number of deaths in the US and across the jurisdictions.
Different definitions of excess deaths result in different estimates. For example, defining excess deaths as the difference between the observed counts and the expected (not the upper bound estimate) results in larger estimates of excess deaths. The upper bound more readily identifies areas experiencing statistically significantly higher than normal mortality. Using the expected count, by contrast, would indicate which areas are experiencing higher than average mortality. Expected counts are provided so that users can evaluate excess deaths relative to different thresholds.
Finally, the estimates of excess deaths reported here may not be due to COVID-19, either directly or indirectly. The pandemic may have changed mortality patterns for other causes of death. Upward trends in other causes of death (e.g., suicide, drug overdose, heart disease) may contribute to excess deaths in some jurisdictions. Future analyses of cause-specific excess mortality may provide additional information about these patterns.
As more information about the accuracy of the weighted estimates is obtained, further refinements may be made and changes to the weighting methods will impact the estimates. Any changes to the methods or weighting algorithm will be noted in the Technical Notes when they occur.
Methods to address reporting lags (i.e., underreporting) were updated as of September 9, 2020. Generally, these updates resulted in estimates of the total number of excess deaths that were approximately 5% smaller than the previous method, as weights in some jurisdictions with improved timeliness were reduced. While these adjustments likely reduce potential overestimation for those jurisdictions with improved timeliness, estimates for the most recent weeks for the US overall are likely underestimated to a larger extent than in previous releases. Some jurisdictions have little to no provisional data available in the most recent week(s) (CT, NC, WV); together, these jurisdictions represent approximately 5% of US deaths. In previous releases, some of the underestimation or lack of provisional data from certain jurisdictions was offset by the overestimation in other jurisdictions with improved timeliness when considering trends for the US overall. Because the updated weighting methods mitigate the impact of the previous overestimation for some jurisdictions with improved timeliness but provide no additional adjustments for underestimation or a lack of recent provisional data in other jurisdictions, the excess death estimates for the US overall are expected to result in a larger degree of underestimation than in previous releases.
To account for potential underreporting in the most recent weeks, counts were weighted by the inverse of completeness. Completeness was estimated as follows. Using provisional data from 2018-2019, weekly provisional counts were compared to final data (with final data for 2019 approximated by the data available as of April 9, 2020), at various lag times (e.g., 1 week following the death, 2 weeks, 3 weeks, up to 26 weeks) by reporting jurisdiction. Completeness by week, lag, and jurisdiction was modeled using zero-inflated binomial hierarchical Bayesian models with state-level and temporal random effects. Temporal random effects were included for both the time trend in the provisional counts, and the lag or reporting delay. These random effects were specified using a type-I random walk distribution, where counts in a given time period depend on the value for the prior time period, plus an error term. These models were implemented using R-INLA (3). Posterior predicted median values of completeness by jurisdiction and lag time were obtained from the models, and the weekly estimates for 2019 were averaged to provide the most recent possible estimates of completeness by jurisdiction, at given lag times. The inverse of these completeness values was applied as weights to adjust for incomplete reporting of provisional mortality data. For example, if provisional mortality data in 2019 for a given jurisdiction was 50% complete within 1-week of death and 75% complete within 2 weeks of death, then the weights for that jurisdiction would be 2 for data presented with a 1 week lag and 1.3 for data presented with a 2-week lag. Of note, these estimates of completeness differ from the estimates provided elsewhere.
Weights in the first few weeks following the date of death were highly inflated and variable for some jurisdictions with relatively small numbers of deaths and where completeness of provisional data is typically very low (0–2%) in the first few weeks following the date of death. These jurisdictions include: Alaska, Connecticut, Louisiana, North Carolina, Ohio, Puerto Rico, Rhode Island, and West Virginia. To avoid highly inflated estimates in these jurisdictions, weights were trimmed at the 90th percentile for weeks reported with shorter lag times (e.g., 1–6 weeks). Additionally, as of September 9, weights for several jurisdictions were adjusted downward based on preliminary analyses of the timeliness of provisional data for deaths occurring in April through May of 2020. These analyses have suggested that timeliness has improved at shorter lags in Alaska, Mississippi, New York (excluding New York City), Ohio, Pennsylvania, South Carolina, Texas, Vermont, Virginia, West Virginia, and Puerto Rico. Weights for these jurisdictions were adjusted downward accordingly to improve the accuracy of the predicted counts.
Unweighted estimates are shown in one of the dashboards so that readers can examine the impact of weighting on estimates of excess deaths. For some jurisdictions, improvements in timeliness in 2020 relative to prior years will lead to weighted estimates that are too large. For other jurisdictions, the weighting may be insufficient to address reporting lags, particularly for data reported with shorter lag times (e.g., within 4–6 weeks). As an additional step to guard against underreporting, the weighted counts of deaths by week and jurisdiction were compared with control counts of deaths based on available demographic information from the death certificate. Demographic data are typically available prior to the cause of death data, which can take 1 week to 8 weeks or more, depending on the jurisdiction and cause of death. For weeks and jurisdictions where the weighted count of deaths was less than the control count based on the demographic data, the weighted values were replaced with the control count. For example, if the weighted count for a given jurisdiction and week was 400, while the control count for that same jurisdiction and week was 800, this indicates that the weights are not fully accounting for incomplete data. In this case, the value of 800 would be used, as it represents a more complete estimate of the total number of deaths occurring in that jurisdiction and week.
Data for jurisdictions where counts are between 1 and 9 are suppressed. Additionally, data for weeks where the counts are less than 50% of the expected number are also suppressed, as these provisional counts are highly incomplete and potentially misleading. This change resulted in showing estimates with a lag of 1 week for most jurisdictions and the US. For some jurisdictions (Connecticut, North Carolina, Puerto Rico), lags may be greater. Declines in the observed numbers of deaths in recent weeks should not be interpreted to mean that the numbers of deaths are decreasing, as these declines are expected when relying on provisional data that are generally less complete in recent weeks. While the weighting method is intended to mitigate the impact of underreporting, it may not be sufficient to eliminate the problem of underreporting entirely. Therefore, it is not yet possible to determine whether decreases in the number of deaths is due to underreporting or to true declines until more complete data is obtained.
Weekly counts of deaths from all causes were examined, including deaths due to COVID-19. As many deaths due to COVID-19 may be assigned to other causes of deaths (for example, if COVID-19 was not mentioned on the death certificate as a suspected cause of death), tracking all-cause mortality can provide information about whether an excess number of deaths is observed, even when COVID-19 mortality may be undercounted. These estimates can also provide information about deaths that may be indirectly related to COVID-19. For example, if deaths due to other causes may increase as a result of health care shortages due to COVID-19. Additionally, deaths from all causes excluding COVID-19 were also estimated. These counts excluded deaths with U07.1 as an underlying or multiple cause of death.
Comparing these two sets of estimates — excess deaths with and without COVID-19 — can provide insight about how many excess deaths are identified as due to COVID-19, and how many excess deaths are due to other causes of death. These deaths could represent misclassified COVID-19 deaths, or potentially could be indirectly related to COVID-19. Additionally, death certificates are often initially submitted without a cause of death, and then updated when cause of death information becomes available. It may be the case that some excess deaths that are not attributed directly to COVID-19 will be updated in coming weeks with cause-of-death information that includes COVID-19. These analyses will be updated periodically, and the numbers presented will change as more data are received.
Cause of Death
As of June 3, 2020, weekly counts of deaths due to select causes of death are presented. These causes were selected based on analyses of comorbid conditions reported on death certificates where COVID-19 was listed as a cause of death (see https://www.cdc.gov/nchs/nvss/vsrr/covid_weekly/index.htm#Comorbidities). Some causes with insufficient numbers of deaths by week and jurisdiction were combined with other categories, and one cause was added to the Alzheimer disease and dementia category (ICD–10 code G31). These estimates are based on the underlying cause of death, and include: Respiratory diseases, Circulatory diseases, Malignant neoplasms, and Alzheimer disease and dementia. ICD–10 codes were used to classify deaths according to the following causes:
- Respiratory diseases
- Influenza and pneumonia (J09–J18)
- Chronic lower respiratory diseases (J40–J47)
- Other diseases of the respiratory system (J00–J06, J20–J39, J60–J70, J80–J86, J90–J96, J97–J99, R09.2, U04)
- Circulatory diseases
- Hypertensive diseases (I10–I15)
- Ischemic heart disease (I20–I25)
- Heart failure (I50)
- Cerebrovascular diseases (I60–I69)
- Other disease of the circulatory system (I00–I09, I26–I49, I51, I52, I70–I99)
- Malignant neoplasms (C00–C97)
- Alzheimer disease and dementia (G30, G31, F01, F03)
- Other select causes of death
- Diabetes (E10–E14)
- Renal failure (N17–N19)
- Sepsis (A40–A41)
Estimated numbers of deaths due to these other causes of death could represent misclassified COVID-19 deaths, or potentially could be indirectly related to COVID-19 (e.g., deaths from other causes occurring in the context of health care shortages or overburdened health care systems). Deaths with an underlying cause of death of COVID-19 are not included in these estimates of deaths due to other causes, but deaths where COVID-19 appeared on the death certificate as a multiple cause of death may be included in the cause-specific estimates. For example, in some cases, COVID-19 may have contributed to the death, but the underlying cause of death was another cause, such as terminal cancer. For the majority of deaths where COVID-19 is reported on the death certificate (approximately 95%), COVID-19 is selected as the underlying cause of death.
Deaths due to all other natural causes were excluded (ICD-10 codes: A00–A39, A42–B99, D00–E07, E15–E68, E70–E90, F00, F02, F04–G26, G31–H95, K00–K93, L00–M99, N00–N16, N20–N98, O00–O99, P00–P96, Q00–Q99). External causes of death (i.e., injuries) were excluded, as the reporting lag is substantially longer for external causes of death (4). Additionally, causes of death where the underlying cause was unknown or ill-specified (i.e., R-codes) were excluded (except for R09.2, which is included under the Respiratory diseases category). Counts of deaths with unknown cause are typically substantially higher in provisional data, as many records are initially submitted without a specific cause of death and are then updated when more information becomes available (4). For deaths due to external causes of death or unknown cause, provisional data are highly unreliable and inaccurate in recent weeks, and it can take six to nine months to ensure sufficiently accurate estimates. Counts by cause provided here will not sum to the total number of deaths, given that some causes are excluded.
Estimates by cause of death and age at death are weighted, using the methods described above. The total count of deaths above average levels is shown for select causes of death. These totals are calculated by summing the number of deaths above average levels (based on weekly counts from 2015–2019) since 2/1/2020. Negative values were set to zero and therefore excluded from these sums. Because not all causes of death are shown and due to differences in how the average expected numbers of deaths are estimated, the total numbers of deaths across all the selected causes will not match the numbers of excess deaths from all causes excluding COVID-19.
Estimates by race and Hispanic origin are weighted using the methods described above. Weekly counts are shown for deaths due to all causes, all causes excluding COVID-19, and COVID-19. Because estimates are weighted to account for incomplete reporting in recent weeks, counts of death due to COVID-19 will not match other data sources. For data years 2018 – 2020, race and Hispanic-origin categories are based on the 1997 Office of Management and Budget (OMB) standards, allowing for the presentation of data by single race and Hispanic origin. These race and Hispanic-origin groups—non-Hispanic single-race white, non-Hispanic single-race black or African American, non-Hispanic single-race American Indian or Alaska Native (AIAN), and non-Hispanic single-race Asian—differ from the bridged-race categories used in previous data years when not all jurisdictions reported race and Hispanic origin using the 1997 OMB standards. Numbers may therefore differ from previous reports and other sources of data on mortality by race and Hispanic origin.
These estimates are based on provisional data, which are incomplete. The weighting method applied may not fully account for reporting lags if there are longer delays at present than in past years. For example, in Pennsylvania, reporting lags are currently much longer than they have been in past years, and death counts for 2020 are therefore underestimated. Conversely, the weighting method may over-adjust for underreporting, given improvements in data timeliness in certain jurisdictions. Unweighted estimates are provided, so that users can see the impact of weighting the provisional counts. However, these unweighted provisional counts are incomplete, and the extent to which they may underestimate the true count of deaths is unknown. Some jurisdictions exhibit recent increases in deaths when using weighted estimates, but not the unweighted. The estimates presented may be an early indication of excess mortality related to COVID-19, but should be interpreted with caution, until confirmed by other data sources such as state or local health departments. It is possible that recent improvements in the timeliness of data could also contribute to the pattern where a jurisdiction exhibits recent increases with the weighted data, but not the unweighted. Conversely, recent increases may be missed in jurisdictions with historically low levels of completeness (e.g., Connecticut, North Carolina) either due to the lack of provisional data or insufficient weighting to address incomplete data.
The completeness of provisional data varies by cause of death and by age group. However, the weights applied do not account for this variability. It is unknown whether completeness varies by race and Hispanic origin. Therefore, the predicted numbers of deaths may be too low for some age groups, race/ethnicity groups, and causes of death. For example, provisional data on deaths among younger age groups is typically less complete than among older age groups. Predicted counts may therefore be too low among the younger age groups. Since the weights were based on the completeness of all-cause mortality data in past years, the weighted estimates for specific causes of death are likely too low, as reporting lags are typically larger for specific causes of death than for all-cause mortality. To minimize the degree of underreporting, cause-specific estimates are presented with a two-week lag.
Why did the excess death estimates change?
– Excess death estimates rely on predictions of the numbers of deaths that would have happened had normal patterns of mortality continued throughout the pandemic. These predictions are much more accurate for the near term than for time periods further away from the baseline pre-pandemic data. As the pandemic has continued for more than three years, we are making longer-term predictions of expected mortality patterns and also excluding more than 160 weeks of recent data from the models (so that expected values were not inflated due to substantially elevated mortality during the pandemic). As a result, the estimates of the expected numbers of deaths have become more variable and unstable over time. To mitigate some of this variability, we developed an approach to impute death counts for weeks during the pandemic so that it would not be necessary to exclude 160+ weeks from the models. This imputation approach reduced some of the variability in the trends in the expected numbers of deaths.
Are there any changes in interpretation with the new method?
– Because both the old method of calculating excess deaths and the new method aim to estimate the expected number of deaths assuming the COVID-19 pandemic did not occur, there are no changes in how these statistics are interpreted. Instead of excluding weeks during the pandemic, we are imputing death counts under the assumption that mortality patterns followed their average trajectories. In both cases, the expected numbers of deaths represent the average number of deaths we would expect to see had the pandemic not occurred.
What are the limitations of this approach?
– While the imputations do account for the natural variability in weekly death counts, because they are based on the average expected numbers and variation, they may not fully account for the variability we typically see related to severe influenza seasons or other types of events that lead to much higher than expected mortality.
– This imputation approach mitigates one challenge with the older approach – having to exclude 160+ weeks of data. However, this newer approach does not provide a solution for the challenge of making accurate long-term predictions. Regardless of the statistical method used, all long-term predictions of expected mortality trends based on pre-pandemic data are subject to an unknown and growing degree of bias and uncertainty as the pandemic goes on.
- Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A. An Improved Algorithm for Outbreak Detection in Multiple Surveillance Systems. Statistics in Medicine 2012;32(7):1206-1222.
- Salmon M, Schumacher D, Hohle M. Monitoring Count Time Series in R: Aberration Detection in Public Health Surveillance. Journal of Statistical Software 2016;70(10):1-35.
- Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society Series B 2009;71(2):319-392.
- Spencer MR, Ahmad F. Timeliness of death certificate data for mortality surveillance and provisional estimates. National Center for Health Statistics. 2016. http://www.cdc.gov/nchs/data/vsrr/report001.pdf.