COVID-19 Genomic Epidemiology Toolkit

email_03Get Email Updates

More modules and materials will be added to this toolkit, so please check back for updates or subscribe to the mailing list.

Welcome and Overview

The Office of Advanced Molecular Detection released this toolkit to address topics related to the application of genomics to epidemiologic investigations and public health response to SARS-CoV-2. The COVID-19 Genomic Epidemiology Toolkit is meant to further the use of genomics in responding to COVID-19 at the state and local level.

CDC’s Dr. Greg Armstrong gives an introduction to the COVID-19 Genomic Epidemiology Toolkit and describes the role for genome sequencing in public health.

Presenter: Gregory L. Armstrong, MD
Director, Advanced Molecular Detection Program,  CDC

ToolkitModule_0 pdf icon[PDF – 15 slides]

Part 1: Introduction
Module 1.1 - What is genomic epidemiology?

This module provides an introduction to genomic epidemiology, with specific reference to SARS-CoV-2 sequencing for epidemiologic investigations.

Presenter: Nancy Chow, PhD
Bioinformatics and Informatics Lead, CDC

ToolkitModule_1.1 pdf icon[PDF – 23 slides]

Further Reading

  1. Pathogen Genomics in Public Health. Armstrong et al. 2019 NEJM. icon
  2. Towards a genomics-informed, real-time, global pathogen surveillance systemGardy and Loman. 2017 Nat Rev Genomics. icon


  1. Scientists have a powerful new tool for controlling the coronavirus: Its own genetic code. Washington Post, 2020. icon
Module 1.2 - The SARS-CoV-2 genome

This module describes the basics of microbial genomes, with specific refence to the SARS-CoV-2 genome.

Presenter: Shatavia S. Morrison, PhD​
Bioinformatics Unit Lead​, CDC

ToolkitModule_1.2 pdf icon[PDF – 15 slides]

Further Reading

  1. How Coronavirus Mutates and Spreads. New York Times, 2020. icon
  2. SARS-CoV-2 Sequencing Data: The Devil is in the Genomic Details. Hemarajata 2020 icon
  3. Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020. Alm et al. 2020 Euro Surveill. icon


  1. SARS-CoV-2 Sequencing icon
Module 1.3 - How to read a phylogenetic tree

This module describes the anatomy of phylogenetic trees and how to interpret them in the context of transmission.

Presenter: Michael Weigand, PhD​
Bioinformatician, CDC

ToolkitModule_1.3 pdf icon[PDF – 28 slides]

Further Reading

  1. How to read a phylogenetic tree. ARTIC Network. icon
  2. Genomic surveillance reveals multiple introductions of SARS-CoV-2 into Northern California. Deng et al. 2020 Science. icon
  3. Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the NetherlandsMunnink et al. 2020 Nature Medicine. icon
  4. Cryptic transmission revealed by genomic epidemiology. Trevor Bedford 2020. icon


  1. How to interpret phylogenetic trees. icon 
  2. Genomic epidemiology playbook — a primer on uses and interpretation. Sidney Bell. icon
Module 1.4 - Emerging variants of SARS-CoV-2

This module introduces basic concepts relevant to the emergence of new SARS-CoV-2 variants and the role of sequencing in their detection and definition.

Presenter: Michael Weigand, PhD​
Bioinformatician, CDC

ToolkitModule_1.4 pdf icon[PDF – 984 KB]

Further Reading

  1. The coronavirus is evolving before our eyes. The Atlantic, 2021. icon
  2. Coronavirus variants and mutations. New York Times, 2021. icon
  3. Emergence and spread of a SARS-CoV-2 variant through Europe in the summer of 2020. Hodcroft et al. 2020 MedRxiv. icon
  4. Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus. Korber et al. 2021 Cell. icon
  5. Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations. Rambaut et al. 2020 Virological. icon
  6. Genomic epidemiology identifies emergence and rapid transmission of SARS-CoV-2 B.1.1.7 in the United States. Washington et al. 2021 MedRxiv. icon
  7. Emergence of SARS-CoV-2 B.1.1.7 lineage — United States, December 29, 2020–January 12, 2021. Galloway et al. 2021 MMWR.


  1. About variants of the virus that causes COVID-19​​.
  2. Genomic surveillance for SARS-CoV-2 variants.
  3. Why S-gene sequencing is key for SARS-CoV-2 surveillance. ThermoFisher. icon
  4. PANGO lineage global reports. icon
  5. covariants.orgexternal icon
  6. SARS-CoV-2 mutation situation reports. Scripps Research. icon
  7. Pangolin COVID-19 lineage assigner. icon
  8. Nextclade clade assignment, mutation calling. icon
Part 2: Case Studies
Module 2.1 - SARS-CoV-2 sequencing in Arizona

This module provides insight into how SARS-CoV-2 sequencing is used to describe the genomic epidemiology of a state and as an investigative tool in COVID-19 outbreak settings.

Presenter: Hayley Yaglom, MS, MPH​
Genomic Epidemiologist, Translational Genomics Research Institute

Arizona Covid-19 Presentation
[Full Version]external icon [Short Version]external icon

Further Reading

  1. An Early Pandemic Analysis of SARS-CoV-2 Population Structure and Dynamics in Arizona. Ladner et al. 2020 American Society for Microbiology. icon


  1. AZ-Strain: Genomic Epidemiology of SARS-CoV-2 in Arizona external icon
Module 2.2 - Healthcare cluster transmission

This module provides insight into two separate outbreaks at long-term care settings, and how sequencing helped clarify the pattern of transmission in these settings.

Presenter: Nicholas Lehnertz, MD MPH MHS
Physician and Epidemiologist, Minnesota Department of Health​

ToolkitModule_2.2 pdf icon[PDF – 18 slides]

Further Reading

  1. Serial testing for SARS-CoV-2 and virus whole-genome sequencing. Taylor et al. 2020 MMWR.

Further Reading for Case Studies

  1. Presymptomatic SARS-CoV-2 infections and transmission in a skilled nursing facility. Arons et al. 2020 NEJM. icon
  2. COVID-19 outbreak associated with a 10-day motorcycle rally in a neighboring state. Firestone et al. 2020 MMWR.
  3. Phylogenetic analysis of SARS-CoV-2 in the Boston area highlights the role of recurrent importation and superspreading events. Lemieux et al. 2020 MedRxiv. icon
  4. The emergence of SARS-CoV-2 in Europe and North America. Worobey et al. 2020 Science. icon
  5. Interregional SARS-CoV-2 spread from a single introduction outbreak in a meat-packing plant in northeast Iowa. Richmond et al. 2020 MedRxiv. icon
  6. SARS-CoV-2 sequencing reveals rapid transmission from college student clusters resulting in morbidity and deaths in vulnerable populations. Richmond et al. 2020 MedRxiv. icon
Part 3: Implementation
Module 3.1 - Getting started with Nextstrain

This module gives an introduction to Nextstrain, a powerful tool for interactive tree visualization.

Presenter: Michael Weigand, PhD​​
Bioinformatician ​, CDC


ToolkitModule_3.1 pdf icon[PDF – 20 slides]

Further Reading

  1. Nextstrain: Real-time tracking of pathogen evolution. Hadfield et al. 2018. Bioinformatics. external icon


  1. Nextstrain documentation docs.nextstrain.orgexternal icon
  2. A Getting Started Guide to the Genomic Epidemiology of SARS-CoV-2 icon
  3. Interacting with auspice, the visualization web application icon
  4. SPHERES state builds icon
Want to know more?
alert icon

More modules and materials will be added to this page, so please check back for updates. Or, scroll to the top of this page and subscribe to our mailing list to get update notifications.

Page last reviewed: February 24, 2021