Genomic Surveillance for SARS-CoV-2 Variants: Circulation of Omicron Lineages — United States, January 2022–May 2023

CDC has used national genomic surveillance since December 2020 to monitor SARS-CoV-2 variants that have emerged throughout the COVID-19 pandemic, including the Omicron variant. This report summarizes U.S. trends in variant proportions from national genomic surveillance during January 2022-May 2023. During this period, the Omicron variant remained predominant, with various descendant lineages reaching national predominance (>50% prevalence). During the first half of 2022, BA.1.1 reached predominance by the week ending January 8, 2022, followed by BA.2 (March 26), BA.2.12.1 (May 14), and BA.5 (July 2); the predominance of each variant coincided with surges in COVID-19 cases. The latter half of 2022 was characterized by the circulation of sublineages of BA.2, BA.4, and BA.5 (e.g., BQ.1 and BQ.1.1), some of which independently acquired similar spike protein substitutions associated with immune evasion. By the end of January 2023, XBB.1.5 became predominant. As of May 13, 2023, the most common circulating lineages were XBB.1.5 (61.5%), XBB.1.9.1 (10.0%), and XBB.1.16 (9.4%); XBB.1.16 and XBB.1.16.1 (2.4%), containing the K478R substitution, and XBB.2.3 (3.2%), containing the P521S substitution, had the fastest doubling times at that point. Analytic methods for estimating variant proportions have been updated as the availability of sequencing specimens has declined. The continued evolution of Omicron lineages highlights the importance of genomic surveillance to monitor emerging variants and help guide vaccine development and use of therapeutics.

Weekly SARS-CoV-2 consensus sequences** from the NS3 program, commercial laboratories, and data repositories were quality-filtered, † † deduplicated, and assigned Pango lineages (3). During January 2022-May 2023, the median interval from specimen collection to data availability was 16 days. Weekly variant proportions were estimated at the national and U.S. Department of Health and Human Services (HHS) regional levels § § by specimen collection date for the 11 weeks before the most recent 3 weeks; lineages were included if they constituted ≥1% (unweighted) of sequences nationally and contained spike protein substitutions of potential therapeutic relevance. To estimate variant proportions for the most recent 3 weeks, nowcasts were generated using multinomial regression fit on the previous 21 weeks of data. ¶ ¶ All methods included weighting to account for the complex survey design and adjust § Sequences from public sequence data repositories are limited to those meeting baseline surveillance criteria, which ensures that they appropriately capture geographic, demographic, and clinical diversity. https://www.aphl.org/ programs/preparedness/Crisis-Management/Documents/Technical-Assistance-for-Categorizing-Baseline-Surveillance-Update-Oct2021.pdf ¶ https://covid.cdc.gov/covid-data-tracker/#variant-proportions; https://data. cdc.gov/Laboratory-Surveillance/SARS-CoV-2-Variant-Proportions/ jr58-6ysp ** A consensus sequence is produced by aligning SARS-CoV-2 nucleotide sequences generated through sequencing a sample and then determining the most common nucleotide at each position. A consensus sequence is an interoperable genomic surveillance unit that can be combined from laboratory sources. † † Quality filters included limiting sequences to include only human-derived sources and U.S-specific sequences and excluding those with invalid state names and laboratory sources. § § https://www.hhs.gov/about/agencies/iea/regional-offices/index.html ¶ ¶ Before August 13, 2022, nowcasts were used to produce estimates for only the most recent 2 weeks.
for potential sampling biases.*** Nowcasts were conducted for any lineages with ≥0.5% prevalence beginning October 11, 2022, † † † to improve accuracy by accounting for differential growth rates of grouped sublineages. Weekly numbers of COVID-19 cases attributable to variants were estimated by multiplying counts of positive nucleic acid amplification tests from COVID-19 electronic laboratory reporting (CELR) with variant proportions. Doubling times for proportions of specific lineages were estimated from the coefficients of the multinomial nowcasting model. § § § Methodologic changes following the public health emergency expiration (4) were summarized. Biweekly estimates using the updated model were compared with weekly estimates from the previous model to assess consistency. Data were current as of June 1, 2023. This activity was reviewed by CDC and conducted consistent with applicable federal law and CDC policy. ¶ ¶ ¶ During January 2, 2022-May 13, 2023, a total of 1,697,197 SARS-CoV-2 surveillance sequences from 56 U.S. jurisdictions**** were generated by or reported to CDC from NS3 (1%), commercial laboratories (60%), and repositories (38%); the percentage of sequences from repositories represented an increase from 10% during June 2021-January 2022 (1). The weekly number of sequenced specimens decreased from approximately 65,000 collected in January 2022 to approximately 4,400 in April 2023, as the number of COVID-19 cases declined (Supplementary Figure 1, https://stacks.cdc. gov/view/cdc/129515).
Omicron remained predominant during January 2, 2022-May 13, 2023, with various descendent lineages emerging and becoming predominant nationwide. The BA.1.1 lineage reached predominance by the week ending January 8, 2022, *** Variant proportion estimation methods account for the complex survey design, with weights based on the weekly estimated number of infections represented by each SARS-CoV-2 sequence; weights are trimmed to the 99th percentile. Each submitting laboratory source was considered a primary sampling unit, and the state and week of sequence sample collection were considered strata. The updated code, weight derivations, and nowcast model equations for the variant proportion estimation methods are available online. https://github.com/CDCgov/SARS-CoV-2_Genomic_Surveillance † † † Beginning October 11, 2022, growth rate and nowcast estimates were conducted for any lineages accounting for ≥0.5% of sequences nationwide (unweighted) in the last week before nowcast estimates. Lineages with a prevalence <1% or without spike protein substitutions of potential therapeutic or clinical relevance were aggregated with their parental lineage in the final estimates. § § § Doubling times for proportions of specific lineages are based on instantaneous growth rates from the multinomial nowcasting model. Doubling times were assessed either 1) when a lineage reached 1% prevalence, for comparisons of doubling times across all lineages, or 2) during the most recent week of data availability, to assess growth trajectories for currently circulating lineages.   (Figure 1). Several of these lineages independently acquired spike receptor binding domain (RBD) substitutions, including R346T, K444T, N460K, and F486S/P (Table). None attained predominance individually; however, BQ.1 (which includes K444T and N460K) and BQ.1.1 (which also includes R346T) reached a combined peak prevalence of 59.    Abbreviations: CELR = COVID-19 electronic laboratory reporting; NS3 = National SARS-CoV-2 Strain Surveillance Program. * Sequences are reported to CDC through NS3, contract laboratories, public health laboratories, and other U.S. institutions. Variant proportion estimation methods use a complex survey design and statistical weights to account for the probability that a specimen is sequenced. https://covid.cdc.gov/ covid-data-tracker/#variant-proportions † Lineages reaching a prevalence of ≥1% with spike protein substitutions of potential therapeutic relevance and separated out on the COVID Data Tracker website. § Estimated numbers of COVID-19 cases attributable to variants were calculated by multiplying weekly numbers of reported positive nucleic acid amplification tests from CELR with estimated variant proportions. Beginning May 13, 2023, after the expiration of the public health emergency declaration (4) and in response to declining numbers of cases and sequenced specimens, methodologic changes were made regarding the analysis of SARS-CoV-2 genomic surveillance data. The reporting cadence and unit of analysis changed from weekly to biweekly, with variant proportions estimated for 2-week periods and nowcast predictions conducted for the most recent 4 weeks, § § § § and state-specific estimates were discontinued. For calculating survey weights, the level and source for information on positive test results changed to regional-level data from the National Respiratory and Enteric Virus Surveillance System (NREVSS) ¶ ¶ ¶ ¶ (6). The previous and updated analytic methods using CELR-and NREVSS-derived survey weights, respectively, produced similar variant proportion estimates for all lineages. An example comparison of national and regional proportions of XBB.1.5 demonstrates the consistency between methodologies (Supplementary Figure 3, https://stacks. cdc.gov/view/cdc/129517). Multiple Omicron lineages independently acquired similar substitutions (e.g., R346T, K444T, N460K, and F486S/P) in the spike RBD, suggesting that these sites are under selective pressure in the population and drive enhanced viral circulation (7). Accordingly, these substitutions have been observed to be associated with escape from neutralizing antibodies, including previously authorized monoclonal antibody therapies (7,8), and the S486P substitution observed in some XBB-descendent lineages also has been observed to increase infectivity via enhanced angiotensin-converting enzyme 2 receptor binding affinity (9). XBB lineages with additional substitutions compared with XBB.1.5, namely XBB.1.16, XBB.1.16.1, and XBB.2.3, had the fastest doubling times as of May 13, 2023. § § § § Beginning May 11, 2023, weighted variant proportions were estimated for the six 2-week periods (12 weeks total) before the two most recent 2-week periods for select lineages accounting for ≥1% (unweighted) of sequences nationwide. Nowcast predictions were used to produce estimates for the two most recent 2-week periods. Nowcasts were also conducted for lineages accounting for ≥0.5% (unweighted) of sequences nationwide during the first 2-week nowcast period to improve accuracy by accounting for differential growth rates of grouped sublineages. ¶ ¶ ¶ ¶ Test positivity data (weekly numbers of positive specimens and total tests administered) from CELR were no longer available after the expiration of the public health emergency declaration (https://healthdata.gov/dataset/ COVID-19-Diagnostic-Laboratory-Testing-PCR-Testing/j8mb-icvb

Discussion
Oct 29, 2022 The findings in this report are subject to at least four limitations. First, early SARS-CoV-2 variant proportion estimates might have low precision because of relatively limited data availability and biases in the timing of specimen collection or sequence submission. These effects can be exacerbated by ***** https://www.fda.gov/drugs/emergency-preparedness-drugs/ emergency-use-authorizations-drugs-and-non-vaccine-biologicalproducts; https://www.covid19treatmentguidelines.nih.gov/tables/ variants-and-susceptibility-to-mabs/ † † † † † https://www.who.int/news/item/18-05-2023-statement-on-the-antigencomposition-of-covid-19-vaccines sequencing and reporting lag time (e.g., holidays) or laboratory issues, such as lineage-specific sequencing failures. Second, continued decreases in the number of sequencing specimens available over time affect precision; for this reason, state-specific estimates were discontinued in May 2023. Third, current analyses might differ from previous analyses because of fluctuations in sequencing data sources, changes in Pango lineage definitions, and methodologic updates. Finally, estimates of COVID-19 cases attributed to more recent lineages are affected by case underascertainment because of increasing at-home test use and other changes in test-seeking behaviors. CDC has maintained national SARS-CoV-2 genomic surveillance since December 2020 to monitor variant proportions and aid in making timely decisions on prevention strategies, including vaccines and therapeutics. Analytic methods have been updated to maintain robust and representative estimates as the availability of sequencing specimens has declined; it is reassuring that the previous and updated weighting methodologies produced consistent estimates. Continued monitoring of SARS-CoV-2 variants in the U.S. population is key for

Summary
What is already known about this topic? CDC has used genomic surveillance to monitor trends in circulating U.S. SARS-CoV-2 variants since December 2020, including the emergence of the Omicron variant at the end of 2021.
What is added by this report?
Weekly estimates of variant proportions during January 2, 2022-May 13, 2023, identified the emergence and subsequent predominance of multiple Omicron lineages in the United States, including BA.2, BA.2.12.1, BA.5, and XBB.1.5. Repeated independent substitutions in the spike protein suggested convergent evolution related to immune evasion. Analytic methods for variant proportion estimation have been updated as numbers of cases and sequenced specimens have declined.
What are implications for public health practice?
Ongoing genomic surveillance can identify emerging SARS-CoV-2 variants and guide vaccine and therapeutic development and use.
guiding public health action, including FDA authorizations for COVID-19 therapeutics and strain selection for vaccines.