Reported Tuberculosis in the United States, 2020

Return to Main Menu

Appendix C

Genotyping Background Information and Glossary

Tuberculosis (TB) genotyping methods are laboratory-based analyses of the genetic material of the bacteria that cause TB disease, Mycobacterium tuberculosiscomplex. The total genetic content is referred to as the genome. Specific sections of the genome contain distinct genetic patterns that help distinguish different strains of Mycobacterium tuberculosis. TB genotyping examines the location, number, and presence of different types of spacer or repetitive DNA patterns. The areas of the genome examined in TB genotyping are different from those related to drug resistance. 

Applications of Genotyping

Patients with TB disease who are related by recent transmission should generally have matching genotype results. Conversely, patients with matching TB genotyping results are probably related by transmission in some way, although the connection might not be recent or direct. 

Genotyping results, when combined with epidemiologic data, can help identify persons with TB disease involved in the same chain of transmission. This information adds value to conventional TB control activities in different ways. These applications are summarized as follows: 

Patient-Level Applications of Genotyping

Complete Contact Investigations

Connections identified between ≥2 patients with TB (i.e., epidemiologic linkages) that might or might not be otherwise identified through routine contact investigations should be confirmed or refuted using available genotyping results.

Cluster Investigations

Connections suggested by genotyping results that were not established through routine contact investigations should be identified.

Other patient-level applications include detecting or investigating potential false-positive culture results and distinguishing relapse TB disease from new TB infection (i.e., reinfection) among TB patients with recurrent TB disease.

Population-Level Applications of Genotyping
  • Potential outbreaks should be detected using geospatial or other analyses to identify genotype clusters.
  • When cases believed to be part of the same cluster have nonmatching genotype results, the outbreak should be refuted.
  • The scope of potential outbreaks should be defined by identifying all presumed cases in an area with a matching genotype.
  • Known outbreaks should be monitored prospectively by watching for new cases with the same outbreak-related genotype.

History of TB Genotyping Surveillance in the United States

In 1996, the Centers for Disease Control and Prevention (CDC) started the National Tuberculosis Genotyping Surveillance Network (NTGSN), a 5-year initiative that established the utility of genotyping in TB control efforts.1 In 2004, based on the knowledge gained from NTGSN and associated studies,2 CDC established the National Tuberculosis Genotyping Service (NTGS) and funded a national genotyping laboratory to genotype ≥1 Mycobacterium tuberculosis isolate from each culture-positive TB case reported in the United States.3 All U.S. TB control programs can use NTGS at no cost to patients, health care providers, or health departments. NTGS participation is voluntary, with each program determining how genotyping data will be used for its TB control activities. Since 2004, approximately 155,000 Mycobacterium tuberculosis isolates have been successfully genotyped through NTGS and its partnerships among CDC programs, national genotyping laboratories, and state and local jurisdictions. 

In 2010, CDC launched the Tuberculosis Genotyping Information Management System (TB GIMS),4 a secure Internet-based database available to reporting areas in all 50 U.S. states, the District of Columbia, Puerto Rico, the U.S. Virgin Islands, and the U.S.-affiliated Pacific Islands. TB GIMS makes genotyping data easily available to users and facilitates linking of genotyping data to patient surveillance records from the National TB Surveillance System. Key features include database queries of genotypes and clusters, data quality checks, aggregate reports, maps, and outbreak detection tools. As of July 2021, TB GIMS has 539 users among local, state, tribal, federal, and territorial partners. 

In 2018, CDC established the National Tuberculosis Molecular Surveillance Center to perform whole-genome sequencing on ≥1 isolate from every culture-positive TB case in the United States. 

Genotyping-Based Cluster Detection

CDC identifies genotype-matched clusters, which can represent TB outbreaks, using geospatial analysis to identify unexpected groupings of TB cases that are proximal in time. TB control programs can use this cluster detection information to help allocate and prioritize resources for investigation and intervention of specific cases that might be caused by recent transmission. 

CDC’s primary outbreak detection method is based on identifying higher than expected geospatial concentrations of a TB genotype in a specific county, compared with the national distribution of that genotype. This method calculates a log-likelihood ratio (LLR) statistic; clusters with higher LLRs represent greater geospatial concentrations than clusters with lower LLRs; higher LLRs might indicate recent transmission of TB. LLRs are then classified into alert levels within TB GIMS based on established cut points. Clusters are classified as no alert (LLRs 0–<5), medium alert (LLRs 5–<10), or high alert (≥10). The alert level and changes in alert levels (e.g., from no alert to medium- or high-level alerts) can help TB programs identify outbreaks by prioritizing TB genotype clusters for further investigation and possible intervention. 

Genotyping Terminology

In NTGS, a genotype is defined as a unique combination of spacer oligonucleotide typing (spoligotype) and 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat typing (MIRU–VNTR) results. Each unique combination of results is assigned a GENType designated as G followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., G00162). This nomenclature is designed for convenience and ease of communication, but the specific numbers assigned have no additional importance outside NTGS. Genotyping data from NTGS should not be used for clinical decision making. 

National TB Genotyping Surveillance Coverage in the United States

National TB genotyping surveillance coverage refers to the percentage of culture-positive TB cases with a genotyped Mycobacterium tuberculosis isolate. High levels of coverage in the United States can provide a better understanding of the molecular epidemiology of TB transmission within a specific geographic area and nationally. Additionally, because outbreak detection algorithms are based on identifying unexpected geospatial concentrations of cases whose isolates have the same genotype, high coverage levels help decrease the likelihood of false-negative alerts. The National Tuberculosis Indicators Project genotyping surveillance coverage target is 100% for 2020.5 


alert level: the alert level is determined by the log-likelihood ratio (LLR) statistic for a given cluster. This is calculated by the Tuberculosis Genotyping Information Management System (TB GIMS) and is updated whenever a new case is added to a genotype cluster. E-mail notifications are sent to TB GIMS users whenever an alert level changes from a no alert LLR (0–<5) to medium LLR (5–<10) or high LLR (≥10), or from a medium LLR to a high LLR alert level. 

cluster investigation: a cluster investigation seeks to identify epidemiologic links among TB patients whose isolates have matching genotypes. It might include reviewing information from public health and medical records or interviewing case managers and outreach workers. It can also involve re-interviewing TB patients. 

epidemiologic (epi) link: an epidemiologic link indicates a relationship that two TB patients share that might explain where, when, and how Mycobacterium tuberculosis  was transmitted between them. Patients who name each other as contacts have an epidemiologic link. However, an epidemiologic link can also be a location where the two patients spent time together or an activity that brought them together. 

genotype: a genotype is the strain discrimination produced by conventional genotyping techniques used for Mycobacterium tuberculosis, including spacer oligonucleotide typing (spoligotyping) and 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat typing analysis (MIRU-VNTR). These designations were developed to facilitate communication of genotyping information among TB programs. 

genotype surveillance coverage: genotyping surveillance coverage is defined as the percentage of culture-positive TB cases with a genotype result. 

genotyping cluster: a cluster consisting of ≥2 cases in a jurisdiction during a specified period with Mycobacterium tuberculosis isolates that share matching genotypes. In the United States, all cases with matching GENType are considered to be in a genotype cluster. The jurisdiction and period used to define clusters vary depending on the specific application. Within TB GIMS, a single county and a 3-year period are typically used to define a cluster. 

GENType: a U.S. molecular surveillance designation for each unique combination of spoligotyping and 24-locus MIRU-VNTR analysis results. GENType is designated as a G  followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., G00017). 

geospatial concentration: a measure of how concentrated a genotype is in time and space. It indicates that recent transmission might have occurred, because patients with isolates having the same genotype and who reside closer to each other are more likely to have come in contact with each other. TB GIMS uses the log-likelihood ratio (LLR) to generate a statistical measure of geospatial concentration of a given TB genotype for purposes of cluster detection and alerting. 

linking: the process of connecting genotyping results in TB GIMS with a corresponding TB case reported to NTSS. This process is essential for ensuring that demographic, risk factor, and geographic data can be viewed in TB GIMS for genotype clusters. 

log-likelihood ratio (LLR): a measure of the geographic concentration of a specific genotype in a county, compared with the national distribution of that same genotype, throughout a 3-year period. A higher LLR indicates that the genotype has geospatial clustering within the county, which might indicate recent transmission of Mycobacterium tuberculosis. 

mycobacterial interspersed repetitive unit–variable number tandem repeat (MIRU-VTNR): a polymerase chain reaction (PCR)-based assay used for genotyping. National Tuberculosis Genotyping Service (NTGS) performs 24-locus MIRU-VNTR analysis on every isolate submitted for genotyping. Before 2009, 12-locus MIRU-VNTR was performed. MIRU-VNTR distinguishes Mycobacterium tuberculosis strains by the difference in the number of copies of tandem repeats at specific regions, or loci, of the Mycobacterium tuberculosis genome 

Mycobacterium bovis: a member of the Mycobacterium tuberculosis complex that is commonly associated with cattle. In the United States, human cases of M. bovis TB typically have a foodborne origin (e.g., consumption of unpasteurized milk or dairy products). M. bovis is intrinsically resistant to pyrazinamide. Detection of TB caused by M. bovis can be performed through genotyping; however, this information should not be relied on for clinical decision making. 

Mycobacterium tuberculosis complex: A group of closely related mycobacterial species that can cause latent TB infection (LTBI) and TB disease (i.e., Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium bovis bacillus Calmette-Guérin, Mycobacterium africanumMycobacterium canetti, Mycobacterium caprae, Mycobacterium microti, Mycobacterium pinnipedii, and Mycobacterium mungi). Among humans, most TB cases are caused by M. tuberculosis. 

National Tuberculosis Genotyping Service (NTGS): NTGS provides TB genotyping services to local and state TB control programs. Since NTGS’s launch in 2004, genotyping services have been provided at no cost to patients, health care providers, and health departments. 

National Tuberculosis Molecular Surveillance Center: The National TB Molecular Surveillance Center was established in 2018 to perform whole-genome sequencing on ≥1 isolate from newly diagnosed culture-positive TB cases in the United States. 

National Tuberculosis Surveillance System (NTSS): NTSS is administered by the Centers for Disease Control and Prevention (CDC) and collects surveillance data through an electronic reporting registry of TB cases. Data collected include demographic, clinical, and social risk factor variables that are reported to CDC by state and local health departments. 

PCRType: a U.S. molecular surveillance designation for each unique combination of spoligotyping and 12-locus MIRU-VNTR. PCRType is designated as PCR followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., PCR01974). 

polymerase chain reaction (PCR): a laboratory method that can rapidly amplify limited quantities of specific DNA, thereby enabling certain types of laboratory testing. The National Tuberculosis Genotyping Service has routinely used two PCR-based techniques, spoligotyping and MIRU-VNTR. 

reinfection: disease caused by a second infection, often with a strain (genotype) of Mycobacterium tuberculosis different from the strain that caused the initial infection. (See also relapse.) 

relapse: represents a clinical worsening of TB disease after treatment and a presumed period of improvement. It is caused by the same strain (genotype) of Mycobacterium tuberculosis. Genotyping the initial and the subsequent Mycobacterium tuberculosis isolates might help distinguish relapse from reinfection. (See also reinfection.) 

Report of a Verified Case of Tuberculosis (RVCT): national case surveillance data regarding patients with TB disease, recorded on the standardized RVCT form and subsequently reported to the Centers for Disease Control and Prevention’s National Tuberculosis Surveillance System. 

spacer oligonucleotide typing (spoligotyping): a polymerase chain reaction (PCR)-based assay used for genotyping. The National TB Genotyping Service performs spoligotyping on every isolate submitted for genotyping. Spoligotyping is based on spacer sequences located in the direct repeat region in the chromosomes of the Mycobacterium tuberculosis complex. The spoligotype uses an octal code to report results as a 15-digit number. 


1Cowan LS, Crawford JT. Genotype analysis of Mycobacterium tuberculosis isolates from a sentinel surveillance population. Emerg Infect Dis. 2002;8(11):1294-1302. Genotype analysis of Mycobacterium tuberculosis isolates from a sentinel surveillance population – PubMed ( 

2Haddad MB, Diem MA, Cowan LS, et al. Tuberculosis genotyping in six low-incidence states, 2000–2003. Am J Prev Med. 2007;32(3):239-243. Tuberculosis genotyping in six low-incidence States, 2000-2003 – PubMed ( 

3Ghosh S, Moonan PK, Cowan L, Grant J, Kammerer S, Navin TR. Tuberculosis Genotyping Information Management System: enhancing tuberculosis surveillance in the United States. Infect Genet Evol. 2012;12(4):782-788. Tuberculosis genotyping information management system: enhancing tuberculosis surveillance in the United States – PubMed ( 

4Centers for Disease Control and Prevention. Tuberculosis Genotyping Information Management System. Atlanta, GA: U.S. Department of Health and Human Services, CDC; [undated]. 

5Centers for Disease Control and Prevention. Monitoring tuberculosis programs—National Tuberculosis Indicator Project, United States, 2002–2008. MMWRMorb Mortal Wkly Rep. 2010;59(10):295-298.