Reported Tuberculosis in the United States, 2018

Return to Main Menu

Appendix D

Genotyping Background Information and Glossary

Tuberculosis (TB) genotyping is a laboratory-based analysis of the genetic material of the bacteria that cause TB disease, Mycobacterium tuberculosis complex. The total genetic content is referred to as the genome. Specific sections of the genome contain distinct genetic patterns that help distinguish different strains of M. tuberculosis. TB genotyping examines the location, number, and presence of different types of spacer or repetitive DNA patterns. The areas of the genome examined in TB genotyping are different from those related to drug resistance.

Applications of Genotyping

Persons with TB disease who are related by transmission should have matching genotype results. Conversely, persons with matching TB genotyping results are probably related by transmission in some way, although the connection might not be recent or direct.

Genotyping results, when combined with epidemiologic data, can help identify persons with TB disease involved in the same chain of transmission. This information adds value to conventional TB control activities in different ways. These applications are summarized as follows:

Patient-Level Applications of Genotyping

Complete Contact Investigations

Connections identified between ≥2 patients with TB (epidemiologic linkages) that might or might not be otherwise identified through routine contact investigations should be confirmed or refuted.

Cluster Investigations

Connections that were not identified through routine contact investigations should be identified.

Other patient-level applications include detecting, refuting, or confirming potential false-positive culture results and distinguishing relapse TB disease from new TB infection among TB patients with recurrent TB disease.

Population-Level Applications of Genotyping

  • Potential outbreaks should be detected by using geospatial or other analyses of genotype clusters.
  • When cases believed to be part of the same outbreak have nonmatching genotype results, the outbreak should be refuted.
  • The scope of potential outbreaks should be defined by identifying all cases in an area with a matching genotype.
  • Known outbreaks should be monitored temporally by watching for new cases with the same outbreak-related genotype.

History of TB Genotyping Surveillance in the United States

In 1996, the Centers for Disease Control and Prevention (CDC) started the National Tuberculosis Genotyping Surveillance Network (NTGSN), a 5-year initiative that established the utility of genotyping in TB control efforts.1 In 2004, based on the knowledge gained from NTGSN and associated studies,2 CDC established the National TB Genotyping Service (NTGS) and funded a national genotyping laboratory, located in Michigan, to genotype ≥1 M. tuberculosis isolate from each culture-positive TB case reported in the United States.3 All U.S. TB control programs can use NTGS at no cost to the patients, health care providers, or health departments. NTGS participation is voluntary, with each program determining how genotyping data will be used for its TB control activities. Since 2004, approximately 140,000 M. tuberculosis isolates have been successfully genotyped through NTGS and its partnerships among CDC programs, national genotyping laboratories, and 58 states and jurisdictions.

In 2010, CDC launched the TB Genotyping Information Management System (TB GIMS), a secure Internet-based database available to all 50 U.S. states, the District of Columbia, Puerto Rico, the U.S. Virgin Islands, and the U.S.-affiliated Pacific Islands. TB GIMS makes genotyping data easily available to users and facilitates linking of genotyping data to patient surveillance records from the National TB Surveillance System. Key features include database queries regarding genotypes and clusters, data quality checks, aggregate reports, maps, and outbreak detection tools. TB GIMS has 576 users among local, state, tribal, federal, and territorial partners.

In 2018, CDC established the National TB Molecular Surveillance Center to perform whole-genome sequencing on ≥1 isolate from every culture-positive TB case in the United States.

Genotyping-Based Cluster Detection

CDC identifies genotype-matched clusters, which can represent TB outbreaks, by using geospatial analysis to identify unexpected groupings of TB cases. TB control programs can use cluster detection information to help allocate and prioritize resources for investigation and intervention on specific cases that might be caused by recent transmission.

CDC’s primary outbreak detection method is based on identifying higher than expected geospatial concentrations of a TB genotype in a specific county, compared with the national distribution of that genotype. This method calculates a log-likelihood ratio (LLR) statistic; clusters with higher LLRs are more likely to represent greater geospatial concentrations than clusters with lower LLRs; higher LLRs might indicate recent transmission of TB. LLRs are then classified into alert levels within TB GIMS on the basis of established cut points. Clusters are classified as no alert (LLRs 0–<5), medium alert (LLRs 5–<10), or high alert (≥10). The alert level and changes in alert levels (e.g., from no alert to medium or high) can help TB programs identify outbreaks and prioritize TB genotype clusters for further investigation or intervention.

Genotyping Terminology

In NTGS, a genotype is defined as a unique combination of spacer oligonucleotide typing results (spoligotype) and 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat typing (MIRU–VNTR) results. Each unique combination of results is assigned a GENType designated as G followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., G00162). This nomenclature is designed for convenience and ease of communication, but the specific numbers assigned have no additional importance outside NTGS. Genotyping data from NTGS should not be used for clinical decision making.

National TB Genotyping Surveillance Coverage in the United States

National TB genotyping surveillance coverage refers to the proportion of culture-positive TB cases with a genotyped M. tuberculosis isolate. High levels of coverage in the United States can provide a better understanding of the epidemiology of TB transmission within a specific geographic area, as well as nationally. Additionally, because outbreak detection algorithms are based on identifying unexpected geospatial concentrations of cases whose isolates have the same genotype, high coverage levels help decrease the likelihood of false-negative alerts. The National TB Indicator Project genotyping surveillance coverage objective is 94% for 2019.4


alert level: the alert level determined by the LLR statistic for a given cluster. This is calculated by TB GIMS and is updated whenever a new case is added to a genotype cluster. E-mail notifications are sent to TB GIMS users whenever an alert level changes from a no alert LLR (0–<5) to medium LLR (5–<10) or high LLR (≥10).

cluster investigation: a cluster investigation identifying epidemiologic links among TB patients whose isolates have matching genotypes. It might include reviewing information from public health and medical records and interviewing case managers and outreach workers. It can also involve re-interviewing TB patients.

epidemiologic (epi) link: an epidemiologic link indicating a relationship that 2 TB patients share that might explain where, when, and how Mycobacterium tuberculosis was transmitted between them. Patients who name each other as contacts have an epidemiologic link. However, an epidemiologic link can be a location where the 2 persons spent time together or an activity that brought them together.

genotype: the strain discrimination produced by ≥1 of the 3 conventional genotyping techniques used for Mycobacterium tuberculosis: spoligotyping, 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat typing analysis, and IS6110-based restriction fragment length polymorphism. These designations were developed to facilitate communication of genotyping information among TB programs.

genotype surveillance coverage: genotyping surveillance coverage, defined as the proportion of culture-positive TB cases with a genotype result.

genotyping cluster: a cluster consisting of ≥2 cases in a jurisdiction during a specified period with M. tuberculosis isolates that share matching genotypes. In the United States, all cases with matching GENType or PCRType (polymerase chain reaction type) are considered to be in a genotype cluster. The jurisdiction and period used vary on the basis of the specific application of the term cluster. Within TB GIMS, a single county and a 3-year period are typically used to define a cluster.

GENType: a US molecular surveillance designation for each unique combination of spoligotype and 24-locus mycobacterial interspersed repetitive unit–variable number tandem repeat typing analysis results. GENType is designated as a G followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., G00017).

geospatial concentration: a measure of how concentrated a genotype is in time and space. It indicates that recent transmission might have occurred, because patients with infections with the same genotype who are proximal in location are more likely to have come in contact with each other. TB GIMS uses the log-likelihood ratio to generate a numeric measure of geospatial concentration of a given TB genotype.

linking: the process of connecting genotyping results in TB GIMS with a reported TB case from the National TB Surveillance System. This step is essential for ensuring that demographic, risk factor, and geographic data can be viewed in TB GIMS for genotype clusters.

log-likelihood ratio (LLR): a measure of the geographic concentration of a specific genotype in a county, compared with the national distribution of that same genotype, throughout a 3-year period. A higher LLR indicates that the genotype clustering within the county has a greater geospatial concentration than the national average, which might indicate recent transmission of Mycobacterium tuberculosis.

multidrug-resistant (MDR): Mycobacterium tuberculosis strains that are resistant to at least isoniazid and rifampin.

mycobacterial interspersed repetitive unit–variable number tandem repeat (MIRU-VTNR): a polymerase chain reaction (PCR)-based genotyping assay. The National TB Genotyping Service performs 24-locus MIRU-VNTR analysis on every isolate submitted for genotyping. Before 2009, 12-locus MIRU-VNTR was performed.

Mycobacterium bovis: a member of the Mycobacterium tuberculosis complex that is commonly associated with cattle. In the United States, human cases of M. bovis TB typically have a foodborne origin (e.g., consumption of unpasteurized dairy products). M. bovis is intrinsically resistant to pyrazinamide. Detection of TB caused by M. bovis can be performed through genotyping; however, this information should not be relied on for clinical decision making.

Mycobacterium tuberculosis complex: A group of closely related mycobacterial species that can cause latent TB infection (LTBI) and TB disease (i.e., M. tuberculosis, Mycobacterium bovis, Mycobacterium bovis bacillus Calmette-Guérin, Mycobacterium africanum, Mycobacterium canetti, Mycobacterium microti, Mycobacterium pinnipedii, and Mycobacterium mungi). Among humans, the majority of TB cases are caused by M. tuberculosis.

National Tuberculosis Genotyping Service (NTGS): provides TB genotyping services to local and state TB control programs. Since 2004, with NTGS’s launch, genotyping services have been provided at no cost to patients, health care providers, and health departments.

National Tuberculosis Surveillance System (NTSS): administered by the Centers for Disease Control and Prevention (CDC), collects surveillance data through an electronic reporting registry. Data collected include demographic, clinical, and social risk factor variables that are reported to CDC by state and local health departments.

PCRType (polymerase chain reaction type): a designation for each unique combination of spoligotype and 12-locus mycobacterial interspersed repetitive unit–variable number tandem repeat result. PCRType is designated as PCR followed by 5 digits, which are assigned sequentially to every genotype identified in the United States (e.g., PCR01974).

polymerase chain reaction (PCR): a laboratory method that can rapidly amplify limited quantities of DNA, thereby enabling certain types of laboratory testing. The National Tuberculosis Genotyping Service has routinely used 2 PCR-based techniques, spoligotyping and mycobacterial interspersed repetitive unit–variable number tandem repeat analysis.

reinfection: disease caused by a second infection, often with a strain different from the strain that caused the initial infection. (See also relapse.)

relapse: represents a worsening of signs and symptoms of disease after treatment and a presumed period of improvement. It is caused by the same strain (genotype) of Mycobacterium tuberculosis. Genotyping the initial and the subsequent M. tuberculosis isolates can be used to distinguish relapse from reinfection. (See also reinfection.)

Report of a Verified Case of Tuberculosis (RVCT): national surveillance data regarding patients with TB disease, recorded on the RVCT form and subsequently reported to the Centers for Disease Control and Prevention’s National Tuberculosis Surveillance System.

restriction fragment length polymorphism (RFLP): analysis that is IS6110-based and the first widely used method for genotyping Mycobacterium tuberculosis isolates; a genotyping technique based on measuring the number and length of specific DNA fragments that are cut using specific restriction enzymes.

spoligotyping: spacer oligonucleotide genotyping is a genotyping technique based on spacer sequences located in the direct repeat region in the chromosomes of the Mycobacterium tuberculosis complex. The spoligotype uses an octal code to report results as a 15-digit number.

  1. Cowan LS, Crawford JT. Genotype analysis of Mycobacterium tuberculosis isolates from a sentinel surveillance population. Emerg Infect Dis. 2002;8(11):1294-1302.
  2. Haddad MB, Diem MA, Cowan LS, et al. Tuberculosis genotyping in six low-incidence states, 2000–2003. Am J Prev Med. 2007;32(3):239-243.
  3. Ghosh S, Moonan PK, Cowan L, Grant J, Kammerer S, Navin TR. Tuberculosis Genotyping Information Management System: enhancing tuberculosis surveillance in the United States. Infect Genet Evol. 2012;12(4):782-788.
  4. Centers for Disease Control and Prevention. Monitoring tuberculosis programs—National Tuberculosis Indicator Project, United States, 2002–2008. MMWR Morb Mortal Wkly Rep. 2010;59(10):295-298.