Guide to the Application of Genotyping to Tuberculosis Prevention and Control

Combining Genotyping and Epidemiologic Data to Improve Our Understanding of Tuberculosis Transmission


Matching Versus Nonmatching Genotypes

The first objective in interpreting genotyping results is to decide if an isolate has a genotype pattern that matches any other isolate in the genotyping results database. Isolates that show a genotyping pattern that matches at least one other isolate in the database are referred to as belonging to the same genotyping cluster. An isolate with at least one other genotype match is also referred to as being clustered. If an isolate has a genotyping pattern that does not match any other isolate in the database, that isolate is referred to as having a nonmatching or unique genotype.

In general, the determination of whether an isolate has a matching or a nonmatching genotype is straightforward, since the genotyping laboratory report will provide a PCR cluster designation for all isolates that have a matching spoligotype and MIRU type. If the genotyping laboratory report lists no PCR cluster designation, the isolate has a nonmatching or unique genotype. TB program staff can determine for themselves if there are any matching PCR genotypes by performing a simple Excel SORT command on the spoligotype and MIRU type results, since that command will group all matching isolates together.

Two factors, however, complicate this picture. First, while all isolates will have genotyping results from the two PCR tests, only a subset of isolates will have IS6110–based RFLP results. When interpreting genotyping results from isolates that belong to a PCR cluster, it is important to remember that a subsequent RFLP analysis may reveal that some or all of the isolates have different RFLP patterns and do not, therefore, belong to the same genotyping cluster. In a more general sense, when one speaks of isolates belonging to the same genotyping cluster, it is important to clarify if the isolates belong to the same PCR cluster (and RFLP has not been performed) or if the isolates belong to the same PCR/RFLP cluster (Table 4.1).

Table 4.1. Genotyping cluster designations based on results of the three genotyping methods (spoligotyping, MIRU analysis, and IS6110-based RFLP). Only isolates that match by the two PCR methods should be analyzed by IS6110-based RFLP.

PCR-based test results IS6110-based RFLP results
Not performed Performed
RFLP patterns match RFLP patterns do not match

Both spoligotype and MIRU analysis show matching genotypes

PCR cluster PCR/RFLP cluster Nonmatching (or unique) genotypes

Either spoligotype or MIRU analysis show a nonmatching genotype

* Nonmatching (or unique) genotypes * Nonmatching (or unique) genotypes * Nonmatching (or unique) genotypes

*RFLP not indicated in this situation

The second factor that complicates the definition of nonmatching genotypes involves the possibility that other isolates, either isolates in another TB program’s database or ones that may be genotyped in the future, may reveal matching genotypes. For example, consider a source patient, who lived and worked in Kansas City, Missouri and transmitted TB to one secondary patient at their place of work. If the secondary patient lived in Kansas City, Kansas, a search of the Kansas TB program’s genotyping database would not reveal a genotype match, nor would a search of the Missouri TB program’s genotyping database. If the two programs routinely compared their data, however, the match would be identified at that time. Similarly, if a source patient transmits TB to a secondary patient, and that secondary patient is not diagnosed at the same time, the initial review of the genotyping data will show that the source patient’s isolate has a nonmatching genotype. When the secondary case is diagnosed and the isolate genotyped, the source case’s status will change from nonmatching to matching.

In summary, it is important to bear in mind that the classification of an isolate as matching or nonmatching is provisional and can change as new data become available.

Infectious Period

The infectious period is a key part of determining if epidemiologic links exist between TB patients because it describes when a TB patient was most likely capable of transmitting TB to others. We will provide an operational definition of the term here, presented by whether the case was sputum smear positive or smear negative.

  • Sputum smear-positive cases: the infectious period extends from 3 months before the first positive smear or symptom onset (whichever is earlier) until 2 weeks after the time of the start of appropriate TB treatment or until the patient is placed into isolation or the date of the first negative smear that is followed by consistently negative smears.
  • Sputum smear-negative cases: the infectious period is defined as beginning 1 month before symptom onset or start of appropriate TB treatment or when the patient was placed into isolation (whichever was earlier) until 2 weeks after the start of appropriate treatment or until isolation began.

Epidemiologic Links

Information on epidemiologic links between two patients with TB comes from data collected during the initial case interviews, the contact investigations, and a subsequent cluster investigation, if one is undertaken.

Key data that help define epidemiologic links collected during the case interviews include the following: a) location where patients lived, worked, and spent time (in order to determine if the patients in a genotyping cluster were also clustered in space); b) the times that each patient was present at each of the locations (in order to determine if the patients were clustered in time); c) the infectious period; and d) social and behavioral traits that the patients might share that could increase the chance of TB transmission (e.g., drug use, homelessness, incarceration). Key data collected during contact or cluster investigations include the following: a) whether either patient named the other one as a contact; and b) whether the patients lived, worked, or spent time at the same place (this information may come from the initial case interview or from the contact investigation). During cluster investigations field staff members seek the same information, but because genotyping results are already available and describe the patients as belonging to the same genotyping cluster, cluster investigations are more focused and search for possible links that might have occurred farther in the past.

What constitutes a known as compared with a possible epidemiologic link cannot be defined as precisely as a genotyping match. The text box, Summing Up: Defining Epidemiologic Links, lists general guidance about definitions that have proven helpful to some TB programs. As we learn more about how to interpret genotyping data, these definitions may need to be revised. And as with genotyping data, epidemiologic links are provisional at any point in time. A contact investigation might fail to identify an epidemiologic link that is discovered only during a subsequent cluster investigation. Similarly, a link may only become apparent when additional cases are added to a cluster and new information about how all the cases are related becomes apparent. Table 4.2 lists commonly identified relationships and locations that were found to represent known epidemiologic links in the NTGSN study.

Table 4.2: Commonly identified relationships and settings that represent known epidemiologic links between TB patients.*

Relationship Frequency
Household member 47%
Common source 27%
Friend or contact outside the home 23%
Co-worker 3%
Total 100%
Emergency shelter 18%
Group quarters 11%
Prison or jail 7%
Nursing home 3%
Hospital 1%
School/day care 1%
Nontraditional setting 59%
Total 100%

*This analysis of unpublished NTGSN data includes 1,485 epidemiologic links between TB patients who had matching genotypes and for whom a contact or cluster investigation identified a likely location and relationship of transmission.
A common source was defined as two TB patients who were in the same place at the same time but did not fit into any of the other categories.
Common nontraditional settings included bars/social clubs, churches/temples, drug/crack houses, and other locations not typically asked about in routine contact investigations.

Summing Up: Defining epidemiologic links

Based on the information collected during case interviews, contact investigations, cluster investigations, and record reviews, TB patients in a genotyping cluster can be characterized by the strength of the epidemiologic links between them.

Known epidemiologic link

Two patients are said to have a known epidemiologic link if either of the following two conditions apply:

  • One of the patients named the other as a contact during one of the patient’s infectious period


  • The two patients were at the same place at the same time during one of the patient’s infectious period

Possible epidemiologic link

Two patients are said to have a possible epidemiologic link if any one of the following conditions apply:

  • The two patients spent time at the same place around the same time, but the timing of when they were there or the timing of the infectious period was not definite enough to meet the criteria for a known epidemiologic link


  • The two patients lived in the same neighborhood around the same time


  • The two patients worked in or were at the same geographic area around the same time and shared social or behavioral traits that increased the chances of transmission

No identified epidemiologic link

Two patients should be classified as having no identified epidemiologic link if they do not meet the criteria listed above.