Syphilis serologic testing algorithms

Key Question

What considerations (i.e. diagnostics and cost-effective implications) should be taken into account when screening for syphilis using either the traditional and reverse algorithm?

Literature Search Terms

((Treponema pallidum OR neurosyphilis OR syphilis) AND (sero-diagnos* OR serodiagnos* OR (serolog* AND (test* OR exam* OR assay* OR screen* OR lab* OR diagnos* OR nontreponemal OR treponemal OR algorithm* OR antibody titer)) OR serofast) NOT exp animals/ not exp humans/. Solely-based international studies were excluded from the literature search.


Four reviewers the evidence as high, medium, and low based on each study’s strengths and weaknesses. Case reports or small case studies were reviewed.

High quality publications: Studies using clinically characterized specimens, stratified by stage, larger sample size, prospective or a well-done cross sectional or retrospective study. Studies with large sample sizes, clinically characterized but not stratified by stage, or characterized but unclear exactly how it was done.

Medium quality publications: Studies with small sample sizes, moderate methodological issues, single lab test as gold standard, or descriptive.

Low quality publications: Studies with major methodological issues or small sample sizes.

I: Case reports or small case studies.

NR: Studies that were not relevant to the key question were assigned and not further rated.

Aktas, G., et al. (2005). “Evaluation of the serodia Treponema pallidum particle agglutination, the Murex Syphilis ICE and the Enzywell TP tests for serodiagnosis of syphilis.” International Journal of STD & AIDS 16(4): 294-298 Evaluation of Serodia TPPA, Murex syphilis ICE, and Enzywell TP tests for screening and comfirmation

Compared to RPR and T. pallidum haemagglutination assay (TPHA)

1876 patient sera from Istanbul Medical Facility from 1997-1998 screened by RPR followed by TPHA confirmation

124 reactive RPR and/or TPHA samples included in the study

Gold standard definition- Laboratory (TPHA or RPR)

24 RPR (+), 16 TPHA (+), and 84 both (+)

95.9% overall agreement between the 4 treponemal tests

TPHA agreement with TPPA 96.7%, ICE 100%, and TP 99.1%.

Syphilis diagnosis using the traditional screening algorithm TPHA (84), TPPA (83), ICE (85), TP (85).

Negative RPR but positive treponemal test- TPHA (16), TPPA (15), ICE (16), TP (16).

23 (18.5%) RPR biological false positives.

Large sample size

Screened all 1876 samples with a non-treponemal and treponemal assay

6.6% prevalence rate in study cohort

One sample could not be tested by Enzywell TP due to insufficient quantity

High agreement between the four treponemal assays

Validity of gold standard- good

Moderate relevant

Greater amount of false-positives seen using the RPR as a screen compared to 4 treponemal assays

Aktas, G., et al. (2007). “Evaluation of the fluorescent treponemal antibody absorption test for detection of antibodies (immunoglobulins G and M) to Treponema pallidum in serologic diagnosis of syphilis.” International Journal of STD & AIDS 18(4): 255-260 Compared the FTA-ABS assay to TPHA, TPPA, ICE ELISA, Diesse Enzywell TP ELISA

1876 patient samples from Istanbul, Turkey between 1996-1998

All sera tested by RPR and/or TPHA

42 sera tested by western blot

Gold standard definition- Laboratory (TPHA or RPR)

Out of 122 samples, 83 (68%) RPR+, 94 (77%) FTA-ABS+, 82 (67%) TPHA +, 81 (66%) TPPA, 83 (68%) ICE +, 83 (68%) TP ELISA+

FTA-ABS agreement with the TPHA, TPPA, ICE, and TP ELISA were 97.5%, 95.9%, 98.3% and 98.3%, respectively.

WB agreement with TPHA, Serodia TPPA, Murex syphilis ICE, Enzywell TP and FTA-ABS tests in 42 sera were 92.8%, 97.6%, 100%, 95.2% and 92.8%, respectively.

Large sample size

Screened all 1876 samples with a non-treponemal and treponemal assay

Syphilis stage not determined

WB performed on a subset of samples

Validity of gold standard- good

Moderate relevance

Near identical performance between the treponemal assays

The FTA-ABS assay may be slightly more prone to giving equivocal and false-negative results

Angue, Y., et al. (2005). “Syphilis serology testing: a comparative study of Abbot Determine, Rapid Plasma Reagin (RPR) card test and Venereal Disease Research Laboratory (VDRL) methods.” Papua New Guinea Medical Journal 48(3-4): 168-173. Evaluation of Abbot Determine (treponemal assay) and Abbot Syfacard R (RPR)

Compared to VDRL

2100 women serum samples from Papua New Guinea (1 hospital and 1 clinic)

March-September 2002

Used traditional algorithm with VDRL followed by TPHA

Gold standard definition- Laboratory (VDRL)

Abbot Determine had 92.0% sensitivity, 94.6% specificity, 42.6% PPV, and 99.6% NPV

RPR had 56.3% sensitivity, 96.5% specificity, 41.2% PPV, and 98.1% NPV

Abbot Determine identified 108 additional positives and RPR 70 additional.

Large sample size

All samples were tested by the VDRL, RPR, and Determine

4-5% prevalence at hospital

Used a non-treponemal and treponemal test for screening

Only compared each test to VDRL and not against each other

Many of the syphilis screens were not confirmed by a second assay

TPHAs were performed only on 40% VDRL, 51% Determine, and 65% RPR positive samples.

Validity of gold standard- VDRL test not routinely used in syphilis screening

Moderate relevance

A large amount of discrepancies between the VDRL and the RPR for syphilis screening however many of these tests were not confirmed with TPHA

Augenbraun, M., et al. (2010). “Hepatitis C virus infection and biological false-positive syphilis tests.” Sexually Transmitted Infections 86(2): 97-98. Cross-sectional

Identifying biological false positives (BFP) in HCV patients with an RPR screen

3666 US women tested

354 RPR screen positive

Used traditional algorithm

Gold standard definition- Laboratory RPR and treponemal assay (TPHA, FTA-ABS, or TPPA)

180 HCV + and RPR + with 4% BFP

174 HCV – and RPR + with 1% BFP

Corrected for intravenous drug users and HIV status the BFP rate in HCV positive women remained the same

Large sample size

Multiple different treponemal assays used for confirmatory testing (fluorescent treponemal antigen absorption test, microhaemagglutination, or TPPA)

Corrected for some of the possible confounders

Only women included

Validity of gold standard- traditional algorithm

Moderate relevance

HCV infection can cause biological false positives when using the RPR test as a syphilis screen. Roughly 4X increase in BFPs.

Berry GJ, Loeffelholz MJ Sex Transm Dis. 2016 Dec;43(12):737-740. Use of Treponemal Screening Assay Strength of Signal to Avoid Unnecessary Confirmatory Testing. 665 Bioplex reactive specimens. Cross sectional.

3.8% screen reactive rate. Mixed patient population.

Assessed likelihood of TPPA confirmation by antibody index

Gold standard definition- N/A.

Majority (65%) of screen-positive specimens had very high antibody index. Of these, >99% were TPPA reactive; consistent among high and low risk populations.


Comparison of different patient populations.

Single center study.

Validity of gold standard- N/A

Relevant.  Strong positives (antibody index at least 8) >99% likely to confirm by TPPA, so could skip TPPA if AI 8 or higher.

Compare to Yen-Lieberman, who showed >6 likely to confirm.

Binnicker, M. J., et al. (2011). “Treponema-specific tests for serodiagnosis of syphilis: comparative evaluation of seven assays.” Journal of Clinical Microbiology 49(4): 1313-1317. Evaluation of 7 treponemal assays: Bioplex 2200 Syphilis IgG, Fluorescent treponemal antibody (FTA), TP-PA, Trep-Sure enzyme immunoassay (EIA), Trep-Check EIA, Trep-ID EIA, and Treponema ViraBlot IgG

Compared to FTA assay

Each sample also tested by RPR and ViraBlot IgM WB assays

303 serum samples submitted to Mayo Clinic

Gold standard definition- Laboratory (FTA-ABS or consensus of the 4 out of 7 treponemal tests)

Compared to FTA- percent agreement was 98.0% Bioplex 2200 Syphilis IgG, 97.0% TP-PA, 95.4% Trep-Sure enzyme immunoassay (EIA), 97.7% Trep-Check EIA, 98.4% Trep-ID EIA, and 97.0% Treponema ViraBlot IgG

Consensus 4 out of 7 positive (panel)- percent agreement was 99.0% Bioplex 2200 Syphilis IgG, 98.0% TP-PA, 95.7% Trep-Sure enzyme immunoassay (EIA), 98.7% Trep-Check EIA, 99.3% Trep-ID EIA, and 98.0% Treponema ViraBlot IgG

Fastest TAT and throughput- Bioplex at 1.75 h for 100 samples and 514 samples for 9 hour shift

Slowest TAT and lowest throughput- Trep ID with 5.7 h for 100 samples and 158 samples in 9 h shift

Among the 97 FTA +, and/or 94 panel +, 66 were RPR + (62.9% and 64.9%, respectively)

61 RPR + were also panel +

Small sample size

Multiple treponemal assays evaluated against each other

RPR and ViraBlot IgM results not included in tables

Clinical data from specimens not included

Validity of gold standard- FTA-ABS not an ideal gold standard but also compared to consensus of test panel

Highly relevant

Increased percent-positive rate using treponemal assays

Comparable performance between the 7 different treponemal assays

Binnicker MJ, Jespersen DJ, Rollins LO. Direct comparison of the traditional and reverse syphilis screening algorithms in a population with a low prevalence of syphilis. J Clin Microbiol 2012; 50:148–150.


1000 prospective specimens.

Low prevalence (1.5% reactive by Bioplex)

Each sample tested by reverse algorithm and by screened by RPR

Gold standard definition- clinical (to resolve discordants in algorithms)

6/1000 (0.6%) Bioplex false positive

2/1000 (0.2%) RPR false negative: possible latent syphilis (and treated)


Not reported if 2 RPR false negatives were due to prozone phenomenon

Validity of gold standard- good



Very relevant.

Head-to-head of reverse and traditional algorithms.

Reverse has false positives (need to reflex). Traditional has rare false negatives.


Bosshard, P. P. (2013). “Usefulness of IgM-specific enzyme immunoassays for serodiagnosis of syphilis: comparative evaluation of three different assays.” Journal of Infection 67(1): 35-42 Retrospective analysis

Evaluation of 3 IgM treponemal assays: Anti-Treponema-pallidum-ELISA IgM by Euroimmune, Pathozyme Syphilis M Capture by Omega Diagnostics, recomWell Treponema IgM by Mikrogen

Each sample first tested by VDRL, TPPA, FTA-ABS, and RPR and Pathozyme Syphilis M Capture

156 serum samples from patients at Zurich University Hospital from Jan 2008 to Dec 2010

Also tested analytical specificity with 151 serum samples containing interfering substances

Gold standard definition- Clinical and laboratory data (each stage was characterized by symptoms and test result)

148/156 were TPPA+ with 8/156 TPPA inconclusive

VDRL+ in 61% 1ᵒ, 97% 2ᵒ, 96% latent, and 6/6 3ᵒ

TPPA and VDRL titers lower in 1ᵒ compared to all other stages

Pathozyme+ in 89.8% 1ᵒ, 90.9% 2ᵒ, 84.0% latent

Euroimmune+ in 81.4% 1ᵒ, 92.4% 2ᵒ, 68.0% latent

RecomWell+ in 50.8% 1ᵒ, 80.3% 2ᵒ, 48.0% latent

Analytical specificity- 91.4% (Euroimmune), 96.0% (Pathozyme), and 100% (recomWell)

Euroimmune had difficulties with Herpes IgM and EBV IgM

Small sample size

Stage of syphilis was determined from patient charts and serological results (symptoms seen for 1ᵒ and 2ᵒ cases)

VDRL- 1ᵒ cases were confirmed with convalescent sera

Validity of gold standard- Excellent. Clinical and laboratory data.

Highly relevant

TPPA was the most sensitive for detecting all stages of syphilis

Low sensitivity (68.0%) for VDRL in detecting 1ᵒ cases

IgM ELISA for suspected early infection

Herpes virus IgM may cause interference for some syphilis IgM tests

Busse, C., et al. (2013). “Evaluation of a new recombinant antigen-based Virotech Treponema pallidum screen ELISA for diagnosis of syphilis.” Clinical Laboratory 59(5-6): 523-529. Evaluation of the Sekisui Virotech Treponema pallidum Screen ELISA for screening

Compared to the Phoenix Trep-Sure ELISA (FDA approved)

Each sample first tested by TPPA and FTA-ABS

421 serum samples- 149 pregnant women, 164 suspected syphilis infections, 29 with borreliosis, 5 HIV, 74 commercial syphilis sera panels

Gold standard definition- Laboratory (TPPA and FTA-ABS)

Phoenix Trep-Sure sensitivity 100% and specificity 93.9%

Virotech ELISA sensitivity 100% and specificity 98.3%

Virotech and Trep-Sure had nearly identical analytical specificities

Small sample size

A nontreponemal assay was not used

Validity of gold standard- Did not evaluate discrepant results between TPPA and FTA-ABS

Low relevance

Near identical performance between two treponemal ELISAs

Cantor, A. G., et al. (2016). “Updated Evidence Report and Systematic Review for the US Preventive Services Task Force.” Journal of the American Medical Association 315(21):2328-37. Systematic review

Examined studies from US-relevant populations on the effectiveness of routine screening, and accuracy of screening tests and strategies

Gold standard definition- N/A

Australian group found higher detection rate of asymptomatic early syphilis when screening HIV-positive MSM every 3 months vs annually (8.1% vs. 3.1)

London group (n=2389) found a higher rate of detection in HIV individuals with 3 month vs 6 month screening (7.3 vs 2.8 cases per 1000)

Rate of syphilis detection also increased with 3 month routine testing in high-risk men (10 or more partners)

No increase in syphilis detection with 3 month routine testing in lower-risk men

EIA vs VDRL screening in high prevalence area (9.4%) showed slightly lower sensitivity (98.0 vs 98.6) but higher specificity (98.6 vs 91.1)

Patient populations other than HIV and MSM not studied for routine screening

Validity of gold standard- N/A

Moderate relevance

Syphilis screening every 3 months in HIV or MSM increased detection rates

Sensitivity >85% and specificity >91% for treponemal and nontreponemal tests but require supplemental testing

Rates of false-positives higher when RPR used for initial screening

Patient population and stage of disease affect diagnostic accuracy

Castro, A., et al. (2013). “A comparison of the analytical level of agreement of nine treponemal assays for syphilis and possible implications for screening algorithms.” BMJ Open 3 (9) (no pagination) (e003347). Retrospective analysis

Evaluation of 8 treponemal assays- FTA-abs (Zues scientific), LIASON, SD BIOLINE POCT, INNO-LIA immunoblot, BioELISA (BioKit), CAPTIA IgG, Trep-ID, and Trep-Sure

Compared to TPPA

Each sample first tested at the CDC by RPR and TPPA

290 serum samples randomly selected from serum bank of Georgia Public Health Laboratory- 109 reactive and 181 nonreactive by TPPA

All samples were serially diluted to determine the analytical sensitivity of each test

Gold standard definition- TPPA

95/109 (85%) of the TPPA reactive samples were RPR-

8 samples were TPPA- but reactive by at least 2 other treponemal tests

Concordance between TPPA + and – for each test: FTA-ABS 94.5% and 100%, INNO-LIA 99.1% and 99.4%, LIASON 100% and 99.4%, Trep-Sure 100% and 98.9%, BioELISA 100% and 98.9%, BIOLINE 100% and 98.9%, CAPTIA IgG 100% and 97.2%, Trep-ID 100% and 100%

Analytical sensitivity in fold dilutions- FTA-ABS (4), CAPTIA IgG (8), INNO-LIA (16), TPPA (16), BIOLINE (64), Trep-ID (64), LIASON (128), BioELISA (128), Trep-Sure (512),

Small sample size

TPPA ran again to ensure test accuracy

Sensitivity and specificity for each test not calculated

Titered out each sample to determine analytical sensitivity

Validity of gold standard- moderate. Single treponemal assay

Highly relevant

Near identical performance between all 9 treponemal tests

The analytical sensitivity between treponemal tests can vary considerably

Confirmatory treponemal test should be at least as sensitive as the screening test

Castro, R. R., et al. (2001). “Evaluation of the passive particle agglutination test in the serodiagnosis and follow-up of syphilis.” American Journal of Clinical Pathology 116(4): 581-585. Cross-sectional and longitudinal study

Evaluation of the MHA-TP, FTA-ABS, and TPPA in diagnosis and follow-up of syphilis

449 patients with suspected early syphilis in the Lisbon, Portugal region

54 patients returned for follow-up of 1, 2, 3, 6, and 12 months post treatment for syphilis

Gold standard definition- Laboratory (MHA, FTA-ABS, and RPR) and clinical

 MHA-TP+ 72.2%, TPPA+ 73.7%, and FTA-ABS+ 74.8%

Sensitivity and specificity of TPPA compared to MHA-TP 100% and 94.4%

Sensitivity and specificity of TPPA compared to FTA-ABS 98.5% and 100%

28 primary syphilis cases- 100% and 89.2% sensitivity of TPPA compared to FTA-ABS and MHA-TP, respectively

54 follow-up patients with 2-fold decrease in titers after 6 months: RPR 56%, MHA-TP 26%, TPPA 70%

Small sample size overall and even smaller for follow-up patients

Unknown how syphilis stage was determined

Validity of gold standard- Valid screening method using two different gold standards. Unknown which test used for screening and how syphilis stage was determined

Low relevance

Near identical performance between the 3 treponemal tests

Higher percentage of patients had a 2-fold decrease in titers using the TPPA test compared to the RPR

Caswell, R. J., et al. (2016). “The Significance of Isolated Reactive Treponemal Enzyme Immunoassay in the Diagnosis of Early Syphilis.” Sexually Transmitted Diseases 43(6): 365-368. Analysis of BioELISA+ and TPPA- serum samples

121,724 samples tested by BioELISA and TPPA

316 serum samples were BioELISA+ and TPPA- from 121,724 patients screened from Aug 2010 to Nov 2014 at the Birmingham Whittall Street Clinic

163 of the 316 patients returned for repeat testing

Gold standard definition- Laboratory (BioELISA screen followed by TPPA and RPR)

Among the 1561 with reactive sera, 20% were BioELISA+ and TPPA-

From the 163 repeat testing, 106 (65%) remained BioELISA+ and TPPA-, 50 (30.7%) turned BioELISA- and TPPA-, 7 (4.3%) were BioELISA+ and TPPA+

Of the 7 BioELISA+ and TPPA+, 1 had signs of early syphilis and 5 had been previously treated for syphilis

Small sample size

Low prevalence (2%)

A nontreponemal test was not performed in initial testing

Validity of gold standard- BioELISA not FDA approved

Low relevance

Near identical performance between the 2 treponemal tests

20% additional testing when using the BioELISA as a screen

Situations where only one treponemal test is positive can be considered a false-positive

2 additional early syphilis cases were found upon repeat testing of isolated BioELISA+ result

Centers for Disease, C. and Prevention (2008). “Syphilis testing algorithms using treponemal tests for initial screening–four laboratories, New York City, 2005-2006.” MMWR – Morbidity & Mortality Weekly Report 57(32): 872-875. 116,822 samples screened by EIA from 4 New York City laboratories between Oct 2005 to Dec 2006

3,664 (3%) were treponemal+ and nontreponemal-

Lab 1 and 2- retested discrepants with TPPA or FTA-ABS

Lab 3- EIA+ and RPR+ reflexed to another treponemal assay

Lab 4- No further testing after EIA and RPR

Gold standard definition- Laboratory (EIA treponemal screening followed by RPR)

6,587 (6%) had reactive EIA test

2,884 of the 6,587 (44%) were RPR+

2,512 tested with an additional treponemal test- 2,079 (83%) were reactive

One lab used TPPA for confirmation found 78 of 80 (98%) reactive

Large sample size

Low prevalence (3-6%)

No clinical history of the patients so not able to assess current or past syphilis infection

Different algorithms by each laboratory

Validity of gold standard- Reverse algorithm. 2 labs ran additional TPPA on EIA+ and RPR- specimens

Low relevance

Lack of standardization for syphilis testing between laboratories

RPR confirmed less than half of the screen positive cases but it is unknown if these represent true infections without a clinical history

Centers for Disease, C. and Prevention (2011). “Discordant results from reverse sequence syphilis screening–five laboratories, United States, 2006-2010.” MMWR – Morbidity & Mortality Weekly Report 60(5): 133-137. 140,176 samples screened by EIA/CIA from 5 laboratories between 2006-2010 using the reverse algorithm

Laboratories- Southern California X2 (Trep-Sure), Northern California (Liason), NYC (Trep-Check), and Chicago (Trep-Sure)

Gold standard definition- Laboratory (EIA/CIA automated screening and RPR and either TPPA or FTA-ABS for confirmation)

4,834 (3.4%) EIA/CIA+

2,743 out of 4,834 (56.7%) were RPR-, of which 886 (31.6%) were TPPA or FTA-ABS nonreactive indicating false-positive EIA/CIA

The percentage of discordants RPR was higher in the lower prevalence area (60.6% vs. 50.6%)

Also the percentage of discordant TPPA or FTA-ABS results was higher in the low prevalence areas (40.8% vs. 14.1%)

Large sample size

2 high and 3 low prevalence areas 14.5% and 2.3%, respectively

Different treponemal screening and confirmation tests

Validity of gold standard- Reverse algorithm. TPPA and FTA-ABS confirmation results lumped together

Moderate relevance

High amount of discordance between EIA/CIA screen and RPR+ (56.7%)

High amount of discordance of confirmatory treponemal assay (31.6%)

Treponemal confirmatory test discordance could be attributed to the different analytical sensitivities of the confirmatory tests. Castro A, et al. (2013) and Maragoni A, et al. (2000)

Chen B, et al. (2017). The traditional algorithm approach underestimates the prevalence of serodiagnosis of syphilis I HIV-infected individuals. PLOS Neg Trop Dis 2017 Jul 20;11(7):e0005758. doi: 10.1371/journal.pntd.0005758 Cross-sectional study, 865 HIV-infected hospital patients undergoing HIV viral load testing

Compared traditional (TRUST screen) with reverse (Non FDA cleared EIA to TRUST) and European (EIA to TPPA

Gold standard definition- N/A. Compared algorithms without considering any as “truth”

Of 123 confirmed positive by traditional algorithm, all were positive by reverse.

Reverse algorithm detected additional 92 (treated?)

98% agreement (Kappa .96) between EIA and TPPA

No clinical histories to determine syphilis stage, or to indicate if additional cases detected by reverse algorithm were past, treated infections

EIA is not FDA cleared.

Non-trep assay (TRUST) isn’t routine screen

Validity of gold standard- N/A

Moderate relevance.

Direct comparison of traditional and reverse algorithms is very relevant (reverse is no less sensitive than traditional in HIV population), but EIA is not FDA cleared and TRUST not common screening assay

Chen, Q., et al. (2016). “Performance Evaluation of a Novel Chemiluminescence Assay Detecting Treponema Pallidum Antibody as a Syphilis Screening Method.” Clinical Laboratory 62(4): 519-526. Evaluation of the Lumipulse G TP-N

Compared to the InTec ELISA

2290 routine screening, 133 syphilis, and 175 interfering substances samples from West China Hospital of Sichuan University

Inconsistent samples tested by RecomLine Treponema IgG, IgM immunoblot

Gold standard definition- 133 samples with clinical and laboratory characterized sera and prospective analysis of 2290 samples where Lumipulse compared to the InTec ELISA

Lumipulse displayed 100% sensitivity and specificity

Lumipulse also displayed lower false-negative and false-positive rates

Large sample size

Low prevalence (2.3%)

Did not use a nontreponemal assay

Validity of gold standard- Unknown how 133 samples where clinically and laboratory characterized. Western Blot used for discrepant results in the 2290 prospective samples

Low relevance

Similar performances between two treponemal assays- One automated and one manual

Chinkhumba, J. (2006). “Economics of blood screening: in search of an optimal blood screening strategy.” Tropical Doctor 36(1): 32-34 Prioritize the order of blood screening tests performed on blood donors

870 potential donors from Jan to June of 2003 were screened for HIV, HBV, VDRL, Hb, and ABO

St. Gabriels Mission Hospital in a rural area near Lilongwe, Malawi

Gold standard definition- N/A only tested by VDRL

The most cost-effective method for blood screening was VDRL, ABO, HBV, Hb, then HIV

The lowest cost was $29.44 and the highest $33.57

Moderate sample size

Low prevalence (1.2%)

Validity of gold standard- N/A only used VDRL

Low relevance

VDRL is cheap test and it is economical to use as the first screening test for blood donors

Choi, S. J., et al. (2013). “Comparisons of fully automated syphilis tests with conventional VDRL and FTA-ABS tests.” Clinical Biochemistry 46(9): 834-837. Evaluation of the HiSens Auto Rapid Plasma Reagin and AutoTPPA

Compared to VRDL and FTA-ABS

504 samples from the Severance Hospital of Yonsei University in Korea

250 samples also tested by TPPA

Gold standard definition- Laboratory (VDRL and FTA-ABS) and clinical (chart reviews of discordants)

Concordance rate between the VDRL and AutoRPR was 67.5% with the 164 (32.5%) discordant results being VDRL+ and AutoRPR-

133 of 164 (81.1%) were FTA-ABS+

106 of the 133 cases had chart reviews- 82 previously treated, 22 latent, 1 primary, and 1 congenital syphilis

Concordance between the FTA-ABS and AutoTPPA was 96.4%, FTA-ABS and VDRL 91.9%, and FTA-ABS and AutoRPR 71.6%

Sensitivity and specificity of the FTA-ABS were 99.6% and 99.1%, AutoTPPA were 98.5% and 98.3%

Moderate sample size

TPPA not conducted on all samples

Chart reviews and syphilis stage could not be determined for all reactive samples

Validity of gold standard- Traditional algorithm. Chart reviews conducted on 106 of 133 discordant results

High relevance

High discordance between the RPR and VDRL

VDRL more sensitive but autoRPR more specific

Authors suggest that the RPR becomes negative quicker after treatment compared to the VDRL

Chuck, A., et al. (2008) Cost effectiveness of enzyme immunoassay and immunoblot testing for the diagnosis of syphilis (Structured abstract). International Journal of STD and AIDS 19, 393-399 Cost effectiveness of EIA screening followed by Inno-Lia for confirmation vs RPR screen and TPPA for confirmation

Study conducted in 2006 at Alberta, Canada using data from 89,647 patients tested

EIAs used- Enzygnost, Architect, and Trep-Sure

Costs include- test reagents, labor, resource cost associated with false-negatives, treatment, treatment follow-up, contracting, follow-up of indeterminate cases

Gold standard definition- not defined

Sensitivity and specificity- RPR 70.6% and 99.5%, EIA 93.0% and 98.9%, TPPA 92.3% and 98.0%, FTA-ABS 87.8% and 94.0%, Inno-Lia 94.6% and 99.5%

$4,225,902 Canadian dollars for RPR then TPPA/FTA-ABS

$4,149,353 Canadian dollars for EIA then Inno-Lia

EIA then Inno-Lia identified 166 additional correct diagnoses (+ or -)

EIA then Inno-Lia will save $461 Can to produce one additional correct diagnosis

Large sample size

Low prevalence (<2%)

All samples tested by both algorithms

Unknown what standard was used to determine sensitivity and specificity of each assay (Maybe based on clinical diagnosis?)

Validity of gold standard- N/A

High relevance

EIA then Inno-Lia algorithm is cost-effective- it will generate cost savings to the health-care system and more correct diagnoses

Cole, M. J., et al. (2007). “Comparative evaluation of 15 serological assays for the detection of syphilis infection.” European Journal of Clinical Microbiology & Infectious Diseases 26(10): 705-713. 114 serum samples with known disease stage and treatment status were tested to assess sensitivity

249 blood donations from London, UK were tested to assess specificity

15 treponemal tests evaluated

EIAs- Trepanostika TP Recombinant, Syphilis EIA II (Newmarket), ICE Syphilis (Abbott), Enzygnost Syphilis, Enzywell Syphilis, Biokit Bioelisa Syphilis 3.0, Bio-Rad Syphilis Total, Pathozyme Syphilis, Mercia Syphilis, Captia Syphilis

TPPA, Biokit TPHA, Randox TPHA, Axis-Shield TPHA, Newmarket TPHA

Gold standard definition- Clinical and laboratory characterized sera from 3 different sources

All kits gave a specificity of 100% except for ICE (2 reactive) and Biokit TPHA (1 reactive)

Sensitivities ranged from 93.9% to 99.1% with the Axis-Shield TPHA and Newmarket TPHA being at 93.9% and Trepanostika TP Recombinant and Syphilis EIA II (Newmarket) being 99.1%

Variation in sensitivities seen in untreated primary and treated late groups- All kits had at least one false-negative in the primary untreated group and all false-negatives in the late stage were treated patients (one exception)

Only the Trepanostika TP Recombinant, Syphilis EIA II, and Serodia TPPA detected all syphilis cases except the darkfield positive case

All primary and secondary cases were positive by Merica IgM (data not shown)

Moderate sample size

15 treponemal tests performed on all samples

Discordant results were retested in duplicate

All stages of syphilis were tested

No pregnant women included in study

No automated treponemal assays were assessed

Validity of gold standard- Excellent

High relevance

One specimen positive by darkfield microscopy only was negative by all 15 tests

Trepanostika TP Recombinant, Syphilis EIA II, and Serodia TPPA are the most suitable for detecting all stages of syphilis

All assays had similar performance with ≥84% sensitivity

Use an IgM assay if screen is negative and primary syphilis is suspected

Some lot to lot variation- suggest using specimens from primary syphilis cases as controls?

Creegan, L., et al. (2007). “An evaluation of the relative sensitivities of the venereal disease research laboratory test and the Treponema pallidum particle agglutination test among patients diagnosed with primary syphilis.” Sexually Transmitted Diseases 34(12): 1016-1018. Cross-sectional

Compared the sensitivities of VDRL and TPPA in primary syphilis samples

106 darkfield-confirmed cases from Jan 2002 to Dec 2004 in the San Francisco area

Gold standard definition- Dark field microscopy confirmed primary cases with genital or anal lesion

77/106 (72.6%) VDRL+ with 75 confirmed by TPPA

91/106 (85.8%) TPPA+

13/106 (12.3%) were VDRL- and TPPA-

VDRL confirmed by TPPA had a sensitivity of 70.8%

TPPA as a first-line test had a sensitivity of 85.9%

51 samples also tested by RPR and only 37 (73%) were RPR+ with 36 confirmed by TPPA

Moderate sample size

Used confirmed primary cases of syphilis

Low number of HIV patients- could skew test sensitivities in HIV population

Validity of gold standard- Excellent

High relevance

TPPA performed better than VDRL in primary syphilis detection (91 vs 75)

RPR and VDRL performed about the same but small RPR sample size

12% of primary cases missed by both tests

VDRL performed poorly in HIV patients

Dai S, Chi P, Lin Y, et al. Improved reverse screening algorithm for Treponema pallidum antibody using signal-to-cutoff ratios from chemiluminescence microparticle immunoassay. Sex Transm Dis 2014; 41:29–34. Architect CLIA screen, with TRUST and TPPA on reactive specimens.

Cancer patients. 8980 specimens over 6 months. 3.6% screen reactive rate

Gold standard definition- laboratory (TPPA, TRUST)

319 Architect reactive. Of these, 272 (85%) were confirmed by TPPA.

Architect S/CO > or = 9.9, specificity and PPV were 100% (confirmed by TPPA)

Large sample size.

Very defined patient population (cancer)- results may not be generalizable

Validity of gold standard- good


CLIA S/CO ratio could be utilized to increase specificity


Dang, Q., et al. (2006). “Evaluation of specific antibodies for early diagnosis and management of syphilis.” International Journal of Dermatology 45(10): 1169-1171. Evaluation of the RPR, TPPA and western blot

67 patients diagnosed with primary or secondary syphilis between 1999-2003

Gold standard definition- Clinical and laboratory (WB?) data

Western blot was positive for all 67 samples

In 20 primary syphilis samples, RPR identified 12 (60%) and TPPA identified 18 (90%)

In 47 secondary syphilis samples, RPR and TPPA identified 100%

In 21 patients at 24 months post treatment, 3 (14%) RPR+, 15 (71%) TPPA+, and 15 (71%) WB+

Small sample size

Unknown how the syphilis stage was determined

Single WB band (n=5) could be false positive

Validity of gold standard- How syphilis stage was characterized was not disclosed.

Low relevance

RPR detected only 60% of primary cases of syphilis compared to 90% by TPPA

Gomez, E., et al. (2010). “Evaluation of the Bio-Rad BioPlex 2200 syphilis multiplex flow immunoassay for the detection of IgM- and IgG-class antitreponemal antibodies.” Clinical & Vaccine Immunology: CVI 17(6): 966-968.


1008 prospective sera submitted to reference lab

Tested by TrepCheck, Bioplex. Discordants tested by TPPA

Gold standard definition- laboratory (EIA; with TPPA on discordants)

Bioplex IgG detected all 77 true positive also detected by TrepChek. Bioplex also detected additional 8 samples (TPPA confirmed) missed by Trepchek.

Statistically similar performance of BIoplex and TrepChek.

No clinical histories

Validity of gold standard- acceptable, to compare trep assays

Moderate relevance. Comparison of 2 trep assays. Medium
Goswami, N. D., et al. (2013). “The footprint of old syphilis: using a reverse screening algorithm for syphilis testing in a U.S. Geographic Information Systems-Based Community Outreach Program.” Sexually Transmitted Diseases 40(11): 839-841 Impact of reverse algorithm in high-risk community screening program.

Prospective enrollment over 2 year period in one county health system

Screening by TrepSure. 239 subjects with EIA results. TRUST on EIA+

Gold standard definition- N/A


45/239 (19%) were EIA positive

18/45 were TRUST positive (15 prior syphilis and treatment; 3 new diagnose of untreated syphilis0

27/45 TRUST negative (18 prior syphilis and treatment; 9 no diagnosis or treatment- presumed false positive)

Small sample size

Non-random enrollment

Validity of gold standard- N/A

Moderate relevance

Reverse algorithm identifies a large proportion of previously treated infections in this high-risk population. (Lab results need interpretive comment)

Gratrix, J., et al. (2012). Impact of reverse sequence syphilis screening on new diagnoses of late latent syphilis in Edmonton, Canada.” Sexually Transmitted Diseases 39(7): 528-530 Retrospective review of 5 yrs data

Compared data from period when reverse algorithm was used to period when traditional algorithm was used

Gold standard definition- combined laboratory and clinical

With traditional algorithm, 0.07% rate of late latent syphilis

With reverse algorithm, 0.14% rate of LLS. (P<0.001)

More resources for follow up of additional cases, but did not quantify $

Validity of gold standard- good Very relevant

Significant increase in number of newly diagnosed late latent syphilis cases detected after switching to rev algorithm (consistent with reports of high diagnostic sensitivity)

Gratzer, B., et al. (2014). Evaluation of diagnostic serological results in cases of suspected primary syphilis infection.” Sexually Transmitted Diseases 41(5): 285-289 Pt population: STD clinic

Testing protocol: Trepsure, RPR, FTA-ABS over 3 years.

52 pts with primary syphilis (symptoms, pos lab test). Performed retrospective medical record review of these cases

Gold standard definition- combined lab (RPR) and clinical (dx of primary syphilis)


Trepsure pos/equiv in 28/52 (53.8%)

9 of the negatives were cases positive only by FTA-ABS.


RPR pos in 40/52 (76.9%)

Considered pts positive by FTA-ABS alone to be true positive, based on clinical suspicion. Could be false positive. No follow up

Validity of gold standard- good


Very relevant

RPR sensitivity of 77% in primary syphilis is consistent with literature

Trepsure EIA sensitivity of 54% (in primary syphilis) is anomaly. Other studies looked at all stages of disease. In fact, the authors admitted that in total, Trepsure detected more cases than RPR (includes latent)

Gu, W. M., et al. (2013). “Comparing the performance of traditional non-treponemal tests on syphilis and non-syphilis serum samples.” International Journal of STD & AIDS 24(12): 919-925 Compare RPR and TRUST (Vs trep assay and history)

209 active syphilis (stratified in stages)

247 non syphilis

Gold standard definition- laboratory (multiple treponemal assays)

Overall sens: RPR 96-99%; TRUST 97-99%

Primary: 90-93% sens (both)

Secondary: RPR, 99%; TRUST, 100%

Latent: RPR, 97%; TRUST, 95-100%

Large sample size

Well characterized samples

Validity of gold standard- good, since treponemal assays considered to be more sensitive than non-trep

Moderate relevance. No head to head of traditional vs. reverse algorithms.

RPR and TRUST sensitivities are consistent with already reported literature.

Hunter, M. G., et al. (2013). “Significance of isolated reactive treponemal chemiluminescence immunoassay results.” Journal of Infectious Diseases 207(9): 1416-1423 Retrospective analysis of patients with reactive Architect syphilis results

Over 28,000 specimens and 1171 reactive

Gold standard definition- combined clinical and laboratory (to assess isolated CIA pos)

1171/28261 (4.1%) CIA reactive

11/20 (55%) patients with isolated positive CIA found to have evidence of syphilis infection

Isolated = CIA +, RPR-, TPPA – (we call these “unlikely”)

Large sample size

But small event (isolated positive CIA)

Validity of gold standard- good


Specimens positive by CIA alone can be true infection (about half of patients)

Jonckheere, S., et al. (2015). “Evaluation of different confirmatory algorithms using seven treponemal tests on Architect Syphilis TP-positive/RPR-negative sera.” European Journal of Clinical Microbiology & Infectious Diseases 34(10): 2041-2048 Compared modern treponemal tests to TPPA

Tested 178 Architect +/RPR- sera. Most from asymptomatic persons at risk for syphilis

Gold standard definition- laboratory (TPPA)

Bioplex Syphilis IgG was only FDA-cleared test evaluated. Compared to TPPA, Bioplex was sens (96%) but poor specificity (55.4%) Likely bias, because samples prescreened for Abbott Architect +.

Validity of gold standard- fair (trep assay vs. trep assay), but bias in selection of sample set by Architect+. Adding clinical follow up on discordant specimens would improve gold standard

Relevant comparison of treponemal assays, but bias in selection of sample set.

Consistent with our data that Bioplex IgG test has high rate of false pos

Jost, H., et al. (2013). “A comparison of the analytical level of agreement of nine treponemal assays for syphilis and possible implications for screening algorithms Retrospective analysis of 290 samples. All compared to TPPA

Agreement and LoD of treponemal assays

Gold standard definition-

Analytical sensitivity of assays varied; however, the diagnostic sensitivity was very similar (Liaison, TrepSure, Captia IgG all at 100% sensitivity) No patient histories, stage of syphilis

Validity of gold standard-

Moderate relevance.

Similar diagnostic sensitivity of several treponemal assays, even though analytical sensitivity varied. (Could impact sensitivity in primary syphilis)

Knaute, D. F., et al. (2012). “Serological response to treatment of syphilis according to disease stage and HIV status.” Clinical Infectious Diseases 55(12): 1615-1622 Serological response to treatment

Retrospective cohort study, 264 syphilis patients

Gold standard definition- clinical

38/90 (42%) patients with primary syphilis were neg by VDRL

4/32 (12%) with latent syphilis were neg by VDRL

No patients with secondary or tertiary syphilis were negative

IgM test (Pathozyme, not FDA cleared) had 4% neg at primary and 21% neg at latent (because only IgM)

Retrospective, selection bias

Large study, but small number of events

Validity of gold standard-good


High rate of false negative VDRL in cases with primary (42%) compared to 26% reported by Singh (RPR); and latent (12%) compared to 39% reported by Singh

Pathozyme IgM ELISA had only 4% neg at primary, compared to 42% for VDRL.

Knight CS, CrumMA, Hardy RW. Evaluation of the LIAISON chemiluminescence immunoassay for diagnosis of syphilis. Clin Vaccine Immunol 2007; 14:710–713. 2645 samples, from several populations:

51 primary or secondary syphilis

999 routine samples (unknown)

200 HIV+

200 pregnant

992 negative controls

Evaluation of LIAISON CLIA vs EIA and reverse algorithm using EIA

Gold standard definition- laboratory (another immunoassay, as well as full algorithm)

Overall CLIA sensitivity 95.8% versus reverse algorithm with EIA; CLIA specificity of 99.1%.

Of patients with known syphilis (primary or secondary), CLIA was 100% accurate (48 pos and 3 neg).

Different patient populations assessed

Insufficient data provided to break down the reported 95.8% sensitivity of CLIA. Insufficient data provided to assess/explain 95.8% sensitivity, while others (Marangoni) reported >99% sensitivity

Validity of gold standard- good

Moderate relevance.

Evaluated one assay, and did not compare against traditional algorithm.


Lee, K., et al. (2013). “Characterization of sera with discordant results from reverse sequence screening for syphilis.” BioMed Research International 2013: 269347 Evaluate Architect

15,713 sera screened by architect. 153 positive and tested by TPPA and RPR

Gold standard definition- laboratory (trep and nontrep)

Showed that 27/153 were RPR reactive (active) and rest were past treated. Of 126 RPR NR, 103 pos by TPPA, so 23 were false pos by Architect (similar to Bioplex) Strictly an assessment of Architect positives, in their particular patient population

Validity of gold standard- good

Minimal relevance. Not a head-to-head comparison of trep and non-trep screens

Demonstrated that reflex testing of post trep screen is necessary

Liu LL et al. (2014) Incidence and risk factors for the prozone phenomenon in serologic testing for syphilis in a large cohort.” Clinical Infectious Diseases 59(3): 384-389 Retrospective analysis., 46,856 samples over 3 yr

Gold standard definition- N/A

36/4334 (0.83%) of all RPR reactive had prozone phenomenon

Compare to Post: 0.9% of reactive patients with prozone (essentially the same).

Very large sample size. Well described patient characteristics.

Validity of gold standard- N/A


Prozone in about 1% of RPR reactive cases; impacts utility of RPR as screen. Authors recommend trep assay to screen


Li Z et al. Screening for antibodies against Treponema pallidum with chemiluminescent microparticle immunoassay: analysis of discordant serology results and clinical correlation. Ann Clin Biochem. 2016 Sep;53(Pt 5):588-92



Retrospective sample set. 267 samples Architect positive.

Gold standard definition- laboratory (TPPA, RPR, immunoblot)

185/267 (69%) confirmed by TPPA. Majority of those that did not confirm by TPPA did not confirm with immunoblot assay

54 (20.2%) of 267 Architect positive samples were RPR reactive

Architect S/CO correlated with diagnostic reliability

Patient population not described

Validity of gold standard- good


False positives by Architect. Further testing necessary

More data on SOS to improve specificity of trep assay

Architect strength of signal correlates with confirmation by TPPA.

Lipinsky, D., et al. (2012). “Validation of reverse sequence screening for syphilis.” Journal of Clinical Microbiology 50(4): 1501.


Retrospective review of 12235 patients tested by Architect and RPR. TPPA performed if either reactive

Gold standard definition- (laboratory) traditional algorithm (RPR and TPPA)

157 (1.3%) reac by both, and 155 reac by TPPA

Off 334 pos only by Architect (false pos or past infection), only 197 reactive by TPPA

Reverse algorithm (Architect) did not miss any

No information on patient risk or clinical information on stage of syphilis cases

Unknown if Architect more sensitive than RPR (no follow up on discordants)

Reported throughput, and labor costs but no data to support

Validity of gold standard- good (traditional algorithm

Relevant. Reverse algorithm (Architect) as sensitive as RPR. Claimed Architect throughput 10X greater than RPR, with reduced labor costs Medium
Liu C, et al. (2014) The diagnostic value and performance evaluation of five serological tests for the detection of Treponema pallidum.” Journal of Clinical Laboratory Analysis 28(3): 204-209 160 syphilis patients: 120 active, 40 latent

210 nonsyphilis controls

Performed TRUST, RPR, TPPA, Architect, and others (not FDA cleared)

Gold standard definition-clinical

RPR and TRUST sensitivity 73%, specificity 96%

Architect sensitivity 100%, specificity 91%

Well described patient populations

Did not describe test performance separately for active and latent stages

Did not provide breakdown of false negative non-trep results (were they among active or latent cases?)

Validity of gold standard- good

Relevant. Head to –head of RPR and Architect. Architect more sensitive than RPR Medium
Loeffelholz, M. J., et al. (2011). “Analysis of bioplex syphilis IgG quantitative results in different patient populations.” Clinical & Vaccine Immunology: CVI 18(11): 2005-2006 Retrospective analysis.

1849 incarcerated

3512 OB/GYN clinic

873 delivery

Gold standard definition- N/A

If RPR titer 1:2 or greater, more likely to confirm by TPPA

Bioplex AI of 8 or greater provided highest specificity (compared to 6, for Yen-Lieberman)

Evaluated different patient populations- high risk incarcerated and low risk OB/GYN

Validity of gold standard- N/A

Moderate relevance.

Treponemal screen strength of signal can predict if TPPA will confirm (increase specificity of screen)

Malm, K., et al. (2015). “Analytical evaluation of nine serological assays for diagnosis of syphilis.” Journal of the European Academy of Dermatology and Venereology 29(12): 2369-2376 301 sera from different clinical categories (HIV, suspected syphilis, blood donors)

Comparison of non-trep (inc. RPR, VDRL) and trep (inc. FDA cleared TrepSure, Architect, Liaison, Captia)

Gold standard definition- laboratory (TPPA)

TrepSure, Architect, Liaison all highly sensitive (>99%) compared to TPPA. Liaison more specific than Architect.


No sig difference in performance of any test between HIV pos and HIV neg patients


Did not compare trep vs. non-trep as a screen.

No clinical histories; no syphilis stage

Large sample size; different patient populations

Validity of gold standard- fair. Not comprehensive lab testing.

Moderate relevance.

Did not directly compare trep vs. non-trep as a screen.


Malm, K., et al. (2015). “Performance of Liaison XL automated immunoassay platform for blood-borne infection screening on hepatitis B, hepatitis C, HIV 1/2, HTLV 1/2 and Treponema pallidum serological markers.” Transfusion Medicine 25(2): 101-105 Compared performance of Liaison to that of Architect.

1,100 donor samples

Gold standard definition- laboratory (Architect)

100% agreement in 1100 blood donors

Of separate set of 17 prev false-reactive samples by Architect, all were non-reactive by Liaison.

Validity of gold standard- fair. Only evaluating agreement as screening test. Moderate relevance.

Two automated CIAs perform equivalent (compare to Malm’s paper above, in which Liaison was more specific than Architect)

Marangoni, A., et al. (2009). “Laboratory diagnosis of syphilis with automated immunoassays.” Journal of Clinical Laboratory Analysis 23(1): 1-6 Retrospective study

Evaluate Enzygnost (not FDA cleared) and Architect, compared to TPHA and WB

244 syphilitic sera

74 sera for analytical specificity

Gold standard definition- clinical “syphilitic” sera, with stage determined

Architect sensitivity 99.2%

Architect specificity 98.4%

Small sample size

Unknown what syphilis stage(s) Architect was assessed

Validity of gold standard- good, but paper did not describe what stage(s) of syphilis.


CLIA evaluated against clinically defined sera (pos group and neg control group)

Marangoni, A., et al. (2005). “Evaluation of LIAISON Treponema Screen, a novel recombinant antigen-based chemiluminescence immunoassay for laboratory diagnosis of syphilis.” Clinical & Diagnostic Laboratory Immunology 12(10): 1231-1234 LIAISON compared to RPR, TPHA, WB

Retrospective study 2494 sera, 131 syphilitic sera, 96 analytical specificity

Prospective: 1800 unselected sera

Gold standard definition- combined laboratory and clinical (defined non-syphilis infections, and staged syphilis)

LIAISON sensitivity: 99.2% overall

100% for untreated primary, secondary, cardiovascular, neuro, congenital. 98.7% for treated latent

LIAISON Specificity: 99.9%

RPR sensitivity:

100% for untreated cardiovascular, neuro, congenital. 85.7% for untreated primary, 96.8% for untreated secondary

Large sample size, well characterized samples.

Validity of gold standard- good


LIAISON more sensitive than RPR for untreated primary, secondary syphilis

Overall, for untreated, LIAISON was 100% sens vs. 96.3% for RPR

Marangoni, A., et al., Evaluation of the BioPlex 2200 syphilis system as a first-line method of reverse-sequence screening for syphilis diagnosis. Clin Vaccine Immunol, 2013. 20(7): p. 1084-8 Comparison of Bioplex Syph IgG with Architect

Different, well defined patient groups

Gold standard definition- laboratory (Architect) and clinical (some specimens had history with stage and treatment)

In a group of 100 patients positive only by Architect (neg by other trep and nontrep tests), only 48 were positive by Bioplex

Sensitivity of Bioplex and Architect was same

Several well defined patient groups, with good clinical histories

Validity of gold standard- moderate to good (clinical history is strong, but Architect may not be universally accepted as gold standard)

Moderate relevance. Head to head of 2 trep assays

Bioplex had fewer false positives than Architect. Equivalent sensitivity

Specificity of automated trep screening assays varies

Mishra, S., et al. (2011). “The laboratory impact of changing syphilis screening from the rapid-plasma reagin to a treponemal enzyme immunoassay: a case-study from the Greater Toronto Area.” Sexually Transmitted Diseases 38(3): 190-196 Retrospective

Compared pre (non-trep screen) and post (EIA screen) periods

Gold standard definition- N/A. No reference; simply compared 2 data sets

After switching to EIA screen, confirmed positive rates increased by 10.3 per 100,000 (significant)

0.59% of EIA+/RPR- patients converted to RPR+ within 2 months (earlier detection )

Non-trep and trep screens were compared over different time periods (not a direct comparison)- what is this type of study called?

Used a non-FDA cleared EIA (Enzygnost)

Large data set (over 3 mil results)

Validity of gold standard- N/A

Very relevant.

Trep screening improves detection of early cases of syphilis (consistent with other studies showing higher sens for untreated primary syphilis)


Owusu-Edusei, K., Jr., et al. (2011). “Serologic testing for syphilis in the United States: a cost-effectiveness analysis of two screening algorithms.” Sexually Transmitted Diseases 38(1): 1-7 Cohort decision analysis model to estimate cost and health outcomes of trad and reverse algorithms

Cohort of 200,000. 1000 active infections (0.5% incidence). 10000 past infections (5%)

Gold standard definition- N/A. No reference established.

Reverse algorithm treated 986 cases vs 868 for traditional (118 more cases), but higher number of followups (11450 vs 3756), and overtreatment (964 vs. 38).

Treating 118 more cases might prevent 1 case of tert. Syphilis.

Cost effect ratio (per case treated): $1671 (rev) and $1621 (trad).

Weaknesses- reliability of input data; may have underestimated overtreatment costs; can’t account for events over time

Strengths- various sensitivity analyses; provided information missing from other studies

Validity of gold standard- N/A

Very relevant.

Estimated that reverse algorithm costs slightly more and results in more unnecessary treatment. Rev algorithm might prevent more cases of syphilis.

Owusu-Edusei, K., Jr., et al. (2011). “The tale of two serologic tests to screen for syphilis–treponemal and nontreponemal: does the order matter?” Sexually Transmitted Diseases 38(5): 448-456 Cohort decision analysis model

Cohort of 10,000

Comparison of costs of different algorithms


Gold standard definition- N/A. No reference established.

Traditional cost was $1400 per outcome prevented , vs. $1500 for reverse algorithm (low prevalence)

Traditional more cost saving ($102,000) vs. reverse ($84,000) (high prevalence)

Both algorithms identified and treated same number

Weaknesses- reliability of input data;

Validity of gold standard- N/A

Very relevant.

Number of cases detected and treated was the same, but reverse algorithm was slightly more expensive (per adverse outcome prevented) in low prev setting (due to more confirmatory tests required)

Park, B. G., et al. (2016). “Comparison of Six Automated Treponema-Specific Antibody Assays.” Journal of Clinical Microbiology 54(1): 163-167 615 samples submitted for syphilis testing. Including 329 suspected current or past syphilis.

Single hospital setting

Gold standard definition- laboratory (FTA-ABS)

157 samples (155 pts) positive

Architect compared to FTA; 96.8% sens, 100% specific; 0.978 kappa

ADVIA Centaur compared to FTA: 99.4% sens, 100% specific; 0.996 kappa

Relatively small sample size

Single center

Validity of gold standard- moderate. FTA-ABS alone is imperfect

Moderate relevance.

Consistent with Park Y. study, good correlation between CLIAs and FTA-ABS.

Park, I. U., et al. (2011). “Screening for syphilis with the treponemal immunoassay: analysis of discordant serology results and implications for clinical management.” Journal of Infectious Diseases 204(9): 1297-1304 Cross sectional. 3 months. 21,623 results.

HMO population. 2% trep assay positive rate.

Determined characteristics of CLIA (LIAISON) confirmed vs. not confirmed

Gold standard definition- N/A

439 specimens positive by LIAISON.

255 unique patients CLIA pos/RPR neg.

184 (72%) TPPA pos; 71 (28%) neg

Confirmed (TPPA+) more likely to have history of syphilis (57% vs 9%)

Confirmed (TPPA+) had higher median CLIA value

Very large sample size

History, clinical characteristics available

Weakness: potential misclassification, clinically

More followup on higher risk could bias data

Validity of gold standard- N/A


Positive CLIA (LIAISON) not confirmed by TPPA likely to be false positive (in low prevalence setting)

CLIA value correlated with probability of confirmation. All pts with ODI > or = 12 were TPPA positive

Park, Y., et al. (2011). “Evaluation of a fully automated treponemal test and comparison with conventional VDRL and FTA-ABS tests.” American Journal of Clinical Pathology 136(5): 705-710 616 serum submitted for FTA-ABS

One hospital.

Compared Architect to FTA-ABS (and VDRL also done on subset of 400)

Gold standard definition- laboratory (FTA-ABS, and also VDRL on subset)

Architect and FTA-ABS correlated well (kappa 0.981)

Of 200 specimens with reactive VDRL, only 152 were positive by Architect. Only 155 positive by FTA-ABS, suggesting that a subset are false reactive by VDRL. Unknown if same specimens neg by both trep assays or different spec

Single center study

No clinical diagnoses, no syphilis staging

VDRL specimens were likely “cherry picked” because 200 reactive and 200 non-reactive

No follow up (additional tests or chart review) of discordants

Validity of gold standard- good. But no follow up on discordants

Moderate relevance.

Architect results correlated well with FTA-ABS

No follow up on discordant specimens, so impossible to make many conclusions on Architect vs. VDRL as screening assay.

Pope, V., et al. (2000). Comparison of the Serodia Treponema pallidum particle agglutination, Captia Syphilis-G, and SpiroTek Reagin II tests with standard test techniques for diagnosis of syphilis.” Journal of Clinical Microbiology 38(7): 2543-2545 640 stored specimens. 390 of which were uncharacterized.

Test evaluated included Captia IgG, compared to MHA-TP

TPPA evaluated against panel of clinically characterized sera

Gold standard definition- for Captia IgG: laboratory (MHA-TP)

96% agreement among treponemal assays

97.7% agreement between Captia and MHA-TP

TPPA sensitivity about 88% for primary; 100% for secondary; 100% untreated latent; 95% treated latent. TPPA specificity about 95%.

No statistical analysis of the data

All tests not compared head-to-head. Essentially several independent studies combined into one paper

Validity of gold standard- good,

Relevant. Newer trep assay (Captia) had high agreement with traditional trep assays

Trep and non-trep not compared head-to-head

Post, J. J., et al. (2012). “Case report and evaluation of the frequency of the prozone phenomenon in syphilis serology – an infrequent but important laboratory phenomenon.” Sexual Health 9(5): 488-490.


Prospective evaluation for prozone in 3222 consecutive sera, most from HIV and STD clinics

Tested all samples at 1:1 and 1:8. Prozone = nonreact at 1:1, reactive at 1:8

Gold standard definition- N/A

Found prozone in 2/3222 (0.06%) samples, and 2/223 (0.9%) reactive patients


Large sample size

Unknown HIV infection status

Unknown syphilis stage

Validity of gold standard- N/A


Prozone phenomenon is limitation of RPR. In this study, about 1% of all reactive patients can be false negative if only tested at 1:1.

Rhoads, D. D., et al. (2017). “Prevalence of Traditional and Reverse-Algorithm Syphilis Screening in Laboratory Practice: A Survey of Participants in the College of American Pathologists Syphilis Serology Proficiency Testing Program.” Archives of Pathology & Laboratory Medicine 141(1): 93-97 Survey questionnaire

2360 labs that subscribe to CAP syphilis serology survey. 83% responded to questions.

Gold standard definition- N/A

Of labs offering single algorithm, about 80% use traditional, and 20% use reverse.

About 25% of labs changed algorithm in past 2 years, and another 10% anticipate changing in upcoming yr

Large sample size and good response rate

Weakness- survey did not include reasons for change in algorithm (eg. Volume)

Validity of gold standard-N/A

Moderate relevance.

Majority of labs still use traditional algorithm, but increasing number using reverse.

Saral, Y., et al. (2012). “Serologic diagnosis of syphilis: comparison of different diagnostic methods.” Acta Dermatovenerologica Croatica 20(2): 84-88 Compare performance of RPR, TPHA, Architect

4226 sera for screening purposes. 117 patients with syphilis diagnosis

Gold standard definition- laboratory (composite of multiple tests), or clinical diagnosis

Architect sensitivity vs. TPHA: 98%

Both Architect and RPR positive in 66 “active” cases.

Of 46 cases only positive by treponemal assays, unknown if any were active syphilis (ie. false negative RPR).

4109 sera negative by all tests

Incorrectly report low sensitivity of non-trep against trep, when they detect different things.

Incorrect interpretation of data

Validity of gold standard- good


Comparable performance of RPR and Architect.

Schmidt, BL, et al. (2000). J Clin Microbiol 38(3):1279-82 Comparison of sensitivity of assays for primary syphilis

52 patients with primary syphilis diagnosed by dermatologist, all MHA-TP neg.

Gold standard definition- clinical (dx by dermatologists)

VDRL sensitivity: 23/52 (44.2%)

Captia IgM sens: 45/52 (86.5%)

Captia IgG sens: 7/31 (22.6%) (some specimens QNS for testing)

Other EIAs that detect both IgG and IgM had sensitivities ranging from 48.5-76.9%


Dated, but one FDA-cleared trep EIA still available (Captia)

For primary syphilis, this is a rather large sample size

Weakness- clinical diagnosis only. Some incorrect diagnoses?

Validity of gold standard- fair. Clinical dx alone for primary syphilis can have false positives. All negative by MHA-TP.

Somewhat relevant.

This IgM ELISA (NOT FDA cleared) is significantly more sensitive than VDRL for lab diagnosis of syphilis in the primary stage

Sharma, M., et al. (2010). “Syphilis serology in human immunodeficiency virus patients: a need to redefine the VDRL test cut-off for biological false-positives.” Journal of Medical Microbiology 59(Pt 1): 130-131 Evaluate TPPA positivity by VDRL titer

HIV population

2671 sera, 134 (5%) positive by VDRL

Gold standard definition: laboratory (TPPA).

No correlation between VDRL titer and likelihood that TPPA would be positive. Single center study

Validity of gold standard- good in this study, because they correlated VDRL titer with TPPA reactivity

Minimal relevance.

Conclusion is that titer didn’t help predict biologic false positives versus true reactives.

Singh, A. E., et al. (2008). “Characteristics of primary and late latent syphilis cases which were initially non-reactive with the rapid plasma reagin as the screening test.” International Journal of STD & AIDS 19(7): 464-468 Cross sectional analysis. Identify characteristics of patients who were initially false non-reactive by RPR, 1980 -2001.

863 cases of primary syphilis , 1303 cases of late latent

Gold standard definition- N/A

224/863 (26%) primary cases non-reactive on initial RPR screen

512/1303 (39%) late latent cases non-reactive on initial RPR screen

Large sample size

Validity of gold standard- N/A

High relevance. Characteristics of non-reactive cases is not important to key question. Didn’t also test with trep assay at same time.

High rate of initial RPR screens false neg during primary and latent syphilis

Recommend that suspect cases be screened also with trep assay due to non-reactive RPR

Sommese, L., et al. (2016). “Comparison of performance of two Treponema pallidum automated chemiluminescent immunoassays in blood donors.” Infectious Diseases 48(6): 483-487 Blood donor population (5543), over 1 year.

Comparison of Architect and Cobas Elecsys

Pos reflexed to TPHA and Inno-LIA

Gold standard definition: laboratory (Architect total Ab)

Architect detected 21 positives, Elecsys 20.

Architect less specific than Elecsys

Architect considered 100% sensitive, but against itself, not an independent gold standard

Validity of gold standard- good (for screening purposes only, but missing full algorithm)

Moderate relevance.

Head-to-head of 2 modern, automated luminescent trep assays


Tong, M. L., et al. (2014). “Analysis of 3 algorithms for syphilis serodiagnosis and implications for clinical management.” Clinical Infectious Diseases 58(8): 1116-1124. Cross-sectional analysis

One hospital in China; results from 24, 124 subjects. 2749 with syphilis

High prevalence (11.4%). Otherwise, patient population not described

Gold standard: clinical dx (clinical, lab, history)

Tests: RPR, TPPA, non FDA cleared CIA (manufactured in China)

Gold standard definition: combined clinical (symptoms plus dated history of infection) and laboratory to identify disease and stage

Sensitivity relative to clinical diagnosis:

Trad: 75.8%

Rev (with RPR): 99.9%

Rev (w/o RPR) (ECDC): 99.4%



Trad: 99.98%

Rev (with RPR): 99.98%

ECDC: 100%



Trad: 99.8%

Rev (with RPR): 99.8%

ECDC: 100%



Trad: 96.98%

Rev (with RPR): 99.98%

ECDC: 99.92%


Trad: 97.22%

Rev (with RPR): 99.96%

ECDC: 99.93%


Identified 22 subjects who displayed prozone phen.


Of 665 cases missed by trad. algorithm, 52 were early syphilis

Clinical diagnosis: inadequate information in medical records

High prevalence population only

Large study population, over 1.5 yr

Extensive clinical data

Validity of gold standard- good

Relevant. However, CIA trep assay not FDA cleared

Head-to-head comparison of 3 testing algorithms

Trad algorithm:

2067 pos RPR screen + (active cases) 22 pos only with dilution (prozone)

Rev algorithm: 2089 pos RPR (active cases)

Tsang, R. S., et al. (2007). “Serological diagnosis of syphilis: comparison of the Trep-Chek IgG enzyme immunoassay with other screening and confirmatory tests.” FEMS Immunology & Medical Microbiology 51(1): 118-124 Samples received by reference laboratory for confirmatory testing, or other additional testing

Gold standard definition: laboratory consensus (several trep and non-trep tests)

Trep-Chek IgG sens: 85%

Specificity: 96%

Trep-Chek may have low sensitivity in reference lab setting due to larger proportion of problematic specimens

Samples originating from many different locations and patient populations

Sample set biased towards unusual findings in original testing labs (limited generalizability)

No clinical information on patients (unknown syphilis stage)

Only 604 specimens total; 34 positive (28 active, 6 past) based on consensus result

Large battery of tests performed on each specimen

Validity of gold standard- good, but no clinical info on patient population


Evaluation of one IgG ELISA against consensus result. No evaluation of other individual methods


Wang, K. D., et al. (2016). “Preferable procedure for the screening of syphilis in clinical laboratories in China.” Infectious Diseases 48(1): 26-31.


Cross-sectional analysis; 2 mo.

One hospital in China; results from 3962 samples.

Gold standard definition: laboratory (TPPA, RPR, CLIA)

Syphilis infection status determined by the tests being evaluated.

Low sensitivity (65%) of RPR is actually due to not detecting past treated syphilis. So, it’s a matter of definition. RPR not intended to detect past treated syphilis.

Increasing CLIA signal/cutoff (S/CO) ratio directly correlated with CLIA true positive rates


Poor quality- study design. Definition used for true infection status

Poor conclusions- RPR is only 65% sensitive because it “missed” past treated infections.

Weakness: Reference standard includes the methods being evaluated.

Never complete defined CLIA method. Implied it might be Architect (Abbott)

Validity of gold standard- good tests, but missing important information

Low importance due to poor study design and incorrect interpretation of results. (sensitivity and specificity of RPR and CLIA).

CLIA S/CO value correlated with likelihood of being true positive. This is a convenient value of CLIA/EIA format.


Wellinghausen, N. and H. Dietenberger (2011). “Evaluation of two automated chemiluminescence immunoassays, the LIAISON Treponema Screen and the ARCHITECT Syphilis TP, and the Treponema pallidum particle agglutination test for laboratory diagnosis of syphilis.” Clinical Chemistry & Laboratory Medicine 49(8): 1375-1377. Compared LIAISON, Architect to TPPA as screening tests

Prospective: 557 samples submitted for syphilis screen, including 318 pregnant

FTA-ABS on borderline samples

Discrepant samples tested by immunoblot

Retrospective: 32 patients, including 6 primary, 13 secondary, 8 latent post therapy, 4 latent, 1 neuro.

Gold standard definition- laboratory (prospective screening) but no clear definition of “true positive” Clinical- retrospective syphilis cases

Screening: All trep tests 100% sensitive. Specificity: Liaison 100%, Architect 99.8%, TPPA 99.6%

Syphilis cases: LIAISON, Architect and TPPA all 100% sensitive

Did not define gold standard (true positive) for prospective study

Validity of gold standard- poor for prospective study- did not define laboratory criteria for true positive

Good for retrospective study- well characterized syphilis cases of different stages

Relevant. Liaison , Architect and TPPA all 100% sensitive. High spec (>/= 99.6%) Medium
Wong, E. H., et al. (2011). “Evaluation of an IgM/IgG sensitive enzyme immunoassay and the utility of index values for the screening of syphilis infection in a high-risk population.” Sexually Transmitted Diseases 38(6): 528-532 674 specimens; almost 50% VDRL reactive (cherry picked).

69% of patients are MSM. 17% HIV prevalence.

Comparison of TREP-SURE IgG/IgM EIA to VDRL

Gold standard definition- laboratory (VDRL)

279 of 285 VDRL + (excluding 33 biologic false positive) were EIA positive = 97.9%

Total of 11 EIA false positive (not confirmed by other trep assays) versus 33 VDRL false positive

EIA S/CO ratio correlated with true infection status

No composite gold standard

Not population based. Biased towards VDRL reactive.

Validity of gold standard- fair. Not population based. Biased towards VDRL reactive


In this patient population, with high rate of recent infections, EIA not more sensitive than VDRL (contrary to some claims that EIA may be more sensitive during primary syphilis)

EIA S/CO ratio could be utilized to increase specificity

Woznicova, V. and Z. Valisova (2007). “Performance of CAPTIA SelectSyph-G enzyme-linked immunosorbent assay in syphilis testing of a high-risk population: analysis of discordant results.” Journal of Clinical Microbiol 45(6): 1794-7 1771 patients over 4 yr; high risk STD

Evaluation of Captia SelectSyph-G ELISA. Compared TPHA. Discordants resolved by medical history, RPR, other trep tests

Gold standard definition- laboratory (TPHA)

1714/1771 (96.8%) concordant

Captia sens, spec, PPV, NPV = 97.7%, 94.2%, 93.5%, 97.9%, respectively.

57 discordants. After follow up of these, Captia sens, spec, PPV, NPV = 99%, 98%, 99.3%, 97.2%, respectively.


Well described patient populations. Clinical histories on patients with discordant test results. Follow-up performed on most pts with discordant test results.

Not a head-to-head comparison of trep and non-trep

Validity of gold standard-fair. Not comprehensive syphilis algorithm

Moderate relevance.

Captia ELISA compared to another trep test (TPHA)

Yen-Lieberman, B., et al. (2011). “Identification of false-positive syphilis antibody results using a semiquantitative algorithm.” Clinical & Vaccine Immunology: CVI 18(6): 1038-1040 Low prevalence population (~3% screen positive rate)

142 samples reactive by screen (Bioplex SyphG)

Gold standard definition: laboratory [Trep-Sure (for presence of trep ab)]

Trep-Sure performed on all samples. Likelihood of confirmation dependent on RPR status and Bioplex S/CO. All samples with Bioplex S/CO of 6 or higher were confirmed by Trep-Sure. Only 50% of samples with S/CO <6 were confirmed . Analytical sensitivity of tests were same Unknown clinical histories. Patient population(s) uncharacterized

Single center study

Validity of gold standard- fair. Trep-Sure might not be recognized reference method


CLIA S/CO ratio could be utilized to increase specificity

Bioplex S/CO >6 is highly specific (likely to confirm by different EIA)

Yoshioka, N., et al. (2007). “Evaluation of a chemiluminescent microparticle immunoassay for determination of Treponema pallidum antibodies.” Clinical Laboratory 53(9-12): 597-603 Comparison of Architect to RPR/TPPA (traditional algorithm)

630 sera from one hospital, in and out-patients

Gold standard definition- laboratory

500 non-reactive by RPR and TPPA: all non-reactive by Architect

121 reactive by both RPR and TPPA: all reactive by Architect

Unknown outcome of remaining 9 specimens

Patient populations not described

No results from 9, presumably discordant, specimens

Validity of gold standard- good


Good correlation between Architect and traditional algorithm

Young, H., et al. (2009). “The Architect Syphilis assay for antibodies to Treponema pallidum: an automated screening assay with high sensitivity in primary syphilis.” Sexually Transmitted Infections 85(1): 19-23 129 stored sera with known syphilis stage (untreated). 79 primary, 29 secondary, 9 early latent, 12 latent

1107 routine sera submitted for syphilis serology.

Evaluation of Architect CLIA

Screen assay: Murex ICE EIA (IgG and IgM). Positives then tested by VDRL, TPPA, IgM EIA.

Gold standard definition- lab and clinical. Untreated syphilis: clinical, including likely date of exposure. Categorized patients by stage (from national database). For routine screening, gold standard was TPPA

129 untreated syphilis cases:

Architect and TPPA agreed 100%

Architect more sensitive than VDRL for all stages.

Primary: 97.5% vs. 78.5%

Secondary: 100% vs. 96.6%

Early latent: 100% vs. 88.9%

Latent unk duration: 100% vs. 83.3%

Total: 98.4% vs. 83.7%

Of 48 cases ID’d from 1107 routine specimens, Architect sens 100%, VDRL 35.4% (but many of these were treated)

Architect specificity: 99.1%


Samples well characterized by syphilis stage.

Large number of primary infections (79)

Validity of gold standard- good.

Highly relevant. Direct comparison of Architect CLIA and VDRL.

Architect more sensitive than VDRL for all syphilis stages, esp. primary

Zhang, W., et al. (2012). “The impact of analytical sensitivity on screening algorithms for syphilis.” Clinical Chemistry 58(6): 1065-1066 Analytical sensitivity analysis of 5 assays: Bioplex IgG, LIAISON, Trep-Sure, CAPTIA G, TPPA.

10 strongly positive samples. 4 with active disease, 6 with past infection.

Gold standard definition- N/A

Analytical sensitivity of Bioplex, LIAISON, CAPTIA were similar.

Trep-Sure more sens (by 3, 2-fold dilutions), and TPPA most sens (by 6, 2-fold dilutions)

Small number of samples

Validity of gold standard- N/A. Samples for analytical sens study characterized based only on lab results (active vs. past infec)

Relevant. Analytical sensitivity of Treponemal assays varies, and could impact level of agreement with reflexive testing. Medium
Zhiyan, L., et al. (2015). “Consistency Between Treponema pallidum Particle Agglutination Assay and Architect Chemiluminescent Microparticle Immunoassay and Characterization of Inconsistent Samples.” Journal of Clinical Laboratory Analysis 29(4): 281-284 Analyzed agreement between Architect CLIA and TPPA

Prospective enrollment of samples positive by CLIA. 149 (3.1%) positive. Samples then tested by RPR. Immunoblot performed on discordants.

2506 men, 2364 women

Gold standard definition- laboratory (TPPA)

149 samples (3.1%) positive by CLIA. 112 (75%) positive by TPPA.  18 (12% of 149 samples) were not confirmed by immunoblot. Most were pregnant, cancer pts, autoimmune disorder, and other infections. Clinical data on CLIA false positives- but only speculate as to the exact cause of false positive test result

Validity of gold standard- fair. Not comprehensive lab testing. No information on patient clinical status, so can’t assess gold standard relative to disease status

Relevant. Describes potential causes for false positive CLIA results. Medium