Purpose: Many factors affect the balance of true and false test results, and the interaction of two such factors – disease prevalence and the positive threshold – cause results to differ in high versus low-prevalence settings. We used an example of testing for latent tuberculosis infection (LTBI) to demonstrate the importance of disease prevalence in decisions regarding positive thresholds and test strategies.
Method: We compared number of true and false positive results when using two LTBI screening tests (in-tube QuantiFERON-TB Gold [QFT-IT] and T-SPOT.TB) in five countries of varying prevalence. We used estimates from test manufacturers to ascertain each test’s positive thresholds, from published literature to determine sensitivity (81%, QFT-IT; 88%, T-SPOT.TB) and specificity (99%; 88%), and from the World Health Organization to estimate country-specific LTBI prevalence. We assumed sensitivity and specificity remained stable, with prevalence the only difference between settings.
Result: In switching from QFT-IT to T-SPOT.TB, the 7% increase in sensitivity impacted number of true positives more in high-prevalence settings, and the 11% decrease in specificity impacted number of false positives more in low-prevalence settings. Tradeoffs between increasing case identification and decreasing unnecessary treatments thus differed by orders of magnitude as prevalence varied, with lower-prevalence settings paying a “price” of accepting more false positives for each true positive gained. For example, the number of false positives per true positive gained in the United States, with 5% LTBI prevalence, was close to 10-fold higher than in Mexico with 29% prevalence, and 30-fold higher than in Ivory Coast with 55% prevalence. Lower-prevalence countries may therefore determine that a 7% increase in early case detection benefits too few people to justify the high burden of false positives, while higher-prevalence countries may decide that a greater increase in early detection is worth the increased treatment of false positives, especially in settings with limited access to care.
Conclusion: Sensitivity and specificity of tests such as QFT-IT and T-SPOT.TB differ in large part because of positive test thresholds, which are applied by test manufacturers equivalently – yet can result in largely different outcomes – between settings. To optimize test performance and improve outcomes, sensitivity and specificity should be set locally not globally, by incorporating prevalence in conjunction with other disease- and setting-specific factors when making testing decisions.