ROC ANALYSIS: A THEOREM REFUTED

Sunday, 23 October 2005 - 11:45 AM

ROC ANALYSIS: A THEOREM REFUTED

George R. Laking, MBChB, BMedSci, University of Manchester, Manchester, United Kingdom, Joanne Lord, PhD, National Institute for Health and Clinical Excellence, London, United Kingdom, and Alastair Fischer, PhD, National Institute for Health and Clinical Excellence, London, United Kingdom.

Purpose: In receiver-operator curve (ROC) analysis, it has been said that "it is a theorem that if one ROC overlaps another, that test improves the cost-effectiveness of treatment under all circumstances". We sought to refute this.

Methods: We extend Swets' binormal model of diagnostic testing to a "tetranormal" model, presuming the existence of four clinically relevant subgroups having prevalences and economic characteristics as follows:

Group

G1

G2

G3

G4

Prevalence

9%

21%

49%

21%

Treatment A: QALYs

1.17

1.24

1.13

1.28

Treatment B: QALYs

1.36

1.38

1.07

1.28

Group	G1	G2	G3	G4
Prevalence	9%	21%	49%	21%
Treatment A: QALYs	1.17	1.24	1.13	1.28
Treatment B: QALYs	1.36	1.38	1.07	1.28

Treatment A costs $200 and Treatment B costs $2000.

We postulate a "gold standard" test that classifies G1 and G3 as belonging to disease X, and G2 and G4 as belonging to disease Y. This reflects the outlook with Treatment A. We now evaluate two further diagnostic tests, D1 and D2, in relation to the gold standard. The tetranormal distributions of diagnostic signal for D1 and D2 are characterized as

D1:

D2:

mean

s.d.

mean

s.d.

G1

0.40

0.50

–

0.50

0.50

G2

0.50

0.60

0.50

0.60

G3

–

0.60

0.50

–

0.60

0.50

G4

–

0.50

0.50

0.40

0.50

	D1:		D2:
	mean	s.d.		mean	s.d.
G1		0.40	0.50		–	0.50	0.50
G2		0.50	0.60			0.50	0.60
G3	–	0.60	0.50		–	0.60	0.50
G4	–	0.50	0.50			0.40	0.50

As well as plotting ROC curves for D1 and D2, we plot a novel curve, "ROTS", that tracks the economic consequences of expansion of access to Treatment B at increasingly lenient diagnostic test thresholds. ROTS is plotted in cost-effectiveness (CE-) space.

Results: In ROC analysis, the curve for D2 overlies that for D1. In CE-space however, the ROTS curve for D1 dominates that for D2. Contrary to the earlier theorem, D1 is the more cost-effective test.

Conclusions: The frame of reference for ROC analysis is determined by a pre-existing diagnostic gold standard. This standard is imperfectly calibrated for the treatment decision at hand. The optimal allocation in this case is for Treatment A to G1 and G2, and Treatment B to G3 and G4. Tests D1 and D2 offer additional diagnostic information independently of their ability to predict gold standard status. This is revealed by the curve ROTS, which plots the actual economic consequences of changing test thresholds. This result suggests that ROTS analysis with CE-space as the frame of reference may be superior to the ROC for the evaluation of new diagnostic technology.

See more of Oral Concurrent Session H - Methodological Advances
See more of The 27th Annual Meeting of the Society for Medical Decision Making (October 21-24, 2005)