QUANTIFICATION OF DIAGNOSTIC TEST DISCRIMINATION AND CALIBRATION IN UNITS OF BITS OF INFORMATION

Monday, 24 October 2005
40

QUANTIFICATION OF DIAGNOSTIC TEST DISCRIMINATION AND CALIBRATION IN UNITS OF BITS OF INFORMATION

William A. Benish, MD, MS, Department of Veterans Affairs, Cleveland, Shaker Heights, OH, Matthew Karafa, PhD, The Cleveland Clinic Foundation, Cleveland, OH, and Neal V. Dawson, MD, Metrohealth Medical Center, Cleveland, OH.

Purpose: To demonstrate the application of information statistics to the quantification of diagnostic test discrimination and calibration.

Methods: Information theory is applied to quantifying diagnostic information. If p is the estimated probability that a patient has a certain disease, then –log2(p) is the “surprisal” associated with confirming that the patient has that disease. The information value (I) of a diagnostic test is the expected value of the reduction in the surprisal that occurs as a result of testing. Diagnostic test performance is modeled by assuming that the test results are normally distributed for both the diseased and healthy populations. Test discrimination improves as the distance between the means of the populations increases. Miscalibration occurs when incorrect assumptions are made about the distributions of the test results. In the present example, miscalibration is modeled by assuming that the standard deviations of the normal distributions are larger or smaller than their true value. An information based discrimination index (DIinfo) is defined as I for the case in which calibration is perfect. An information based calibration index (CIinfo) is defined as the amount of information lost as a result of miscalibration, i.e., CIinfo = DIinfo – I. Traditional measures of test discrimination and test calibration derived from Brier scores are also applied to these simulated data; comparisons are made between the information statistics and the Brier score statistics.

Results: Examination of graphs of the various test performance measures plotted as a function of the separation between the means of the distributions reveal the following: a) a general similarity between the information statistics and the Brier Score related statistics, b) qualitative differences between these statistics—situations exist in which the two classes of statistics reach different conclusions about which test is better calibrated, c) diagnostic information is quantified as a negative number when the performance of a test is worse than that of a forecaster who predicts the base rate.

Conclusions: Diagnostic test discrimination, test calibration, and overall test performance can be quantified in units of bits of information. In contrast to conventional methods of quantifying test performance, information measures are meaningful (nonarbitrary) ratio scale statistics.

See more of Poster Session III
See more of The 27th Annual Meeting of the Society for Medical Decision Making (October 21-24, 2005)

Monday, 24 October 200540

QUANTIFICATION OF DIAGNOSTIC TEST DISCRIMINATION AND CALIBRATION IN UNITS OF BITS OF INFORMATION

Monday, 24 October 2005
40