PS1-56 QUANTIFYING RANK ORDER DISCRIMIATION IN UNITS OF INFORMATION

Sunday, October 18, 2015
Grand Ballroom EH (Hyatt Regency St. Louis at the Arch)
Poster Board # PS1-56

William Benish, MS, MD, Case Western Reserve University, Shaker Heights, OH, Jarrod E. Dalton, PhD, Cleveland Clinic, Cleveland, OH and Neal V. Dawson, MD, Case Western Reserve University at MetroHealth Medical Center, Cleveland, OH
Purpose:    The methods exist to quantify the discrimination performance of a diagnostic (or prognostic) decision maker (or model) in units of information when the decision maker assigns observations to nominal categories, but not when the decision maker assigns probabilities to the possible disease states  or other events.  The purpose of this report is to introduce an information measure of rank order discrimination that is applicable to probabilistic diagnosis.

Method:   “Surprisal” is a primitive information theory concept that quantifies the unlikelihood of an event.  The surprisal associated with an event that has probability p is equal to – log p.  Bayes’ theorem allows for a baseline probability of a disease state or other event to be updated.  The information provided by accrued clinical data can be quantified by calculating the reduction in the surprisal, e.g., pretest surprisal minus posttest surprisal.  To calculate an information measure of rank order discrimination, DInfo, begin with a data set containing: 1) the probabilities assigned by the decision maker to each of the possible disease states for each trial, and 2) the disease state observed on each trial.  For a given trial, let s represent the disease state that was observed, ps represent the baseline probability of s, p* represent the probability assigned by the decision maker to s on that trial, and p** represent the proportion of times, taken across all trials, that s was observed when the probability assigned by the decision maker to s was equal to or greater than p*.  The information provided by the decision maker’s rank order discrimination on that trial is –log(ps)+log(p**).  Define DInfoas the average of these values across all trials.

Result:  In the case of perfect discrimination, DInfo equals the uncertainty about the disease state (calculated as the entropy of the observed disease states).  When the decision maker’s probability estimates do not vary across trials, DInfo equals zero.  DInfoand the area under the receiver operating characteristic curve can reach different conclusions about which of two sets of probability assignments demonstrate the best discrimination.

Conclusion:   The proposed information measure of rank order discrimination complements the established information measures of total diagnostic information, categorical discrimination, and calibration.  It is applicable to the case in which there are multiple possible disease states.