PS4-32 GRAPHICALLY SUMMARIZING RISK PREDICTION MODELS

Wednesday, October 21, 2015
Grand Ballroom EH (Hyatt Regency St. Louis at the Arch)
Poster Board # PS4-32

Vanya Van Belle, PhD, Ir, ESAT-STADIUS / iMinds-KU Leuven Medical Information Technologies Department, KU Leuven, Leuven, Belgium and Ben Van Calster, PhD, KU Leuven Department of Development and Regeneration, Leuven, Belgium

Purpose:             

   To summarize risk prediction models (RPMs) in an understandable and appealing way.

Methods:

   Risk prediction models such as logistic regression and Cox proportional hazard regression, obtain a risk estimate in three steps: (i) multiply the value of each predictor by a coefficient, (ii) add these to obtain the linear predictor, (iii) apply a transformation to obtain the risk.  Summarizing RPM can therefore be done by representing these steps.  We propose the use of color bars to represent step (i).   Similar to nomograms, each predictor is represented by one bar, but the color encodes the value of the contribution to the linear predictor.  As such, the impact of the different predictors is notable in a glimpse.  The translation of the linear predictor to the estimated risk is also represented by a colored bar. 

Results:

   The proposed method is applied to the German Breast cancer Study Group data.  Figure 1 illustrates a cox model using age, the number of positive nodes and the tumor grade.  The effect of age is limited since the color corresponding to age hardly changes.  A larger effect is noted for the number of positive lymph nodes.  However, values larger than 20 only occur in 11 out of 686 cases and the effect might be overestimated by looking at the color bars alone.  It is therefore important to also look at the percentiles: the fifth and ninety-fifth percentile of the data have been added to the plot (vertical dashed lines). The grade of the tumor also contributes to the risk estimate.   

   The triangles illustrate the risk prediction for one specific patient: age 40, 5 positive nodes and a tumor grade of 2.  Age and the number of nodes have a very low contribution for this patient (≈0.2), whereas tumor grade has a higher contribution (≈0.8).  The score (i.e. the translated linear predictor) of ≈1.2 corresponds to a predicted 5-year survival of 0.46. 

Conclusions:

   The proposed method is able to represent a RPM in an appealing and understandable way.  Addition of percentiles enables to interpret the effects in function of the data and not to overestimate the effects.  Addition of patient predictor values and corresponding risk estimate visualizes the risk prediction process for this specific patient.