Purpose: Plots to assess the calibration of a prediction model, when applied to a validation dataset, are critical for judging the adequacy of the model or comparing rival models. Traditionally, these plots have applied a smoothing function to a plot of the “actual” outcome on the vertical axis and “predicted” outcome on the horizontal axis. In this way, the reader can compare the smoothed line to that of a perfectly straight 45 degree line that denotes perfect calibration. While such plots are helpful, 2 deficiencies remain. First, this plot does not naturally indicate where the bulk of the predictions lie. Second, and related to the first, is that the prevalence of the predictions in a region of miscalibration cannot be inferred. The purpose of the present study was to introduce a plot that repairs both deficiencies of a traditional calibration plot.
Method: After several unsuccessful iterations involving the manipulation of axes, addition of shading, etc., a plot was constructed that appeared to solve the deficiencies above plus provide ready interpretation. The vertical axis is displayed as prediction error (actual value – predicted value). The horizontal axis is predicted value spaced in proportion to the frequency of predicted values. In other words, the spacing of the x-axis is such that a histogram of predicted values would indicate a uniform distribution. This approach makes it easy to infer where the bulk of the predictions lie. More importantly, it quickly illustrates the frequency of predictions which might lie in a miscalibrated area of the prediction model.
Result: Figure 1 presents a traditional calibration curve for a prediction model applied to a validation dataset. Note that this figure suggests quite poor calibration of the prediction tool. Figure 2 is the novel “miscalibration curve.” Note that this curve suggests a substantially different interpretation, indicating excellent calibration of the model for the vast majority of its predictions.
Conclusion: The miscalibration curve is a useful plot for providing improved insight into the performance of a prediction model, relative to the traditional calibration curve.