3I-5
TWO IS BETTER THAN ONE: AN ILLUSTRATION OF PAIRING SURVIVAL CURVES AND RECEIVER OPERATING CHARACTERISTIC CURVES TO ENHANCE DISEASE SIMULATION MODEL VALIDATION
Purpose:
Disease modelers often conduct external validation using cross-sectional population-level outcomes (e.g. mortality rates), but a unique validation opportunity presents itself when modelers have access to individual-level longitudinal data.
Methods:
We developed a cardiovascular disease (CVD) micro-simulation model that simulates lifetime CVD incidence and mortality. The model requires individual-level data, drawing randomly with replacement from a representative individual-level dataset and simulating the remainder of each individual's life. For this exercise, we used individual-level CVD risk factor data (age, sex, cholesterol, etc.) from the 1999-2000 National Health and Nutrition Examination Survey (NHANES) population, which has follow-up all-cause mortality and CVD mortality data for each individual through 2011. We validated our simulation model to mortality outcomes using two distinct approaches.
Survival curves: We simulated 1,000,000 individuals through the model and tracked their yearly survival. We compared annual average model population-level all-cause and CVD mortality rates against that observed in the NHANES population. Non-parametric bootstrapping was used to calculate 95% confidence intervals for observed mortality rates.
ROC curves: We used the same NHANES population and simulated each individual through the model 1,000 times, calculating the percent of iterations each individual died (all-cause or CVD) at five- and ten-year intervals. Individuals were ranked by these values to characterize model-based risk. We then compared these individual-level model-based risk rankings to observed individual-level mortality outcomes in the NHANES data, treating the model as a diagnostic test for mortality risk (where observed outcomes were the reference standard). Receiver operating characteristic (ROC) curves were constructed to calculate area under the curve (AUC) values.
Results:
Using survival curves, five-year all-cause mortality for the simulation model compared to NHANES observed outcomes (n=2,689) was 4.6% versus 4.3% (95% CI: 3.7-4.9%); five-year CVD mortality was 1.2% versus 1.1% (0.8-1.4%). At ten years, corresponding values were 10.9% versus 11.2% (10.3-12.2%) and 2.6% versus 2.2% (1.8-2.7%). AUCs for all-cause and CVD mortality at 5 years were 0.80 (0.77-0.83) and 0.82 (0.75-0.88) respectively, and at ten years, 0.83 (0.81-0.85) and 0.85 (0.81-0.88) respectively (Figure).
Conclusion:
Solely relying on population-level survival curves could lead to individual-level mismatch of risk and outcomes; AUC performance alone does not take absolute risk into account. Our CVD model validation exercise demonstrates that both methods in tandem can provide a well-rounded model performance summary.