5N-6
PREDICTIONS IN FINITE MIXTURE MODELS FOR MAPPING TO THE EQ-5D-3L: MEAN-BASED VERSUS CLASSIFICATION-BASED PREDICTIONS
Methods: We used data from the 2003 Medical Expenditure Panel Survey to estimate two types of finite mixture models: a mixture of two normal distributions and a custom mixture of a degenerate distribution with mass at 1 and two censored normals (Tobit). We used the SF-12v2 mental and physical summary scores as predictors of both the mean components and the mixture probabilities. We evaluated predictions using mean average error (MAE) and root mean square error (RMSE). We compared two types of predictions from mixture models: predictions weighted by the estimated mixture probabilities (mean-based predictions) and predictions based on classification by the highest estimated mixture probability (classification-based predictions). We also estimated models in a subsample of individuals with heart disease and stroke because these individuals have a wider range of EQ-5D-3L scores. In addition, we compared mixture model predictions to those of OLS regression.
Results: Predictions from finite mixture models based on classification outperformed mean-based predictions. Mean-based predictions, which are based on an average of the predicted mean components weighted by the estimated mixture probabilities, are in fact similar to predictions from simple OLS models. Classification-based predictions perform better in samples with a wider range of EQ-5D-3L scores.
Conclusions: Predicting health utility from generic or disease-specific non-preference instruments present a challenge because of the unusual distribution of health utility, which is bounded to the left and to the right, has multiple modes, and a non-negligible proportion of observations is clustered at a single value. Finite mixture models have emerged as promising alternatives to traditional models because mixture models can account for the characteristics of preference-based data. While several studies have explored what forms of mixtures could better capture the distributional characteristics of health utility data, the literature offers little guidance on which type of prediction method is more appropriate. In this study, we showed that only classification-based predictions retain the advantages of mixture models. Weighted-average predictions do not preserve the characteristics of health utility data and are similar to OLS predictions. However, classification-based predictions may result in large errors due to misclassification.