PREDICTING THE EQ-5D PREFERENCE INDEX FROM THE SF-12 HEALTH SURVEY IN A NATIONAL US SAMPLE: A FINITE MIXTURE APPROACH

Monday, October 20, 2014
Poster Board # PS2-15

Candidate for the Lee B. Lusted Student Prize Competition

Marcelo Coca Perraillon, MA, Ya-Chen Tina Shih, Ph.D. and Ronald Thisted, Ph.D., University of Chicago, Chicago, IL
Purpose:  To develop a cross-sectional finite mixture model for predicting the EQ-5D preference index from the mental and physical components of the SF-12 instrument in a sample representative of the US population.   

Methods:  We implemented a finite mixture model assuming that the observed EQ-5D preference index is a combination of three distributions: a degenerate distribution with mass at values indicating perfect health and two censored (Tobit) normal distributions.  A mixture model of this type accounts for the observed characteristics of the EQ-5D distribution, which is bounded, has multiple modes, and has a large proportion of observations clustered at values of one.  We used data from the Medical Expenditure Panel Survey 2000 to randomly divide observations into estimation and validation datasets. We evaluated predictions in the validation sample using mean average error (MAE) and root mean square error (RMSE). We compared two different types of predictions from mixture models: weighted predictions by the estimated mixture probabilities and predictions based on classification by the highest estimated probability. We compared these predictions to those of two commonly used models: ordinary least squares (OLS) regression and two-part models. To facilitate the use of mixture models with Tobit components, we developed a Stata command, which we made publicly available.

Results:  Predictions from finite mixture models based on classification outperformed predictions from two-part models and OLS regression, with substantial improvement in a sample with a smaller proportion of respondents in good health.  

Conclusions:  Finite mixtures offer a flexible modeling approach that accounts for the idiosyncratic characteristics of the distribution of preferences. The use of mixture models allows analysts to obtain more accurate estimates of preferences when only summary scores from the SF-12 and a limited number of demographic characteristics are available. Mixture models are particularly useful when the target sample does not have a large proportion of individuals in good health.