Purpose: Health utility scores elicited by different instruments may produce different values. The objective of the study was to assess utility scores derived from three common instruments.
Method: EQ-5D, HUI3 and SF-36 questionnaire data was available from 1,051 individuals from the Ontario HIV Treatment Network (OHTN) Cohort Study. Data is collected at 11 active sites across Ontario. Participants completed all three instruments during a face-to-face interview. Utility scores were derived from the three instruments. Agreement between the three instruments was investigated. Mean difference, or bias, and the 95% confidence limits of the bias, or the limits of agreement were evaluated. Bland-Altman plots were used to investigate systematic bias between utility score measures.
Result: The mean bias for the cohort was –0.01 (HUI3 versus SF-36), 0.12 (EQ-5D versus HUI3) and 0.11 (EQ-5D versus SF36). However, the upper and lower limits of agreement (HUI3 versus SF-36) was 0.41 and –0.44 respectively, or a 95% confidence interval of 0.85. The narrowest confidence interval was 0.47, comparing the EQ-5D and SF-36. Compared to the EQ-5D and HUI3, proportional error was observed with the SF-36 in the Bland-Altman plots. Specifically, greater bias was observed as the average utility score decreased. For average utility scores less than 0.42 and 0.45, SF-36 utility scores were always greater than the EQ-5D and HUI3 utility scores respectively. The proportional error was likely caused by the higher floor effect with the SF-36 derived utility scores. In contrast, the Bland-Altman plot of the EQ-5D compared to the HUI3 was notable for the clustering at, or near the ceiling with the EQ-5D derived utility scores.
Conclusion: EQ-5D, HUI3, and SF-36 are three commonly used instruments. However, each instrument utilizes a unique questionnaire framework, and captures different health domains. As well, the algorithms used by each instrument to derive utility scores employ different designs. At the individual participant level, large differences between utility scores derived from the EQ-5D, HUI3 and SF-36 were observed. While the three instruments are used to measure a common outcome, there are distinct differences amongst values.
See more of: The 32nd Annual Meeting of the Society for Medical Decision Making