PS 4-41
RELIABILITY AND CONSISTENCY OF THREE VALUE FRAMEWORKS FOR ONCOLOGY THERAPEUTICS
Method: Six raters (3 MDs, 1 DNP, 2 PhDs) all rated 2 oncology products for each of 3 cancers (6 product-cancer combinations in all) using 3 frameworks:
- ASCO Value Framework
- ESMO Magnitude of Clinical Benefit Scale
- Institute for Clinical and Economic Review (ICER) Value Assessment Framework.
More prevalent and costly cancers and related products were selected to represent a range of indications (curative and palliative), malignancies (solid and hematologic), and mechanisms (cytotoxic, biologic, immunologic).
Raters received the published clinical data required to complete the evaluations and detailed instructions for each framework, and were provided no formal training. Intraclass correlation coefficients (ICC) were estimated to measure tool reliability. Drugs indicated for advanced disease (5 of the 6) were rank ordered by mean and individual scores, and Kendall’s W coefficient was calculated to measure agreement among tools. In sensitivity analyses (SA) for ICC, raters were excluded one at a time.
Result: There were 6 ratings of 6 products for each of the 3 frameworks (108 ratings total). ICC results (SA range) were: ASCO 0.66 (0.61-0.70); ESMO 0.73 (0.67-0.78); ICER 0.72 (0.65-0.95). Rankings for the 5 advanced disease regimens (A-E) varied by framework:
ASCO |
ESMO |
ICER |
A |
B |
A |
B |
D |
C |
C |
A |
D |
D |
C |
B |
E |
E |
E |
Kendall’s W across all 3 frameworks was 0.69 (range 0.59-0.85 among individual raters). Pairwise, Kendall’s W was 0.75 for ASCO-ESMO, 0.85 ASCO-ICER, and 0.70 ESMO-ICER.
Conclusion: Knowledgeable but untrained raters, provided with key data, produced moderately reliable results using 3 recently published frameworks for assessing the value of cancer treatments. The frameworks had fair to good consistency, indicating convergent validity, although they led to significantly different conclusions about the relative value of treatments for advanced disease. The conclusions suggest it may be premature to use these frameworks in treatment decision-making without further proof of their reliability or a better understanding of how to interpret differential results between frameworks.