ORAL ABSTRACTS: ASSESSING PATIENT PREFERENCES AND PRIORITIES
Laura D. Scherer, PhD
University of Missouri
Department of Psychological Sciences
Method: We surveyed 149 healthcare providers (47% female; 74% White) recruited from two large urban academic hospital emergency departments in the Mid-Atlantic region between May 19th and December 26th, 2015, and 224 patients (61% female; 56% African-American) from one of the emergency departments between April and July 2015. All subjects answered 46 Likert scale questions measuring providers’, or patients’, mental representations, 2 yes/no questions measuring patients’ expectations for antibiotics, and 2 free response questions measuring patients’ knowledge regarding antibiotic prescribing. Analysis was conducted using exploratory factor analysis (EFA).
Results: The “Why Not Take a Risk” gist captured significant unique variance across both provider (13% of variance) and patient (16% of variance) samples. Separate factors captured other relevant gists such as the possibility of harm from side effects (10% of variance for providers; 9% of variance for patients) and that antibiotics might not be safe (6% of variance for providers; 11% of variance for patients). Patients (6% of variance), but not providers, endorsed a gist indicating that antibiotics work against viruses.
Conclusion: Both patients and providers utilize “Why Not Take a Risk?” – a widespread strategy that is associated with categorical risk perceptions rather than verbatim analysis. Although individually rational, reliance on this gist can lead to socially suboptimal results including antibiotic resistance. These perceptions are associated with physicians’ expectations for antibiotics which can affect their prescribing. Additionally, patients’ expectations have been shown to drive physician behavior. Thus, “Why Not Take a Risk?” may be a strong driver of overprescribing behavior, suggesting opportunities for public health communication interventions and physician education.
Method: A randomized controlled trial (ClinicalTrials.gov Identifier: NCT02637609) was conducted among a nationally representative, and racially/ethnically diverse, sample of patients with type 2 diabetes in the United States. Respondents evaluated 11 concepts that were purposively selected through a robust mixed-methods environmental scan to represent both known barriers and facilitators of patients’ diabetes management and were randomized to either Likert or BWS. Standardized mean scores were calculated for each object based on the Likert and BWS methods and the results were compared graphically and tested statistically via Spearman’s Rho. We also compared measures of respondents’ understanding of the tasks and associated burdens.
Result: Randomization to Likert (n=549) and BWS (n=554) resulted in balance in respondent characteristics. While the results were highly correlated across the methods (Spearman’s Rho=0.973), respondents in the Likert arm did not value any factor as a barrier, while BWS identified concepts that respondents valued both positively and negatively. BWS also had tighter standard errors across all objects. Despite the statistical problems associated with their use, respondents considered Likert items both easier to understand, easier to answer and more reflective of their preferences (P<0.01).
Conclusion: Despite the high correlation between the results, BWS seems to have numerous advantages. Respondents tended use only the positive responses on the Likert items, despite considering some concepts that have been documented as barriers in the literature. However, respondents found Likert items less burdensome – both given the novelty of BWS and its relative burden. This said, while the burden for each respondent is less, many more respondents are needs and even then results may be biased. Moving forward, researchers, policy makers, and clinicians should consider BWS an alternative to simple Likert items.
To investigate how two approaches to taste heterogeneity modeling affect inferences about relative-importance weights that US elderly placed on treatment benefits and risks for delaying the onset of Alzheimer’s disease (AD).
1004 US individuals aged between 60 and 85 completed a web-enabled discrete-choice survey instrument in which they were to suppose that they would develop AD in the future without medication. Survey tasks presented the option of no medication or a hypothetical AD treatment, using either a 12 or 16-year timeframe with progression from normal memory to cognitive impairment to AD to death. AD treatments were defined by reductions in the number of years with cognitive impairment or AD but with daily nausea and increased risks of disabling stroke and of death in the first year of treatment. Choice tasks were based on predetermined statistical properties using SAS. Choice data were analyzed using a random parameters logit (RPL) model in Stata and a scale-adjusted latent-class analysis (LCA) model in Latent GOLD. Model parameters were rescaled to a common metric to facilitate comparison between RPL and LCA.
LCA revealed three distinct classes of respondents based on survey timeframe, respondent age, current health status, and whether the respondent was a current AD caregiver. Class one (42% of study sample) generally preferred medication, traded between all benefits and risks, was relatively younger, and was not a current AD caregiver. Class two (30% of sample) preferred no medication, was more concerned about treatment risks than benefit, had no self-reported illnesses, and was a current AD caregiver. Class three (28% of sample) strongly preferred medication, was more concerned about treatment benefit than risks, was relatively older, and was a current AD caregiver. The relative importance of treatment benefit and risks from RPL was similar to that of the largest class (class one).
Most respondents (70%) were willing to accept treatment risks to reduce time with cognitive impairment or AD. However, the LCA identified 30% of respondents that were more risk averse with a strong preference for no AD treatment. While RPL model results may be informative for the “average” respondent and general policymaking, results from the LCA may hold the key to identifying heterogeneity of preferences and guiding treatment decision making in a clinical setting.
Stated preference (SP) methods elicit individuals’ preferences and are widely used in health policy and clinical decision-making. A major concern is that individuals’ responses to hypothetical choices may not reflect their real preferences, which brings into question the external validity of the stated preferences. Recent research has found that external validity from SP surveys is lower for publicly- versus privately- funded goods, and those with a moral component. Moreover, in a health context, the few SP studies that have assessed external validity, have reported aggregate differences between predictions from SP data and revealed preferences. The aim of this paper is to contrast the predictions from a SP survey with the revealed preferences observed in a large, linked observational dataset.
This paper uses a case study of blood donation to illustrate an approach for improving the accuracy of predictions from SP models. A large online SP survey (5000 invitees) was administered to provide information about donors’ willingness to donate blood at different frequencies, according to alternative future policy options. Donors invited to complete the SP survey were selected from a large longitudinal dataset, the PULSE database, of all 1.2 million blood donors in England, which records the frequency with which donors actually donate blood under current policies. For those policies, we contrasted the actual donation frequency with the same donors’ predicted donation frequency estimated with a multinominal logit model. This ‘within sample approach’ to estimating the discrepancy between stated and revealed preferences minimizes bias due to unobserved confounding.
Compared to the observed frequencies of blood donation, the SP model overestimated the frequency by 31% for women and 40% for men, with wider differences in the estimated discrepancies according to other characteristics. For example, the predicted and observed frequencies differ by 45% for younger women (aged 17-30), but only by 16% for older women (aged 61-70).
This approach can extend the external validity of SP models by harnessing large data to calibrate predictions at the level of the subgroup rather than at the aggregate level. Hence, this method can help to improve the predictive value of responses to SP surveys, and their usefulness for decision-making.
The hypothetical nature of stated preference data raises an important question about its validity in characterizing respondents’ actual behaviour. The objective of this study was to compare the forecasted choices of respondents using stated preferences in a discrete choice experiment (DCE) to their observed actual choices at an individual level.
A DCE was performed in patients prior to being offered treatment for latent tuberculosis infection. A mixed logit model was estimated using hierarchical Bayes. The individual-specific preference coefficients were used to calculate the expected probability of choosing the treatment by each patient. The forecasted choice using this probability was compared to their actual decision. We examined the comparability of different distributions for the random parameters. We also explored the predictive power of DCE using different thresholds to convert probabilities into the predicted choices and a Receiver Operating Characteristic (ROC) curve.
Our results identified significant heterogeneity in preferences for all attributes among respondents. The model with log-normal distribution for attributes representing treatment side effects improved model fit and predictive power compared to the other model specifications. The best model correctly predicted actual decisions for 83% of participants. The predictive performance of the DCE results was also confirmed using a threshold that maximizes Youden's index and the ROC curev. The area under the ROC was 0.8237. We also showed that individual-specific coefficients reflected respondents’ actual choices more closely compared to the aggregate level estimates. While the probability of the chosen alternative over the sample was 69% based on the aggregate coefficients, the average of predicted individual probabilities of the chosen alternative was 82%.
In summary, our findings showed that DCE as a method of obtaining stated preferences can yield similar results to revealed preferences at the individual level in this setting. However future investigations are required to establish the predictive power of DCEs in different settings.
Methods: We used data from the 2003 Medical Expenditure Panel Survey to estimate two types of finite mixture models: a mixture of two normal distributions and a custom mixture of a degenerate distribution with mass at 1 and two censored normals (Tobit). We used the SF-12v2 mental and physical summary scores as predictors of both the mean components and the mixture probabilities. We evaluated predictions using mean average error (MAE) and root mean square error (RMSE). We compared two types of predictions from mixture models: predictions weighted by the estimated mixture probabilities (mean-based predictions) and predictions based on classification by the highest estimated mixture probability (classification-based predictions). We also estimated models in a subsample of individuals with heart disease and stroke because these individuals have a wider range of EQ-5D-3L scores. In addition, we compared mixture model predictions to those of OLS regression.
Results: Predictions from finite mixture models based on classification outperformed mean-based predictions. Mean-based predictions, which are based on an average of the predicted mean components weighted by the estimated mixture probabilities, are in fact similar to predictions from simple OLS models. Classification-based predictions perform better in samples with a wider range of EQ-5D-3L scores.
Conclusions: Predicting health utility from generic or disease-specific non-preference instruments present a challenge because of the unusual distribution of health utility, which is bounded to the left and to the right, has multiple modes, and a non-negligible proportion of observations is clustered at a single value. Finite mixture models have emerged as promising alternatives to traditional models because mixture models can account for the characteristics of preference-based data. While several studies have explored what forms of mixtures could better capture the distributional characteristics of health utility data, the literature offers little guidance on which type of prediction method is more appropriate. In this study, we showed that only classification-based predictions retain the advantages of mixture models. Weighted-average predictions do not preserve the characteristics of health utility data and are similar to OLS predictions. However, classification-based predictions may result in large errors due to misclassification.