THE PSYCHOMETRIC BEAUTY OF EXPERT MODELS: A PSYCHOMETRIC EVALUATION OF LINEAR VS. NON-LINEAR MEDICAL EXPERT MODELS IN LENS-MODEL TASKS

Monday, October 25, 2010
Vide Lobby (Sheraton Centre Toronto Hotel)
Esther Kaufmann, PhD, University of Teacher Education Central Switzerland, Zug, Zug, Switzerland

Background: Since Meehl’s (1954) outstanding review on clinical vs. actuarial prediction (i.e. expert models) the success of models is generally accepted to prevent diagnostic errors not only in the medical area. Recently, there is a discussion about which type of models is the most valid. Until now, however, there is no evaluation work comparing linear and non-linear models across medical judgment tasks. Within the lens-model framework it is unique to compare linear and non-linear models built-up on lens-model components (Tucker, 1964) and compairing their success within the same tasks. To what degree do expert models actually improve physicians’ judgment achievement, and which type of model is the most valid?

Methods: To evaluate the success of linear and non-linear models in medical lens-model tasks we applied a psychometric meta-analysis approach according to Hunter and Schmidt (2004). Only this approach makes it possible to correct the data base for up to 11 possible artefacts. To our knowledge, this is the first psychometric application to evaluate the success of linear and non-linear expert-models and compare them within medical lens-model tasks. To check any ecological fallacy (Robinson, 1950) we will first focus on single medical experts.

Results: Our scatter plots reveal that of the 95 single-judgment achievements evaluated against models 47 prefer a linear one, and in seven a non-linear model is more successful. However, in 41 analysed judgment achievements the model was unsuccessful. This clearly shows the success of linear over non-linear models, and beyond that the question arises whether the underlying tasks are responsible for the differences. In the 10 medical tasks the linear expert model improved physicians’ judgment achievement in eight tasks. In two tasks, however, the evaluated expert models were not successful. Finally, to give an impression of the power of linear expert models, we used psychometric corrected components, and it is clearly shown that with such correction the overall success increases from .01 to .35 in linear expert models and from -.27 to -.17 in non-linear expert models.

Conclusion: This leads to the conclusion that artefact-corrected linear expert models should be applied to support the daily judgments of physicians at work to reduce errors. Such an approach is needed in the evaluation work on expert models as well – to find the most valid model for every medical task and reduce diagnostic errors.