PS4-54
A DATA-GENERATED MARGIN-OF-ERROR TO DECIDE WHEN TWO MEASUREMENTS AGREE
Method: Estimated from an analysis of variance, the MoE was applied to 125 pairs of Sharp score ratings of x-ray images from the study comparing rheumatoid arthritis therapies. The Sharp score counts narrowing and erosions in joints of the hands and feet. Using a square-root transform, we extended the MoE to deal with outliers. We computed kappa statistics (K) defining agreement as a paired difference <= MoE.
Result:
Only 10% of 125 paired ratings exactly agreed (K = 0.02). Defining agreement as Sharp score differences <=3 points, 57% of the pairs agreed (K = 0.38). The MoE was 7.8. With agreement defined as differences <=7 points, 74% agreed (K = 0.63). Using the MoE to select discordant pairs, we found systematic differences between x-ray readers and refined consensus training. The table below shows the extended MoE damped outliers and gave more reasonable results than MoE.
X-ray reader differences, D, for wrist, hand, erosions, narrowing, and total Sharp score.
Grouping |
D:Mean |
(D) |
Extended MoE |
MoE |
Wrist |
0.6 |
2.6 |
1.4 |
3.5 |
Hand |
1.0 |
3.6 |
3.2 |
5.2 |
Erosions |
0.8 |
3.5 |
2.9 |
4.6 |
Narrowing |
1.0 |
4.5 |
4.0 |
6.2 |
Total |
1.8 |
5.2 |
7.3 |
7.8 |
Conclusion: The data-generated MoE identifies a threshold for agreement, identifies discordant pairs, and rescales data to produce a more easily interpreted kappa statistic. MoE applies to nearly any unstable or imprecise paired measurements.
See more of: 37th Annual Meeting of the Society for Medical Decision Making