I-6 DISCRIMINATION AND CALIBRATION OF ARTIFICIAL NEURAL NETWORKS IN MAMMOGRAPHIC DIAGNOSIS

Tuesday, October 20, 2009: 5:15 PM
Grand Ballroom, Salon 6 (Renaissance Hollywood Hotel)
Turgay Ayer, MS1, Oguzhan Alagoz, PhD2, Jagpreet Chhatwal, PhD3, Jude W. Shavlik1, Charles E. Kahn, Jr., MD, MS4 and Elizabeth S. Burnside, MD, MPH, MS2, (1)University of Wisconsin, Madison, WI, (2)University of Wisconsin-Madison, Madison, WI, (3)Merck Research Laboratories, North Wales, PA, (4)Medical College of Wisconsin, Milwaukee, WI

Purpose: In this study, we develop an artificial neural network (ANN) to estimate the risk of breast cancer based on mammographic findings and demographic risk factors, and assess how well our ANN can (1) discriminate between benign and malignant mammographic findings, and (2) generate well-calibrated probabilities that estimate the risk of breast cancer for individual findings.

Method: Our dataset consisted of 62,219 prospectively collected consecutive mammography findings matched with Wisconsin State Cancer Reporting System. We built a three-layer ANN with excessive hidden nodes (1000) because large networks are shown to perform better when presented to unseen cases. We trained and tested our ANN using ten-fold cross validation and kept a validation set to prevent overfitting. We compared the performance of our ANN to that of interpreting radiologists. We used area under the receiver operating characteristic curve (AUC), sensitivity, and specificity to evaluate discriminative performance of our ANN and interpreting radiologists. We calculated the accuracy of risk prediction (i.e. calibration) of our ANN using the Hosmer–Lemeshow (H-L) goodness-of-fit test.

Result: Our ANN demonstrated an AUC = 0.965 ± 0.001, which was significantly higher (P < .001) than that of the radiologists, AUC = 0.939 ± 0.011. Our ANN also demonstrated significantly high calibration as shown by a small H-L statistic (12.46) and high P-value (P=0.13, df=8).

Conclusion: Our ANN can effectively discriminate malignant abnormalities from benign ones and produce well-calibrated risk estimates for individual abnormalities. Our findings suggest that ANNs may have the potential to help radiologists improve mammography interpretation.

Candidate for the Lee B. Lusted Student Prize Competition