Purpose: In this study, we develop an artificial neural network (ANN) to estimate the risk of breast cancer based on mammographic findings and demographic risk factors, and assess how well our ANN can (1) discriminate between benign and malignant mammographic findings, and (2) generate well-calibrated probabilities that estimate the risk of breast cancer for individual findings.
Method: Our dataset consisted of 62,219 prospectively collected consecutive mammography findings matched with Wisconsin State Cancer Reporting System. We built a three-layer ANN with excessive hidden nodes (1000) because large networks are shown to perform better when presented to unseen cases. We trained and tested our ANN using ten-fold cross validation and kept a validation set to prevent overfitting. We compared the performance of our ANN to that of interpreting radiologists. We used area under the receiver operating characteristic curve (AUC), sensitivity, and specificity to evaluate discriminative performance of our ANN and interpreting radiologists. We calculated the accuracy of risk prediction (i.e. calibration) of our ANN using the Hosmer–Lemeshow (H-L) goodness-of-fit test.
Result: Our ANN demonstrated an AUC = 0.965 ± 0.001, which was significantly higher (P < .001) than that of the radiologists, AUC = 0.939 ± 0.011. Our ANN also demonstrated significantly high calibration as shown by a small H-L statistic (12.46) and high P-value (P=0.13, df=8).
Conclusion: Our ANN can effectively discriminate malignant abnormalities from benign ones and produce well-calibrated risk estimates for individual abnormalities. Our findings suggest that ANNs may have the potential to help radiologists improve mammography interpretation.
Candidate for the Lee B. Lusted Student Prize Competition