top of page

A Deep Learning Breast Cancer Risk Model for Precise Supplemental Screening

Leslie R. Lamb et al.

May 4, 2026

"Key Points

Question  Can a deep learning (DL) model applied to screening mammograms more accurately identify patients at risk for future breast cancer and false-negative screening results than breast density?

Findings  In a multisite cohort study of 123 091 consecutive screening mammograms in 67 019 patients, the DL model showed greater accuracy than breast density in estimating future breast cancer. False-negative rates were stratified across DL risk groups and were highest in high-risk patients.

Meaning  Findings of this study suggest that DL risk models could offer a more precise and equitable alternative to breast density as a policy criterion for determining access to supplemental breast imaging.

Abstract

Importance  As of September 2024, federal legislation mandates that patients be informed of their breast density, a modest breast cancer risk factor and known cancer-masking agent. This binary metric, dense vs nondense, applies to 40% to 50% of women and is subjectively assessed with interreader variability, limiting its utility for guiding supplemental imaging.

Objective  To compare the performance of a deep learning (DL) breast cancer risk model vs radiologist-assessed breast density in estimating future breast cancer and false-negative (FN) screening results.

Design, Setting, and Participants  This retrospective cohort study included consecutive bilateral screening mammograms from women 30 years or older performed from January 1, 2009, to December 31, 2018, across 5 sites of a large academic health system, with follow-up through December 31, 2023, to allow ascertainment of 5-year breast cancer outcomes.

Exposures  A DL risk model applied to standard screening mammograms and radiologist-assessed breast density categorized using the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) Atlas.

Main Outcomes and Measures  Primary outcomes were breast cancer diagnoses within 5 years of mammography and FN screening results, defined as BI-RADS 1 or 2 examinations followed by a cancer diagnosis within 1 year. DL risk scores were stratified as low (<1.7%), intermediate (1.7%-3.0%), or high (>3.0%). Cancer and FN rates were compared across DL risk groups and breast density categories. Discriminatory performance was assessed using the area under the receiver operating characteristic curve (AUROC) and compared using the DeLong test.

Results  Among 123 091 mammograms in 67 019 women (median [IQR] age, 58.0 [50.0-67.0] years), 50 974 (41.4%) were classified as dense. The DL model demonstrated significantly higher discriminatory accuracy than breast density in predicting future cancer (AUROC, 0.71 [95% CI, 0.70-0.72] vs 0.53 [95% CI, 0.52-0.54]; P < .001). FN rates increased across DL risk groups (2.1 per 1000 examinations in high-risk vs 1.0 and 0.6 in intermediate and low-risk groups, respectively). Women with dense breasts had higher FN rates than those with nondense breasts (1.7 vs 0.6 per 1000 examinations; P < .001). Adding breast density to the DL model did not improve performance.

Conclusions and Relevance  In this cohort study of screening mammography, a DL risk model outperformed breast density in estimating risk of future breast cancer and stratified FN screening results across risk groups. These findings support transitioning from density-based policy triggers toward more precise image-derived risk models to guide access to supplemental imaging."

bottom of page