New Paper: Can AI Predict Breast Cancer Risk 10 Years in Advance?
A variety of thresholds and time frames have been suggested for estimating risk of developing breast cancer as a guide to interventions including supplemental screening and risk-reducing medications:
- A lifetime risk threshold of ≥20% was recommended by the American Cancer Society in 2007 to guide decisions for supplemental MRI.
- A five-year Gail risk of ≥1.7% has been used to guide use of tamoxifen or other risk-reducing medications and recently was proposed by NCCN to “consider” MRI, but see DBI’s prior discussion that such a suggestion is misguided as nearly all women over age 58 would qualify.
- Tabár had previously shown that it takes 9-10 years to see any reduction in breast cancer deaths due to screening, and such a horizon (10-year life expectancy) is also suggested for making decisions about continuing to screen for a variety of cancers (including breast). As such, a ten-year risk estimate could be the “Goldilocks” for decisions about supplemental screening.
In a recent paper, Eriksson et al described the development and validation of an AI-based breast cancer risk model using only mammograms and a 10-year threshold. The study included a total of 17,913 individuals across three primary datasets used for developing and validating the AI risk model ranging in age from 31 to 94 years with a median follow-up of 10 years.1
Key Findings
- Discriminatory Performance: The model outperformed Tyrer-Cuzick-v8 and BCSC-v3, as well as the 5-year image-based Mirai tool.2 The AI model achieved a 10-year area under the curve (AUC) of 0.70 to 0.73 across the validation cohorts..
- High-Risk Identification: The percent of women identified with future cancers was higher than with other models.3 In the top 10% of individuals identified as high-risk by the AI tool (i.e. 8.5% 10-year absolute risk), 30-33% of future cancers developed.
- Diversity and Consistency: Performance remained consistent across various age groups, tumor characteristics (ER status), and between white and Black participants, an important finding given longstanding concerns about bias in AI tools.4
In this data set, it is not clear how many of the women who developed cancer had their cancer detected on screening mammography vs. due to symptoms, or at early vs. later stage. As such, it is uncertain at this time how best to use this information to guide supplemental screening recommendations.
1 Model development: 7,603 women from the Swedish KARMA screening cohort were used to train the AI model. Validation cohorts: 8,696 women underwent 10-year validation, including 2,959 women in the KARMA internal validation cohort and 5,737 women in the Olmsted County (Minnesota, USA) external validation cohort. Additional external validation: the EMBED hospital-based cohort (Atlanta, USA) included 1,614 women (269 breast cancer cases and 1,345 matched controls) with shorter-term follow-up.
2The AI model achieved 10-year AUCs of 0.70–0.73 across validation cohorts. Invasive breast cancer AUC was 0.72 in both KARMA and Olmsted; overall breast cancer AUC was 0.73 in KARMA and 0.70 in Olmsted. Performance remained consistent across age, ER status, and race with time-dependent 10-year AUCs of ≥0.7 across cohorts. In KARMA, invasive cancer AUC was 0.72 (95% CI: 0.68–0.76) versus 0.64 (0.60–0.69) for Tyrer-Cuzick-v8 and 0.66 (0.61–0.70) for BCSC-v3 (both P<0.01). Mirai AUC was 0.66 (0.63–0.70) in KARMA and 0.64 (0.62–0.67) in Olmsted; AI performance was significantly higher across KARMA, Olmsted, and EMBED cohorts (P<0.01 in KARMA/Olmsted; P<0.05 in EMBED).
3In the top 10% highest-risk group, the AI model identified 33% of future breast cancers in KARMA and 30% in Olmsted versus 23% for Tyrer-Cuzick-v8, 24%/23% for Mirai (KARMA/Olmsted), and 20% for BCSC-v3.
4 Age-stratified 10-year AUCs were 0.67 (40–49), 0.71 (50–69), and 0.75 (70–74) in KARMA, and 0.68 (35–49), 0.72 (50–69), and 0.66 (70–94) in Olmsted. ER-positive AUC was 0.73 in both cohorts; ER-negative AUC was 0.70 in both cohorts. For in situ cancers, AUC was 0.76 in KARMA and 0.66 in Olmsted. Race-stratified AUCs in Olmsted were 0.70 for white and 0.67 for non-white participants; in the diverse EMBED cohort (42–47% Black participants), AUC was 0.7 for both white and non-white participants.

