Validation of Rosner-Colditz breast cancer incidence model using an independent data set, the California Teachers Study

B. A. Rosner, G. A. Colditz, S. E. Hankinson, J. Sullivan-Halley, J. V. Lacey, L. Bernstein

Research output: Contribution to journalArticlepeer-review

39 Scopus citations


To validate an established breast cancer incidence model in an independent prospective data set. After aligning time periods for follow-up, we restricted populations to comparable age ranges (47-74 years), and followed them for incident invasive breast cancer (follow-up 1994-2008, Nurses' Health Study [NHS]; and 1995-2009, California Teachers Study [CTS]). We identified 2026 cases during 540,617 person years of follow-up in NHS, and 1,400 cases during 288,111 person years in CTS. We fit the Rosner-Colditz log-incidence model and the Gail model using baseline data. We imputed future use of hormones based on type and prior duration of use and other covariates. We assessed performance using area under the curve (AUC) and calibration methods. Participants in the CTS had fewer children, were leaner, consumed more alcohol, and were more frequent users of postmenopausal hormones. Incidence rate ratios for breast cancer showed significantly higher breast cancer in the CTS (IRR = 1.32, 95 % CI 1.24-1.42). Parameters for the log-incidence model were comparable across the two cohorts. Overall, the NHS model performed equally well when applied in the CTS. In the NHS the AUC was 0.60 (s.e. 0.006) and applying the NHS betas to the CTS the performance in the independent data set (validation) was 0.586 (s.e. 0.009). The Gail model gave values of 0.547 (s.e. 0.008), a significant 4 % lower, p < 0.0001. For women 47-69 the AUC values for the log-incidence model are 0.608 in NHS and 0.609 in CTS; and for Gail are 0.569 and 0.572. In both cohorts, performance of both models dropped off in older women 70-87, and later in follow-up (6-12 years). Calibration showed good estimation against SEER with a non-significant 4 % underestimate of overall breast cancer incidence when applying the model in the CTS population (p = 0.098). The Rosner-Colditz model performs consistently well when applied in an independent data set. Performance is stronger predicting incidence among women 47-69 and over a 5-year time interval. AUC values exceed those for Gail by 3-5 % based on AUC when both are applied to the independent validation data set. Models may be further improved with addition of breast density or other markers of risk beyond the current model.

Original languageEnglish
Pages (from-to)187-202
Number of pages16
JournalBreast Cancer Research and Treatment
Issue number1
StatePublished - Nov 2013


  • Breast cancer
  • Calibration
  • Methods
  • Prediction models
  • Validation


Dive into the research topics of 'Validation of Rosner-Colditz breast cancer incidence model using an independent data set, the California Teachers Study'. Together they form a unique fingerprint.

Cite this