These are the frequently asked questions about the development of the on-line Age Gap Decision Tool.

Where is the data from?

The data is from cancer registration records and are extracted from the National Cancer Data Repository (NCDR). Our model is based on records of patients diagnosed with a primary operable breast cancer in the West Midlands and the Northern and Yorkshire regions at age 70 and over between the years of 2002-2010. The full dataset consists of (approximately 24,000) patient records.

What data is included?


Patients were included in the model if they were diagnosed with a first instance of primary operable breast cancer and were recorded as having ER+ disease and/or received hormone therapy.


Patients were eligible for the chemotherapy model if they were treated surgically for non-metastatic breast cancer. Since very few women aged over 80 received chemotherapy, the estimated risk of breast cancer mortality is based on data for patients aged 70-79 only. The tool may still be used for patients aged 80 and over, but results should be interpreted with caution.

Exclusion criteria

  • Patients with non-invasive disease (e.g. DCIS) or multifocal tumours were excluded from all analyses.
  • Patients known to have metastatic disease were also excluded.

What is the age of the data?

The data consists of patients diagnosed between 2002 and 2010. An earlier version of the model included survival follow up until March 2013. The current version includes follow up January 2017. The tool will be revised again as and when further appropriate data becomes available.

How was the data modelled?

For the surgery model, the effect of treatment on breast cancer mortality is estimated from the registry data. As the data is not randomised, this comparison has been done with careful consideration of the effects of potential confounding variables.

For the chemotherapy model, the treatment effect is based on findings from a large meta-analysis of clinical trials comparing adjuvant chemotherapy with surgery without chemotherapy. This is the best available estimate of the effectiveness of chemotherapy, and is consistent with the outcomes observed in our data. To calculate the risk of mortality for an individual patient, the treatment effect is combined an estimate of personalised risk based on the outcomes observed in the registry data.

What are the limitations of the data?

The data used to produce the personalised outputs was collected from clinical practice and not from a prospective study. This means that our outputs reflect what has happened to patients in a "real-life" clinical setting, as opposed to the somewhat artificial setting of a clinical trial. However, this means that there is less control over the data, and inevitably there are issues such as missing data and some items which would have enhanced the outputs are not available. We have addressed these issues to the best of our ability using appropriate statistical techniques.

Where does the Activities of Daily Living (ADL) data come from?

The effect of different ADL scores on outcomes is based on an age, sex and comorbidity adjusted analysis of US survey data published in 2012. This data has been used because this information is not collected by UK cancer registries. We hope that as a result of the ongoing Bridging the Age Gap study, we will be able to replace these estimates with UK specific values in the future.

Why are these the only co-morbidities included?

The comorbidities included in the algorithm are those that constitute the widely used and well validated Charlson Comorbidity Index. The effects of comorbidity on survival are estimated using this index. We hope that as a result of the ongoing Bridging the Age Gap study, we will be able to enhance the algorithm with more tailored estimates of the effects of comorbidities.

Why isn't socioeconomic status included?

Socioeconomic status is included in the underlying model calculations via the income domain of the Indices of Multiple Deprivation. However, according to the constitution of the NHS treatment should not be determined on the basis of deprivation alone. Some of the effects of socioeconomic status on outcomes is captured in differences in comorbidity and frailty that exist between these groups.

Why isn't ethnicity included in the calculation?

As with socioeconomic status, treatment decisions should not be made on the basis of ethnicity unless this is relevant to predicted outcomes. In our dataset only a small percentage of breast cancer patients were from Black and Minority Ethnic populations, so it is not currently possible to assess how ethnicity interacts with treatment choice and survival.

Why isn't performance status included?

Eastern Cooperative Oncology Group performance status is not routinely collected by cancer registries so it is not possible to include this in our models.

How reliable are the predictions?

Based on statistical validation in women over 70 from the West Midlands and Northern & Yorkshire registry regions, we are confident in the reliability of the predictions, and they are based on the best available evidence at this time. There is more uncertainty about predictions for the oldest women (i.e. 90 or over), because there is limited data available for this group. However, as more data becomes available in the future we will be able to do additional analyses to assess the accuracy of these predictions in other populations, and if necessary amend the models.