Breast Cancer

Context

“Subjectivity (Ed, i.e., the prior) is sometimes seen as a deficiency of Bayesian inference. Others regard it as a powerful advantage; it permits us to express our personal experience mathematically and combine it with data in a principled and transparent way. Bayes’s rule informs our reasoning in cases where ordinary intuition fails us or where emotion might lead us astray. We will demonstrate this power in a situation familiar to all of us.

Suppose you take a medical test to see if you have a disease, and it comes back positive. How likely is it that you have the disease? For specificity, let’s say the disease is breast cancer, and the test is a mammogram.”

Pearl, Judea. The Book of Why: The New Science of Cause and Effect (pp. 104-105). Basic Books. Kindle Edition.

Representing the Problem Domain as a Bayesian Network

We implement this example as a causal Bayesian network, which means the arc between Breast Cancer and Mammogram represents a causal relationship.
The resulting Bayesian network in XBL format is available here:

BoW_BreastCancer.xbl

Loading SVG...

You can also experiment with this model via our WebSimulator: https://simulator.bayesialab.com/#!simulator/186824514911

Should I worry about a positive test result?

“Suppose a forty-year-old woman gets a mammogram to check for breast cancer, and it comes back positive. The hypothesis, D (for “disease”), is that she has cancer. The evidence, T (for “test”), is the result of the mammogram. How strongly should she believe the hypothesis? Should she have surgery?” (Pearl, p. 105)

Calculating the Cancer Risk with BayesiaLab

We use the probabilities described by Pearl to set the parameters of the Causal Bayesian Network:
- For a typical forty-year-old woman, the probability of getting breast cancer in the next year is about one in seven hundred, 0.14%. We use that as our prior;
- The sensitivity (true-positive) of a mammogram is 73%;
- The specificity (true-negative) of a mammogram is 88%.
Notice the Input component Breast Cancer—Your Prior Estimate in the WebSimulator . This allows you to set your own initial belief that a patient has breast cancer.
Upon setting Mammogram=Positive as Hard Evidence, the probability of Breast Cancer=True increases from 0.14% to 0.86%.

Counterintuitive Results

“The conclusion is startling. I think that most forty-year-old women who have a positive mammogram would be astounded to learn that they still have less than a 1 percent chance of having breast cancer. Figure 3.3 might make the reason easier to understand: the tiny number of true positives (i.e., women with breast cancer) is overwhelmed by the number of false positives.” (Pearl, p. 106)

Should I worry now?

“However, the story would be very different if our patient had a gene that put her at high risk for breast cancer—say, a one-in-twenty chance within the next year.

For a woman in this situation, the chances that the test provides lifesaving information are much higher. That is why the task force continued recommending annual mammograms for high-risk women.

This example shows that P(disease | test) is not the same for everyone; it is context-dependent (Ed: it depends on the prior). If you know that you are at high risk for a disease to begin with, Bayes’s rule allows you to factor that information in. Or if you know that you are immune, you need not even bother with the test!” (Pearl, pp. 107-108)

Recalculating the Risk

To answer this question with BayesiaLab, you can either modify the model by setting the prior of Breast Cancer to 5% via the Node Editor, or you can set a Probabilistic Evidence via the Monitor.
In the WebSimulator, you would set the Input Breast Cancer—Your Prior Estimate (initial belief) to 5%.
Upon setting Mammogram=Positive, the probability of Breast Cancer=True increases to 24.25%.

Visualizing the Impact of the Prior

To illustrate the impact of the prior (or prevalence), we added a parent node to Breast Cancer for defining such prior. This is what we call a “hyperparameter.”
The updated network, including the hyperparameter, is available here:

BoW_BreastCancer_Prevalence.xbl

You can now set Mammogram=Positive as Hard Evidence.
With this evidence set, you can use Target Mean Analysis to explore a range of values for the prior, from 0% to 100%: Menu > Analysis > Visual > Target > Target's Posterior > Curves > Total Effects.
You will obtain a plot in which the x-axis represents the prior of Breast Cancer=True, i.e., the hyper-parameter.
The y-axis represents the updated probability of Breast Cancer=True given a positive mammogram result.