Imagine that a person gets the results of a colon cancer screening test.

There are two possible scenarios - either the result is positive, indicating that they have colon cancer, or the result is negative, indicating they don’t have colon cancer.

At this point, the person may ask themselves, how worried should I be that it was a positive test result? Or, how reassured should I be that it was a negative test result?

Each test has a positive predictive value, or PPV, which is the probability that people with a positive test result truly have the outcome, and a negative predictive value, or NPV, which is the probability that people with a negative test result truly don’t have the outcome.

Let’s take an example to show how it’s possible to measure a test’s predictive value.

Let’s say that we recruit 1000 people - 100 people with colon cancer and 900 people without colon cancer - and then we give them all the same screening test.

That way we can see how many people with positive results actually have colon cancer and how many people with negative results actually don’t have colon cancer.

We can organize the results using a 2 by 2 table, where the true disease status of the individual is on the top of the box, and the results of the screening test are on the side, and each of the cells is labeled a, b, c, or d.

A true positive would be a person who gets a positive test result and has colon cancer.

A true negative would be a person who gets a negative test result and doesn’t have colon cancer.

A false positive would be a person who gets a positive test result even though they don’t have colon cancer.

And a false negative would be a person who gets a negative test result even though they have colon cancer.

To calculate the positive predictive value, we divide the number of true positives by the total number of people who tested positive - so cell a divided by the sum of cell a and b.

A test with a perfect positive predictive value would have 100 true positives in cell a, because the test would correctly identify everyone who has colon cancer, and zero false positives in cell b.

To calculate negative predictive value, we divide the number of true negatives by the total number of people who tested negative - so cell d divided by the sum of cell c and d.

A test with perfect specificity would have 900 true negatives in cell d, because the test would correctly identify everyone who doesn’t have colon cancer, and zero false negatives, in cell c.

But no test is 100% perfect, so let’s say that cell a contains 90 true positives, cell b contains 50 false positives, cell c contains 30 false negatives, and cell d contains 850 true negatives.

In this situation, the positive predictive value would be 64%, because there are 90 people who are true positives - in cell a - and 140 people who tested positively - cell a plus cell b.

In other words, 64% of people who test positively will actually have colon cancer, while the other 36% of people who test positively will not have colon cancer.

The negative predictive value would be 97%, because there are 850 people - in cell d - who are true negatives and 880 people who tested negatively- cell b plus cell d.

In other words, 97% of people who test negatively will actually not have colon cancer, while the other 3% of people who test negatively will have colon cancer.

Predictive values are commonly confused with sensitivity and specificity.

A test with high sensitivity will correctly identify most people who have the condition, and a test with high specificity will correctly identify most people who don’t have the outcome.

The main difference between validity and predictive value is that sensitivity and specificity are fixed characteristics of a test.

On the other hand, predictive values are affected by the prevalence of the outcome.

To understand this better, let’s follow another example.

So, imagine we now want to test for colon cancer in a group of 10,000 teenagers.

Colon cancer is rare for teenagers, so let’s say the prevalence is only about 1%, in reality, it would be far lower, but this makes the numbers easier to follow.

And let’s also say that the test we use to check for colon cancer has a sensitivity of 99%, meaning it correctly identifies 99% of people who have colon cancer, and a specificity of 95%, so it correctly identifies 95% of people who don’t have colon cancer.