# Information bias

Videos

Notes

## Biostatistics and epidemiology

#### Epidemiology

#### Content Reviewers:

Rishi Desai, MD, MPH#### Contributors:

Pauline Rowsome, Tanner Marshall, MS, Evan Debevec-McKenneyInformation bias or measurement bias is a type of bias or error that can occur when researchers are unable to collect accurate data.

Typically, information can be misclassified in two ways - differential, when information collected from one group is accurate but information collected from the other group is inaccurate, and non- differential, when information collected from both groups is inaccurate.

For example, let’s say you want to figure out if flossing teeth prevents cavities.

So you follow 100 people who floss, and 100 people who don’t floss, over the course of ten years, and find out that 30% of the flossers and 60% of the non- flossers ended up getting cavities.

Now, we can divide these proportions, 60% divided by 30%, and conclude that people who don’t floss have 2 times the risk of getting cavities compared to people that do floss.

Now, in this study we assume that every person in the flossing group is going to floss their teeth every single day for ten years, and every person in the non- flossing group is not going to floss their teeth for ten years.

But sometimes people in one of the study groups don’t stick with their exposure for the entire study period.

For example, maybe some people in the flossing group stopped flossing halfway through the study, because they ran out of floss and just never bought more.

In this case, they would still be counted by the researchers as part of the flossing group even though they technically switched over to the non- flossing group.

On the other hand, everyone in the non- flossing group stayed in the non-flossing group, meaning that they all really didn’t floss for the entire ten years.

This would cause differential misclassification, because information collected from the non- flossing group would be accurate, but information collected from the flossing group would be inaccurate.

So how does differential misclassification affect the results of a study?

Let’s assume that flossing actually does decrease the risk of cavities, so the people from the flossing group who stopped flossing halfway through the study actually had a higher risk of cavities than people who kept flossing for the whole study, which would cause an overall increase in the risk of cavities for the flossing group.

So, going back to our study, we might find that 40% of people in the flossing group got cavities, and we’d be unaware that the true risk is 30%.

Now when we compare the proportions of people who got cavities - 60% in the non- flossing group and divide it by 40% in the flossing group – we end up with 1.5 times the risk of getting cavities for people who don’t floss compared to those that do floss.

In this case the true risk of cavities was underestimated.

But although differential misclassification can sometimes underestimate the effect of the exposure on the outcome, it can also overestimate it; so it can have a very unpredictable effect on the results.

On the flip side, non- differential misclassification occurs when people in both study groups don’t stick with the exposure, so the information collected from both groups is inaccurate.

For example, just like before, some people in the flossing group stopped flossing halfway through the study because they ran out of floss, which increased their risk of getting cavities from 30% to 40%.

But this time, some people in the non- flossing group also started flossing halfway through the study, maybe because they started dating someone who is picky about oral hygiene, which decreased their risk of cavities from 60% to 50%.

So when we compare the proportion of people in the non- flossing group who got cavities - 50% - to the proportion of people who got cavities in the flossing group - 40% - we find that the risk of cavities for people who don’t floss is 1.25 times the risk of cavities for people who do floss, which is a big underestimation of the true risk.

Non- differential misclassification tends to underestimate the effect of the exposure on the outcome, and as a result researchers are often more concerned about differential misclassification compared to non- differential misclassification.

Now, there are three common types of information biases and they can all lead to differential or non- differential misclassification depending on the situation: surveillance bias, recall bias, and surrogate interview bias.

The first is surveillance bias which occurs when one group is monitored much more closely than another group, which can happen if researchers believe that one group is more at risk for the outcome than another group.

For example, researchers might do a more thorough dental exam for those that don’t floss, because they want to make sure they spot any new cavities that might’ve developed since their last exam.