Biostatistics and epidemiology
USMLE® Step 1 style questions USMLE
USMLE® Step 2 style questions USMLE
A researcher is investigating the association between the use of social media in teenagers and bipolar disorder. He gathers continuous data, which is divided into 2 groups: those who spend more than 2 hours per day on social media and those who spend less than 2 hours per day. The data is demonstrated below:
More than 2 hours of social media daily Less than 2 hours of social media daily Bipolar disorder 300 200 No bipolar disorder 100 400
Which of the following is the odds ratio for developing bipolar disorder among tennagers who spend more than 2 hours per day on social media, when compared to those who spend less than 2 hours per day.
Content Reviewers:Rishi Desai, MD, MPH
Contributors:Tanner Marshall, MS, Evan Debevec-McKenney
In statistics, the words probability and odds are often confused with each other, because they help measure the same thing - the chance that an outcome will occur - and in both cases, we need to know the same two things - the number of times an outcome actually happened, or didn’t happen, and the total number of times an outcome could have happened.
The probability is the number of times an outcome happened divided by the number of times the outcome could have happened, and is often represented by a capital P.
For example, let’s say we want to figure out the probability of having a heart attack for people with hypertension - or high blood pressure.
To do this, we could carry out a cohort study which is where we start with two exposure groups and follow them over time to see if they develop a certain outcome.
We could recruit 100 people with hypertension, The exposed group - and 100 people without hypertension - the non-exposed group - and organize our results in a 2 by 2 table, and keep track of how many of them have heart attacks in the next year.
The two outcomes - heart attack or no heart attack - labeled on the top, and the two exposure groups - hypertension or no hypertension - on the side, and each of the cells inside the box labeled as a, b, c, or d.
Now, let’s say there are 9 people that have heart attacks in the group with hypertension - cell a - and 3 people that have heart attacks in the group without hypertension - cell c.
That means that there are 100 minus 9, or 91, people that didn’t have heart attacks in the group with hypertension - cell b - and 100 minus 3, or 97, people that didn’t have heart attacks in the group without hypertension - cell d.
To calculate the relative risk, we need the probability of having a heart attack in the group with hypertension - so cell a, 9, divided by cell a plus cell b or 100, which gives us 0.09.
We also need the probability of having a heart attack in the group that doesn’t have hypertension - so cell c, 3, divided by cell c plus cell d, 100, which gives us 0.03.
The relative risk is 0.09 divided by 0.03, so 3, which means that people who have hypertension have 3 times the risk of having a heart attack in one year compared to people without hypertension.
The probability of having a heart attack in the past year for people with hypertension is 9 divided by 100, and the result can be written as a decimal - 0.09 - or a percentage - 9%.
On the flip side, we could also find the probability of not having a heart attack for people with hypertension.
A simple way to do this is to subtract the probability of having a heart attack from 1, or 100%.
So, the probability of not having a heart attack for people with hypertension is 1 minus 0.09, which equals 0.91, or 91%.
Now, the relative risk can only be calculated when we have information about incidence, or the probability of a new outcome occurring during a certain time period.
That’s because the word risk implies that hypertension is responsible for causing, or at least partially causing, heart attacks to happen.
And, if hypertension causes heart attacks to happen, then a person would have to have hypertension before they have a heart attack, which can only be measured by incidence - which measures new events.
But sometimes information on incidence can’t be collected, so relative risk can’t be calculated.
For example, case-control studies start with a case group, that already has the outcome, and a control group, that doesn’t have the outcome - and looks at past exposures.
In this situation, odds and odds ratios come in very handy.
Odds compare the probability of an outcome occurring with the probability of an outcome not occurring, so it’s P compared with 1 minus P.
So in our hypertension example, we’d use the probability of having a heart attack - 0.09 - compared with the probability of not having a heart attack, 1 minus 0.09, or 0.91.
Dividing both sides by 0.91, we get 0.0989 compared to 1, and rounding off we can say that it’s 0.1 compared to 1, and multiplying by 10, we get an odds of 1 to 10.
In other words, in one year, for every 1 person with hypertension who has a heart attack, there are 10 people with hypertension who do not have a heart attack.
In fact, the odds will always be slightly higher than the probability of an event, but the odds and the probability end up being very similar when an outcome is rare - specifically when it’s found in less than 10% of the population.
The odds ratio (OR) is a statistical measure used to compare the odds of an event occurring in one group to the odds of the same event occurring in another group. It is often used in retrospective studies, such as case-control studies, to compare the likelihood of an outcome (such as a disease or condition) occurring in one group of people compared to another group.
The odds ratio is calculated by dividing the odds of the event occurring in one group by the odds of the event occurring in the other group. For example, if the odds of a disease occurring in group A are 1 in 10, and the odds of the disease occurring in group B are 1 in 20, the odds ratio would be (1/10) / (1/20) = 2. This indicates that the odds of the disease occurring in group A are twice as high as the odds of the disease occurring in group B.