Skip to content

Sample size

Sample size


0 / 6 complete


0 / 1 complete
High Yield Notes
5 pages

Sample size

6 flashcards

USMLE® Step 1 style questions USMLE

1 questions

USMLE® Step 2 style questions USMLE

1 questions

A rheumatologist is evaluating the long-term risk of venous thromboembolism in patients with newly diagnosed rheumatoid arthritis. The study sample size consists of 20 participants. The hazard ratio for venous thromboembolism was found to be 1.7 with a 95% confidence interval of 1.2-3.9. When the size sample of the study is increased, which of the following parameters will change?  


Content Reviewers:

Rishi Desai, MD, MPH

Typically, the goal of a study is to explore the relationship between an exposure and an outcome in a target population.

For example, let’s say we want to find out if Medication A, can lower blood pressure better than Medication B, which is the current treatment, in people with hypertension - or high blood pressure - who live in Perth, Australia.

But the population of Perth is around 2 million people, and almost a third of the population has hypertension, so that makes our target population nearly 670 thousand people.

It would take way too much money and time to include them all in the study, so instead, we have to select just a sample of them - which becomes our sample population.

And the sample population should be selected by randomization, so that we have a high chance of including people of ages, races, and socioeconomic statuses that reflect the target population.

But how many people do we choose?

Choosing too many costs more time and money, and choosing too few means that they may not adequately represent the target population.

For example, let’s say we choose 20 people as our sample population for our study, 10 are given Medication A and 10 are given Medication B, and we check their blood pressures after five years.

In the Medication A group, 5 people have a lower blood pressure, and in the Medication B group, 2 people have a lower blood pressure.

This gives us an overall relative risk of 2.5, meaning that Medication A is 2.5 times more effective than Medication B in the sample population.

But that doesn’t necessarily mean that Medication A will be 2.5 times more effective among all of the hypertensive people in Perth.

For example, maybe the sample contained all women, and Medication A happens to work really well just in women - in which case we may be overestimating this effect.

Or what if it works really well in women, but even better in men - in which case we’re underestimating the effect.

To figure out the perfect sample size, we need to know five things.

First, we need to know the current response rate, or the proportion of people who respond to the current treatment.

For example, let’s say that 50 out of 100 people have lower blood pressure after five years of using Medication B, then the current response rate is 50% over 5 years.

Second, we need to know the estimated difference in response rates between Medication B and Medication A, based on previous research.

For example, if Medication A had a response rate of 70% over 5 years in Berlin, Germany, then we can assume the response rates might be similar.

So, if we consider that Medication B’s response rate is 50%, we’d say that the estimated response rate of Medication A is 70%, and the estimated difference in response rates is 20%.

Third, we need to know if we want a one- sided study or two- sided study.

A one-sided study could ask a question like, does Medication A have a cure rate that’s higher than Medication B’s cure rate?

Alternatively, it could ask, does Medication A have a cure rate that’s lower than Medication B’s cure rate?

Basically, a one-sided study can look for a new cure rate that’s either higher or lower than the current cure rate, but not both.

In contrast, a two-sided study might ask a question like, does Medication A have a different cure rate than Medication B?

In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations.