Sample size

00:00 / 00:00


Sample size


0 / 6 complete

USMLE® Step 1 questions

0 / 1 complete

USMLE® Step 2 questions

0 / 1 complete

High Yield Notes

5 pages


Sample size

of complete


USMLE® Step 1 style questions USMLE

of complete

USMLE® Step 2 style questions USMLE

of complete

A rheumatologist is evaluating the long-term risk of venous thromboembolism in patients with newly diagnosed rheumatoid arthritis. The study sample size consists of 20 participants. The hazard ratio for venous thromboembolism was found to be 1.7 with a 95% confidence interval of 1.2-3.9. When the size sample of the study is increased, which of the following parameters will change?  


Typically, the goal of a study is to explore the relationship between an exposure and an outcome in a target population.

For example, let’s say we want to find out if Medication A, can lower blood pressure better than Medication B, which is the current treatment, in people with hypertension - or high blood pressure - who live in Perth, Australia.

But the population of Perth is around 2 million people, and almost a third of the population has hypertension, so that makes our target population nearly 670 thousand people.

It would take way too much money and time to include them all in the study, so instead, we have to select just a sample of them - which becomes our sample population.

And the sample population should be selected by randomization, so that we have a high chance of including people of ages, races, and socioeconomic statuses that reflect the target population.

But how many people do we choose?

Choosing too many costs more time and money, and choosing too few means that they may not adequately represent the target population.

For example, let’s say we choose 20 people as our sample population for our study, 10 are given Medication A and 10 are given Medication B, and we check their blood pressures after five years.

In the Medication A group, 5 people have a lower blood pressure, and in the Medication B group, 2 people have a lower blood pressure.

This gives us an overall relative risk of 2.5, meaning that Medication A is 2.5 times more effective than Medication B in the sample population.

But that doesn’t necessarily mean that Medication A will be 2.5 times more effective among all of the hypertensive people in Perth.

For example, maybe the sample contained all women, and Medication A happens to work really well just in women - in which case we may be overestimating this effect.

Or what if it works really well in women, but even better in men - in which case we’re underestimating the effect.

To figure out the perfect sample size, we need to know five things.

First, we need to know the current response rate, or the proportion of people who respond to the current treatment.

For example, let’s say that 50 out of 100 people have lower blood pressure after five years of using Medication B, then the current response rate is 50% over 5 years.

Second, we need to know the estimated difference in response rates between Medication B and Medication A, based on previous research.


In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations.

Copyright © 2023 Elsevier, its licensors, and contributors. All rights are reserved, including those for text and data mining, AI training, and similar technologies.

Cookies are used by this site.

USMLE® is a joint program of the Federation of State Medical Boards (FSMB) and the National Board of Medical Examiners (NBME). COMLEX-USA® is a registered trademark of The National Board of Osteopathic Medical Examiners, Inc. NCLEX-RN® is a registered trademark of the National Council of State Boards of Nursing, Inc. Test names and other trademarks are the property of the respective trademark holders. None of the trademark holders are endorsed by nor affiliated with Osmosis or this website.