Content Reviewers:Rishi Desai, MD, MPH
Contributors:Elizabeth Nixon-Shapiro, Sarah Clifford, BMBS, BSc (Hons), Marisa Pedron, Kaitlyn Harper
The Student’s t-test or simply, the t-test, is a type of parametric statistical test used to determine if there’s a significant difference between the means or averages of two groups.
And significance is normally defined by a p-value of less than 0.05 or 5%.
Now when doing any parametric test, there are three key assumptions that we have to make about the population.
First, the sample population must have been recruited randomly.
Choosing names randomly ensures that the people included in the study will have similar characteristics to the target population.
This is important because that ensures that the results of the t-test can be applied to the target population - meaning it has good external validity!
The second assumption is that each individual in the sample was recruited independently from other individuals in the sample.
In other words, no individuals influenced whether or not any other individual was included in the study.
For example, if two friends decided to get their blood pressures measured on the same day, and they were both included in the study, these two individuals would not be independent of each other and the second assumption would not be met.
Like random sampling, independent recruitment of individuals is important because it ensures that the sample population approximates the target population.
The third assumption is that the sample size is large enough to approximate the target population, which usually means having more than 20 people.
If it’s impossible to get a large sample size, then the sample population must follow a normal bell-shaped distribution for the characteristic being studied because that’s what we would expect to see in the target population.
Okay, now let’s say you want to figure out if a certain medication lowers systolic blood pressure.
So, you measure 25 people’s systolic blood pressures and find that the mean systolic blood pressure for the whole group is 138 mmHg.
Then, you give them the medication and after six weeks, you find that the mean systolic blood pressure for the group is only 130 mmHg.
Now, to figure out if a decrease in systolic blood pressure from 138 to 130 is significant, we could perform a t-test.
Specifically, since the two means were measured in the same population before and after the treatment, we would use a paired t-test.
This is different than an unpaired t-test, or 2-sample t-test, which is used to compare two groups of individuals.
For example, an unpaired t-test could compare the systolic blood pressure measurements of a group of 25 people who used medication for six weeks to a different group of 25 people who did not use the medication for six weeks.
Typically, a paired t-test starts with two hypotheses.
The first hypothesis is the null hypothesis, and it basically says that the mean of the differences between the two groups is equal to zero.
In other words, the null hypothesis is that taking the medication results in no difference in systolic blood pressure.
The second hypothesis is the alternate hypothesis, and since a t-test can be either one-sided or two-sided, there are two versions of the alternative hypothesis.
The alternate hypothesis for a one-sided t-test would either state that the mean of the differences is a positive number or that the mean of the differences is a negative number.
The alternate hypothesis for a two-sided t-test would state that the mean of the differences for both groups is not equal to zero, but it wouldn’t specify if it was positive or negative.
Typically, researchers choose to use two-sided t-tests, since they usually don’t know how the medication will affect people who take it.
So, the two-sided alternative hypothesis for our study would state that the mean of the differences in systolic blood pressure for people that take the medication compared to people who don’t take the medication is not equal to zero.
To test these hypotheses, we need to calculate a t-score, which is a ratio of the mean of differences between the two groups to the standard error of the mean of differences between the two groups.
Let’s start with the first part - the mean of differences between the two groups.
In our case, that’s the difference in systolic blood pressure in individuals before and after the treatment and it’s represented by the symbol d-bar.
For example, let’s calculate the mean of the differences for the first three people in the study.
Let’s say that their systolic blood pressures before the medication were 135, 142, and 137 mmHg, and after the medication they were 127, 145, and 128 mmHg.
For each individual, we can find the difference by subtracting the value taken before the medication from the value taken after the medication.
So, for person 1, it’s 127 minus 135, or -8, for person 2, it’s 145 minus 142, or +3; and for person 3 it’s 128 minus 137, or -9.
Now, to find the mean of the differences, we add up the individual differences and divide by the number of people in the group.