Content Reviewers:Rishi Desai, MD, MPH
Contributors:Sam Gillespie, BSc
Analysis of variance, or simply, ANOVA, is a type of parametric statistical test used to determine if there’s a significant difference between the means or averages of three or more groups.
And significance is normally defined by a p-value of less than 0.05 or 5%.
Now when doing any parametric test, there are three key assumptions that we have to make about the population.
First, the sample population must have been recruited randomly. Choosing names randomly ensures that the people included in the study will have similar characteristics to the target population.
This is important because that ensures that the results of the test can be applied to the target population - meaning it has good external validity!
The second assumption is that each individual in the sample was recruited independently from other individuals in the sample. In other words, no individuals influenced whether or not any other individual was included in the study.
For example, if two friends decided to get their blood pressures measured on the same day, and they were both included in the study, these two individuals would not be independent of each other and the second assumption would not be met.
Like random sampling, independent recruitment of individuals is important because it ensures that the sample population approximates the target population.
The third assumption is that the sample size is large enough to approximate the target population, which usually means having more than 20 people.
If it’s impossible to get a large sample size, then the sample population must follow a normal bell-shaped distribution for the characteristic being studied because that’s what we would expect to see in the target population.
Okay, now let’s say there’s a certain blood pressure medication, called Medication A, and you want to figure out if it helps lower systolic blood pressure after taking it for three months and after taking it for six months. So, you find 10 people and give each of them Medication A. Then, you measure each of their systolic blood pressures at time 1 - which is the time you initially gave them the medication - and then measure it again at time 2, let’s say 3 months later, and time 3, let’s say 6 months after they started taking the medication. You find out that the mean systolic blood pressure measurement at time 1 is 138; at time 2 it’s 132, and at time 3 it’s 130. Now, the next step is to figure out if 138, 132, and 130 are significantly different from one another, and you do that by performing an ANOVA test.
Specifically, we would use a repeated measures ANOVA test, because we’re looking at the same group of people at multiple time periods.
In a repeated ANOVA test, time is the independent variable, and in this example, systolic blood pressure is the dependent variable.
It might be tempting to think that medication type is the independent variable in this study, but this isn’t the case, since everyone in the study is taking the same medication type.
The repeated measures ANOVA test is different than an independent one-way ANOVA test, which looks at multiple groups of people at one time point.
For example, let’s say there are three medications called Medication A, B, and C. A one-way ANOVA test would compare the systolic blood pressure measurements for people who have been taking one of the three types of medications for 6 months.
In this example, medication type is the independent variable instead of time, because you measure the blood pressure of every person in the study at the same time.
Typically, a repeated ANOVA test starts with two hypotheses.
The first hypothesis is the null hypothesis, and it says that the means of each group are equal.
In other words, the null hypothesis is that the mean systolic blood pressure is the same for people at time 1, 2, and 3.
The second hypothesis is the alternate hypothesis, and it says that at least one group’s mean is significantly different from the others.
So, the alternate hypothesis in our example is that the mean systolic blood pressure is not the same for people at time 1, 2, and 3.
One important thing to know is that ANOVA doesn’t tell you which group’s mean is different than the others or whether the mean is higher or lower; it simply tells you that the groups’ means are not equal.
Now, there are six steps to test these hypotheses.
The first step is to calculate the mean of each individual group and the overall mean or grand mean - which is the mean blood pressure measurements for all the groups.
Since the means for each group are 138, 132, and 130, we can calculate the overall mean by adding up each group - so 138 plus 132 plus 130, which is 400. Then, we divide that by the number of groups, which is 3. So, the overall mean is 400 divided by 3, or approximately 133.
The second step is to find the between-group variation, which is also called the sum of squares-between, or the SSB.
The sum of squares-between is a measure of how similar each group’s mean is to the overall mean.
To find the sum of squares-between, we start by subtracting each group’s mean from the overall mean and squaring it, which is called the squared difference. Then, you multiply the squared difference by the number of people in that group.
For a repeated ANOVA test, the number of people in each group stays consistent unless people drop out of the study in the middle of it.
In this example, there are 10 people in each group. So, for Time 1, we subtract the mean blood pressure of the Time 1 group, which is 138, from the overall mean, which is 133, and that equals negative 5. The squared difference is negative 5 squared, or 25, and 25 times 10 is 250. For the Time 2 group, the mean is 132, so 133 minus 132 is 1, and 1 squared is still 1, and 1 times 10 equals 10. For the Time 3 group, the mean is 130, so 133 minus 130 is 3. 3 squared is 9, and 9 times 10 is 90. Now that we have the values for each group, we add them together to get the sum of squares-between. So, 10 plus 250 plus 90 is 350.
A larger sum of squares-between tells us that the group means and the overall mean are spread out or different from one another, and a smaller sum of squares-between tells us that the group means are fairly similar to the overall mean.
The third step in the ANOVA calculation is to find the within-group variation, which is also called the sum of squares-within, or SSW.
The sum of squares-within is a measure of how similar each individual blood pressure measurement is from its own group mean.
To find the sum of squares-within, you start by finding the squared differences for each person in one group, and to do this, you take each individual blood pressure measurement and subtract that group’s mean, then square it.
For example, let’s just take the first 3 systolic blood pressure measurements in the Time 1 group, which are 129, 142, and 143. Since the group mean is 138, you subtract 138 from each individual measurement, so 129 minus 138 is negative 9, 142 minus 138 is 4, and 143 minus 138 is 5. Then you square each number and add them all together to get the squared difference - so when you add up negative 9-squared, or 81, and 4-squared, or 16, and 5-squared, or 25, you get 122.
The squared difference is larger for groups with more people, so let’s say the squared difference of the Time 1 group is 330, and the squared differences for the Time 2 and Time 3 groups are 310 and 265. As a general rule, if all of the groups have equal sample sizes - like if each group has 10 people - then groups with higher squared differences, like the Time 1 group, have more variation than groups that have lower squared differences, like the Time 3 group.
In other words, the individual blood pressure measurements for individuals in the Time 1 group are more spread out than the blood pressure measurements for individuals in the Time 3 group.
Now, to get the sum of squares-within, we add up all the squared differences for each group. So 330 plus 310 plus 265 equals 905.
To do step 4, we have to know a little more about the sum of squares-within. The sum of squares-within is made of subject-level variation, which is also called the sum of squares of subjects or SSs, and random error, which is also called the sum of squared error or SSE.
The values for the sum of squares of subjects and sum of squared-error add up to the value of the sum of squares-within.
The sum of squares of subjects is basically the variation caused by differences in people’s individual characteristics, like sex, age, or genetic differences.
For example, let’s say we’re measuring blood pressure in two groups of people. People who are older tend to have higher blood pressure, so if there are lots of older people in one group and lots of younger people in the other group, then the individual blood pressure measurements in the first group will be higher than the measurements in the second group, and the sum of squares of subjects will be high.