# Mean, median, and mode

Videos

Notes

## Biostatistics and epidemiology

#### Biostatistics

### AssessmentsMean, median, and mode

### Flashcards

### Mean, median, and mode

The mean is calculated by summing up all the values in the data set then (dividing/multiplying) by the number of these values.

### Mean, median, and mode exam links

#### Content Reviewers:

Rishi Desai, MD, MPH, Yifan Xiao, MD#### Contributors:

Sam Gillespie, Anju PaulIn statistics, it’s often helpful to know the central point of a set of data, because it gives a pretty good idea about the whole data set. It’s like a one number summary of the data.

What’s the number of words on a page in a book? About 250. Of course it depends on the book, but that’s the one number summary.

The mean, median and mode are the most commonly used ways to measure this central point.

Let’s start with the mean, which is also called the average.

You can calculate the mean by adding up each value in a data set and then dividing by the total number of data points.

Let’s look at an example. Let’s say 7 students took a test on biostatistics and out of 100 possible points, one student got 17, another got 19, two got 20, two more got 21 and the last student got 22.

The mean score would be the total number of points they all got added up together divided by the number of students which is 7.

So that’s: 17+19+20+20+21+21+227=1407= 20.

To show this as a formula, we can say that the mean, written as X with a bar over it, is the total sum of the individual data points X1, X2, ......., Xn, divided by n, which is the number of data points.

A mean test score of 20, quickly tells us that overall, these students didn’t do well on this test.

But, the problem with the mean, is that it can be influenced by an extreme value called an outlier.

Let’s say that another student comes along and get a perfect 100 out of 100 on the test.

That means that the average is now: 17+19+20+20+21+21+22+100 = 240 divided by 8, which is 30.

This one number summary isn’t a very good summary because 7 out of 8 students scored below 23, and the only reason that the average is so high is because of this one high-scoring student.

In this case, 100 is an outlier, and any data set with an outlier is called skewed data.

To calculate the central point when there may be outliers, you can use the median.