In statistics, it’s often helpful to know the central point of a set of data, because it gives a pretty good idea about the whole data set. It’s like a one number summary of the data.

What’s the number of words on a page in a book? About 250. Of course it depends on the book, but that’s the one number summary.

The mean, median and mode are the most commonly used ways to measure this central point.

Let’s start with the mean, which is also called the average.

You can calculate the mean by adding up each value in a data set and then dividing by the total number of data points.

Let’s look at an example. Let’s say 7 students took a test on biostatistics and out of 100 possible points, one student got 17, another got 19, two got 20, two more got 61 and the last student got 62.

The mean score would be the total number of points they all got added up together divided by the number of students, which is 7.

So that’s: 17+19+20+20+61+61+62 = 260 = 37.14

To show this as a formula, we can say that the mean, written as X with a bar over it, is the total sum of the individual data points X1, X2, ......., Xn, divided by n, which is the number of data points.

A mean test score of 37.14, quickly tells us that overall, these students didn’t do well on this test.

But, the problem with the mean, is that it can be influenced by an extreme value called an outlier.

Let’s say that another student comes along and get a perfect 100 out of 100 on the test.

That means that the average is now: 17+19+20+20+61+61+62+100 = 360 divided by 8, which is 45.

This one number summary isn’t a very good summary because the only reason that it is so high is due to this one high-scoring student.

In this case, 100 is an outlier, and any data set with an outlier is called skewed data.

To calculate the central point when there may be outliers, you can use the median.