Normal distribution and z-scores
10,555views
00:00 / 00:00
Flashcards
Normal distribution and z-scores
0 of 5 complete
Questions
USMLE® Step 1 style questions USMLE
0 of 4 complete
USMLE® Step 2 style questions USMLE
0 of 4 complete
Transcript
Content Reviewers
Contributors
Let’s say you ask 1000 men for their weight, and then you plot their answers on a histogram, which is a plot that shows the distribution of any measurement or data.
Let’s say that the average weight is 170 pounds or about 77 kilograms, and that it turns out that the majority of men weighed that amount, whereas fewer men weighed a little bit higher or a little bit lower than the average, and even fewer men weighed much higher or much lower than the average.
If we draw a curve over the top of our histogram, we get the normal distribution curve, which is also called the bell curve, because it’s shaped like a bell.
The bell curve is symmetrical, with half the data on the left of the average and half the data on the right side of the average.
The area under the bell curve is equal to 1, or 100%, with the highest percentage of data in the middle section and the lowest percentage of data in the outer tails of the curve.
Typically, for population data, the average point in a bell curve is labeled with the greek letter mu, and mu refers to the mean, median, and mode, because when data are normally distributed, the mean, median, and mode are all equal to each other.
The standard deviation is a measure of how spread out the data are from the average, and for population data it’s represented by the greek letter sigma.
For example, let’s say the standard deviation of weight for our sample of men is 29 pounds, or 13 kilograms.
In a normal distribution, 68 percent of the data are within one standard deviation.
That means that 68 percent of men will weigh somewhere between 170 minus 29, or 141 pounds, and 170 plus 29, or 199 pounds.
Also, 95 percent of the data are found within two standard deviations - so, since 29 times 2 is 58, then 95 percent of men will weigh somewhere between 170 minus 58, or 112 pounds, and 170 plus 58, or 228 pounds.
Finally, 99.7 percent the data are found within three standard deviations, and since 29 times 3 is 87, 99.7% of men will weigh between 170 minus 87, or 83 pounds, and 170 plus 87, or 257 pounds.
This is called the empirical rule, or the 68-95-99.7 rule.
Now, the shape of the bell curve depends on the size of the standard deviation.
A small standard deviation, like if it was only 5 pounds, tells you that most of the data are clustered around the average - and this makes the bell curve very tall and skinny.
On the other hand, a large standard deviation, like if it was 50 pounds, tells you that most of the data are way above and way below the average - and this makes the bell curve look very wide and flat.
Summary
The normal distribution is a continuous probability distribution that is symmetric about the mean, with a bell-shaped curve. 68%, 95%, and 99% of the data lies within one, two, and three standard deviations from the mean, respectively. The normal distribution represents the occurrence of many natural phenomena.
On the other hand, a Z-score indicates the number of standard deviations between a certain value and the mean. A Z-score of 0 indicates that the data point is exactly at the mean. A Z-score of 1 indicates that the data point is one standard deviation above the mean, and a Z-score of -1 indicates that the data point is one standard deviation below the mean. Z-scores can be used to determine how unusual a data point is within a dataset that follows a normal distribution.