Stats Bites: Distributions

[et_pb_section fb_built=”1″ admin_label=”Section” _builder_version=”4.0.11″ min_height=”207px” custom_padding=”4px||0px|||” locked=”off”][et_pb_row column_structure=”2_3,1_3″ _builder_version=”4.0.11″ width=”100%” custom_padding=”1px||17px|||”][et_pb_column type=”2_3″ _builder_version=”4.0.11″][et_pb_text admin_label=”Bio” _builder_version=”4.0.11″ header_font_size=”34px”]

In this bite you will learn about the meaning of distributions in data, how to recognize whether your data is normally distributed or not and finally, how to compare scores on a standard normal distribution.

 

[/et_pb_text][/et_pb_column][et_pb_column type=”1_3″ _builder_version=”4.0.11″][et_pb_image src=”https://learningspaces.dundee.ac.uk/ctil/files/2016/03/icon_graph.png” admin_label=”Icon” _builder_version=”4.0.11″ width=”65%” module_alignment=”center” hover_enabled=”0″ align=”center”][/et_pb_image][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ admin_label=”S1″ _builder_version=”4.0.11″ custom_margin=”-3px|||||” custom_padding=”8px||10px|||” locked=”off”][et_pb_row _builder_version=”4.0.11″ width=”100%” custom_margin=”-17px|auto||auto||”][et_pb_column type=”4_4″ _builder_version=”4.0.11″][et_pb_text admin_label=”The normal distribution” _builder_version=”4.0.11″]

Normal distribution

This is probably the most important type of distribution in statistics. The normal distribution is the typical way data will fall. Normal distributed data takes the form of a bell shaped curve, with the peak (mean) close to the middle and data values falling symmetrically to either side of the mean, tapering off equally at each end. There are many variables that are considered normally distributed, for example, IQ, height, exam scores and blood pressure among many other phenomena.

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row column_structure=”1_2,1_2″ _builder_version=”4.0.11″ width=”100%”][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_image src=”https://learningspaces.dundee.ac.uk/ctil/files/2016/03/cropped-image-distribution.jpeg” admin_label=”Distribution diagram” _builder_version=”4.0.11″ hover_enabled=”0″][/et_pb_image][/et_pb_column][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_text admin_label=”Explanation of diagram” _builder_version=”4.0.11″]

Take IQ scores for example— the majority of the population will have average intelligence scores and therefore the majority of the population will be clustered around the mean. The number of those with exceptionally low or exceptionally high scores will be far less common and therefore will represent the tail ends of the distribution. The shape of normal distribution can be estimated by two simple parameters: the mean and the standard deviation. The mean defines the location of the peak whilst the standard deviation tells us how wide the tails are, as if you remember, the standard deviation tells us how spread out scores are from the mean. You can visualize exactly how this works by manipulating this interactive graph.

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row column_structure=”1_2,1_2″ _builder_version=”4.0.11″ width=”100%” custom_margin=”56px|auto||auto||”][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_text admin_label=”Other types of distribution” _builder_version=”4.0.11″]

What about other types of distribution?

The normal distribution is the most frequently mentioned type of distribution and with good reason. A rough, over-simplified explanation of why the normal distribution is so important is because it is easy for mathematicians and statisticians to work with. Many statistical tests assume a normal distribution and are likely to produce unreliable results when data is not approximately normally distributed. But not all data is normally distributed. Data can take many shapes such as right skewed, left skewed, uniform and bi modal. Check out the video below to find out about these other types of distributions as well as get a recap of the normal distribution.

[/et_pb_text][/et_pb_column][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_video src=”http://www.youtube.com/watch?v=bPFNxD3Yg6U” admin_label=”Shape of data video” _builder_version=”4.0.11″][/et_pb_video][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ admin_label=”S2″ _builder_version=”4.0.11″ custom_padding=”14px||9px|||” locked=”off”][et_pb_row _builder_version=”4.0.11″ width=”100%”][et_pb_column type=”4_4″ _builder_version=”4.0.11″][et_pb_text admin_label=”Standard normal distribution” _builder_version=”4.0.11″]

Standard normal distribution and z-scores

The standard normal distribution is a special case of the normal distribution. Let’s set the scene a little bit first. Imagine you are an evil lecturer who gives your class statistics test on a Monday morning and a biology test the same afternoon. One of your students, Alex, wants to know which test he did better on. You know that Alex scored 70 on the statistics test and 85 on the biology test. You also know the standard deviation for both tests, however, comparing these scores is meaningless unless they are on the same scale. Both tests potentially have differing totals, therefore comparing a score of 70 and a score of 60 is pointless unless we know what they are relative to. To transform these scores into the same scale we must standardize the raw scores by converting them into z-scores. The simple formula looks like this; raw score – mean score / standard deviation.

[/et_pb_text][/et_pb_column][/et_pb_row][et_pb_row column_structure=”1_2,1_2″ _builder_version=”4.0.11″ width=”100%” custom_padding=”||66px|||”][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_text admin_label=”Standard normal distribution example” _builder_version=”4.0.11″]

So let’s say that the mean score for the statistics test was 50 with a standard deviation of 10 marks and the mean score on the biology test was 75 with a standard deviation of 15, our calculations would look as follows;

Zstatistics = 70-50 ÷ 10 = 2
Zbiology = 85-75 ÷ 15 = 0.67

Ok, so what does this mean? Although Alex’s score for biology was greater than statistics, the corresponding z-score for statistics is greater. What this tells us is that, with the respects to the average score, Alex actually scored better on the statistics test. 2 standard deviations better than average to be exact. You can see how the scores compared when we look at them on the standard normal distribution.

The standard normal distribution is simply the normal distribution but with a common base— a mean of 0 and standard deviation units of 1. Using the standard normal distribution lets us compare values from different scales in a way that is meaningful. Check out the video below to learn more about the standard normal distribution, z-scores and percentiles.

[/et_pb_text][/et_pb_column][et_pb_column type=”1_2″ _builder_version=”4.0.11″][et_pb_image src=”https://learningspaces.dundee.ac.uk/ctil/files/2016/03/SD.png” admin_label=”Standard distribution diagram” _builder_version=”4.0.11″ hover_enabled=”0″][/et_pb_image][/et_pb_column][/et_pb_row][et_pb_row _builder_version=”4.0.11″ width=”100%”][et_pb_column type=”4_4″ _builder_version=”4.0.11″][et_pb_video src=”http://www.youtube.com/watch?v=uAxyI_XfqXk” admin_label=”Z-scores and percentiles video” _builder_version=”4.0.11″][/et_pb_video][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ admin_label=”Outro” _builder_version=”4.0.11″ custom_padding=”7px|||||” locked=”off”][et_pb_row _builder_version=”4.0.11″ width=”100%”][et_pb_column type=”4_4″ _builder_version=”4.0.11″][et_pb_text admin_label=”If you need more help” _builder_version=”4.0.11″]

…Still needing a little help?

Haven’t found what you were looking for? Try browsing some of the other online resources on the home page. Additionally, you can get one-to-one support by contacting your tutor or attending one of the Math Base drop-ins held regularly throughout semester.[/et_pb_text][/et_pb_column][/et_pb_row][/et_pb_section]