Have you ever wondered where to start for being a successful data analyst? Here is an article that shows everything about data analysis.
What Is The Z Score or The Standard Score?
The standard score or the z score is a way to calculate how far the actual score deviates from the mean in standard deviation units. It also allows us to compare two samples from different populations. The z score can be negative or positive. The negative z score means the value is below the arithmetic value and the positive z score means our value is above the arithmetic value.
The z score is calculated by subtracting the mean of the data set from the data point we want to calculate and divide by the standard deviation of the data set.
What Does Z Scored Use For In Our Everyday Life?
Not everyone is a statistician or a mathematician but everyone deals with data sets almost every day. Whether it is your screen time, final exam, or the class or how much you spent on clothing last year. It means we also use statistical measurements like mean or the deviation.
But where to use the z score?
We can use it while comparing two different data sets with different distributions. For example:
We have two chefs mastered in the art of making soufflé. Let’s say our first chef’s name is Sasha and the other is David. David participated in a cooking contest and gets a score of 472 out of 500. Sasha also participates in this kind of contest and gets the score 82 out of 100. Now we have to decide who was more successful. But how? David’s score is higher than Sasha’s in value but we can’t compare them like that. In these situations, we turn the raw scored into z scores. This way we can calculate who did better at the contest.
In the contest where David competes, the mean was 392 and the standard deviation was 100. To calculate the standard score we subtract 392, the mean, from David’s score 472 and divide it to standard deviation. The right equation for that is: “472 -392 /100″ which is +0.08. David’s score is 0, 08 standard deviation above the average.
In Sasha’s contest, the average was 52 and the standard deviation was 30. Now we apply the same formula we used for David’s contest. We subtract the mean from the score and divide it into the standard deviation. The equation for that is: ” 82 -52 /30 “. Sasha’s z score is +1.0. It means Sasha is 1.0 standard deviation above the average.
Now that we can compare David and Sasha’s score easily. David was 0. 8 standard deviation above and Sasha is 1.0 standard deviation above the mean. It means Sasha is a better soufflé baker then David; this is what statistics tell us. But maybe we shouldn’t decide without trying both of them.
“Another area of using the standard score is measuring the growth of children around the world. It is used to calculate weight and height for age or height for weight. Children who fall below the -2 or -3 standard deviation are diagnosed with malnutrition.”1
Tip: Always add minus or plus to z scores and try to convert them to two-digit numbers after 0,x.And you should always add the 0 before the comma if the value is smaller than 1. For example, 0, 8798 above average must be converted to +0, 88.
Two Compare the Frequency of A Score
By looking at the z scale we just made I can determine how many of the contestants get the score of Sasha or David. Looking at the z scale tells me people who took David’s score is more than people who took Sasha’s score. Because most of the data is gathered around 1 standard deviation below or above the mean.
It Allows Us To Guess The Score
By looking at the z scale we can determine if it is likely for Sasha or David to take their scores. We can use this in hypothetical measurements.
“But how did converting raw data to z scores helped us compare them? It is because z score distribution always has a mean of 0 and a standard deviation of 1.00. Even if the original distribution is not normal the mean is 0 and the standard deviation is 1.00. This helps us standardize and compare two different sets of data even if they measured differently like in our example.
Most data in a z score distribution is between below and above 1.00 standard deviations from the mean. There are fewer data are between 2 standard deviations above and below the mean and it is very rare to find data greater than plus and minus 3 standard deviations.”2
Why Do We Need The Z Score?
As we can tell the examples from above we need z score to compare two different data sets by standardizing the data from any type of measurement.
Converting your data to the z score is basically re-scales and standardize your data. This is why it is also called a standard score.
We can imagine a z score as a cookie-cutter. You can cut cookie dough or styrofoam but both will still look like the gingerbread man.
How To Calculate Z Score Mathematically?
Z score or standard score is calculated by data point minus the man divided by the standard deviation. But what are these terms mean?
When we are collecting individual information about something we call the whole group data, data set, or population. In this article all data is numeric but data can be any type of information in real life. One single piece of data can be called data point or individual and I will refer to it in both these ways so don’t get confused.
The mean is “average” in statistics. It is calculated by adding up all the data and divide it by the number of data. I will give an example and I strongly advise you to follow up and calculate with me during these. For example let’s say we curious about how many hours we spend in front of the screen every day so we wrote down our daily screen time for a week 2, 2, 5, 3, 7, 1, and 1. We add up all these numbers and the result is 21. Now we have 7 individual data points so we divide 21 to 7 and find the result 3. We spend 3 hours on average in front of the screen every week.
To have more information on the average and how to calculate it in Excel you can check our How to use Excel AVERAGE function article.
Another thing we need to find to determine the z score is the standard deviation. Standard deviation is a statistical term to determine the gap between the average and the actual data. A small standard deviation means the data sen is consistent and normal. Data points are close to each other. A high standard deviation means the data set is inconsistent, unexpected, and sometimes unreliable. It means individuals in the data set are not close but spread through the number line.
To calculate standard deviation we subtract all the numbers from the mean and square them to prevent getting negative values. We add up these squared numbers and divide them to one less of the number of individuals in our data set.
For our screen time example we can explain it like that:
The mean is 3 so we will subtract 3 from each data point. We spend 2 hours in front of the screen on Monday so it means “2 -3 = -1”. Now we square the “- 1” result we found and reach the number 1.
We will apply the same formula for each individual in our data set. Then we will add up these numbers and get the result of √30. We will divide it into one less of the number of data points which is 6. This will bring us the conclusion of 0,9. This is our standard deviation value.
To have more information on the standard deviation and how to calculate it in excel you can visit our How To Calculate Standard Deviation in Excel article.
Now that we have the mean and the standard deviation we can calculate the z score.
Carrying on with the screen time example we will calculate the z score of Monday. To do that we should subtract the mean from our data point and divide it to standard deviation value. For us, it means “2 -3 = -1″. Now we divide -1 to 0,9 which is our standard deviation value, ” -1 /0,9 = -1,1 “. The result is -1,1. It means our screen time on Monday is more than 1 and less than 2 standard deviations away from the mean.
How To Calculate The Z Score In Excel?
As we explained in the previous paragraphs z score is calculated by data point minus mean divided by the standard deviation. To calculate z score in excel first we need to write our data in column one under the other.
We can calculate the z score in excel by using specific formulas. The good thing about using excel formulas is that we can adapt the formulas for large data sets that. Which allows us to calculate quicker and easier. Let’s learn how to do that step by step.
Step 1: Determine Your Data Set
Let’s say we are calculating final exams of English language and literature students and let’s say their scores are: 72, 90, 87, 65, 50, 78, 82, 63, 55, and 80. These ten data points are written from column A and row 2 to column A row 11. I chose to write “Test Scores” in A1, “Z score” in B1, and “Z score” in C1 in our sheet to sort things easily. I added two “Z score” columns because I will show you how to calculate it both manually and automatically.
Tip: While you write formulas the cell you refer to will highlight so won’t need to double-check everything you write.
Step 2: Calculate The Mean
Since we will use the mean to determine the z score for our data set we need to calculate the mean. Mean means mathematical average in a group of data. Mathematically it means adding up all the numbers and divide them to 10 but in excel we use a simple formula for calculating quickly. Let’s write “Mean” to a cell other than columns A, B, or C. It can be any row and column other than these. I choose F2 for this example. Click to the blank cell next to it, which is G2 for us, and type the formula to formula space above our sheet for calculating the mean:
Press enter and we have the mean. It is 72,2
Step 3: Calculate The Standard Deviation
After that, we have calculated the standard deviation. Standard deviation is the measurement that shows us how much and individual data is away from the mean. Mathematically we calculate it by subtracting all the numbers from the mean and square them so all the results will be positive instead of negative. It will be added up all these squared numbers to each other and divide them to one less of the number of individuals in our data set (which is 10 in this example). Excel has a formula for that too of course.
Let’s write “Standard Deviation” to a cell. I picked F3 for our practice. We are going to blank cell next to it, which is G3, and type our formula to the formula space above our sheet.
For those who use the latest versions of excel the formula should be:
But for the prior versions, we use the STDEV formula instead of the STDEV.S formula. Which is:
Press and enter and the standard deviation is approximately 13,55
Tip: If you’re not sure if you’re using the latest versions or not don’t worry, excel tells you to change to the formula you use to an up to date formula.
STEP 4: Calculating The Z Score Manually
Not that we have calculated the mean and the formula we can start calculating the z score.
We can start to determine our fist individual’s, 72 for us, z score. We will subtract the mean from the individual and divide it into the standard deviation.
A quick reminder that our mean was in the G2 cell and the standard deviation value was in the G3 cell.
So what we are going to do is clicking the blank B2 cell and type
=(A2- G2)/ G3
And this will give us the result, -0,01476. The z score for our first individual is -0,01476.
Now we can calculate other individuals’ z scores by entering the same formula except by putting the cell for which one we are calculating instead of A2. But there is an easier way to calculate all of them.
We click the cell with the formula and we lock the standard deviation value cell and the mean value cell. We use the dollar symbol ($) to do that. We put a dollar sign next to the column letter and row number for G2 and G3. We did not lock the A2 column because it has to change for each individual we will calculate. Our formula should look like this:
Now that we locked our cells we can click the bottom right corner of the B2 cell and drag it down to B11. Now we have z scores for our whole data set.
Step 5: STANDARDIZE Formula for The Z Score
This was the manual way to calculate the z score. You can also use the STANDARDIZE formula for determining it.
Let’s say we use the same data set. 72, 90, 87, 65, 50, 78, 82, 63, 55, and 80 from cell A2 to cell A11. Our mean is in the G2 cell and the standard deviation value is in the G3 cell.
We created two “Z score” columns in the beginning. So Whichever column you filled in, you need to fill in the empty column next to it. In our practice, it is the column with the letter C.
We click blank C2 cell and type the STANDARDIZE formula to the formula space above our sheet:
=STANDARDIZE (A2, G2, G3)
We press enter and the value is the same with our other z score column. It is -0,01476.
We can do it for the whole data set in the same way before. We lock the mean value cell and the standard deviation value cell by adding the dollar sign ($) next to the column letter and row number. Again we will not lock the data point’s cell because we want it to change for every cell. Now our formula should look like this:
Now that we locked our cells we can click the bottom right corner of the C2 cell and drag it down to C11. Now we have z scores for our whole data set.