With z Scores We Can Compare Apples and Oranges

(Essay found in Nesselroade & Grimm, 2019; pg. 138)

Nesselroade & Grimm, 2019

Is he taller than he is heavy? This question, at first glance, seems to be nonsensical; like comparing apples with oranges. The reason the question appears to be unanswerable is that height and weight are different variables and measured in different units. How can we say that 6 ft 2 in. is more or less than 145 lb? However, in the world of statistics we can compare the relative position of scores in different distributions by using standardized scores. The z score transformation will convert original scores, from different scales, to a common unit. The common unit is the z score, which is the number of standard deviations a raw score is from the mean of a given distribution. Now if we were told that a man’s height transforms to a z of +1.3, and his weight to a z of –0.42, could we answer the question, “is he taller than he is heavy?” If we first qualified our statement by saying that we are comparing two values relative to the distribution from which they came, then “yes;” we could answer affirmatively. When his height is transformed into a z score, the mean and standard deviation of a distribution of heights is used to make the transformation. His weight is transformed into a z score using the mean and standard deviation from a distribution of weights. In this way, to say that he is taller than he is heavy is to say that his transformed height value locates him higher on the z distribution of heights than his transformed weight score locates him on the z distribution of weights.

We can also use z scores to compare things like test performances on two different tests. Suppose a roommate is gloating a bit because they scored an 88 on a History exam while the other roommate only scored an 82 on their psychology exam. However, we suspect that the History exam was much easier than the Psychology one. If we knew the means and standard deviations of both exams (and if we can assume both sets of tests were normally distributed) we could see which of the roommates performed better in relation to the rest of their respective classes.

This way of comparing scores from different scales of measurement is very useful in the social and behavioral sciences as well as in the field of education. We can ask, for instance, if a person is more depressed than anxious, more paranoid than manic, or better at math than at reading. Although the scales of the tests are designed to tap different traits and abilities, and each scale has its own mean and standard deviation, by standardizing the raw scores an examiner can easily make cross-scale comparisons.

Find this and other similar side-bar discussions in the Nesselroade & Grimm textbook.