William Gosset

(Essay found in Nesselroade & Grimm, 2019; pgs. 267-268)

William Gosset (1876–1937) developed the t distribution as well as the independent- and dependent-samples t tests. After receiving a degree in chemistry and mathematics from Oxford, Gosset was hired by the Guinness brewery in Dublin in 1899. Around the turn of the century, many companies, especially in the agricultural industry, attempted to apply a scientific approach to product development. A typical research question would have been, “Which fertilizer will produce the largest corn yield?” or “What is the best temperature to brew ale so as to maximize its shelf life?” Until Gosset’s work, statisticians dealt with very large numbers of observations, in the hundreds and thousands. Traditional wisdom held that one should take a very large sample, compute the mean and standard deviation, and refer to the z table to make probability statements. The problem that confronted Gosset was how to make inferences about the difference between population means when sample sizes were small. For example, suppose ten plots of barley are treated with one fertilizer and ten plots are treated with another fertilizer. With such small samples (before Gosset), there was no way to determine if the difference in yield was due to sample fluctuation (chance) or the effect of the brand of fertilizer.

To test the mean of one sample against a specified population value or test the difference between two sample means, the t table (instead of the z table) is used to find critical values and make probability statements when σ is unknown. In his 1908 seminal article, “The Probable Error of a Mean,” Gosset addressed the problem of small samples: “As we decrease the number of experiments, the value of the standard deviation found from the sample of experiments becomes itself subject to an increasing error, until judgments reached in this way become altogether misleading” (Student, 1908; p. 2). He realized that the standard normal curve, on which the z table is based, leads to inaccurate judgments about the area under the curve of a sampling distribution when sample sizes are small and σ is unknown. In the following quote, Gosset expressed the purpose of his 1908 paper. “The aim of the present paper is to determine the point at which we may use the tables of the probability integral in judging of the significance of the mean of a series of experiments, and to furnish alternative tables for use when the number of experiments is too few” (p. 2). (His reference to the tables of the probability integral refers to the z table, and “alternative tables” refers to the newly developed t table.) Gosset’s use of the term “significance” was prophetic since at this time the concept of significance testing had not been developed. The conventional use of the 5 percent level of significance emerged over the next 25 years.

Gosset’s classic 1908 article is one of the most important publications in the history of inferential statistics. “With one stroke, he: (1) discovered a new statistical distribution; (2) invented a statistical test that became the prototype for a whole series of tests, including analysis of variance; and (3) extended statistical analysis to small samples.…” (Tankard, 1984, p. 99). Although t tests are one of the cornerstones of modern statistics, Gosset’s work was not greeted with enthusiasm. Fisher, the originator of the analysis of variance, described the reaction of colleagues as “weighty apathy” (Fisher, 1939, p. 5), and Cochran stated that, “the t distribution did not spread like wildfire” (Cochran, 1976, p. 13). Even Gosset underestimated the impact that his discoveries would have, as he wrote to Fisher, “I am sending you a copy of Student’s Tables as you are the only man that’s ever likely to use them!” (Gosset, 1970; Letter 11).

An interesting aspect of Gosset’s work is that he used a pseudonym when publishing; he took the name Student. Not wanting the competition to know of its scientific work, Guinness forbade their scientists from publishing. As a result, Gosset secretly published all his articles under the name of “Student.” It is for this reason that the t test is also known as “Student’s t test.”

Gosset remained with Guinness until his death, assuming the position of head brewer a few months before he died in 1937.

Find this and other spotlights on important statisticians in the Nesselroade & Grimm textbook.

Cochran, W. G. (1976). “Early Development of Techniques in Comparative Experimentation.” In On the History of Statistics and Probability, edited by D. B. Owen. New York: Marcel Dekker.

Fisher, R. A (1939). “Student.” Annals of Eugenics 9, pp. 1–9.

Gosset, W. S. (1970). Letters from W S. Gosset to R. A. Fisher, 1915–1936. Issued for private circulation. Dublin: Arthur Guinness.

Student (1908). “The probable error of a mean.” Biometrika, 6, pp. 1-25.

Tankard, J. W. (1984). The Statistical Pioneers. Cambridge, MA: Schenkman.

Paul Nesselroade

William Gosset

(Essay found in Nesselroade & Grimm, 2019; pgs. 267-268)