(Essay found in Nesselroade & Grimm, 2019; pg. 324)
This is another box in the series exploring the various reasons for the current reproducibility crisis in the social, behavioral, and medical sciences. Fellow researchers sometimes wonder if the use of one-tailed tests in the literature occurs because it is the only way to reject the null hypothesis. The following study may be a case in point. Buttery and White (1978) were interested in the relationship between affective states (feelings) and biorhythms. According to biorhythm theory, people experience a 28-day emotional cycle. At the peak of the cycle, people are expected to be cheerful and optimistic. At the bottom of the cycle, people are prone to be irritable and negative.
Twenty participants were asked to provide ratings of 11 emotionally related concepts. Ratings were obtained from participants at both the high and low points of their emotional biorhythm. Since each participant supplied two scores, a dependent-samples t test was conducted. The number of paired scores is 20; therefore, the df is 19. The critical value for a two-tailed test when df = 19, is ±2.09. The critical value for a one-tailed test, with the same df,is 1.73. The authors’ tobt was 1.76, which they reported as evidence to reject the null hypothesis given the use of a one-tailed test. This raises at least a couple issues. First, how can readers know that the decision to run a one-tailed test was made ahead of time, for this study or any other one that uses a one-tailed test? Second, even if the authors decided on a one-tailed t test before the data were collected, this decision fails to allow for the possibility that the direction of the effect could have been contrary to their prediction. Perhaps people at the bottom of their cycle are more cheerful than when they are at the top of their biorhythm cycle. In an area that does not have strong theoretical or empirical reasons for expecting a directional finding, the use of a one-tailed t test is highly questionable. In this situation, the chance of a Type I error may be greater than 5 percent. The libertarian view taken by many researchers toward the use of one-tailed tests may be another reason for the current reproducibility crisis in the social, behavioral, and medical sciences.
Find this and other essays regarding “Is the Scientific Method Broken?” in the Nesselroade & Grimm textbook.
Buttery, T. J., & White, W. F. (1978). Student teachers’ affective behavior and selected biorhythm patterns. Perceptual and Motor Skills, 46, 1033–1034.