Is the Scientific Method Broken? The Value of Replication

(Essay found in Nesselroade & Grimm, 2019; pgs. 200 – 201)

This is another box in the series looking at the reproducibility crisis in the social, behavioral, and medical sciences. When researchers conclude that the null cannot be rejected (also known as “failing to reject the null hypothesis”) the study’s findings are deemed “non-significant.” This term is a way of expressing the idea that any differences between the sample means of the various conditions in a study are not substantial enough to warrant rejecting the null hypothesis of no difference. (The degree of differences needed to be found between sample means before the null hypothesis can be rejected is a topic that will be carefully explored in future chapters.) Unfortunately, most journals in the social sciences are not interested in publishing research that has not found evidence of differences between conditions, so-called non-significant findings. Each journal wants to include only articles that seem original, important, and are likely to be read and referenced by others.

This policy, however, can create a problem. Imagine several researchers working independently of each other, each of them looking at a similar research question. Furthermore, imagine their research hypothesis is, in the end, not a very good one. That is, perhaps the null hypothesis is actually true; the independent variable has no effect on the dependent variable. However, given the sampling error that naturally occurs within a sampling distribution, suppose one of the researchers gets an extreme sample mean that prompts them to reject the null. The researcher is in error; their sample mean was very unusual. However, they do not realize this and believe they are correct in rejecting the null hypothesis and claiming to have found an effect. If all of the researchers who were looking at the same topic published their findings, readers might become suspicious of the one study showing an effect amidst the many others that do not. However, readers will not be exposed to these other (non-significant) findings. These studies will not be published by the journals. The only study that will be published is the one that found evidence to reject the null. Once we realize this, it is easy to see how the current publication practices in the social, behavioral, and medical sciences create the possibility that the findings of a number of published studies may not be reliable.

A potential remedy for this is the replication of published studies. Unfortunately, replication is not particularly valued in the profession and so is rarely performed. For example, according to Makel, Plucker and Hegarty (2012) only about 1.6% of all published studies in the top 100 psychology journals from 1900 to 2012 were replication attempts. Simply stated, replications are rarely published. However, in recent years a growing number of researchers are serving the scientific community by carefully and painstakingly replicating published research findings. A good example of this growing trend is the Center for Open Science (https://cos.io/). Hopefully, the value of replication efforts will continue to rise in the social, behavioral, and medical sciences in the wake of this reproducibility crisis.

Find this and other essays regarding “Is the Scientific Method Broken?” in the Nesselroade & Grimm textbook.

Makel, M. C., Plucker, J. A., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science, 7(6), 537-542.