In October 2011, Joseph Simmons, a psychologist at the Wharton School of the University of Pennsylvania, published a clearly preposterous result in the respectable journal Psychological Science. Together with Uri Simonsohn, also at Wharton, and Leif Nelson of the University of California, Berkeley, Simmons showed that people who listened to the Beatles song “When I’m Sixty-Four” grew younger, by nearly 18 months. But if the result was laughable, the point of the paper was serious: to show how standard scientific methods could generate scientific support for just about anything.
In the years before, the trio had slowly lost faith in the stream of neat findings in psychology. “Leif, Uri, and I all had the experience of reading papers and simply not believing them,” Simmons says. It seemed unlikely that it could all be down to fraud. “After much discussion, our best guess was that so many published findings were false because researchers were conducting many analyses on the same data set and just reporting those that were statistically significant,” they recently wrote. They termed the behavior, which also gave them their nonsense result, “p-hacking,” a reference to the p-value, which determines whether a result is considered significant.
The paper came at a critical time for psychologists. Earlier that year, another paper using standard methods had shown that extrasensory perception was a real phenomenon—a result the authors meant seriously, to the dismay of other psychologists. “If you use the techniques that everyone is using in their normal research … and it supports the existence of bullshit, then there is good reason to think that that method is wrong more generally and you shouldn’t be using it,” says Chris Chambers, a neuroscientist at Cardiff University. In another blow a few months later, noted social psychologist Diederik Stapel of Tilburg University in the Netherlands admitted to faking data for dozens of papers.
The crescendo of problems has led some psychologists to adopt a radical solution: describing the research they plan to do, and how, before they gather a single piece of data. Preregistration, in its simplest form, is a one-page document answering basic questions such as: What question will be studied? What is the hypothesis? What data will be collected, and how will they be analyzed? In its most rigorous form, a “registered report,” researchers write an entire paper, minus the results and discussion, and submit it for peer review at a journal, which decides whether to accept it in principle. After the work is completed, reviewers simply check whether the researchers stuck to their own recipe; if so, the paper is published, regardless of what the data show.
Preregistration had already become the norm in clinical trials as a way to prevent publication bias, the tendency for many negative results to remain unpublished. Now, it is spreading through psychology and into other fields, not just to ensure those results see the light of day, but also because of a different advantage: By committing researchers to a fixed plan, it takes away some of the degrees of freedom that can skew their work.
In their paper on the Beatles tune, for instance, Simmons and his colleagues had tested for significance after roughly every 10 participants, they had used other songs, and they had asked participants many questions, creating a large data set with which they could play around. As it turned out, using the age of the participants’ fathers to control for variation in baseline age resulted in their statistically significant but absurd finding. That would have been impossible with a preregistered study.
Some skeptics warn about unforeseen consequences. In a 2016 blog post, Susan Goldin-Meadow, then-president of the Association for Psychological Science in Washington, D.C., wrote about her “fear that preregistration will stifle discovery.” “How can we make new discoveries if our studies need to be catalogued before they are run?” she asked. Science is not just about testing pre-established hypotheses, but also about discovering them, Goldin-Meadow wrote; she also cautioned that the push to preregister might “devalue or marginalize the studies for which the preregistration procedures don’t fit.”
Preregistration does not preclude generating new hypotheses, counters Brian Nosek, director of the University of Virginia’s Center for Open Science (COS) in Charlottesville; it just makes it transparent when researchers are doing so. Too often, a result is presented as if it confirms a hypothesis when researchers are actually doing what has become known as HARKing, he says, “hypothesizing after results are known.” (For example, a researcher who rolls a die three times and gets a six each time could report that she wanted to test her hypothesis that a die will always show a six, and that her study confirmed it.)
A bureaucratic burden
Preregistration can pay off for individual researchers as well as the broader field, advocates say. For one thing, it forces researchers to think harder about a study before it begins. “I have at least twice had the experience of realizing that my study didn’t make any sense as I was writing up the analysis plan for it,” Simmons says. Registered reports have the additional advantage that journals agree in advance to accept studies based on their methodological strength and the importance of the questions they address, Chambers adds. “It takes away all this pressure that you might get nonsignificant results or that your results might in some way be seen as undesirable by the editors or reviewers.”
Several databases today host preregistrations. The Open Science Framework, run by COS, is the largest one; it has received 18,000 preregistrations since its launch in 2012, and the number is roughly doubling every year. The neuroscience journal Cortex, where Chambers is an editor, became the first journal to offer registered reports in 2013; it has accepted 64 so far, and has published results for 12. More than 120 other journals now offer registered reports, in fields as diverse as cancer research, political science, and ecology.
Still, the model is not attractive to everyone. Many journals are afraid of having to publish negative results, Chambers says. And some researchers may not want to commit to publishing whatever they find, regardless of whether it supports a hypothesis. He notes that a bit of publication bias can have advantages for scientists, some of whom essentially create a “brand” by publishing studies that support a certain theory. “A lot of scientists are more like lawyers than detectives. They have a theory and they are trying to use the evidence to support it.”
There are other drawbacks. “It can feel like a very bureaucratic burden to have to preregister every experiment,” Nosek admits. The practice could also lead to a conservative shift in science, if reviewers recommend against accepting registered reports when their hypothesis runs counter to conventional wisdom. That could make it harder to do high-risk research. “This is a real concern,” Nosek says.
How can we make new discoveries if our studies need to be catalogued before they are run?
It’s not easy to tell how real preregistration’s potential benefits and drawbacks are. Anne Scheel of the Eindhoven University of Technology in the Netherlands, for instance, recently set out to answer a seemingly simple question: Do registered reports lead to more negative results being published? “I’m quite shocked how hard it is,” says Scheel, because it’s not clear what a good control group would be. Early adopters of preregistration may be very different from other researchers, for instance, or people might choose to preregister only research more likely to have no effect. For now, she has the less ambitious goal of establishing the percentage of hypotheses in registered reports that are ultimately confirmed.
Tom Hardwicke, a metaresearcher at Stanford University in Palo Alto, California, has many questions about registered reports themselves—for instance, how detailed the initial protocols are, how often researchers deviate from them, and whether they’re transparent about that. But when he tried to answer those questions recently, “the raw ingredients were not available,” Hardwicke says. Many journals don’t publish the protocols and don’t deposit them in an independent registry such as the COS database, Hardwicke says. “The community needs access to registered protocols to conduct this kind of research and ensure that registered reports do what they say on the tin,” he says.
Chambers agrees but thinks things are getting better. Already, 76% of journals that accept preregistrations agree to make protocols publicly available. At others, the reluctance “is part technical teething problems, part cultural barriers resulting from introducing preregistration for the first time to fields where it is unfamiliar,” he says.
For preregistration to be a success, the protocols need to be short, simple to write, and easy to read, Simmons says. That’s why in 2015 he, Nelson, and Simonsohn launched a website, aspredicted.org, that gives researchers a simple template for generating a preregistration. Like their 2011 paper, Simmons says, the website has been much more successful than he would ever have predicted.