If the crowdfunding effort is anything to go by, there is huge sympathy for the data detectives Leif Nelson, Joe Simmons and Uri Simonsohn. The three men — professors of marketing, applied statistics and behavioural science, respectively — have carved out a reputation as defenders of sound scientific research methods. Now they face a lawsuit in the US claiming $25mn for defamation, and the campaign to fund their defence raised over $180,000 in the first 24 hours. The list of donors reads like a Who’s Who of behavioural science, including a $4,900 donation from Nobel laureate Richard Thaler.
In June, Nelson, Simmons and Simonsohn published four posts on their blog, Data Colada, in their own words “detailing evidence of fraud in four academic papers co-authored by Harvard Business School Professor Francesca Gino”. The blog digs deep into the version history of researchers’ Excel spreadsheets, looking for what its authors say is evidence of data being manually altered at unexpected points. Gino, who is on administrative leave, has sued Harvard and the trio, claiming that their actions have damaged her reputation.
Professor Gino, a behavioural scientist, is entitled to defend her good name, although the flood of donations to the Data Colada defence fund reflects a widespread feeling that the blog is performing an important service. “The field benefits from Data Colada,” wrote one donor. Another declared, “Correcting the scientific literature deserves gratitude, not punishment.”
There is a broader lesson to be drawn about the scientific process. Scientific institutions favour research that delivers quantity over quality, novelty over robustness and the production of original claims rather than the scrutiny of familiar ones. The result, say researchers Paul Smaldino and Richard McElreath, has been “the natural selection of bad science”, a place where good work suffers and bad work thrives.
For example, it is often easier to “discover” something publishable if your research methods are substandard. That might mean an outrageous fraud; more often that might take the form of a minor-seeming infraction such as testing lots of different hypotheses and only reporting the most interesting results. This makes nonsense out of the statistical methods we use to sift out flukes.
We are rightly more outraged by fraudsters than by researchers who cut corners, but if the aim is to advance knowledge, motive doesn’t matter. “Any sufficiently crappy research is indistinguishable from fraud,” says the statistician Andrew Gelman.
In an ideal world, data sets would be properly documented and shared for anyone to analyse. Statistical queries would be logged so that scientists could see exactly what other analytical steps other scientists had taken. Experiments would be pre-registered, so that they didn’t disappear into file drawers when the results were disappointing. All this would make science more rigorous and collaborative, with less emphasis on eye-catching and more emphasis on building something that endures.
Dame Ottoline Leyser, the head of UK Research and Innovation, has pointed out that if everyone breaks new ground and nobody builds, all you have is lots of holes in the ground. The problem, says Stuart Ritchie, the author of Science Fictions, is that “all these things are just a hassle”. Not only is it tedious to jump through a lot of methodological hoops rather than running fun new experiments, it is also bad for one’s career. If high standards are voluntary, the fast-and-loose researchers will be able to pump out catchy findings while the rigorous scientists will keep torpedoing their own results.
Meanwhile, even for those not being sued for $25mn, the rewards for carefully scrutinising existing research are scant. Journals are keener to publish new findings than to publish “replications”, studies that check whether older experimental results actually stand up. As for the work performed by the Data Colada bloggers, there seems to be no place for this in the formal structures of the scientific establishment.
Another data sleuth, Elisabeth Bik, who spots manipulated images in scientific papers, won the John Maddox Prize from the charity Sense About Science for her work. But she has no professorial chair. She is funded by consultancy gigs and supporters on Patreon. If we fund such detective work by having an occasional whip-round, no wonder there is so much bad research and so little scrutiny.
The saying goes that science is self-correcting. That cliché obscures two uncomfortable facts. The first is that the truth emerges not through some automatic process, but because somebody did the hard work and took the reputational risk to find the errors. We shouldn’t assume that will just happen. We should find space and funding for it in our scientific institutions.
The second fact is that there is no need for correction if the science is right the first time. That means strengthening the basic standards of science — for example, by supporting replication efforts, by requiring the pre-registration of scientific experiments, and by building tools to support the sharing and tracking of data and methods.
There are glimmers of hope that scientists, scientific journals and grant-making bodies are all taking more interest in such work. The potential reward here is enormous. With the right digital tools, publication rules and scientific norms we can make rigorous research easier to do, easier to share and easier to check — while making life difficult both for the large number of too-casual researchers and for the small number of cheats.
Prevention is better than cure. It is never too late to spot mistakes and to correct the scientific record. But science will gain more — and for vastly less heartache — if journals, universities and funding bodies support better, more robust research practices right at the start.
Written for and first published in the Financial Times on 1 September 2023.
My first children’s book, The Truth Detective is now available (not US or Canada yet – sorry).