The expression, "there are three types of lies: lies, damn lies, and statistics," is attributed to Benjamin Disraeli, the U.K.'s Prime Minster from 1874 to 1880. A century and a half later, improper manipulation of statistics is pervasive. It can do great damage when it corrupts the so-called "scientific literature" – the body of knowledge published in articles by researchers based on their experiments or studies, which is the foundation of science. When those flawed, published articles contain potentially important findings, they are widely reported in news outlets and social media.
A system has been developed to try to ensure that the research published in "peer-reviewed" journals – the gold standard -- is valid. The way it works is that researchers submit their article to the journal, and the editors then send the manuscript to unremunerated reviewers, or "peers," in the research community, who anonymously offer an opinion on whether the article is of sufficiently high quality to be published.
There is a problem with this seemingly logical process, however: Far too often, it fails. Many articles that pass through it are methodologically flawed, contain fraudulently manipulated data or obviously implausible claims, and should not have been accepted. Sometimes the editors and reviewers are part of the deception.
An egregious example came to light last Fall when a prominent publisher, Hindawi, an Egyptian subsidiary of a larger, multinational firm called John Wiley & Sons, announced that due to a major cheating scandal involving some of their editors and peer reviewers, they were withdrawing more than 500 papers en masse.
Hindawi publishes 200 open-access, author-fee journals, 16 of which were involved. In September 2022, according to Retraction Watch, a publication that follows the retraction of scientific papers:
Hindawi's research integrity team found several signs of manipulated peer reviews for the affected papers, including reviews that contained duplicated text, a few individuals who did a lot of reviews, reviewers who turned in their reviews extremely quickly, and misuse of databases that publishers use to vet potential reviewers.
Richard Bennett, vice president of researcher and publishing services for Hindawi, told us that the publisher suspects "coordinated peer review rings" consisting of reviewers and editors working together to advance manuscripts through to publication. Some of the manuscripts appeared to come from paper mills, he said.
The problem is not unique to Hindawi. Retraction Watch continued:
Other publishers have announced large batches of retractions recently. Earlier this month, the Institute of Physics' IOP Publishing announced that it planned to retract nearly 500 articles likely from paper mills, and PLOS in August announced it would retract over 100 papers from its flagship journal over manipulated peer review.
A 2021 article described the tribulations of a small cadre of scientific fraud hunters, or "data sleuths," who reveal cheating in published papers. It is unclear how extensive such misconduct is, but there is certainly a lot that falls under the rubric of "research misbehaviors" or Questionable Research Practices (QRPs). An important subset of these is outright cheating with statistics.
One kind of QRP employs a form of a statistical trick called Multiple Testing and Multiple Modeling, or MTMM. The Multiple Testing component involves asking a lot of questions using a large, complicated data set. For example, a standard nutrition study asks many people, a cohort, to record in Food Frequency Questionnaires, or FFQs, how much of certain things they ate. The investigators then follow the cohort over time and follow up by asking them whether they experience various health problems.
The number of foods in the FFQ might include 60 to several hundred, and the various health outcomes might include from dozens to fifty or more. With careful planning and powerful computers, many thousands of correlations are possible. A data dredge of predicates and outcomes is likely to come up with many statistical "correlations" that might seem persuasive after the researcher constructs a narrative, but are due purely to chance.
What is the "Modeling" aspect of MTMM? The data can be sliced and diced by age groups, gender, geography, etc. and is only limited by the researcher's imagination and computing power. That provides innumerable possibilities for spurious correlations. For example, the 511 papers retracted by Hindawi were published in 2020, the same year that 7,740 papers that included the term "FFQ," providing plenty of opportunities for Questionable Research Practices.
Another technique of statistical sleight-of-hand used to get a desired – but not necessarily accurate – result is called "p-hacking": It involves trying one statistical or data manipulation after another until you get a small enough p-value that qualifies as "statistical significance," even though the finding is the result of chance, not a reflection of reality. P-hacking poses numerous questions but fails to correct for their number and is a way of fudging the analysis. It is not uncommon. Evolutionary biologist Dr. Megan Head and her colleagues found that p-hacking is common in almost every scientific field.
Thus, given widespread p-hacking and the recent retraction of hundreds of supposedly peer-reviewed papers, it is evident that peer review and editorial oversight do not ensure that articles in scientific publications represent reality instead of statistical chicanery. And the problem is increasing over time: In 2020, 7.3% of the people who replied to the American Physical Society survey said they had witnessed data falsification, up from 3.9% in 2003. And 12.5% of the 2020 respondents had felt pressured to break ethics rules, compared with 7.7% in 2003.
This is a significant problem for the scientific community because if published articles are unreliable, we do not really know what we think we know.
The cause for all this cheating is simply greed -- the desire of the research community to tap into the huge reservoirs of research funds, the pressure on scientists to publish or perish, and publishers of scientific journals seeking to maximize profits. Many journal publishers thrive by obtaining fees from authors, which provides an incentive to accept even low-quality or frankly fraudulent research articles. At the same time, investigators are eager to pad their c.v.'s with large numbers of publications, whatever their quality.
Better oversight is needed. Government agencies and university officials charged with ensuring research integrity and scientific professional societies need to acknowledge the insidious fraud in the publication of scientific studies and take corrective actions.
Henry I. Miller ([email protected]), a physician and molecular biologist, is the Glenn Swogger Distinguished Fellow at the American Council on Science and Health. He was the founding director of the FDA's Office of Biotechnology. S. Stanley Young is the assistant director for bioinformatics at the National Institute of Statistical Sciences.