Reading the Science News page on Google News today, I read a shocking story at the mainstream "Inside Higher Ed" site. It's an article entitled "The Growing Problem of Scientific Research Fraud." It starts out by saying, "When a group of researchers at Northwestern University uncovered evidence of widespread—and growing—research fraud in scientific publishing, editors at some academic journals weren’t exactly rushing to publish the findings." We then hear a little about what sounds like a "censor the bad news" affair.
But the researchers' paper did eventually get published. We read this:
"Last week Amaral and his colleagues published their findings in the Proceedings of the National Academy of Sciences of the United States of America. They estimate that they were able to detect anywhere between 1 and 10 percent of fraudulent papers circulating in the literature and that the actual rate of fraud may be 10 to 100 times more."
We read in the article some researcher saying, "If this trend goes unchecked, science will be ruined and misinformation is going to dominate the literature.” Figure 5 in the paper includes this graph:
The "paper mill products" line shows fraudulent papers. The "PubPeer commented" line shows papers suspected of fraud, and mentioned on a site in which scientists can anonymously discuss suspicions of fraud. The "retracted" line shows papers retracted because of their low-quality or problems discovered in them. The great majority of junk science papers are not retracted. Notice the trend lines. A larger and larger fraction of scientific papers are fraudulent or junk.
The graph above is from the newly published paper "The entities enabling scientific fraud at scale are large, resilient, and growing rapidly." The link for the paper's pdf file is here.
In previous posts I discussed the issue of fraud in biology research. The posts were these:
- Study Suggests Massive Faking in Neuroscience Papers
- They're Calling It a Huge Memory Research Fraud, But Is It Only the Tip of the Iceberg?
- Fraud and Misconduct Are Not Very Rare in Biology
Why would such wrongdoing occur? If you are a scientist living in a "publish or perish" culture, it may be expected that you will author a certain number of papers each year. There is an effect called publication bias, in which scientific journals prefer to publish papers reporting positive results. If you are a scientist doing experiments that have recently produced only null results, you may resort to paying some paper mill to get some result that will have a higher chance of getting published. The paper mill companies are typically in foreign countries, and have discreet names such as Suichow Editorial Services.
A researcher named Bernhard A. Sabel has developed what he thinks is a pretty simple way to spot paper mill papers in biology and medicine: look for papers which have author email addresses that are private emails or hospital emails rather than college or university emails such as joesmith@harvard.com. The technique of Sabel is entirely different from the technique mentioned earlier in this post.
The latest version of a paper by Sabel describes the paper mill industry:
"The major source of fake publications are 1,000+ 'academic support' agencies – so-called 'paper mills' – located mainly in China, India, Russia, UK, and USA (Abalkina, 2021; Else, 2021; PĂ©rez-Neri et al., 2022). Paper mills advertise writing and editing services via the internet and charge hefty fees to produce and publish fake articles in journals listed in the Science Citation Index (SCI) (Christopher, 2021; Else, 2022). Their services include manuscript production based on fabricated data, figures, tables, and text semi-automatically generated using artificial intelligence (AI). Manuscripts are subsequently edited by an army of scientifically trained professionals and ghostwriters."
Sabel mentions a case of a paper mill that emailed a scientific journal offering a sum of $1000 if the journal published one of the papers the paper mill (calling itself an editorial services firm) helped to produce.
A paper by Sabel states this:
"More than 1,000 paper mills openly advertise their services on Baidu and Google to 'help prepare' academic term papers, dissertations, and articles intended for SCI publications. Most paper mills are located in China, India, UK, and USA, and some are multinational. They use sophisticated, state-of-the-art AI-supported text generation, data and statistical manipulation and fabrication technologies, image and text pirating, and gift or purchased authorships. Paper mills fully prepare – and some guarantee –publication in an SCI journal and charge hefty fees ($1,000-$25,000; in Russia: $5,000) (Chawla, 2022) depending on the specific services ordered (topic, impact factor of target journal, with/without faking data by fake 'experimentation')"
Sabel estimates that paper mills are a major business, earning a revenue of about a billion dollars per year. He estimates that close to 150,000 papers are questionable papers with red flags indicating possible paper mill authorship.
Publicly available AI programs such as ChatGPT are making this kind of hard fraud easier. Such programs can do a million and one things. Ask such a program to generate some type of information on some topic, and you might get some largely fictional or largely inaccurate output (sometimes called "AI slop") that can be pasted into a scientific paper.
The discussion above is largely about what we might call hard fraud. Hard fraud may be defined as something involving data that is fake or made up. But we should not limit a discussion of scientific fraud to a mere discussion of hard fraud. There is also what we can call soft fraud. Soft fraud in scientific research may be defined as the use of extremely misleading analysis techniques and misleading data gathering techniques and misleading data presentation techniques to give the impression that something was discovered, when no such thing occurred. Soft fraud is extremely abundant in scientific research. To read about some of the things going on when soft fraud occurs in scientific literature, read my posts here:
The Building Blocks of Bad Science Literature
50 Types of Questionable Research Practices
To understand the financial factors that drive such hard and soft fraud, you need to "follow the money" by considering factors like those diagrammed below. Read here for an explanation of the diagram.