The need for speed —

The preprint problem: Unvetted science is fueling COVID-19 misinformation

Peer review moves to Twitter, muddling public health information.

A novelty Magic 8 Ball brings up the words

A significant difference between COVID-19 and past pandemics—even the 2009 outbreak of H1N1—has been the speed with which information on the disease has spread. Partly, that's down to social media, as platforms like Twitter have been embraced by scientists and doctors. But another major factor has been the rise of what we call a preprint—an academic research paper that's posted to a publicly accessible server in advance of it having gone through the traditional process of peer review. When unvetted science that makes bold claims goes straight to the public, that can cause problems, as illustrated by a recent preprint on coronavirus mutations covered by John Timmer earlier today.

How things used to work

The traditional way that scientists share their findings has been through peer-reviewed journals. A scientist—or more typically a team of scientists—conducts their research, writes up the results, and sends them to a journal that covers that particular field. (Or, if they want to make a bigger splash, a multidisciplinary publication like Nature or Science.) When the journal receives the paper, it sends copies out to be reviewed by (usually) three scientists that are also in the same field—peers of the authors. Those reviewers cast a highly critical eye upon the paper, looking for flaws in the methodology and analysis or other potential problems. Sometimes, they don't find any, and the paper sails through and shows up in print soon afterward.

More commonly, one or more reviewers will find something they deem objectionable or insufficient. Each reviewer sends the journal editor their thoughts, often with questions or suggestions to pass on to the authors. Those will typically need to be addressed before the editor will accept the research for publication. Sometimes these are helpful, although not always—many a scientist can tell tales of the mean and vindictive "reviewer 3" who keeps their graduate students and postdocs up at night.

The process isn't perfect, and it's invariably not quick, often taking weeks or months and multiple revisions between the initial submission to a journal and its eventual publication.

What’s a preprint?

That traditional model of peer-reviewed publishing isn't exactly new; in fact, it's hundreds of years old. But some time in the 1990s, researchers in the physics community decided to improve it. Since you can't really run a massive particle accelerator or experimental nuclear reactor with a couple of grad students and a single principal investigator, many physicists were already accustomed to working together on big collaborative projects and exchanging drafts of papers with the whole team. And so, embracing the still relatively new Internet, they created servers like arXiv—repositories to share research publications that had yet to go through peer review and formal publication.

Instead of just sharing with their team, these researchers started sharing with the entire research community. Other scientists could read the research and discuss it with each other and the authors before the findings showed up in print, which led to the term "preprint."

The use of preprint servers and the transparency of a public (if unofficial) peer-review process was often a stick that physicists and computer scientists used to beat their biomedical colleagues with at meetings. Their implication was that biology was hopelessly behind the times. Over time, a new generation of disaffected biologists became more vocal with their dissatisfaction about the sclerotic nature of the traditional publication process. And so, led by genomics researchers—who, like their cousins in high-energy physics, were also used to working on big multidisciplinary collaborative projects—preprint servers like bioRxiv and medRxiv began to be created for biomedical science.

The attraction is obvious. Scientists can disseminate their findings much more rapidly, and young researchers can show funding bodies and hiring committees that they are being productive and contributing to the field while waiting for drafts to clear peer review. Publishing one's research on a preprint server also means that it's widely accessible, as opposed to locked behind a subscription paywall.

In just a few short years, the practice of using preprint servers in biomedical research has soared in popularity. A study conducted in 2019 showed a rapid increase in the number of papers uploaded to bioRxiv over the first five years since it was created in 2014, as well as the rate at which new papers were deposited. There was an equally impressive increase in the number of downloads from biorXiv over that time. Encouragingly, the study also found that two-thirds of the preprints posted to bioRxiv within its first four years were published in peer-reviewed journals. (The number is much lower when more recent submissions were included, clearly showing the lengthy time lag involved in getting a paper into print.)

So what’s the problem?

Not every biomedical scientist is convinced about the rise of preprints in biomedical research (and in the interests of disclosure, I'm one of them). For some, their opposition has been due to policies at some journals that forbid any promotion of a research paper before it appears on their printed pages (or online at the journal's website, at any rate). That complaint has largely gone by the wayside as most journals have moved with the tide and altered their policies to exclude non-profit preprint servers from counting as prior publication.

Other scientists have voiced concerns that posting their work to a preprint server could allow rivals to race them to publication or beat them to the market. And others—and this is my particular fear—worry about unpolished or even inaccurate work ending up in the hands of readers ill-equipped to evaluate it.

To put it more bluntly, if a paper posted to arXiv regarding a particular flavor of subatomic particle turns out to be erroneous or flawed, no one's going to die. But if a flawed research paper about a more contagious mutation of a virus in the middle of a global pandemic is reported on uncritically, then there really is the potential for harm.

Indeed, this is not an abstract fear. We are in the middle of a global pandemic, and a recent study in The Lancet found that much of the discussion (and even policymaking) about COVID-19's transmissibility (also known as R0) during January 2020 was driven by preprints rather than peer-reviewed literature.

Still, proponents of preprints will rightly point out that peer review isn't always perfect and that plenty of papers that go through the traditional process are later shown to be problematic. As our article earlier today shows, scientists are aware of the differences, and Twitter has become a new outlet for conducting a very transparent form of peer review. The preprint genie is out of the bottle, and there are certainly good arguments to be made in favor of rapid dissemination of findings during a public health emergency like the one in which we find ourselves. But if you see a finding reported in the media from a preprint, perhaps treat it with one more grain of salt than usual.

Channel Ars Technica