Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Human gene function publications that describe wrongly identified nucleotide sequence reagents are unacceptably frequent within the genetics literature

Yasunori Park, View ORCID ProfileRachael A West, Pranujan Pathmendra, View ORCID ProfileBertrand Favier, View ORCID ProfileThomas Stoeger, View ORCID ProfileAmanda Capes-Davis, View ORCID ProfileGuillaume Cabanac, View ORCID ProfileCyril Labbé, View ORCID ProfileJennifer A Byrne
doi: https://doi.org/10.1101/2021.07.29.453321
Yasunori Park
1Faculty of Medicine and Health, The University of Sydney, NSW, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rachael A West
1Faculty of Medicine and Health, The University of Sydney, NSW, Australia
2Children’s Cancer Research Unit, Kids Research, The Children’s Hospital at Westmead, Westmead, NSW, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rachael A West
Pranujan Pathmendra
1Faculty of Medicine and Health, The University of Sydney, NSW, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bertrand Favier
3Univ. Grenoble Alpes, TIMC, Grenoble, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Bertrand Favier
Thomas Stoeger
4Successful Clinical Response in Pneumonia Therapy (SCRIPT) Systems Biology Center, Northwestern University, Evanston, United States
5Department of Chemical and Biological Engineering, Northwestern University, Evanston, United States
6Center for Genetic Medicine, Northwestern University School of Medicine, Chicago, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thomas Stoeger
Amanda Capes-Davis
1Faculty of Medicine and Health, The University of Sydney, NSW, Australia
7CellBank Australia, Children’s Medical Research Institute, Westmead, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Amanda Capes-Davis
Guillaume Cabanac
8Computer Science Department, IRIT UMR 5505 CNRS, University of Toulouse, 118 route de Narbonne, 31062 Toulouse Cedex 9, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Guillaume Cabanac
Cyril Labbé
9Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cyril Labbé
Jennifer A Byrne
1Faculty of Medicine and Health, The University of Sydney, NSW, Australia
10NSW Health Statewide Biobank, NSW Health Pathology, Camperdown, NSW, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer A Byrne
  • For correspondence: jennifer.byrne@health.nsw.gov.au
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Nucleotide sequence reagents underpin a range of molecular genetics techniques that have been applied across hundreds of thousands of research publications. We have previously reported wrongly identified nucleotide sequence reagents in human gene function publications and described a semi-automated screening tool Seek & Blastn to fact-check the targeting or non-targeting status of nucleotide sequence reagents. We applied Seek & Blastn to screen 11,799 publications across 5 literature corpora, which included all original publications in Gene from 2007-2018 and all original open-access publications in Oncology Reports from 2014-2018. After manually checking the Seek & Blastn screening outputs for over 3,400 human research papers, we identified 712 papers across 78 journals that described at least one wrongly identified nucleotide sequence. Verifying the claimed identities of over 13,700 nucleotide sequences highlighted 1,535 wrongly identified sequences, most of which were claimed targeting reagents for the analysis of 365 human protein-coding genes and 120 non-coding RNAs, respectively. The 712 problematic papers have received over 17,000 citations, which include citations by human clinical trials. Given our estimate that approximately one quarter of problematic papers are likely to misinform or distract the future development of therapies against human disease, urgent measures are required to address the problem of unreliable gene function papers within the literature.

Author summary This is the first study to have screened the gene function literature for nucleotide sequence errors at the scale that we describe. The unacceptably high rates of human gene function papers with incorrect nucleotide sequences that we have discovered represent a major challenge to the research fields that aim to translate genomics investments to patients, and that commonly rely upon reliable descriptions of gene function. Indeed, wrongly identified nucleotide sequence reagents represent a double concern, as both the incorrect reagents themselves and their associated results can mislead future research, both in terms of the research directions that are chosen and the experiments that are undertaken. We hope that our research will inspire researchers and journals to seek out other problematic human gene function papers, as we are unfortunately concerned that our results represent the tip of a much larger problem within the literature. We hope that our research will encourage more rigorous reporting and peer review of gene function results, and we propose a series of responses for the research and publishing communities.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted July 31, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Human gene function publications that describe wrongly identified nucleotide sequence reagents are unacceptably frequent within the genetics literature
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Human gene function publications that describe wrongly identified nucleotide sequence reagents are unacceptably frequent within the genetics literature
Yasunori Park, Rachael A West, Pranujan Pathmendra, Bertrand Favier, Thomas Stoeger, Amanda Capes-Davis, Guillaume Cabanac, Cyril Labbé, Jennifer A Byrne
bioRxiv 2021.07.29.453321; doi: https://doi.org/10.1101/2021.07.29.453321
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Human gene function publications that describe wrongly identified nucleotide sequence reagents are unacceptably frequent within the genetics literature
Yasunori Park, Rachael A West, Pranujan Pathmendra, Bertrand Favier, Thomas Stoeger, Amanda Capes-Davis, Guillaume Cabanac, Cyril Labbé, Jennifer A Byrne
bioRxiv 2021.07.29.453321; doi: https://doi.org/10.1101/2021.07.29.453321

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9152)
  • Bioengineering (6790)
  • Bioinformatics (24037)
  • Biophysics (12142)
  • Cancer Biology (9550)
  • Cell Biology (13808)
  • Clinical Trials (138)
  • Developmental Biology (7650)
  • Ecology (11719)
  • Epidemiology (2066)
  • Evolutionary Biology (15522)
  • Genetics (10655)
  • Genomics (14337)
  • Immunology (9496)
  • Microbiology (22872)
  • Molecular Biology (9113)
  • Neuroscience (49072)
  • Paleontology (355)
  • Pathology (1485)
  • Pharmacology and Toxicology (2572)
  • Physiology (3851)
  • Plant Biology (8341)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2299)
  • Systems Biology (6199)
  • Zoology (1302)