Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Bioscience-scale automated detection of figure element reuse

Daniel E. Acuna, Paul S. Brookes, Konrad P. Kording
doi: https://doi.org/10.1101/269415
Daniel E. Acuna
1School of Information Studies, Syracuse University, Syracuse, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: deacuna@syr.edu
Paul S. Brookes
2Department of Anesthesiology, University of Rochester Medical Center, Box 604, 601 Elmwood Avenue, Rochester, NY 14642, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Konrad P. Kording
3Departments of Bioengineering and Neuroscience, University of Pennsylvania
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Scientists reuse figure elements sometimes appropriately, e.g. when comparing methods, and sometimes inappropriately, e.g. when presenting an old experiment as a new control. To understand such reuse, automatically detecting it would be important. Here we present an analysis of figure element reuse on a large dataset comprising 760 thousand open access articles and 2 million figures. Our algorithm detects figure region reuse, while being robust to rotation, cropping, resizing, and contrast changes, and estimates which of the reuses have biological meaning. Then a three-person panel analyzes how problematic these biological reuses are using contextual information such as captions and full texts. Based on the panel reviews, we estimate that 9% of the biological reuses would be unanimously perceived as at least suspicious. We further estimate that 0.6% of all articles would be unanimously perceived as fraudulent, with inappropriate reuses occurring 43% across articles, 28% within article, and 29% within a figure. Our tool rapidly detects image reuse at scale, promising to be useful to a broad range of people that campaign for scientific integrity. We suggest that a great deal of scientific fraud will be, sooner or later, detectable by automatic methods.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 23, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Bioscience-scale automated detection of figure element reuse
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Bioscience-scale automated detection of figure element reuse
Daniel E. Acuna, Paul S. Brookes, Konrad P. Kording
bioRxiv 269415; doi: https://doi.org/10.1101/269415
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Bioscience-scale automated detection of figure element reuse
Daniel E. Acuna, Paul S. Brookes, Konrad P. Kording
bioRxiv 269415; doi: https://doi.org/10.1101/269415

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Scientific Communication and Education
Subject Areas
All Articles
  • Animal Behavior and Cognition (4114)
  • Biochemistry (8816)
  • Bioengineering (6519)
  • Bioinformatics (23464)
  • Biophysics (11792)
  • Cancer Biology (9209)
  • Cell Biology (13325)
  • Clinical Trials (138)
  • Developmental Biology (7439)
  • Ecology (11412)
  • Epidemiology (2066)
  • Evolutionary Biology (15152)
  • Genetics (10439)
  • Genomics (14044)
  • Immunology (9172)
  • Microbiology (22159)
  • Molecular Biology (8813)
  • Neuroscience (47575)
  • Paleontology (350)
  • Pathology (1429)
  • Pharmacology and Toxicology (2492)
  • Physiology (3730)
  • Plant Biology (8082)
  • Scientific Communication and Education (1437)
  • Synthetic Biology (2221)
  • Systems Biology (6039)
  • Zoology (1253)