Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Cont-ID: Detection of samples cross-contamination in viral metagenomic data

View ORCID ProfileJohan Rollin, Wei Rong, Sébastien Massart
doi: https://doi.org/10.1101/2023.01.23.525161
Johan Rollin
1University of Liège, Gembloux Agro-Bio Tech, Plant Pathology Laboratory, 5030, Gembloux, Belgium
2DNAVision, 6041, Gosselies, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Johan Rollin
  • For correspondence: johan.rollin@doct.uliege.be
Wei Rong
1University of Liège, Gembloux Agro-Bio Tech, Plant Pathology Laboratory, 5030, Gembloux, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sébastien Massart
1University of Liège, Gembloux Agro-Bio Tech, Plant Pathology Laboratory, 5030, Gembloux, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background High Throughput sequencing (HTS) technologies completed by the bioinformatic analysis of the generated data are becoming an important detection technique for virus diagnostics. They have the potential to replace or complement the current PCR-based methods thanks to their improved inclusivity and analytical sensitivity, as well as their overall good repeatability and reproducibility. Cross-contamination is a well-known phenomenon in molecular diagnostics and corresponds to the exchange of genetic material between samples. Cross-contamination management was a key drawback during the development of PCR-based detection and is now adequately monitored in routine diagnostics. HTS technologies are facing similar difficulties due to their very high analytical sensitivity. As a single viral read could be detected in millions of sequencing reads, it is mandatory to fix a detection threshold that will be influenced by cross-contamination. Cross-contamination monitoring should therefore be a priority when detecting viruses by HTS technologies.

Results We present Cont-ID, a bioinformatic tool designed to check for cross-contamination by analysing the relative abundance of virus sequencing reads identified in sequence metagenomic datasets and their duplication between samples. It can be applied when the samples in a sequencing batch have been processed in parallel in the laboratory and with at least one external alien control. Using 273 real datasets, including 68 virus species from different hosts (fruit tree, plant, human) and several library preparation protocols (Ribodepleted total RNA, small RNA and double stranded RNA), we demonstrated that Cont-ID classifies with high accuracy (91%) viral species detection into (true) infection or (cross) contamination. This classification raises confidence in the detection and facilitates the downstream interpretation and confirmation of the results by prioritising the virus detections that should be confirmed.

Conclusions Cross-contamination between samples when detecting viruses using HTS can be monitored and highlighted by Cont-ID (provided an alien control is present). Cont-ID is based on a flexible methodology relying on the output of bioinformatics analyses of the sequencing reads and considering the contamination pattern specific to each batch of samples. The Cont-ID method is adaptable so that each laboratory can optimise it before its validation and routine use.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted January 23, 2023.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Cont-ID: Detection of samples cross-contamination in viral metagenomic data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Cont-ID: Detection of samples cross-contamination in viral metagenomic data
Johan Rollin, Wei Rong, Sébastien Massart
bioRxiv 2023.01.23.525161; doi: https://doi.org/10.1101/2023.01.23.525161
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Cont-ID: Detection of samples cross-contamination in viral metagenomic data
Johan Rollin, Wei Rong, Sébastien Massart
bioRxiv 2023.01.23.525161; doi: https://doi.org/10.1101/2023.01.23.525161

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4104)
  • Biochemistry (8807)
  • Bioengineering (6508)
  • Bioinformatics (23442)
  • Biophysics (11782)
  • Cancer Biology (9195)
  • Cell Biology (13307)
  • Clinical Trials (138)
  • Developmental Biology (7428)
  • Ecology (11402)
  • Epidemiology (2066)
  • Evolutionary Biology (15140)
  • Genetics (10429)
  • Genomics (14036)
  • Immunology (9166)
  • Microbiology (22142)
  • Molecular Biology (8802)
  • Neuroscience (47528)
  • Paleontology (350)
  • Pathology (1427)
  • Pharmacology and Toxicology (2489)
  • Physiology (3729)
  • Plant Biology (8076)
  • Scientific Communication and Education (1437)
  • Synthetic Biology (2220)
  • Systems Biology (6035)
  • Zoology (1252)