Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

coil: an R package for cytochrome C oxidase I (COI) DNA barcode data cleaning, translation, and error evaluation

View ORCID ProfileCameron M. Nugent, Tyler A. Elliott, Sujeevan Ratnasingham, Sarah J. Adamowicz
doi: https://doi.org/10.1101/2019.12.12.865014
Cameron M. Nugent
1Department of Integrative Biology, University of Guelph. Guelph, Ontario, Canada
2Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cameron M. Nugent
  • For correspondence: nugentc@uoguelph.ca
Tyler A. Elliott
2Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sujeevan Ratnasingham
2Centre for Biodiversity Genomics, Biodiversity Institute of Ontario, University of Guelph. Guelph, Ontario, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah J. Adamowicz
1Department of Integrative Biology, University of Guelph. Guelph, Ontario, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Biological conclusions based on DNA barcoding and metabarcoding analyses can be strongly influenced by the methods utilized for data generation and curation, leading to varying levels of success in the separation of biological variation from experimental error. The five-prime region of cytochrome c oxidase subunit I (COI-5P) is the most common barcode gene for animals, with conserved structure and function that allows for biologically informed error identification. Here, we present coil (https://CRAN.R-project.org/package=coil), an R package for the pre-processing and error assessment of COI-5P animal barcode and metabarcode sequence data. The package contains functions for placement of barcodes into a common reading frame, accurate translation of sequences to amino acids, and highlighting insertion and deletion errors. The analysis of 10,000 barcode sequences of varying quality demonstrated how coil can place barcode sequences in reading frame and distinguish sequences containing indel errors from error-free sequences with greater than 97.5% accuracy. Package limitations were tested through the analysis of COI-5P sequences from the plant and fungal kingdoms as well as the analysis of potential contaminants: nuclear mitochondrial pseudogenes and Wolbachia COI-5P sequences. Results demonstrated that coil is a strong technical error identification method but is not reliable for detecting all biological contaminants.

Footnotes

  • https://cran.r-project.org/package=coil

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted February 07, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
coil: an R package for cytochrome C oxidase I (COI) DNA barcode data cleaning, translation, and error evaluation
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
coil: an R package for cytochrome C oxidase I (COI) DNA barcode data cleaning, translation, and error evaluation
Cameron M. Nugent, Tyler A. Elliott, Sujeevan Ratnasingham, Sarah J. Adamowicz
bioRxiv 2019.12.12.865014; doi: https://doi.org/10.1101/2019.12.12.865014
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
coil: an R package for cytochrome C oxidase I (COI) DNA barcode data cleaning, translation, and error evaluation
Cameron M. Nugent, Tyler A. Elliott, Sujeevan Ratnasingham, Sarah J. Adamowicz
bioRxiv 2019.12.12.865014; doi: https://doi.org/10.1101/2019.12.12.865014

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2238)
  • Biochemistry (4306)
  • Bioengineering (2963)
  • Bioinformatics (13494)
  • Biophysics (5966)
  • Cancer Biology (4640)
  • Cell Biology (6649)
  • Clinical Trials (138)
  • Developmental Biology (3942)
  • Ecology (6244)
  • Epidemiology (2053)
  • Evolutionary Biology (9191)
  • Genetics (6888)
  • Genomics (8810)
  • Immunology (3924)
  • Microbiology (11300)
  • Molecular Biology (4465)
  • Neuroscience (25659)
  • Paleontology (183)
  • Pathology (722)
  • Pharmacology and Toxicology (1212)
  • Physiology (1783)
  • Plant Biology (4005)
  • Scientific Communication and Education (893)
  • Synthetic Biology (1194)
  • Systems Biology (3631)
  • Zoology (654)