Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing

Rahul Sinha, Geoff Stanley, Gunsagar S. Gulati, Camille Ezran, Kyle J. Travaglini, Eric Wei, Charles K.F. Chan, Ahmad N. Nabhan, Tianying Su, Rachel M. Morganti, Stephanie D. Conley, Hassan Chaib, Kristy Red-Horse, Michael T. Longaker, Michael P. Snyder, Mark A. Krasnow, Irving L. Weissman
doi: https://doi.org/10.1101/125724
Rahul Sinha
1Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: sinhar@stanford.edu
Geoff Stanley
2Department of Bioengineering, Stanford University School of Engineering, 443 Via Ortega, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gunsagar S. Gulati
1Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Camille Ezran
3Department of Biochemistry and Howard Hughes Medical Institute, Beckman Center B400, 279 Campus Drive, Stanford, CA 94305-5307, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kyle J. Travaglini
3Department of Biochemistry and Howard Hughes Medical Institute, Beckman Center B400, 279 Campus Drive, Stanford, CA 94305-5307, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eric Wei
4Stanford Center for Genomics and Personalized Medicine, 3165 Porter Drive, Palo Alto, CA 94304, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Charles K.F. Chan
5Department of Surgery, Hagey Laboratory for Pediatric Regenerative Medicine, 257 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ahmad N. Nabhan
3Department of Biochemistry and Howard Hughes Medical Institute, Beckman Center B400, 279 Campus Drive, Stanford, CA 94305-5307, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tianying Su
6Department of Biology, Gilbert Building, Rm 109, 371 Serra Mall, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rachel M. Morganti
1Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stephanie D. Conley
1Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hassan Chaib
4Stanford Center for Genomics and Personalized Medicine, 3165 Porter Drive, Palo Alto, CA 94304, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kristy Red-Horse
6Department of Biology, Gilbert Building, Rm 109, 371 Serra Mall, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael T. Longaker
5Department of Surgery, Hagey Laboratory for Pediatric Regenerative Medicine, 257 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael P. Snyder
4Stanford Center for Genomics and Personalized Medicine, 3165 Porter Drive, Palo Alto, CA 94304, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark A. Krasnow
3Department of Biochemistry and Howard Hughes Medical Institute, Beckman Center B400, 279 Campus Drive, Stanford, CA 94305-5307, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Irving L. Weissman
1Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
7Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA
8Ludwig Center for Cancer Stem Cell Research and Medicine, Stanford University School of Medicine, 265 Campus Drive, Stanford, CA 94305, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Illumina-based next generation sequencing (NGS) has accelerated biomedical discovery through its ability to generate thousands of gigabases of sequencing output per run at a fraction of the time and cost of conventional technologies. The process typically involves four basic steps: library preparation, cluster generation, sequencing, and data analysis. In 2015, a new chemistry of cluster generation was introduced in the newer Illumina machines (HiSeq 3000/4000/X Ten) called exclusion amplification (ExAmp), which was a fundamental shift from the earlier method of random cluster generation by bridge amplification on a non-patterned flow cell. The ExAmp chemistry, in conjunction with patterned flow cells containing nanowells at fixed locations, increases cluster density on the flow cell, thereby reducing the cost per run. It also increases sequence read quality, especially for longer read lengths (up to 150 base pairs). This advance has been widely adopted for genome sequencing because greater sequencing depth can be achieved for lower cost without compromising the quality of longer reads. We show that this promising chemistry is problematic, however, when multiplexing samples. We discovered that up to 5-10% of sequencing reads (or signals) are incorrectly assigned from a given sample to other samples in a multiplexed pool. We provide evidence that this “spreading-of-signals” arises from low levels of free index primers present in the pool. These index primers can prime pooled library fragments at random via complementary 3’ ends, and get extended by DNA polymerase, creating a new library molecule with a new index before binding to the patterned flow cell to generate a cluster for sequencing. This causes the resulting read from that cluster to be assigned to a different sample, causing the spread of signals within multiplexed samples. We show that low levels of free index primers persist after the most common library purification procedure recommended by Illumina, and that the amount of signal spreading among samples is proportional to the level of free index primer present in the library pool. This artifact causes homogenization and misclassification of cells in single cell RNA-seq experiments. Therefore, all data generated in this way must now be carefully re-examined to ensure that “spreading-of-signals” has not compromised data analysis and conclusions. Re-sequencing samples using an older technology that uses conventional bridge amplification for cluster generation, or improved library cleanup strategies to remove free index primers, can minimize or eliminate this signal spreading artifact.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted April 09, 2017.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing
Rahul Sinha, Geoff Stanley, Gunsagar S. Gulati, Camille Ezran, Kyle J. Travaglini, Eric Wei, Charles K.F. Chan, Ahmad N. Nabhan, Tianying Su, Rachel M. Morganti, Stephanie D. Conley, Hassan Chaib, Kristy Red-Horse, Michael T. Longaker, Michael P. Snyder, Mark A. Krasnow, Irving L. Weissman
bioRxiv 125724; doi: https://doi.org/10.1101/125724
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing
Rahul Sinha, Geoff Stanley, Gunsagar S. Gulati, Camille Ezran, Kyle J. Travaglini, Eric Wei, Charles K.F. Chan, Ahmad N. Nabhan, Tianying Su, Rachel M. Morganti, Stephanie D. Conley, Hassan Chaib, Kristy Red-Horse, Michael T. Longaker, Michael P. Snyder, Mark A. Krasnow, Irving L. Weissman
bioRxiv 125724; doi: https://doi.org/10.1101/125724

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Molecular Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4395)
  • Biochemistry (9619)
  • Bioengineering (7111)
  • Bioinformatics (24915)
  • Biophysics (12642)
  • Cancer Biology (9979)
  • Cell Biology (14388)
  • Clinical Trials (138)
  • Developmental Biology (7968)
  • Ecology (12135)
  • Epidemiology (2067)
  • Evolutionary Biology (16010)
  • Genetics (10938)
  • Genomics (14764)
  • Immunology (9889)
  • Microbiology (23719)
  • Molecular Biology (9493)
  • Neuroscience (50965)
  • Paleontology (370)
  • Pathology (1544)
  • Pharmacology and Toxicology (2688)
  • Physiology (4031)
  • Plant Biology (8685)
  • Scientific Communication and Education (1512)
  • Synthetic Biology (2403)
  • Systems Biology (6446)
  • Zoology (1346)