Prospective identification of parasitic sequences in phage display screens

Nucleic Acids Res. 2014 Feb;42(3):1784-98. doi: 10.1093/nar/gkt1104. Epub 2013 Nov 11.

Abstract

Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 10(9) random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the 'parasites'), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 'parasites'. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼ 10(6) diverse clones prevents the biased selection of parasitic clones.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Surface Display Techniques*
  • Data Interpretation, Statistical
  • High-Throughput Nucleotide Sequencing*
  • Peptide Library*
  • Sequence Analysis, DNA

Substances

  • Peptide Library