Reproducible Discovery of Cell-Binding Peptides “Lost” in Bulk Amplification via Emulsion Amplification in Phage Display Panning

Many pharmaceutically-relevant cell surface receptors are functional only in the context of intact cells. Phage display, while being a powerful method for the discovery of ligands for purified proteins often fails to identify a diverse set of ligands to receptors on a cell membrane mosaic. To understand this deficiency, we examined growth bias in naïve phage display libraries and observed that it fundamentally changes selection outcomes: The presence of growth-biased (parasite) phage clones in a phage library is detrimental to selection and cell-based panning of such biased libraries is poised to yield ligands from within a small parasite population. Importantly, amplification of phage libraries in water-oil emulsions suppressed the amplification of parasites and steered the selection of biased phage libraries away from parasite population. Attenuation of the growth bias through the use of emulsion amplification reproducibly discovers the ligands for cell-surface receptors that cannot be identified in screen that use conventional ‘bulk’ amplification. Find Ligands in Droplets Canonical phage display selection of ligands for breast cancer cells, which uses bulk amplification (BA) of phage library, reproducibly identified peptide ligands from a ~0.0001% sub-population of the library, which harbors fast-growing phage. Replacing BA by emulsion-amplification (EmA) altered the selection landscape and yielded cell-binding ligands not accessible to conventional phage-display select

Find Ligands in Droplets. Canonical phage display selection of ligands for breast cancer cells, which uses bulk amplification (BA) of phage library, reproducibly identified peptide ligands from a ~0.0001% sub-population of the library, which harbors fast-growing phage. Replacing BA by emulsion-amplification (EmA) altered the selection landscape and yielded cell-binding ligands not accessible to conventional phage-display select Genetically-encoded (GE) libraries displayed on phage, [1] yeast, [2] or RNA [3] are powerful technologies for the discovery of ligands for virtually any molecular target, including many therapeutically-relevant targets. [4] They also permit selection of ligands that bind multi-target entities such as cells and organs and the antibody repertoire (reviewed in [5] [6] ). Functional ligands emanating from the multi-target screens can give rise to therapeutic candidates [4,7] or targeting probes [5,8] and instructive materials that control stem cell differentiation [9] and self-renewal. [10] All in vitro GEselection strategies start from a diverse library of >10 9 ligands, and increase the abundance of targetbinding 'hits' in the sequence pool via rounds of panning-retention of binding and removal of nonbinding ligands-and re-amplification of recovered ligands. These steps exert two orthogonal selection pressures: (i) panning selects for ligands that bind to the target; (ii) re-amplification selects the library clones that exhibit higher rates of amplification than the population average. Characterization of the amplification bias in the selection of ligands from GE-libraries is critical for improving the reproducibility and efficiency of discovery of ligands from these libraries.
Amplification biases have been characterized in oligonucleotide libraries, [11] mRNA-displayed libraries, [12] and in phage-displayed libraries of peptides. [13] In phage-display specifically, bias originates from mutations in the ribosome-binding sequence [14] or in the (+)-origin; [15] clones that harbor these mutations (a.k.a., "parasites") have higher propagation rates and these parasites often recur in selection procedures. [13a] They can be identified prospectively by sequencing naïve and amplified libraries. [13a] Unwanted enrichment of parasites in phage libraries, [16] genomic DNA libraries, [17] and in SELEX (ref [18] and references within) can be minimized by employing emulsion amplification (EmA) of these libraries inside monodisperse aqueous droplets. Although the benefits of EmA in nucleotide libraries are documented, the effect of EmA on the outcome of selection of other GE-libraries is not know. In this manuscript, we characterize the role of amplification bias in the selection procedure against intact cells. As the cell contains several thousand distinct cell surface receptors, selection pressure in such screens is 'weak' and these selections are steered towards parasite sub-population.
Reproducibility of such discovery is coupled to the amplification bias: clones that have amplification bias ("parasites") are discovered reproducibly. Decreasing the amplification bias by employing EmA steers the selection landscape away from parasite population. Reproducibility of such discovery is driven by the binding preferences of the clones and it is not related to their amplification preferences.
As a model multi-receptor target, we employed the breast cancer cell line MDA-MB-231 ("MB-231", Figure S1). Phage-displayed library of 10 9 7-mer peptides in M13KE vector (PhD-7) is a convenient library for such selection because the molecular mechanism for biases in this library are well characterized and this library has been used in over 1000 publications to date (source: BioPanning Database [19] ). We categorized both libraries into: the 'visible population' (V) defined as 10 6 -10 7 sequences identified by Illumina sequencing and the 'invisible population' (I) corresponding to the set of all possible library members excluding the visible population ( Figure 1D). Within the V-population, we mapped the 'parasite population' (P) by re-amplification of the library in the absence of selection. [13a] In the PhD-7 library ( Figure S2), P-population consisted of ~10 3 sequences that increased significantly (p<0.05) in such amplification; and the magnitude of bias defined as population average [Amp]/[Naive] was a factor of 10 ( Figure 1B-D and Figure S2). Three independent replicates from selection of strongly biased phage library against MB-231 reproducibly yielded binding peptide clones from the P-population ( Figure 1F). In rounds 1 through 3, in a representative screen, 39 out of the top 50 unique sequences originated from P-population and 11 from V-population ( Figure 1F and S3 show). The probability for 39/50 of unique sequences to originate from a population that has 10 3 /10 9 = 0.0001% of the unique sequences is negligible (p~10 -39*6 ). To show that the steering of the selection to the P-population was caused by a growth bias, we reduced the growth bias by using EmA in place of the conventional "bulk amplification" (BA) used in the selection.
All other parameters in the selection-type of cell, type and amount of library, selection procedure (i.e., To estimate the affinity of the cell-binding hits enriched in panning, we synthesized the peptide sequences on Teflon-patterned paper arrays and performed short-term adhesion of MB-231-GFP cells using a previously validated assay. [20] Confocal fluorescent microscopy ( Figures 2H, S5-S6) confirmed the presence of cells; fluorescent scanner and calibration curve ( Figure S8) estimated the number of cells adhering to each peptide ( Figure 2B, S7, S9-S13). Comparing the number cells adhered to the peptides of unknown affinity to peptides with known affinity (RGD, 10 µM) [21] allowed preliminary estimation of binding strength ( Figure S7D).  Figure S14, S15, and S16). peptides total, 209 unique-yielded 56 unique cell-binding hits that supported adhesion of significantly more cells than the negative control (GGRDS peptide, p < 0.05, Figure 2C-D). The fraction of validated cell-binding peptides was ~30% in BA-screens and up to 60% in EmA-screens ( Figure 2C-D). The strength of peptide-cells interaction extrapolated from cell numbers EmA was significantly higher for a population of peptides discovered in EmA, when compared to BA selection.
To assess whether the identified peptides target different receptors on MB-231 cell surface, we estimated whether MB-231-binding peptides support adhesion of closely related MCF-7 breast and distally related HT29 colon carcinoma ( Figure 2B, S14-S17). Of 36 tested peptides, two bound only to MB-231 cells and four bound to MB-231 and MCF-7 but not HT29. These observations suggested that population of peptides identified in the screen indeed target distinct receptors but this hypothesis can be further strengthened in the future using techniques such as affinity pull-down and proteomic analysis.
An average cell contains several thousand molecularly distinct receptors. Our results highlight that the GE-screens in such multi-receptor milieu are sensitive to intrinsic secondary selection pressures in the GE-library. Figure 2E-F summarizes the steering of the selection of cell-binding ligands in 7-mer peptide landscape. The confirmed cell-binding ligands selected from BA-screen from lot #1 and #2 of the Ph.D.-7 library are steered to non-overlapping P 1 and P 2 populations respectively ( Figure 2E-G).
The steering towards distinct populations that comprise less than 0.0001% of available diversity makes it impossible to discover the same sequence from two different preparations of the library. The irreproducibility makes it impossible to compare the outcomes of selections: if similar libraries are produced by two different research groups, selections from such two libraries is likely to diverge and yield two different set of binding ligands.
The observed steering and biases and linked to a well-characterized bias in M13KE originating from two factors: (i) proximity of cloned LacZa gene to origin or replication and regulatory regions depresses replication and makes it possible for phage to acquire rare beneficial mutations that increase the growth rate; [14b] (ii) amplification via continuous infection and secretion of mature phage converts even minor growth advantages (e.g., 20% increase) to major difference in composition (100-fold increase in one amplification). [22] Conclusion: We characterize discovery trajectories in phage-displayed libraries built on M13KE genome with well-characterized bias. Simple replacement of BA with EmA yields three major outcomes: (i) EmA-selection steers the selections away from the pre-defined parasite populations; (ii) Steering away from the "parasite" population discovers ligands from the regions of the library not