Skip to main content

Advertisement

Log in

Exploiting Models of Molecular Evolution to Efficiently Direct Protein Engineering

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Directed evolution and protein engineering approaches used to generate novel or enhanced biomolecular function often use the evolutionary sequence diversity of protein homologs to rationally guide library design. To fully capture this sequence diversity, however, libraries containing millions of variants are often necessary. Screening libraries of this size is often undesirable due to inaccuracies of high-throughput assays, costs, and time constraints. The ability to effectively cull sequence diversity while still generating the functional diversity within a library thus holds considerable value. This is particularly relevant when high-throughput assays are not amenable to select/screen for certain biomolecular properties. Here, we summarize our recent attempts to develop an evolution-guided approach, Reconstructing Evolutionary Adaptive Paths (REAP), for directed evolution and protein engineering that exploits phylogenetic and sequence analyses to identify amino acid substitutions that are likely to alter or enhance function of a protein. To demonstrate the utility of this technique, we highlight our previous work with DNA polymerases in which a REAP-designed small library was used to identify a DNA polymerase capable of accepting non-standard nucleosides. We anticipate that the REAP approach will be used in the future to facilitate the engineering of biopolymers with expanded functions and will thus have a significant impact on the developing field of ‘evolutionary synthetic biology’.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Arnold FH, Georgiou G (2003) Directed enzyme evolution: screening and selection methods. Humana Press, Totowa, New Jersey

    Google Scholar 

  • Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SR (2004) The Pfam protein families database. Nucleic Acids Res 32:D138–D141

    Article  PubMed  CAS  Google Scholar 

  • Benner SA, Gaucher EA (2001) Evolution, language and analogy in functional genomics. Trends Genet 17:414–418

    Article  PubMed  CAS  Google Scholar 

  • Bielawski JP, Yang Z (2004) A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. J Mol Evol 59:121–132

    Article  PubMed  CAS  Google Scholar 

  • Brakmann S (2001) Discovery of superior enzymes by directed molecular evolution. Chembiochem 2:865–871

    Article  PubMed  CAS  Google Scholar 

  • Bridgham JT, Ortlund EA, Thornton JW (2009) An epistatic ratchet constrains the direction of glucocorticoid receptor evolution. Nature 461:515–519

    Article  PubMed  CAS  Google Scholar 

  • Chen F, Gaucher EA, Leal NA, Hutter D, Havemann SA, Govindarajan S, Ortlund EA, Benner SA (2010) Reconstructed evolutionary adaptive paths give polymerases accepting reversible terminators for sequencing and SNP detection. Proc Natl Acad Sci USA 107:1948–1953

    Article  PubMed  CAS  Google Scholar 

  • Crameri A, Whitehorn EA, Tate E, Stemmer WP (1996) Improved green fluorescent protein by molecular evolution using DNA shuffling. Nat Biotechnol 14:315–319

    Article  PubMed  CAS  Google Scholar 

  • Crameri A, Raillard SA, Bermudez E, Stemmer WP (1998) DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature 391:288–291

    Article  PubMed  CAS  Google Scholar 

  • Fox R, Roy A, Govindarajan S, Minshull J, Gustafsson C, Jones JT, Emig R (2003) Optimizing the search algorithm for protein engineering by directed evolution. Protein Eng 16:589–597

    Article  PubMed  CAS  Google Scholar 

  • Fox RJ, Davis SC, Mundorff EC, Newman LM, Gavrilovic V, Ma SK, Chung LM, Ching C, Tam S, Muley S, Grate J, Gruber J, Whitman JC, Sheldon RA, Huisman GW (2007) Improving catalytic function by ProSAR-driven enzyme evolution. Nat Biotechnol 25:338–344

    Article  PubMed  CAS  Google Scholar 

  • Gaucher EA (2007) Ancestral sequence reconstruction as a tool to understand natural history and guide synthetic biology: realizing and extending the vision of Zukerkandl and Pauling. Oxford University Press, Oxford, pp 20–33

    Google Scholar 

  • Gaucher EA, Miyamoto MM, Benner SA (2001) Function–structure analysis of proteins using covarion-based evolutionary approaches: elongation factors. Proc Natl Acad Sci USA 98:548–552

    Article  PubMed  CAS  Google Scholar 

  • Gaucher EA, Das UK, Miyamoto MM, Benner SA (2002a) The crystal structure of eEF1A refines the functional predictions of an evolutionary analysis of rate changes among elongation factors. Mol Biol Evol 19:569–573

    PubMed  CAS  Google Scholar 

  • Gaucher EA, Gu X, Miyamoto MM, Benner SA (2002b) Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem Sci 27:315–321

    Article  PubMed  CAS  Google Scholar 

  • Gaucher EA, Miyamoto MM, Benner SA (2003) Evolutionary, structural and biochemical evidence for a new interaction site of the leptin obesity protein. Genetics 163:1549–1553

    PubMed  CAS  Google Scholar 

  • Gaucher EA, Govindarajan S, Ganesh OK (2008) Palaeotemperature trend for Precambrian life inferred from resurrected proteins. Nature 451:704–707

    Article  PubMed  CAS  Google Scholar 

  • Govindarajan S, Ness JE, Kim S, Mundorff EC, Minshull J, Gustafsson C (2003) Systematic variation of amino acid substitutions for stringent assessment of pairwise covariation. J Mol Biol 328:1061–1069

    Article  PubMed  CAS  Google Scholar 

  • Gu X (2001) Maximum-likelihood approach for gene family evolution under functional divergence. Mol Biol Evol 18:453–464

    PubMed  CAS  Google Scholar 

  • Gu X, Vander Velden K (2002) DIVERGE: phylogeny-based analysis for functional–structural divergence of a protein family. Bioinformatics 18:500–501

    Article  PubMed  CAS  Google Scholar 

  • Harms MJ, Thornton JW (2010) Analyzing protein structure and function using ancestral gene reconstruction. Curr Opin Struct Biol 20:360–366

    Article  PubMed  CAS  Google Scholar 

  • Henry AA, Romesberg FE (2005) The evolution of DNA polymerases with novel activities. Curr Opin Biotechnol 16:370–377

    Article  PubMed  CAS  Google Scholar 

  • Horlacher J, Hottiger M, Podust VN, Hubscher U, Benner SA (1995) Recognition by viral and cellular DNA polymerases of nucleosides bearing bases with nonstandard hydrogen bonding patterns. Proc Natl Acad Sci USA 92:6329–6333

    Article  PubMed  CAS  Google Scholar 

  • Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP (2001) Evolution—Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314

    Article  PubMed  CAS  Google Scholar 

  • Kimura M (1991) The neutral theory of molecular evolution: a review of recent evidence. Jpn J Genet 66:367–386

    Article  PubMed  CAS  Google Scholar 

  • Knudsen B, Miyamoto MM (2001) A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins. Proc Natl Acad Sci USA 98:14512–14517

    Article  PubMed  CAS  Google Scholar 

  • Korkegian A, Black ME, Baker D, Stoddard BL (2005) Computational thermostabilization of an enzyme. Science 308:857–860

    Article  PubMed  CAS  Google Scholar 

  • Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33:W299–W302

    Article  PubMed  CAS  Google Scholar 

  • Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and clustal X version 2.0. Bioinformatics 23:2947–2948

    Article  PubMed  CAS  Google Scholar 

  • Leal NA, Sukeda M, Benner SA (2006) Dynamic assembly of primers on nucleic acid templates. Nucleic Acids Res 34:4702–4710

    Article  PubMed  CAS  Google Scholar 

  • Lehman N, Unrau PJ (2005) Recombination during in vitro evolution. J Mol Evol 61:245–252

    Article  PubMed  CAS  Google Scholar 

  • Lenski RE, Winkworth CL, Riley MA (2003) Rates of DNA sequence evolution in experimental populations of Escherichia coli during 20,000 generations. J Mol Evol 56:498–508

    Article  PubMed  CAS  Google Scholar 

  • Liao J, Warmuth MK, Govindarajan S, Ness JE, Wang RP, Gustafsson C, Minshull J (2007) Engineering proteinase K using machine learning and synthetic genes. BMC Biotechnol 7:16

    Article  PubMed  Google Scholar 

  • Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257:342–358

    Article  PubMed  CAS  Google Scholar 

  • Lopez P, Casane D, Philippe H (2002) Heterotachy, an important process of protein evolution. Mol Biol Evol 19:1–7

    PubMed  CAS  Google Scholar 

  • Lutz S, Patrick WM (2004) Novel methods for directed evolution of enzymes: quality, not quantity. Curr Opin Biotechnol 15:291–297

    Article  PubMed  CAS  Google Scholar 

  • Ness JE, Kim S, Gottman A, Pak R, Krebber A, Borchert TV, Govindarajan S, Mundorff EC, Minshull J (2002) Synthetic shuffling expands functional protein diversity by allowing amino acids to recombine independently. Nat Biotechnol 20:1251–1255

    Article  PubMed  CAS  Google Scholar 

  • Ness JE, Cox AJ, Govindarajan S, Gustafsson C, Gross RA, Minshull J (2005) Empirical biocatalyst engineering: escaping the tyranny of high throughput screening. American Chemical Society, Washington, DC

    Google Scholar 

  • Patel PH, Kawate H, Adman E, Ashbach M, Loeb LA (2001) A single highly mutable catalytic site amino acid is critical for DNA polymerase fidelity. J Biol Chem 276:5044–5051

    Article  PubMed  CAS  Google Scholar 

  • Perez-Jimenez R, Li JY, Kosuri P, Sanchez-Romero I, Wiita AP, Rodriguez-Larrea D, Chueca A, Holmgren A, Miranda-Vizuete A, Becker K, Cho SH, Beckwith J, Gelhaye E, Jacquot JP, Gaucher EA, Sanchez-Ruiz JM, Berne BJ, Fernandez JM (2009) Diversity of chemical mechanisms in thioredoxin catalysis revealed by single-molecule force spectroscopy (vol 16, pg 890, 2009). Nat Struct Mol Biol 16:1331

    Article  CAS  Google Scholar 

  • Perfeito L, Fernandes L, Mota C, Gordo I (2007) Adaptive mutations in bacteria: high rate and small effects. Science 317:813–815

    Article  PubMed  CAS  Google Scholar 

  • Pupko T, Galtier N (2002) A covarion-based method for detecting molecular adaptation: application to the evolution of primate mitochondrial genomes. Proc Biol Sci 269:1313–1316

    Article  PubMed  CAS  Google Scholar 

  • Saraf MC, Horswill AR, Benkovic SJ, Maranas CD (2004) FamClash: a method for ranking the activity of engineered enzymes. Proc Natl Acad Sci USA 101:4142–4147

    Article  PubMed  CAS  Google Scholar 

  • Sismour AM, Lutz S, Park JH, Lutz MJ, Boyer PL, Hughes SH, Benner SA (2004) PCR amplification of DNA containing non-standard base pairs by variants of reverse transcriptase from Human Immunodeficiency Virus-1. Nucleic Acids Res 32:728–735

    Article  PubMed  CAS  Google Scholar 

  • Skovgaard M, Kodra JT, Gram DX, Knudsen SM, Madsen D, Liberles DA (2006) Using evolutionary information and ancestral sequences to understand the sequence–function relationship in GLP-1 agonists. J Mol Biol 363:977–988

    Article  PubMed  CAS  Google Scholar 

  • Tabor S, Richardson CC (1995) A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. Proc Natl Acad Sci USA 92:6339–6343

    Article  PubMed  CAS  Google Scholar 

  • Taverna DM, Goldstein RA (2002) Why are proteins marginally stable? Proteins 46:105–109

    Article  PubMed  CAS  Google Scholar 

  • Thornton JW (2004) Resurrecting ancient genes: experimental analysis of extinct molecules. Nat Rev Genet 5:366–375

    Article  PubMed  CAS  Google Scholar 

  • Tobin MB, Gustafsson C, Huisman GW (2000) Directed evolution: the ‘rational’ basis for ‘irrational’ design. Curr Opin Struct Biol 10:421–427

    Article  PubMed  CAS  Google Scholar 

  • Van Regenmortel MH (2000) Are there two distinct research strategies for developing biologically active molecules: rational design and empirical selection? J Mol Recognit 13:1–4

    Article  PubMed  Google Scholar 

  • Voigt CA, Mayo SL, Arnold FH, Wang ZG (2001) Computational method to reduce the search space for directed protein evolution. Proc Natl Acad Sci USA 98:3778–3783

    Article  PubMed  CAS  Google Scholar 

  • Wang ZO, Pollock DD (2005) Context dependence and coevolution among amino acid residues in proteins. Methods Enzymol 395:779–790

    Article  PubMed  CAS  Google Scholar 

  • Wang HC, Spencer M, Susko E, Roger AJ (2007) Testing for covarion-like evolution in protein sequences. Mol Biol Evol 24:294–305

    Article  PubMed  CAS  Google Scholar 

  • Wong WS, Yang Z, Goldman N, Nielsen R (2004) Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics 168:1041–1051

    Article  PubMed  CAS  Google Scholar 

  • Yamashiro K, Yokobori S, Koikeda S, Yamagishi A (2010) Improvement of Bacillus circulans beta-amylase activity attained using the ancestral mutation method. Protein Eng Des Sel 23:519–528

    Article  PubMed  CAS  Google Scholar 

  • Yang ZH (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591

    Article  PubMed  CAS  Google Scholar 

  • Yang ZH, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino-acid-sequences. Genetics 141:1641–1650

    PubMed  CAS  Google Scholar 

  • You L, Arnold FH (1996) Directed evolution of subtilisin E in Bacillus subtilis to enhance total activity in aqueous dimethylformamide. Protein Eng 9:77–83

    Article  PubMed  CAS  Google Scholar 

  • Yuen CM, Liu DR (2007) Dissecting protein structure and function using directed evolution. Nat Methods 4:995–997

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by a National Institutes of Health grant to EAG. MFC was supported by an NIH NRSA award and in part by the Emory University Fellowship in Research and Science Teaching (FIRST) program’s NIH/NIGMS IRACDA grant number K12 GM000680-11. This work was also supported by the National Aeronautics and Space Administration’s Exobiology and Astrobiology Programs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eric A. Gaucher.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cole, M.F., Gaucher, E.A. Exploiting Models of Molecular Evolution to Efficiently Direct Protein Engineering. J Mol Evol 72, 193–203 (2011). https://doi.org/10.1007/s00239-010-9415-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-010-9415-2

Keywords

Navigation