Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genomic features of bacterial adaptation to plants

This article has been updated

Abstract

Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and the other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. This work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The genome dataset used in analysis, and differences in gene category abundances.
Fig. 2: Validation of predicted plant-associated genes by multiple approaches.
Fig. 3: Proteins and protein domains that were reproducibly enriched as plant-associated or root-associated in multiple taxa.
Fig. 4: A protein family shared by plant-associated bacteria, fungi, and oomycetes that resemble plant proteins.
Fig. 5: Co-occurring plant-associated and soil-associated flagellum-like gene clusters are sporadically distributed across Burkholderiales.
Fig. 6: Rapidly diversifying, high-copy-number Jekyll and Hyde plant-associated genes.
Fig. 7: Hyde1 proteins of Acidovorax citrulli AAC00-1 are toxic to E. coli and various plant-associated bacterial strains.

Similar content being viewed by others

Change history

  • 05 April 2018

    In the version of this article initially published, owing to technical errors during production Supplementary Tables 2–26 were linked to the incorrect legends, and replacement files posted were corrupted. The errors have been corrected in the HTML version of the paper.

References

  1. Ley, R. E. et al. Evolution of mammals and their gut microbes.Science320, 1647–1651 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Baumann, P. Biology bacteriocyte-associated endosymbionts of plant sap-sucking insects. Annu. Rev. Microbiol.59, 155–189 (2005).

    Article  CAS  PubMed  Google Scholar 

  3. Sprent, J. I. 60Ma of legume nodulation. What’s new? What’s changing? J. Exp. Bot.59, 1081–1084 (2008).

    Article  CAS  PubMed  Google Scholar 

  4. Pfeilmeier, S., Caly, D. L. & Malone, J. G. Bacterial pathogenesis of plants: future challenges from a microbial perspective: Challenges in Bacterial Molecular Plant Pathology. Mol. Plant Pathol.17, 1298–1313 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Chowdhury, S. P., Hartmann, A., Gao, X. & Borriss, R. Biocontrol mechanism by root-associated Bacillus amyloliquefaciens FZB42—a review. Front. Microbiol.6, 780 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Fibach-Paldi, S., Burdman, S. & Okon, Y. Key physiological properties contributing to rhizosphere adaptation and plant growth promotion abilities of Azospirillum brasilense.FEMS Microbiol. Lett.326, 99–108 (2012).

    Article  CAS  PubMed  Google Scholar 

  7. Santhanam, R. et al. Native root-associated bacteria rescue a plant from a sudden-wilt disease that emerged during continuous cropping. Proc. Natl. Acad. Sci. USA112, E5013–E5020 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Peters, N. K., Frost, J. W. & Long, S. R. A plant flavone, luteolin, induces expression of Rhizobium meliloti nodulation genes. Science233, 977–980 (1986).

    Article  CAS  PubMed  Google Scholar 

  9. Hiei, Y., Ohta, S., Komari, T. & Kumashiro, T. Efficient transformation of rice (Oryza sativa L.) mediated by Agrobacterium and sequence analysis of the boundaries of the T-DNA.Plant J.6, 271–282 (1994).

    Article  CAS  PubMed  Google Scholar 

  10. Hueck, C. J. Type III protein secretion systems in bacterial pathogens of animals and plants. Microbiol. Mol. Biol. Rev.62, 379–433 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Bulgarelli, D. et al. Revealing structure and assembly cues forArabidopsis root-inhabiting bacterial microbiota. Nature488, 91–95 (2012).

    Article  CAS  PubMed  Google Scholar 

  12. Lundberg, D. S. et al. Defining the core Arabidopsis thaliana root microbiome. Nature488, 86–90 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Bulgarelli, D., Schlaeppi, K., Spaepen, S., Ver Loren van Themaat, E. & Schulze-Lefert, P. Structure and functions of the bacterial microbiota of plants. Annu. Rev. Plant Biol.64, 807–838 (2013).

    Article  CAS  PubMed  Google Scholar 

  14. Ofek-Lalzar, M. et al. Niche and host-associated functional signatures of the root surface microbiome. Nat. Commun.5, 4950 (2014).

    Article  CAS  PubMed  Google Scholar 

  15. Gottel, N. R. et al. Distinct microbial communities within the endosphere and rhizosphere of Populus deltoides roots across contrasting soil types. Appl. Environ. Microbiol.77, 5934–5944 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Bai, Y. et al. Functional overlap of the Arabidopsis leaf and root microbiota. Nature528, 364–369 (2015).

    Article  CAS  PubMed  Google Scholar 

  17. Hardoim, P. R. et al. The hidden world within plants: ecological and evolutionary considerations for defining functioning of microbial endophytes. Microbiol. Mol. Biol. Rev.79, 293–320 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Bulgarelli, D. et al. Structure and function of the bacterial root microbiota in wild and domesticated barley. Cell Host Microbe17, 392–403 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Hacquard, S. et al. Microbiota and host nutrition across plant and animal kingdoms. Cell Host Microbe17, 603–616 (2015).

    Article  CAS  PubMed  Google Scholar 

  20. Tatusov, R. L., Galperin, M. Y., Natale, D. A. & Koonin, E. V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res.28, 33–36 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res.44, D457–D462 (2016).

    Article  CAS  PubMed  Google Scholar 

  22. Haft, D. H., Selengut, J. D. & White, O. The TIGRFAMs database of protein families. Nucleic Acids Res.31, 371–373 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Huntemann, M. et al. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand. Genomic Sci.10, 86 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy.Genome Biol.16, 157 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Finn, R. D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res.44, D279–D285 (2016).

    Article  CAS  PubMed  Google Scholar 

  26. Ives, A. R. & Garland, T. Jr. Phylogenetic logistic regression for binary dependent variables. Syst. Biol.59, 9–26 (2010).

    Article  PubMed  Google Scholar 

  27. Brynildsrud, O., Bohlin, J., Scheffer, L. & Eldholm, V. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary.Genome Biol.17, 238 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Hultman, J. et al. Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes. Nature521, 208–212 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Louca, S. et al. Integrating biogeochemistry with multiomic sequence information in a model oxygen minimum zone. Proc. Natl. Acad. Sci. USA113, E5925–E5933 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Coutinho, B. G., Licastro, D., Mendonça-Previato, L., Cámara, M. & Venturi, V. Plant-influenced gene expression in the rice endophyteBurkholderia kururiensis M130. Mol. Plant Microbe Interact.28, 10–21 (2015).

    Article  PubMed  CAS  Google Scholar 

  31. Long, S. R. Rhizobium-legume nodulation: life together in the underground. Cell56, 203–214 (1989).

    Article  CAS  PubMed  Google Scholar 

  32. Ruvkun, G. B., Sundaresan, V. & Ausubel, F. M. Directed transposon Tn5 mutagenesis and complementation analysis of Rhizobium meliloti symbiotic nitrogen fixation genes. Cell29, 551–559 (1982).

    Article  CAS  PubMed  Google Scholar 

  33. Hershey, D. M., Lu, X., Zi, J. & Peters, R. J. Functional conservation of the capacity for ent-kaurene biosynthesis and an associated operon in certain rhizobia. J. Bacteriol.196, 100–106 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Nett, R. S. et al. Elucidation of gibberellin biosynthesis in bacteria reveals convergent evolution. Nat. Chem. Biol.13, 69–74 (2017).

    Article  CAS  PubMed  Google Scholar 

  35. Scharf, B. E., Hynes, M. F. & Alexandre, G. M. Chemotaxis signaling systems in model beneficial plant-bacteria associations. Plant Mol. Biol.90, 549–559 (2016).

    Article  CAS  PubMed  Google Scholar 

  36. Büttner, D. & He, S. Y. Type III protein secretion in plant pathogenic bacteria. Plant Physiol.150, 1656–1664 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Gao, R. et al. Genome-wide RNA sequencing analysis of quorum sensing-controlled regulons in the plant-associated Burkholderia glumae PG1 strain. Appl. Environ. Microbiol.81, 7993–8007 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Weller-Stuart, T., Toth, I., De Maayer, P. & Coutinho, T. Swimming and twitching motility are essential for attachment and virulence ofPantoea ananatis in onion seedlings.Mol. Plant Pathol.18, 734–745 (2017).

    Article  CAS  PubMed  Google Scholar 

  39. De Weger, L. A. et al. Flagella of a plant-growth-stimulatingPseudomonas fluorescens strain are required for colonization of potato roots. J. Bacteriol.169, 2769–2773 (1987).

    Article  PubMed  PubMed Central  Google Scholar 

  40. de Weert, S. et al. Flagella-driven chemotaxis towards exudate components is an important trait for tomato root colonization by Pseudomonas fluorescens. Mol. Plant Microbe Interact.15, 1173–1180 (2002).

    Article  PubMed  Google Scholar 

  41. Ravcheev, D. A. et al. Comparative genomics and evolution of regulons of the LacI-family transcription factors. Front. Microbiol.5, 294 (2014).

    PubMed  PubMed Central  Google Scholar 

  42. Yamauchi, Y., Hasegawa, A., Taninaka, A., Mizutani, M. & Sugimoto, Y. NADPH-dependent reductases involved in the detoxification of reactive carbonyls in plants. J. Biol. Chem.286, 6999–7009 (2011).

    Article  CAS  PubMed  Google Scholar 

  43. Burstein, D. et al. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach. PLoS Pathog.5, e1000508 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  44. Dean, P. Functional domains and motifs of bacterial type III effector proteins and their roles in infection. FEMS Microbiol. Rev.35, 1100–1125 (2011).

    Article  CAS  PubMed  Google Scholar 

  45. Stebbins, C. E. & Galán, J. E. Structural mimicry in bacterial virulence. Nature412, 701–705 (2001).

    Article  CAS  PubMed  Google Scholar 

  46. Price, C. T. et al. Molecular mimicry by an F-box effector ofLegionella pneumophila hijacks a conserved polyubiquitination machinery within macrophages and protozoa.PLoS Pathog.5, e1000704 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. Rothmeier, E. et al. Activation of Ran GTPase by a Legionella effector promotes microtubule polymerization, pathogen vacuole motility and infection. PLoS Pathog.9, e1003598 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Xu, R.-Q. et al. AvrAC(Xcc8004), a type III effector with a leucine-rich repeat domain from Xanthomonas campestris pathovar campestris confers avirulence in vascular tissues of Arabidopsis thaliana ecotype Col-0. J. Bacteriol.190, 343–355 (2008).

    Article  CAS  PubMed  Google Scholar 

  49. Shevchik, V. E., Robert-Baudouy, J. & Hugouvieux-Cotte-Pattat, N. Pectate lyase PelI of Erwinia chrysanthemi 3937 belongs to a new family. J. Bacteriol.179, 7321–7330 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Cesari, S., Bernoux, M., Moncuquet, P., Kroj, T. & Dodds, P. N. A novel conserved mechanism for plant NLR protein pairs: the “integrated decoy” hypothesis. Front. Plant Sci.5, 606 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Sarris, P. F. et al. A plant immune receptor detects pathogen effectors that target WRKY transcription factors. Cell161, 1089–1100 (2015).

    Article  CAS  PubMed  Google Scholar 

  52. Sarris, P. F., Cevik, V., Dagdas, G., Jones, J. D. & Krasileva, K. V. Comparative analysis of plant immune receptor architectures uncovers host proteins likely targeted by pathogens. BMC Biol.14, 8 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Le Roux, C. et al. A receptor pair with an integrated decoy converts pathogen disabling of transcription factors to immunity. Cell161, 1074–1088 (2015).

    Article  PubMed  CAS  Google Scholar 

  54. Brown, G. D. & Netea, M. G. (eds.). Immunology of Fungal Infections. (Springer, Dordrecht, The Netherlands, 2007).

    Google Scholar 

  55. Gadjeva, M., Takahashi, K. & Thiel, S. Mannan-binding lectin—a soluble pattern recognition molecule. Mol. Immunol.41, 113–121 (2004).

    Article  CAS  PubMed  Google Scholar 

  56. Ma, Q.-H., Tian, B. & Li, Y.-L. Overexpression of a wheat jasmonate-regulated lectin increases pathogen resistance. Biochimie92, 187–193 (2010).

    Article  CAS  PubMed  Google Scholar 

  57. Xiang, Y. et al. A jacalin-related lectin-like gene in wheat is a component of the plant defence system. J. Exp. Bot.62, 5471–5483 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Yamaji, Y. et al. Lectin-mediated resistance impairs plant virus infection at the cellular level. Plant Cell24, 778–793 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Weidenbach, D. et al. Polarized defense against fungal pathogens is mediated by the Jacalin-related lectin domain of modular Poaceae-specific proteins. Mol. Plant9, 514–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  60. Sahly, H. et al. Surfactant protein D binds selectively toKlebsiella pneumoniae lipopolysaccharides containing mannose-rich O-antigens. J. Immunol.169, 3267–3274 (2002).

    Article  CAS  PubMed  Google Scholar 

  61. Osborn, M. J., Rosen, S. M., Rothfield, L., Zeleznick, L. D. & Horecker, B. L. Lipopolysaccharide of the gram-negative cell wall. Science145, 783–789 (1964).

    Article  CAS  PubMed  Google Scholar 

  62. Tans-Kersten, J., Huang, H. & Allen, C. Ralstonia solanacearum needs motility for invasive virulence on tomato. J. Bacteriol.183, 3597–3605 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Cole, B. J. et al. Genome-wide identification of bacterial plant colonization genes. PLoS Biol.15, e2002860 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. Poggio, S. et al. A complete set of flagellar genes acquired by horizontal transfer coexists with the endogenous flagellar system in Rhodobacter sphaeroides. J. Bacteriol.189, 3208–3216 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Ho, B. T., Dong, T. G. & Mekalanos, J. J. A view to a kill: the bacterial type VI secretion system. Cell Host Microbe15, 9–21 (2014).

    Article  CAS  PubMed  Google Scholar 

  66. MacIntyre, D. L., Miyata, S. T., Kitaoka, M. & Pukatzki, S. TheVibrio cholerae type VI secretion system displays antimicrobial properties. Proc. Natl. Acad. Sci. USA107, 19520–19524 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Tian, Y. et al. The type VI protein secretion system contributes to biofilm formation and seed-to-seedling transmission of Acidovorax citrulli on melon. Mol. Plant Pathol.16, 38–47 (2015).

    Article  CAS  PubMed  Google Scholar 

  68. Peiffer, J. A. et al. Diversity and heritability of the maize rhizosphere microbiome under field conditions. Proc. Natl. Acad. Sci. USA110, 6548–6553 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Agler, M. T. et al. Microbial hub taxa link host and abiotic factors to plant microbiome variation. PLoS Biol.14, e1002352 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  70. Bokulich, N. A., Thorngate, J. H., Richardson, P. M. & Mills, D. A. Microbial biogeography of wine grapes is conditioned by cultivar, vintage, and climate. Proc. Natl. Acad. Sci. USA111, E139–E148 (2014).

    Article  CAS  PubMed  Google Scholar 

  71. Coleman-Derr, D. et al. Plant compartment and biogeography affect microbiome composition in cultivated and native Agave species. New Phytol.209, 798–811 (2016).

    Article  CAS  PubMed  Google Scholar 

  72. Shade, A., McManus, P. S. & Handelsman, J. Unexpected diversity during community succession in the apple flower microbiome. MBio4, e00602–e00612 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  73. Turner, T. R. et al. Comparative metatranscriptomics reveals kingdom level changes in the rhizosphere microbiome of plants. ISME J.7, 2248–2258 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Edwards, J. et al. Structure, variation, and assembly of the root-associated microbiomes of rice. Proc. Natl. Acad. Sci. USA112, E911–E920 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Kroj, T., Chanclud, E., Michel-Romiti, C., Grand, X. & Morel, J.-B. Integration of decoy domains derived from protein targets of pathogen effectors into plant immune receptors is widespread. New Phytol.210, 618–626 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Mukhtar, M. S. et al. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science333, 596–601 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Vimr, E. & Lichtensteiger, C. To sialylate, or not to sialylate: that is the question. Trends Microbiol.10, 254–257 (2002).

    Article  CAS  PubMed  Google Scholar 

  78. de Jonge, R. et al. Conserved fungal LysM efector Ecp6 prevents chitin-triggered immunity in plants. Science329, 953–955 (2010).

    Article  PubMed  CAS  Google Scholar 

  79. Doty, S. L. et al. Diazotrophic endophytes of native black cottonwood and willow. Symbiosis47, 23–33 (2009).

    Article  CAS  Google Scholar 

  80. Weston, D. J. et al. Pseudomonas fluorescens induces strain-dependent and strain-independent host plant responses in defense networks, primary metabolism, photosynthesis, and fitness. Mol. Plant Microbe Interact.25, 765–778 (2012).

    Article  CAS  PubMed  Google Scholar 

  81. Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature499, 431–437 (2013).

    Article  CAS  PubMed  Google Scholar 

  82. Beszteri, B., Temperton, B., Frickenhaus, S. & Giovannoni, S. J. Average genome size: a potential source of bias in comparative metagenomics.ISME J.4, 1075–1077 (2010).

    Article  PubMed  Google Scholar 

  83. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res.25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Varghese, N. J. et al. Microbial species delineation using whole genome sequences. Nucleic Acids Res.43, 6761–6771 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Kerepesi, C., Bánky, D. & Grolmusz, V. AmphoraNet: the webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene533, 538–540 (2014).

    Article  CAS  PubMed  Google Scholar 

  86. Wu, M., Chatterji, S. & Eisen, J. A. Accounting for alignment uncertainty in phylogenomics. PLoS One7, e30288 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One5, e9490 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  88. Sen, A. et al. Phylogeny of the class Actinobacteria revisited in the light of complete genomes. The orders ‘Frankiales’ and Micrococcales should be split into coherent entities: proposal of Frankiales ord. nov., Geodermatophilales ord. nov., Acidothermales ord. nov. and Nakamurellales ord. nov. Int. J. Syst. Evol. Microbiol.64, 3821–3832 (2014).

    Article  PubMed  CAS  Google Scholar 

  89. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics26, 2460–2461 (2010).

    Article  CAS  PubMed  Google Scholar 

  90. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  91. Wang, Z. & Wu, M. A phylum-level bacterial phylogenetic marker database. Mol. Biol. Evol.30, 1258–1262 (2013).

    Article  CAS  PubMed  Google Scholar 

  92. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol.57, 289–300 (1995).

    Google Scholar 

  93. Finn, R. D. et al. HMMER web server: 2015 update. Nucleic Acids Res.43, W30–W38 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Alexeyev, M. F. The pKNOCK series of broad-host-range mobilizable suicide vectors for gene knockout and targeted DNA insertion into the chromosome of gram-negative bacteria. Biotechniques26, 824–826 (1999).

    Article  CAS  PubMed  Google Scholar 

  95. Hadjithomas, M. et al. IMG-ABC: a knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites. MBio6, e00932 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.Nucleic Acids Res.30, 3059–3066 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Stamatakis, A., Hoover, P. & Rougemont, J. A rapid bootstrap algorithm for the RAxML Web servers. Syst. Biol.57, 758–771 (2008).

    Article  PubMed  Google Scholar 

  98. Finkel, O. M., Béjà, O. & Belkin, S. Global abundance of microbial rhodopsins. ISME J.7, 448–451 (2013).

    Article  CAS  PubMed  Google Scholar 

  99. Traore, S. M. Characterization of Type Three Effector Genes of A. citrulli, the Causal Agent of Bacterial Fruit Blotch of Cucurbits. (Virginia Polytechnic Institute and State University, Blacksburg, VA, 2014).

    Google Scholar 

  100. Basler, M., Ho, B. T. & Mekalanos, J. J. Tit-for-tat: type VI secretion system counterattack during bacterial cell-cell interactions.Cell152, 884–894 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. J.L.D. and S.G.T. were supported by NSF INSPIRE grant IOS-1343020, and J.L.D. was also supported by DOE–USDA Feedstock Award DE-SC001043 and by the Office of Science (BER), US Department of Energy, grant no. DE-SC0014395. S.H.P. was supported by NIH Training Grant T32 GM067553-06 and was a Howard Hughes Medical Institute (HHMI) International Student Research Fellow. D.S.L. was supported by NIH Training Grant T32 GM07092-34. J.L.D. is an Investigator of the HHMI, supported by the HHMI and the Gordon and Betty Moore Foundation (GBMF3030). M.E.F. was supported by NIH Dr. Ruth L. Kirschstein NRSA Fellowship F32-GM112345. D.A.P. and T.-Y.L. were supported by the Genomic Science Program, US Department of Energy, Office of Science, Biological and Environmental Research as part of the Oak Ridge National Laboratory Plant Microbe Interfaces Scientific Focus Area (http://pmi.ornl.gov) and Plant Feedstock Genomics Award DE-SC001043. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. J.A.V. was supported by a SystemsX.ch grant (Micro2X) and a European Research Council (ERC) advanced grant (PhyMo). We thank I. Bertani, C. Bez, R. Bowers, D. Burstein, A. Chun Chen, D. Chiniquy, B. Cole, O. Cohen, A. Copeland, J. Eisen, E. Eloe-Fadrosh, M. Hadjithomas, O. Finkel, H. Schnitzel Meule Fux, N. Ivanova, J. Knelman, R. Malmstrom, R. Perez-Torres, D. Salomon, R. Sorek, T. Mucyn, R. Seshadri, T.K. Reddy, L. Ryan, and H. Sberro Livnat for general help, text editing, and ideas for this work. We thank R. Walcott (University of Georgia, Athens, GA, USA) for providing the Acidovorax citrulli VasD mutant strain.

Author information

Authors and Affiliations

Authors

Contributions

A.L. performed most data analysis and wrote the paper. I.S.G. performed phylogenetic inference, performed phylogenetically aware analyses, analyzed the data, provided the supporting website, and contributed to manuscript writing. M. Mittelviefhaus and J.A.V. designed and performed experiments related to Hyde1 gene function and contributed to manuscript writing. S.C. isolated single bacterial cells and prepared metadata for data analysis. F.M. analyzed data. S.H.P. analyzed data and contributed to manuscript writing. J.M. produced a mutant strain for Hyde1. K.W. tested Hyde1 toxicity inE. coli. G.D. and V.V. produced deletion mutants and designed and performed rice root colonization experiments. K.S. helped in data analysis. B.R.A. prepared metadata for data analysis. D.S.L., T.-Y.L., S.L., Z.J., M. McDonald, A.P.K., M.E.F., and S.L.D. isolated bacteria from different plants or managed this process. T.G.d.R. managed the sequencing project. S.R.G., D.A.P., and R.E.L. managed bacterial isolation efforts and contributed to manuscript writing. B.Z. managed Hyde1 deletion and toxicity testing. S.G.T. contributed to manuscript writing. T.W. managed single-cell isolation efforts and contributed to manuscript writing. J.L.D. directed the overall project and contributed to manuscript writing.

Corresponding authors

Correspondence to Susannah G. Tringe, Tanja Woyke or Jeffery L. Dangl.

Ethics declarations

Competing interests

J.L.D. is a cofounder of and shareholder in, and S.H.P. collaborates with, AgBiome LLC, a corporation that aims to use plant-associated microbes to improve plant productivity.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–29 and Supplementary Note 1.

Life Sciences Reporting Summary

Supplementary Table 1

All genomes used. Lists of all genomes used from nine taxa (pre-filtration). Cells filled with yellow are Brassicaceae root isolates from the USA, cells filled with green are single cells isolated from Arabidopsis thaliana, cells filled with pink are poplar isolates, cells filled with blue are recently published leaf and root Arabidopsis and soil isolates from Europe, cells filled with purple are maize root isolates. “Filtered out?” column is ‘N’ if genome is retained for usage in analysis after QA process. “Representative genome taxid” – taxon id of another genome (different row in the same tab) representing at least two redundant genomes. Completeness and contamination values were calculated with CheckM. Full genome sequence, gene annotation, and metadata of each genome used can be found in the IMG website https://img.jgi.doe.gov/. For example the metadata of taxon id 2558860101 can be found in https://img.jgi.doe.gov/cgibin/mer/main.cgi?section=TaxonDetail&page=taxonDetail&ta xon_oid=2558860101.

Supplementary Table 2

Statistics of genomes in the taxa used

Supplementary Table 3

Sequencing and assembly information of new genomes

Supplementary Table 4

Abundance of the nine taxa in 16S marker gene surveys. The relative abundances of taxa composing a specific taxon were taken from the different publications and were added to yield the relative abundance of that taxon. In those cases with biological replicates, e.g. in Lundberg et al. Nature 2012 we used the median value.

Supplementary Table 5

Genome size comparison. Genome size comparison between the different isolation sites done by t-test and PhyloGLM. Each cell denotes the group with the largest genomes, if the difference is significant (P < 0.05). N.S. - not significant. PhyloGLM test takes into account the phylogenetic structure of the taxon.

Supplementary Table 6

COG-to-COG category mapping

Supplementary Table 7

Acinetobacter PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster inpfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue< 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1.To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 8

Actinobacteria1 PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 9

Actinobacteria2 PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 10

Alphaproteobacteria PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used q- value < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 11

Bacillales PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 12

Bacteroidetes PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 13

Burkholderiales PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 14

Pseudomonas PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn, we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 15

Xanthomonadaceae PA/NPA/RA/soil genes/domains. Phylogenetic diversity is the median pairwise distance between the genomes hosting the genes in the cluster. Values for each test are "Y", "N", or "Untested" (clusters were untested when there was insufficient phylogenetic signal, they were too small or were found in all genomes). To be considered as a significant cluster in pfam/COG/TIGRFAM/KO + hypergbin/hypergcn, we used qvalue < 0.05 (Benjamini Hochberg FDR corrected). To be considered as significant cluster in OrthoFinder + hypergbin/hypergcn we used Bonferroni-corrected P < 0.1. To be considered as a significant PA/RA cluster in phyloglmcn/phyloglmcn, we used q-value < 0.05 (Benjamini Hochberg FDR corrected) and an estimate > 0 (or estimate < 0 for significant NPA/soil). To be considered as a significant PA/RA cluster in Scoary, we used P < 0.05 for three tests: Fisher exact test (Benjamini Hochberg FDR corrected), worst pairing scenario test, and empirical test and odds ratio or Fisher exact test > 1 (odds ratio < 1 for NPA/soil).

Supplementary Table 16

Validation of PA/NPA/RA/soil genes through metagenomes. a. Samples used (n=38), b. Summary of results based on two sided t test.

Supplementary Table 17

Validation of PA genes in Paraburkholderia kururiensis M130. a. Mutant used and statistical tests results, b. Raw data: cfu/g root, 3. Primers used.

Supplementary Table 18

The number of operons predicted by different approaches.

Supplementary Table 19

Reproducible PA domains. a. Protein domains that are significantly PA in at least three taxa by at least two tests. NA – test results are not available (untested), NS – non-significant result. b. Fractions for LacI proteins within genomes, c. Fraction of pfam00248 domain within genomes.

Supplementary Table 20

DNA motifs predicted to be bound by LacI transcription factors. Predicted promoter sequences are intergenic sequences, at least 25 bp long, located upstream of carbohydrate metabolism and transport genes that are found directly adjacent to LacI genes. The most abundant kmers of different lengths were detected using wordcount (Emboss package). The most abundant motifs found in multiple taxa were compared against their distribution in random intergenic sequences using the Fisher exact test.

Supplementary Table 21

PREPARADOs. Pfam domains that are both significant PA/RA domains (reproducibly found as such in multiple taxa or by multiple approaches) and more abundant in plants than in bacteria according to Pfam (PREPARADOs). Pfams labeled in yellow are carbohydrate-related and are part of proteins found in eukaryotes and bacteria with full length sequence similarity, having an N-terminus signal peptide, and lacking a transmembrane domain. Cells marked in green are domains that are predicted to be secreted by Sec or T3SS (over >50% of the bacterial proteins having the domain are predicted to be secreted by these secretion systems).

Supplementary Table 22

Full-length proteins conserved between PA bacterial genes and eukaryotic genes. LAST alignment results of PREPARADO-containing proteins from bacteria (query) against plant, fungi, oomycetes, and protist proteins from Refseq (target). Only alignments that are over 40% identity and stretch across at least 90% of the query and target length are shown.

Supplementary Table 23

Jekyll and Hyde. Gene homologs of Jekyll and Hyde proteins based on protein homologs on IMG; To find all homologs and paralogs of Jekyll and Hyde genes (a-d) we used IMG blast search with e value threshold of 1e-5 against all IMG isolates, some of which were not included in the original comparartive analysis and hence their genes are not part of any cluster. Since Hyde1 proteins are rapidly evolving, they are scattered across multiple OrthoFinder orthogroups. Metadata in a-d was retrieved from IMG website. a. Jekyll protein homologs of Acidovorax gene Ga0102403_10160, b. Hyde1 protein homologs of Acidovorax protein Aave_1071, c. Hyde1-like protein homologs of Pseudomonas protein A243_06583, d. Hyde2 homologs of Ga0078621_123530, e. Hyde1-like-Hyde2 loci in representative Proteobacteria, one per genus, and their location adjacent to T6SS genes and within genomes that encode T6SS. Hyde2 was found based on blast search against the nr db with Acav_4635 as the query.

Supplementary Table 24

Divergence of Jekyll gene operon. An analysis of the Jekyll gene cluster that is presented in Figure 6b. Control genes are shown in Figure S26c. The table summarizes a comparison between multiple sequence alignments of the Jekyll locus (Figure S24b) and the control genes (Figure S24c).

Supplementary Table 25

Toxicity of Hyde proteins and recovery of prey cells confronted with Hyde-encoding Acidovorax and different mutants. Includes primers used to make Acidovorax deletion strains, strains used as prey and their antibiotic resistance, raw results for cell toxicity and competition assays.

Supplementary Table 26

Significant orthogroups (orthofinder clusters) supported by three statistical approaches: either hypergbin, phyloglmbin, and Scoary, or hypergcn, phyloglmcn, and Scoary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Levy, A., Salas Gonzalez, I., Mittelviefhaus, M. et al. Genomic features of bacterial adaptation to plants. Nat Genet 50, 138–150 (2018). https://doi.org/10.1038/s41588-017-0012-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-017-0012-9

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research