The extracellular leucine-rich repeat superfamily; a comparative survey and analysis of evolutionary relationships and expression patterns

BMC Genomics. 2007 Sep 14:8:320. doi: 10.1186/1471-2164-8-320.

Abstract

Background: Leucine-rich repeats (LRRs) are highly versatile and evolvable protein-ligand interaction motifs found in a large number of proteins with diverse functions, including innate immunity and nervous system development. Here we catalogue all of the extracellular LRR (eLRR) proteins in worms, flies, mice and humans. We use convergent evidence from several transmembrane-prediction and motif-detection programs, including a customised algorithm, LRRscan, to identify eLRR proteins, and a hierarchical clustering method based on TribeMCL to establish their evolutionary relationships.

Results: This yields a total of 369 proteins (29 in worm, 66 in fly, 135 in mouse and 139 in human), many of them of unknown function. We group eLRR proteins into several classes: those with only LRRs, those that cluster with Toll-like receptors (Tlrs), those with immunoglobulin or fibronectin-type 3 (FN3) domains and those with some other domain. These groups show differential patterns of expansion and diversification across species. Our analyses reveal several clusters of novel genes, including two Elfn genes, encoding transmembrane proteins with eLRRs and an FN3 domain, and six genes encoding transmembrane proteins with eLRRs only (the Elron cluster). Many of these are expressed in discrete patterns in the developing mouse brain, notably in the thalamus and cortex. We have also identified a number of novel fly eLRR proteins with discrete expression in the embryonic nervous system.

Conclusion: This study provides the necessary foundation for a systematic analysis of the functions of this class of genes, which are likely to include prominently innate immunity, inflammation and neural development, especially the specification of neuronal connectivity.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs*
  • Animals
  • Brain / metabolism
  • Caenorhabditis elegans Proteins / chemistry
  • Caenorhabditis elegans Proteins / genetics
  • Caenorhabditis elegans Proteins / metabolism
  • Cluster Analysis
  • Computational Biology / methods
  • Computer Simulation
  • Databases, Protein
  • Drosophila Proteins / chemistry
  • Drosophila Proteins / genetics
  • Drosophila Proteins / metabolism
  • Evolution, Molecular*
  • Gene Expression Regulation, Developmental / genetics*
  • Humans
  • Leucine / analysis*
  • Ligands
  • Mice
  • Multigene Family
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / genetics*
  • Proteins / metabolism
  • Proteome / genetics
  • RNA / genetics
  • RNA / metabolism
  • Repetitive Sequences, Amino Acid*

Substances

  • Caenorhabditis elegans Proteins
  • Drosophila Proteins
  • Ligands
  • Proteins
  • Proteome
  • RNA
  • Leucine