Single amino acid repeats in signal peptides

FEBS J. 2010 Aug;277(15):3147-57. doi: 10.1111/j.1742-4658.2010.07720.x. Epub 2010 Jun 17.

Abstract

There has been an increasing interest in single amino acid repeats ever since it was shown that these are the cause of a variety of diseases. Although a systematic study of single amino acid repeats is challenging, they have subsequently been implicated in a number of functional roles. In general surveys, leucine runs were among the most frequent. In the present study, we present a detailed investigation of repeats in signal peptides of secreted and type I membrane proteins in comparison with their mature parts. We focus on eukaryotic species because single amino acid repeats are generally rather rare in archaea and bacteria. Our analysis of over 100 species shows that repeats of leucine (but not of other hydrophobic amino acids) are over-represented in signal peptides. This trend is most pronounced in higher eukaryotes, particularly in mammals. In the human proteome, although less than one-fifth of all proteins have a signal peptide, approximately two-thirds of all leucine repeats are located in these transient regions. Signal peptides are cleaved early from the growing polypeptide chain and then degraded rapidly. This may explain why leucine repeats, which can be toxic, are tolerated at such high frequencies. The substantial fraction of proteins affected by the strong enrichment of repeats in these transient segments highlights the bias that they can introduce for systematic analyses of protein sequences. In contrast to a general lack of conservation of single amino acid repeats, leucine repeats were found to be more conserved than the remaining signal peptide regions, indicating that they may have an as yet unknown functional role.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids
  • Animals
  • Computational Biology / methods
  • Eukaryota
  • Humans
  • Leucine
  • Mammals
  • Protein Sorting Signals / genetics*
  • Repetitive Sequences, Amino Acid*

Substances

  • Amino Acids
  • Protein Sorting Signals
  • Leucine