Deciphering apicoplast targeting signals--feature extraction from nuclear-encoded precursors of Plasmodium falciparum apicoplast proteins

Gene. 2001 Dec 12;280(1-2):19-26. doi: 10.1016/s0378-1119(01)00776-4.

Abstract

The malaria causing protozoan Plasmodium falciparum contains a vestigal, non-photosynthetic plastid, the apicoplast. Numerous proteins encoded by nuclear genes are targeted to the apicoplast courtesy of N-terminal extensions. With the impending sequence completion of an entire genome of the malaria parasite, it is important to have software tools in place for prediction of subcellular locations for all proteins. Apicoplast targeting signals are bipartite; containing a signal peptide and a transit peptide. Nuclear-encoded apicoplast protein precursors were analyzed for characteristic features by statistical methods, principal component analysis, self-organizing maps, and supervised neural networks. The transit peptide contains a net positive charge and is rich in asparagine, lysine, and isoleucine residues. A novel prediction system (PATS, predict apicoplast-targeted sequences) was developed based on various sequence features, yielding a Matthews correlation coefficient of 0.91 (97% correct predictions) in a 40-fold cross-validation study. This system predicted 22% apicoplast proteins of the 205 potential proteins on P. falciparum chromosome 2, and 21% of 243 chromosome 3 proteins. A combination of the PATS results with a signal peptide prediction yields 15% potentially nuclear-encoded apicoplast proteins on chromosomes 2 and 3. The prediction tool will advance P. falciparum genome analysis, and it might help to identify apicoplast proteins as drug targets for the development of novel anti-malaria agents.

MeSH terms

  • Algorithms
  • Amino Acids / genetics
  • Animals
  • Biological Transport
  • Cell Nucleus / genetics
  • Databases, Genetic
  • Neural Networks, Computer
  • Organelles / metabolism*
  • Plasmodium falciparum / genetics*
  • Plasmodium falciparum / metabolism
  • Protein Precursors / genetics*
  • Protein Precursors / metabolism
  • Protozoan Proteins / genetics*
  • Protozoan Proteins / metabolism

Substances

  • Amino Acids
  • Protein Precursors
  • Protozoan Proteins