ProtoBee: Hierarchical classification and annotation of the honey bee proteome

  1. Noam Kaplan and
  2. Michal Linial1
  1. Department of Biological Chemistry, Life Science Institute, The Hebrew University, Jerusalem 91904, Israel

Abstract

The recently sequenced genome of the honey bee (Apis mellifera) has produced 10,157 predicted protein sequences, calling for a computational effort to extract biological insights from them. We have applied an unsupervised hierarchical protein-clustering method, which was previously used in the ProtoNet system, to nearly 200,000 proteins consisting of the predicted honey bee proteins, the SWISS-PROT protein database, and the complete set of proteins of the mouse (Mus musculus) and the fruit fly (Drosophila melanogaster). The hierarchy produced by this method has been entitled ProtoBee. In ProtoBee, the proteins are hierarchically organized into 18,936 separate tree hierarchies, each representing a protein functional family. By using the mouse and Drosophila complete proteomes as reference, we are able to highlight functional groups of putative gene-loss events, putative novel proteins of unique functionality, and bee-specific paralogs. We have studied some of the ProtoBee findings and suggest their biological relevance. Examples include novel opsin genes and intriguing nuclear matches of mitochondrial genes. The organization of bee sequences into functional clusters suggests a natural way of automatically inferring functional annotation. Following this notion, we were able to assign functional annotation to about 70% of the sequences. ProtoBee is available at http://www.protobee.cs.huji.ac.il

Footnotes

  • 1 Corresponding author.

    1 E-mail michall{at}cc.huji.ac.il; fax 972-2-6586448.

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4916306.

    • Received November 11, 2005.
    • Accepted June 1, 2006.
  • Freely available online through the Genome Research Open Access option.

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server