PT - JOURNAL ARTICLE AU - Ajay Ramakrishnan Varadarajan AU - Rohini Mopuri AU - J. Todd Streelman AU - Patrick T. McGrath TI - Genome-wide protein phylogenies for four African cichlid species AID - 10.1101/110510 DP - 2017 Jan 01 TA - bioRxiv PG - 110510 4099 - http://biorxiv.org/content/early/2017/02/21/110510.short 4100 - http://biorxiv.org/content/early/2017/02/21/110510.full AB - Background The thousands of species of closely related cichlid fishes in the great lakes of East Africa are a powerful model for understanding speciation and the genetic basis of trait variation. Recently, the genomes of five species of African cichlids representing five distinct lineages were sequenced and used to predict protein products at a genome-wide level. Here we characterize the evolutionary relationship of each cichlid protein to previously sequenced animal species.Results We used the Treefam database, a set of preexisting protein phylogenies built using 109 previously sequenced genomes, to identify Treefam families for each protein annotated from four cichlid species: Metriaclima zebra, Astatotilapia burtoni, Pundamilia nyererei and Neolamporologus brichardi. For each of these Treefam families, we built new protein phylogenies containing each of the cichlid protein hits. Using these new phylogenies we identified the evolutionary relationship of each cichlid protein to its nearest human and zebrafish protein. This data is available either through download or through a webserver we have implemented.Conclusion These phylogenies will be useful for any cichlid researchers trying to predict biological and protein function for a given cichlid gene, understanding the evolutionary history of a given cichlid gene, identifying recently duplicated cichlid genes, or performing genome-wide analysis in cichlids that relies on using databases generated from other species.