OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy

Genome Biol. 2015 Aug 6;16(1):157. doi: 10.1186/s13059-015-0721-2.

Abstract

Identifying homology relationships between sequences is fundamental to biological research. Here we provide a novel orthogroup inference algorithm called OrthoFinder that solves a previously undetected gene length bias in orthogroup inference, resulting in significant improvements in accuracy. Using real benchmark datasets we demonstrate that OrthoFinder is more accurate than other orthogroup inference methods by between 8 % and 33 %. Furthermore, we demonstrate the utility of OrthoFinder by providing a complete classification of transcription factor gene families in plants revealing 6.9 million previously unobserved relationships.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genes, Plant
  • Genomics / methods*
  • Multigene Family*
  • Phylogeny
  • Proteins / genetics
  • Software*
  • Transcription Factors / genetics

Substances

  • Proteins
  • Transcription Factors