BaCoCa--a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions

Mol Phylogenet Evol. 2014 Jan:70:94-8. doi: 10.1016/j.ympev.2013.09.011. Epub 2013 Sep 25.

Abstract

BaCoCa (BAse COmposition CAlculator) is a user-friendly software that combines multiple statistical approaches (like RCFV and C value calculations) to identify biases in aligned sequence data which potentially mislead phylogenetic reconstructions. As a result of its speed and flexibility, the program provides the possibility to analyze hundreds of pre-defined gene partitions and taxon subsets in one single process run. BaCoCa is command-line driven and can be easily integrated into automatic process pipelines of phylogenomic studies. Moreover, given the tab-delimited output style the results can be easily used for further analyses in programs like Excel or statistical packages like R. A built-in option of BaCoCa is the generation of heat maps with hierarchical clustering of certain results using R. As input files BaCoCa can handle FASTA and relaxed PHYLIP, which are commonly used in phylogenomic pipelines. BaCoCa is implemented in Perl and works on Windows PCs, Macs and Linux operating systems. The executable source code as well as example test files and a detailed documentation of BaCoCa are freely available at http://software.zfmk.de.

Keywords: GC content; Invariant positions; Phylogenomics; RCFV; Saturation; Software.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Genomics
  • Phylogeny*
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software*