RT Journal Article SR Electronic T1 PhyKIT: a UNIX shell toolkit for processing and analyzing phylogenomic data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.10.27.358143 DO 10.1101/2020.10.27.358143 A1 Jacob L. Steenwyk A1 Thomas J. Buida III A1 Abigail L. Labella A1 Yuanning Li A1 Xing-Xing Shen A1 Antonis Rokas YR 2020 UL http://biorxiv.org/content/early/2020/10/28/2020.10.27.358143.abstract AB Diverse disciplines in biology process and analyze multiple sequence alignments (MSAs) and phylogenetic trees to evaluate their information content, infer evolutionary events and processes, and predict gene function. However, automated processing of MSAs and trees remains a challenge due to the lack of a unified toolkit. To fill this gap, we introduce PhyKIT, a toolkit for the UNIX shell environment with 30 functions that process MSAs and trees, including but not limited to estimation of mutation rate, evaluation of sequence composition biases, calculation of the degree of violation of a molecular clock, and collapsing bipartitions (internal branches) with low support. To demonstrate the utility of PhyKIT, we detail three use cases: (1) summarizing information content in MSAs and phylogenetic trees for diagnosing potential biases in sequence or tree data; (2) evaluating gene-gene covariation of evolutionary rates to identify functional relationships, including novel ones, among genes; and (3) identify lack of resolution events or polytomies in phylogenetic trees, which are suggestive of rapid radiation events or lack of data. We anticipate PhyKIT will be useful for processing, examining, and deriving biological meaning from increasingly large phylogenomic datasets. PhyKIT is freely available on GitHub (https://github.com/JLSteenwyk/PhyKIT) and documentation including user tutorials are available online (https://jlsteenwyk.com/PhyKIT).Competing Interest StatementThe authors have declared no competing interest.