Abstract
Motivation In silico methodologies to assess pharmaceutical activity and toxicity are increasingly important in QSAR, and many chemical fingerprints have been developed to tackle this problem. Among them, KEGG Chemical Function and Substructure (KCF-S) has been shown to perform well in some pharmaceutical and metabolic studies. However, the software that generates KCF-S fingerprints has limited usability: the input file must be Molfile or SDF format, and the output is only a text file.
Results We established a new Python package, KCF-Convoy, to generate KCF format and KCF-S fingerprints from Molfile, SDF, SMILES, and InChI seamlessly. The obtained KCF-S was used in a number of supervised machine-learning methods to distinguish herbicides from other pesticides, and to find characteristic substructures in taxonomy groups.
Availability KCF-Convoy is implemented as a Python package freely available at https://github.com/KCF-Convoy and the user can use the package management system “pip” and also the Docker environment.
Contact maskot{at}chemsys.t.u-tokyo.ac.jp