Abstract
In most cases, the application of machine learning techniques to biological sequence data requires a vector representation of the sequences. Extracting the numerical features from sequence data can be time consuming, especially if the user lacks programming skills. To this end, we propose a Weka package called WeSeqMiner, which provides several useful filters for extracting numerical features from sequence data for use in the Weka machine learning workbench. Motivated with an example, we show that the WeSeqMiner package integrates well with the Weka API, allowing transformations to be incorporated into Weka workflows for predictive model generation. WeSeqMiner can be installed by pointing the Weka package manager to the URL github.com/djhogan/WeSeqMiner/raw/master/WeSeqMiner.zip. The Javadoc for WeSeqMiner classes can be accessed at djhogan.github.io/seqminer.