RT Journal Article SR Electronic T1 TAXyl: An in-silico method for predicting the thermal activity for xylanases from GH10 and GH11 families JF bioRxiv FD Cold Spring Harbor Laboratory SP 826040 DO 10.1101/826040 A1 Mehdi Foroozandeh Shahraki A1 Kiana Farhadyar A1 Kaveh Kavousi A1 Mohammad Hadi Azarabad A1 Amin Boroomand A1 Shohreh Ariaeenejad A1 Ghasem Hosseini Salekdeh YR 2019 UL http://biorxiv.org/content/early/2019/11/10/826040.abstract AB Xylanases are a class of enzymes with numerous industrial applications and are involved in the degradation of xylose polysaccharide, which is present in lignocellulosic biomass. The optimum temperature of enzymes is the indicator of their thermal activity and is an essential factor to be considered when choosing an appropriate biocatalyst for a particular purpose. Therefore, in-silico prediction of this enzymatic attribute is a significant cost and time-effective step in the effort to identify and characterize novel enzymes. The objective of this study was to develop an accurate computational method to predict the thermal activity status of xylanases from glycoside hydrolases families 10 and 11, the most prevalent known xylanase families. Here we present TAXyl (Thermal Activity Prediction for Xylanase), a new sequence-based machine learning method that has been trained using a selected combination of various physicochemical protein features. This ensemble of four supervised learning algorithms discriminates mesophilic, thermophilic, and hyper-thermophilic xylanases based on their optimum temperature with the process of soft-voting. TAXyl’s performance was ultimately evaluated through multiple iterations of six-fold cross-validations, and it exhibited a mean accuracy of ~0.94, F1-score of ~0.91, and MCC of ~0.9. Additionally, the model was tested on previously unseen data and depicted relatively similar performance. To the best of our knowledge, this tool is the most accurate and practical prediction tool currently available and operating on this class of enzymes. TAXyl is freely accessible as a web-service at http://arimees.com/ and provides users with several features to facilitate the characterization of GH10 and GH11 xylanases.AACAmino acid composition2AACDipeptide composition3AACTripeptide compositionANNArtificial Neural NetworksAPAACamphiphilic pseudo amino acid compositionCTDComposition, Transition, DistributionCTFconjoint triad featuresGHGlycoside HydrolaseMCCMatthews Correlation CoefficientMLPMulti-Layer PerceptronPAACPseudo amino acid compositionQSOQuasi-sequence OrderRBFRadial Basis FunctionRFERecursive Feature EliminationSOCNSequence order coupling numberSVMSupport-Vector Machine