PT  - JOURNAL ARTICLE
AU  - Akshara Pande
AU  - Sumeet Patiyal
AU  - Anjali Lathwal
AU  - Chakit Arora
AU  - Dilraj Kaur
AU  - Anjali Dhall
AU  - Gaurav Mishra
AU  - Harpreet Kaur
AU  - Neelam Sharma
AU  - Shipra Jain
AU  - Salman Sadullah Usmani
AU  - Piyush Agrawal
AU  - Rajesh Kumar
AU  - Vinod Kumar
AU  - Gajendra P.S. Raghava
TI  - Computing wide range of protein/peptide features from their sequence and structure
AID  - 10.1101/599126
DP  - 2019 Jan 01
TA  - bioRxiv
PG  - 599126
4099  - http://biorxiv.org/content/early/2019/04/04/599126.short
4100  - http://biorxiv.org/content/early/2019/04/04/599126.full
AB  - Motivation In last three decades, a wide range of protein descriptors/features have been discovered to annotate a protein with high precision. A wide range of features have been integrated in numerous software packages (e.g., PROFEAT, PyBioMed, iFeature, protr, Rcpi, propy) to predict function of a protein. These features are not suitable to predict function of a protein at residue level such as prediction of ligand binding residues, DNA interacting residues, post translational modification etc.Results In order to facilitate scientific community, we have developed a software package that computes more than 50,000 features, important for predicting function of a protein and its residues. It has five major modules for computing; composition-based features, binary profiles, evolutionary information, structure-based features and patterns. The composition-based module allows user to compute; i) simple compositions like amino acid, dipeptide, tripeptide; ii) Properties based compositions; iii) Repeats and distribution of amino acids; iv) Shannon entropy to measure the low complexity regions; iv) Miscellaneous compositions like pseudo amino acid, autocorrelation, conjoint triad, quasi-sequence order. Binary profile of amino acid sequences provides complete information including order of residues or type of residues; specifically, suitable to predict function of a protein at residue level. Pfeature allows one to compute evolutionary information-based features in form of PSSM profile generated using PSIBLAST. Structure based module allows computing structure-based features, specifically suitable to annotate chemically modified peptides/proteins. Pfeature also allows generating overlapping patterns and feature from whole protein or its parts (e.g., N-terminal, C-terminal). In summary, Pfeature comprises of almost all features used till now, for predicting function of a protein/peptide including its residues.Availability It is available in form of a web server, named as Pfeature (https://webs.iiitd.edu.in/raghava/pfeature/), as well as python library and standalone package (https://github.com/raghavagps/Pfeature) suitable for Windows, Ubuntu, Fedora, MacOS and Centos based operating system.