PT - JOURNAL ARTICLE AU - Huang, Beibei AU - Zhang, Eric AU - Chaudhari, Rajan AU - Gimperlein, Heiko TI - Sequence-based Optimized Chaos Game Representation and Deep Learning for Peptide/Protein Classification AID - 10.1101/2022.09.10.507145 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.09.10.507145 4099 - http://biorxiv.org/content/early/2022/10/29/2022.09.10.507145.short 4100 - http://biorxiv.org/content/early/2022/10/29/2022.09.10.507145.full AB - As an effective graphical representation method for 1D sequence (e.g., text), Chaos Game Representation (CGR) has been frequently combined with deep learning (DL) for biological analysis. In this study, we developed a unique approach to encode peptide/protein sequences into CGR images for classification. To this end, we designed a novel energy function and enhanced the encoder quality by constructing a Supervised Autoencoders (SAE) neural network. CGR was used to represent the amino acid sequences and such representation was optimized based on the latent variables with SAE. To assess the effectiveness of our new representation scheme, we further employed convolutional neural network (CNN) to build models to study hemolytic/non-hemolytic peptides and the susceptibility/resistance of HIV protease mutants to approved drugs. Comparisons were also conducted with other published methods, and our approach demonstrated superior performance.Supplementary information available onlineCompeting Interest StatementThe authors have declared no competing interest.