Abstract
Amyloids are fibrillar protein aggregates with simple repeated structural motifs in their cores, usually β-strands but sometimes α-helices. Identifying the amyloid-prone regions within protein sequences is important both for understanding the mechanisms of amyloid-associated diseases and for understanding functional amyloids. Based on the crystal structures of seven cross-β amyloidogenic peptides with different topologies and one recently solved cross-α fiber structure, we have developed a computational approach for identifying amyloidogenic segments in protein sequences using the Associative memory, Water mediated, Structure and Energy Model. The AWSEM-Amylometer performs favorably in comparison with other predictors in predicting aggregation-prone sequences in multiple datasets. The method also predicts the specific topologies (the relative arrangement of β-strands in the core) of the amyloid fibrils well. An important advantage of the AWSEM-Amylometer over other existing methods is its direct connection with an efficient, optimized protein folding simulation model, AWSEM. This connection allows one to combine efficient and accurate search of protein sequences for amyloidogenic segments with the detailed study of the thermodynamic and kinetic roles that these segments play in folding and aggregation in the context of the entire protein sequence. We present new simulation results that highlight the free energy landscapes of peptides that can take on multiple fibril topologies. We also demonstrate how the Amylometer methodology can be straightforwardly extended to the study of functional amyloids that have the recently discovered cross-α fibril architecture.