Abstract
The function of a protein is enabled by its conformational landscape. For non-rigid proteins, a complete characterization of this landscape requires understanding the protein’s structure in all functional states, the stability of these states under target conditions, and the transition pathways between them. Several strategies have recently been developed to drive the machine learning algorithm AlphaFold2 (AF) to sample multiple conformations, but it is more challenging to a priori predict what states are stabilized in particular conditions and how the transition occurs. Here, we combine AF sampling with small-angle scattering curves to obtain a weighted conformational ensemble of functional states under target environmental conditions. We apply this to the pentameric ion channel GLIC using small-angle neutron scattering (SANS) curves, and identify apparent closed and open states. Under resting conditions, we find that the best fit to experimental SANS data corresponds to a population of only closed states, while activating conditions allow for a subpopulation of open channels, matching both experiments and extensive simulation sampling using Markov state models. The predicted closed and open states closely resemble crystal structures determined under resting and activating conditions respectively, and project to predicted basins in free energy landscapes calculated from the Markov state models. Further, without using any structural information, the AF sampling also correctly captures intermediate conformations and projects onto the transition pathway resolved in the extensive sampling. This combination of machine learning algorithms and low-dimensional experimental data appears to provide an efficient way to predict not only stable conformations but also accurately sample the transition pathways several orders of magnitude faster than simulation-based sampling.
Author summary The dynamic behavior of proteins is key to their function, including nerve signaling, enzyme catalysis, and cellular regulation. These functions rely on precise movements and shape changes that allow proteins to interact with other molecules. Understanding protein structures and their evolution at the atomic level is thus crucial for many applications such as drug development, but remains a challenging problem. High-resolution experimental techniques can determine the structural states of many proteins, but often struggle to capture less-populated states. While computational approaches can model protein dynamics, they can be expensive and are typically limited to short time scales that may not encompass the full range of biologically relevant behavior. Recently, artificial intelligence-driven tools like AlphaFold2 (AF) have emerged to predict protein structures with high accuracy. However, they usually default to predicting a single structure, and while modified workflows allow for sampling of alternative states, it can be difficult to assess their functional relevance. Here, we introduce a method that combines AlphaFold2 with small-angle scattering data to predict multiple protein states and their frequencies under specific biological conditions. This approach offers a computationally efficient alternative for integrating experimental data with computational methods, providing a new tool for studying protein dynamics.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵* haloi{at}kth.se, erik.lindahl{at}dbb.su.se