Abstract
A new method termed “Relative Principal Components analysis” (RPCA) is introduced that extracts optimal relevant principal components to describe the change between two data samples representing two macroscopic states. The method is widely applicable in data-driven science. Calculating the components is based on a unified physical framework which introduces the objective function, namely the Kullback-Leibler divergence, appropriate for quantifying the change of the macroscopic state as it is effected by the microscopic features. To demonstrate the applicability of RPCA, we analyze the thermodynamically relevant conformational changes of the protein HIV-1 protease upon binding to different drug molecules. In this case, the RPCA method provides a sound thermodynamic foundation for the analysis of the binding process. The relevant collective (global) conformational changes can be reconstructed from the informative latent variables to exhibit both the enhanced and the restricted conformational fluctuations upon ligand association. Moreover, RPCA characterizes the locally relevant conformational changes which can be presented on the structure of the protein.