Three Essential Resources to Improve Differential Scanning Fluorimetry (DSF) Experiments

Differential Scanning Fluorimetry (DSF) is a method that enables rapid determination of a protein’s apparent melting temperature (Tma). Owing to its high throughput, DSF has found widespread application in fields ranging from structural biology to chemical screening. Yet DSF has developed two opposing reputations: one as an indispensable laboratory tool to probe protein stability, another as a frustrating platform that often fails. Here, we aim to reconcile these disparate reputations and help users perform more successful DSF experiments with three resources: an updated, interactive theoretical framework, practical tips, and online data analysis. We anticipate that these resources, made available online at DSFworld (https://gestwickilab.shinyapps.io/dsfworld/), will broaden the utility of DSF.

The purpose of this paper is therefore to guide readers from failed to successful DSF. This begins with re-joining DSF with its theoretical underpinnings-a combination of unfolding thermodynamics and kinetics, and dye-binding-in a convenient and usable manner. In the first of three resources, we start with an empirical assessment of the ability of the current theoretical framework to describe real DSF data, and find that it overlooks two widespread features which carry important practical ramifications: kinetic influence on protein unfolding, and atypical dye activation. We present a correspondingly-updated theoretical framework and associated computational model for DSF, and demonstrate its increased ability to describe widely-observed empirical phenomena in DSF at odds with the current framework. We make this model available as an interactive online tool at DSFworld (https://gestwickilab.shinyapps.io/dsfworld/). In the second resource, we build on this updated framework to identify common but largely undescribed technical pitfalls which ruin even well-designed DSF experiments, and present experimental bestpractices to minimize them. Finally, we provide a free DSF data analysis software at www.DSFworld.com. DSFworld addresses major existing limitations in data, including customizable visualizations based on user-defined experimental variables, as well as streamlined handling of complex, multi-transition data. Together, we hope that these three resources-theory, practical tips and data analysis-will help the community perform more successful DSF experiments.

Results.
Section I: An improved theoretical framework for DSF.
In a DSF experiment, the measured fluorescence signal is a product of multiple molecular events, including protein unfolding, dye binding, and dye activation (e.g. an increase in quantum yield). The current theoretical framework (termed here "Model 1") underlying the design and interpretation of DSF experiments includes two major assumptions: (i) that protein thermal unfolding reflects thermodynamic equilibrium and (ii) that dye fluorescence provides a proxy for unfolded protein abundance.
Here, we demonstrate that this widespread theoretical framework for DSF falls short in two straightforward, yet important, ways. First, we show experimental and theoretical evidence that the kinetics of protein unfolding, not just the thermodynamics, influences the outcome of most DSF experiments. Second, we show that dye binding is not always exclusive to the unfolded state; rather, the dye sometimes binds to the native state. In other cases, the dye fails to bind the unfolded states, such that it cannot accurately reveal the unfolding process. Incorporating these considerations, we propose an improved theoretical framework for DSF, termed "Model 2".
Currently, interpretation of most DSF experiments assumes that protein unfolding is dominated by thermodynamic contributions, yet the results of extensive CD and DSC studies have suggested that kinetics plays a major role, especially for thermal unfolding 35,36 . Based on this classical literature, we hypothesized that DSF results might also include a contribution from kinetics. If so, calculated Tma values measured by DSF would be expected to depend on the rate at which the sample is heated (°C / min). To test this idea, a model thermodynamic unfolder (hen egg white lysozyme) 37 and a model kinetic unfolder (malate dehydrogenase; MDH) 38 were analyzed by DSF at systematically increased heating rates (0.5, 1, 2 and 4 ºC/min; Figure 1a). As expected, we found that the calculated Tma value for lysozyme (69 °C) was largely unaffected by heating rate (∆Tma of only 0.8 °C, calculated between the extreme heating rates of 0.25 and 2 ºC/min; Figure 1b, c). Conversely, a strong effect was observed for MDH (∆Tma = 3.1 ºC; Figure   1b, c). To understand how widespread this kinetic contribution might be, we tested six additional proteins that vary in structure and molecular mass (PPIF, PerAB, PPIE, Per2, Hsc70, CHIP/STUB1; Supplemental Table 1). Strikingly, the Tma values for all six proteins, like MDH, also varied substantially with heating rate (∆Tma = 2.2 to 5.4 ºC; Figure 1b, Supplemental Table   1). This result suggests that kinetics does significantly influence the outcomes of many DSF experiments.
To incorporate unfolding kinetics into the theoretical framework of DSF, we chose the simplest of the classic Lumry-Eyring models which combine thermodynamics and kinetics of unfolding: 39

, ⇌ →
The use of this model is informed by pioneering work on the analysis of DSC experiments 35 . However, we had to adapt it for use in simulating DSF data by including the contribution of dye binding. Specifically, we multiplied the abundance of each protein state F(T) and the kinetic partitioning between these states, L(T), by the extent to which they each activate the dye D(T). Importantly, the D(T) term is a product of both a dye's binding affinity and its quantum yield. Thus, in Model 2, the measured fluorescence is predicted to be a sum of the RFU contributions from dye binding to each of the three states (RFUnative, RFUreversibly unfolded, RFUirreversibly unfolded), and corrected for the empirically observed temperature-dependent losses in fluorescence, yielding the final model: was able to produce theoretically sound reason for why negative ∆Tma values are sometimes observed. We also noted that these negative ∆Tma values were accompanied by a systematic decrease in the slope of the transition (Figure 2d), suggesting that analysis of curve shape, not just ∆Tma, is both theoretically-justified and potentially advantageous. At the least, these results prompt re-evaluation of the current practice of discarding all negative ∆Tma hits from high throughput DSF screens.
Next, we examined why, for some proteins, DSF fails to reproduce the melting curves that are expected from CD or DSC. For example, it is relatively common to encounter DSF curves in which the fluorescence starts high at low temperature and then decreases, rather than increases, during heating. We hypothesized that this effect might be due, in part, to aberrant binding of the dye to the native, folded state. To test this idea, we generated a theoretical DSF dataset from Model 2, in which the affinity of the dye for protein states-native, reversibly unfolded, and irreversibly unfolded-was individually enhanced. We found that, when dye was activated by the native state, the resulting Tma values were aberrantly elevated (Figure 1 c), matching what is often observed in practice. In the other extreme, when dye was not activated by the unfolded states, the fluorescence was "flat" and unfolding transitions were not detectable ( Figure 2e, right panel). These results also agree with our empirical observations of some proteins (Supplemental Figure 3). Extending this concept to data analysis, we found that including a temperature-dependent initial fluorescence population in the fitted sigmoid model improves the accuracy of Tma values calculated for proteins with native-state dye activation (described in more detail in Section III). Finally, extending this concept to the bench, we found that elimination of extraneous sources of dye activation is a critical step in optimizing DSF conditions (described in more detail in Section II).
An interactive, online tool for comparison of Models 1 and 2 is available at DSFworld and the associated R script is publicly available on GitHub.

II. Practical tips, theoretically grounded
Although the theoretical framework described in Section I appears to be a useful tool to improve DSF experimental design and interpretation, it does not account for experimental artifacts.
A separate consideration of artifacts is important because common ones can produce effects on fluorescence that qualitatively resemble those introduced by using Model 1. In this section, we describe potential sources of these artifacts, alongside five best-practices to avoid them. In addition, to reducing the impact of these artifacts, the other goal of this section is to improve reproducibility and maximize sensitivity of DSF experiments.

i. Include no-protein controls for every condition.
Because DSF relies on the use of dye fluorescence as a proxy for protein unfolding, identifying and minimizing sources of protein-independent fluorescence is a critical step. Indeed, the fluorescence of SYPRO Orange is known to be sensitive to excipients that are common in biological buffers, such as glycerol, detergents, lipids and EDTA 34,43 . Here, we focus on two especially pernicious and common sources of protein-independent fluorescence: dye binding to colloidal aggregates and dye binding to the plastic used in manufacture of some microtiter plates.
Dye binding to plastic is a common problem in DSF. This artifact manifests as significant fluorescence in the absence of any protein (Figure 3a). In extreme cases, this artifact produces a fluorescence transition that, upon first inspection, mimics the shape of a curve that might result from dye binding to a folded protein (Figure 3b). One can readily discriminate between these possibilities by testing dye fluorescence in buffer without protein. An under-appreciated aspect of this artifact is that it varies between plate lots (e.g. microtiter plates manufactured at different times). Thus, because offending plates might have the same catalog number and vendor, each new lot should be tested for protein-independent dye-activation before use (see Supplemental Figure 4 for a plate compatibility test protocol).
When a DSF experiment involves addition of a small molecule, we found that an additional control must be performed to reduce artifacts associated with dye activation by to the compound.
Much like plate-related artifacts, these ones manifest as a protein-independent increase in dye fluorescence. We suspected that dye binding to colloidal aggregates might underlie some of these cases. This hypothesis is based on work describing the formation of colloidal aggregates by small molecules as a recognized mechanism of pan-assay interference compounds (PAINS) [44][45][46] . To test this idea, we assembled a panel of eight compounds that have been reported to form colloidal aggregates, but that otherwise vary in chemical structure (Supplementary Figure 5) 47,48 .
Interestingly, seven of the eight compounds induced protein-independent fluorescence in the DSF experiment, which was sufficient to obscure the melting transition of lysozyme ( Figure 4, Supplementary Figures 6 & 7). Importantly, we also found that the addition of 5X SYPRO Orange It is also recommended that each experiment includes a benchmarked positive control. For example, we typically include a DSF standard composed of 10 µM hen egg white lysozyme and 10 µM ("5X") SYPRO Orange in buffer (e.g. 10 mM HEPES, 200 mM NaCl pH 7.2 ; Tma ~ 69 °C). If this standard produces atypical data (e.g. high initial fluorescence, aberrant Tma value), this suggests a reagent-based artifact. Even with these precautions, it is sometimes difficult to completely eliminate all contributions of artifacts. For example, a protein might not be stable in the absence of detergent. In our experience, it is reasonable to proceed if the contribution of the artifact is less than 10% of the desired signal.
ii. Optimize heating rates.
One practical implication of Model 2 is that both Tma values and curve shape are sensitive to the heating rate employed. In addition to the theoretical implications, this relationship is also practically useful: in our hands, both the reproducibility and sensitivity of DSF data can be In addition to improving signal, optimized thermocycling protocols can also make DSF results more reproducible and easier to analyze. For example, shallow or poorly resolved transitions are difficult to fit using common methods, such as first derivate. When these data features are encountered, it is sometimes helpful to switch to up-down heating mode. In up-down mode thermocycling, the reaction is re-cooled to 25 ºC between heating increments (Supplemental Kd or ∆Gunfold for the system of interest, than it is unlikely that DSF will "fix" the issue. Rather, any apparent change in Tma could be the result of one of the artifacts mentioned above.

Section III. Data analysis
Inaccessibility of robust, efficient data analysis remains a significant and widespread bottleneck for DSF users. Recent reports have presented both scripts and websites for the analysis of DSF data 52-57 , we found that two substantial bottlenecks remained unaddressed. The full analysis workflow, from raw data uploading to results downloading, is available at DSFworld 58-68 and as stand-alone scripts and modular web applications on GitHub.

Discussion.
Here, we present three resources to help users design, optimize and troubleshoot DSF experiments: interactive theoretical modeling, practical tips and online data analysis. These efforts culminate in an on-line resource: DSFworld (https://gestwickilab.shinyapps.io/dsfworld/). In the first Section, we found that linking DSF experiments to established protein unfolding theory improved our ability to predict common problems and design potential solutions. For example, using a panel of seven proteins of diverse size, we observed that kinetics plays a significant role in the outcome of DSF experiments, motivating reconsideration of the thermodynamic framework that is widely used to-date. Accordingly, we present an updated theoretical model, Model 2, which includes attention to both thermodynamics and kinetics of unfolding. Using simulated results, we demonstrate that changes in the activation energy of unfolding (Ea) effect curve shape and Tma, providing a possible explanation for how legitimate ligands can sometimes decrease, rather than increase, ∆Tma. This is an important advance because such reports in the literature are often called into question as potential artifacts. Given the current interest in protein stability, a fresh, theorydriven approach to this question seems warranted.
These improved models illustrate that curve shape, not just Tma, is a useful feature of DSF experiments. For example, ligand binding might be evident by a change of curve shape, even if the ∆Tma is modestly affected. However, it is important to note here that the physical mechanisms responsible for this observation are often not clear. Going forward, it will be important to establish, using structural and computational methods, if there are specific elements of curve shape (e.g. slope, initial fluorescence, number of transitions) that are most informative. In the meantime, we suggest a broader view of DSF results than just a singular focus on Tma.
Other fields of biophysical measurement, such as SPR, have benefitted from application of user-initiated, quality control criteria. These efforts often coalesce around shared, online resources.
Towards that goal, we report an online DSF data visualization and analysis at DSFworld. As part of this effort, we include customizable data fitting, visualization and plotting features, which includes Tma calculation by first derivative or any of four sigmoidal models. Furthermore, we have made the full code for DSFworld available on GitHub, alongside both stand-alone scripts and modularized web applications for each of the individual data analysis problems resolved at DSFworld. This repository can serve as both a venue and resource for the continued improvement of challenges in DSF data analysis.
We hope readers can then use these resources-theory, technical tips, and data analysis-as a foundation to drive DSF forward through their own innovations, designing powerful experiments and completing them easily.

Materials and methods.
For all procedures, no unexpected or unusually high safety hazards were encountered.