Introduction

The onset and proliferation of cancer stems from dynamic changes that result from a series of changes in cellular interactions governing a complex network1,2,3. Inspired by the previous work of Teschendorff4, examining properties of such networks at various states may aid in the understanding of certain cellular processes leading to tumorigenesis. One key property is the notion of robustness, or the ability of a system to adapt to dynamic changes and perturbations while still maintaining functionality. From this perspective, a fundamental hurdle to cancer therapy is acquired tumor robustness5. On the other hand, quantification of robustness and in particular, that of cancer networks, has remained elusive. Understanding and exploiting such network properties from a biological perspective provides an alternative framework to viewing underlying mechanisms. In turn, this may guide and uncover new drug targets.

In this work, we demonstrate the role of curvature as system-level characteristic of certain cancer networks and its relationship to network functionality in terms of a notion of robustness6,7 , specifically at the local interaction level. Curvature, in the broad sense, is a measure by which a geometrical object deviates from being flat and is defined in varying manners given the context8. Our reference to curvature will be restricted to Ricci curvature and its contraction, scalar curvature. Similarly, “robustness” can be formally defined in terms of the rate function from the theory of large deviations by appealing to the Fluctuation Theorem6,7. Roughly speaking, robustness relates to the rate at which a given dynamical system returns to its original (normal) state following a perturbation or external disturbance. The key ingredient that intimately links curvature and robustness is the concept of entropy. Indeed, through a suitable characterization on the lower bound of Ricci curvature9, one can show that entropy and curvature are positively correlated, a fact that we express as ΔS × ΔRic ≥ 0 and where ΔS and ΔRic are the changes in entropy and Ricci curvature, respectively (see Methods)10. Now, if we consider random perturbations to the network, the Fluctuation Theorem asserts that ΔS × ΔR ≥ 0 where ΔR is a relative change in robustness and hence, the relationship ΔRic × ΔR ≥ 0 holds. As we will argue in this work, this tacit relationship to robustness will allow curvature to serve as an alternative, yet powerful proxy (Fig. 1). This seems especially true for cancer networks.

Figure 1
figure 1

This work focuses on analyzing robustness with respect to pairwise interactions.

Systems equipped with multiple signaling pathways can be framed in the context of robustness. Whereas previous work has shown dynamic entropy as a cancer “hallmark” through a nodal characterization, we expound upon this by providing a framework to analyzing gene-to-gene interaction robustness. In doing so, we will show that the method herein presents no “loss of information” and may be aptly suited to uncover particular pathways contributing to the robustness of cancer systems.

Our work here differs from previous approaches of characterizing network robustness4,6,7,11 in several important aspects. To the best our knowledge, it is the first to express general network functional robustness through curvature and to point out that this may provide an intrinsic cancer characteristic. With regards to entropy, our utilization of curvature holds the following key advantages: (I) Ricci curvature provides pairwise information over all possible pathways as opposed to network entropy, which is defined as a nodal measure7. This is particularly significant due to the fact that we are interested in specific gene-to-gene interactions contributing to the resilience of cancer (including those “hidden” interactions not necessarily defined by the underlying topology). In short, previous work of network entropy exhibits a “loss of information” with regards to the robustness of the interactions themselves4,7. (II) Ricci curvature can be formulated as a simple linear program and is well-behaved as compared to network entropy12. (III) Scalar curvature, in a similar manner to network entropy, is defined as a nodal measure in which interactions are not explicitly described.

In the present work, we compare gene co-expression networks from cancer and adjacent-normal tissue samples using network curvature. Motivated by previous entropic studies4, we fix the underlying topology of the networks using prior data on known physical interactions between gene products allowing only the weights to evolve between normal and tumor networks. Then, by treating each network as a random walk, we attempt to exploit the underlying dynamics of specific gene-to-gene interactions.

Finally, we should note that the methods explicated in the present work are applicable not only to cancer networks, but may also assist in unifying several phenomena in molecular biology for which notions of robustness (via entropy and curvature) seem to be increasingly important5,13,14,15. For example, recent work has demonstrated that local signaling entropy may serve as a novel indicator of drug sensitivity13 while at the same time, may operate as a proxy for the height or elevation in Waddington’s differentiation landscape14. Furthermore, it has been argued that feedback loops are essential to the function of biological mechanisms and systems that arise from deliberate Darwinian-like principles5,15. In what follows, one can view Ricci curvature as a new feedback measure, i.e., the number of triangles in a network (redundant pathways) can be characterized by a lower bound of Ricci curvature5,16,17. This fascinating interplay between feedback, robustness, entropy and now Ricci curvature is at the core of this work.

The remainder of this paper is outlined as follows. We first provide results to demonstrate that Ricci curvature, more precisely Ollivier-Ricci curvature18,19, is a proxy for robustness as well as an apparent cancer characteristic. In particular, we discuss the importance (and previously unresolved) ability of quantifying robustness at the interaction level. We then show that several analogous nodal curvature measures, defined through varying contractions of Ricci curvature, achieve similar results to that of network entropy, which by construction, is a nodal attribute. We conclude with a discussion of the results with a primary focus on information loss from previous entropic methods, examination of robustness for specific gene-to-gene interactions in context of cancer biology and analytic advantages of employing Ricci curvature as opposed to entropy. From this, we then offer a possible path forward that relates the well-known Ricci flow to the effect (and design thereof) of specific drug targets that can possibly mitigate the robust nature of cancer.

Results

We focus our investigation primarily on transcription networks composed of metabolic and cancer specific genes20,21. For each data set, gene co-expression networks were generated by calculating the non-parametric (Spearman) correlation between all pairs of genes. That is, for a given gene pair, correlation was computed across all samples within a given phenotype (normal or cancerous tissue). The metabolic data set consists of approximately 1600 metabolic genes (derived from the Recon2 human metabolic reconstruction22) of six different tumor types: breast cancer (BRCAM), head and neck squamous cell carcinoma (HNSCM), kidney papillary carcinoma (KIRPM), liver cancer (LIHCM), lung adenocarcinoma (LUADM) and thyroid cancer (THCAM). We further supplemented the above study with corresponding networks that contain approximately 500 cancer-related genes derived from the Cosmic Cancer Gene Census21 (denoted by T, e.g., BRCAT). With regards to the topology, the networks analyzing metabolic genes possess a total of 33843 edges, average degree of 43 and a median degree of 34. For the networks composed of known cancer-related genes, the total number of edges, average degree and median degree are 8162, 37 and 22 respectively (see Methods).

Gene-to-Gene Robustness: Ollivier-Ricci Curvature

We employ a neat notion of a Ricci curvature19 inspired through coarse geometry (Fig. 2). In particular, if we let (X, d) be a metric space equipped with a family of probability measures {μx:xX}, we define the Ollivier-Ricci curvature κ(x, y) along the geodesic connecting nodes x and y via

Figure 2
figure 2

Positive Ricci curvature is reflected by the characteristic that for two very close points x and y with a tangent vector v connecting xy as well as tangent vectors w (at x) and w′ (at y), in which w′ is obtained by parallel transport of w, that the two corresponding geodesics will get closer.

This can be compared to the traditional flat geometry of a Euclidean space where such distances are unaffected during the parallel transport. Equivalently, this may be formulated by the fact that the transportation distance between two small (geodesic balls) is less than the distance of their centers. Ricci curvature along the direction xy quantifies this, averaged on all directions w at x.

where W1 denotes the Earth Mover’s Distance (Wasserstein 1-metric)23,24 and d is the geodesic distance on the graph. For the case of weighted graphs, we set

where dx is the sum taken over all neighbors of node x and where wxy denotes the weight of an edge connecting node x and node y (wxy = 0 if d(x, y) ≥ 2). The measure μx may be regarded as the distribution of a one-step random walk starting from x, with the weight wxy quantifying the strength of interaction between nodal components or the diffusivity across the corresponding link (edge). To motivate this definition and highlight the role of curvature as a proxy for robustness, we compute the Ollivier-Ricci curvature for two Ornstein-Uhlenbeck19 processes generated in an identical manner except with two different “mean-reversion” rates (see Methods). An Ornstein-Uhlenbeck process describes velocity of a Brownian particle (with mass) under the influence of friction and is regarded as more realistic than simple Brownian motion. In particular, this illustrative example (Fig. 3) shows that the signal with higher curvature (red) is more capable of returning towards zero (equilibrium) in the face of the same noise (perturbations), illuminating its robustness and as argued previously via the Fluctuation Theorem. One may also consider, for motivational purposes, Ollivier-Ricci curvature on several networks with differing geometries and topologies and their functionality with respect to robustness (Fig. 4, Fig. S1). Nevertheless, the positive correlation between the rate of return to equilibrium in the Ornstein-Uhlenbeck sense and Ollivier-Ricci curvature, holds for higher dimensions and provides a simple yet informative example linking curvature to robustness.

Figure 3
figure 3

We generated two Ornstein-Uhlenbeck processes with the same parameter set except for different α and in turn, exhibits different Ollivier-Ricci curvatures:

κ(x, y) = 0.6321 (red) with α = 1.0 and κ(x, y) = 0.0952 with α = 0.1 For both signals, Ornstein-Uhlenbeck process parameters were initialized with x(0) = 1 with σ = 1. One can see that Δκ × Δα ≥ 0 for this broad set of stochastic processes.

Figure 4
figure 4

We computed the average Ollivier-Ricci curvature for three different networks shown above as well as network entropy.

To ensure a fair comparison, each of the networks is composed of 200 nodes with 400 (unweighted) edges - the only difference in the underlying structure. Although Ricci curvature is a local property, it nevertheless shows that, on average, Ricci curvature is higher for networks that exhibit higher entropy.

We then compute the Ollivier-Ricci curvature on tumor and normal tissue networks for all the studied cancer types. We begin with a characterization of the distributions for all networks composed of metabolic genes (≈1.25 M possible pairs) as well as our supplemental corresponding networks for which we examine only known cancer related genes (≈100 K possible pairs). In particular, we provide an analysis in terms of average curvature, the difference in expected value on upper/lower 5% tails of distribution along with the p-value result of a paired one-tailed Wilcoxon signed-rank test25 (Table 1, Table S1). This analysis is done, in part, to characterize the shift of distribution with respect to (cancer-normal) changes in Ollivier-Ricci curvature. As such, one can see that the difference between cancer and normal tissue distributions is “positive” with a low p-value signifying robustness. Further, one can consider the left tail of the distribution (at a given 0.1% 0.5%, 1%, 3%) as the lower bound of Ollivier-Ricci curvature as opposed to simply taking minimum value which is sensitive to topological errors. Then, it can be seen that this increase in lower bound points precisely to an increase in entropy9 (Table S2, Table S3). In all cases of examining the left tail (12 cases at 5 given lengths), the lower bound for a particular cancer network was larger than its normal counterpart. The trend also became more apparent as we decreased the tail length. The largest tail length of 5% was chosen as this was in line with the Wilcoxon signed-rank test. We also note that while we do not restrict our computation to node degree or path length, i.e., curvature is assigned to every gene pair, the average statistic was taken over those interactions with d(x, y) = 1. Revisiting equation (1), one can see curvature (and changes in curvature) for interactions “far” from the underlying topology will decay due to the term d(x, y). We should note that since a graph is a 1-geodesic space, if κ(x, y) ≥ k for d(x, y) = 1, then κ(x, y) ≥ k " x, y18. Thus, computing statistics (i.e., averages) for adjacent vertices will suffice and results are still valid (in the sense of robustness) for d(x, y) ≥ 2, e.g., non-adjacent pairs in general will contribute negligibly and can be treated as scaling such statistics.

Table 1 A distribution analysis for changes in Ollivier-Ricci average curvature between cancer and normal tissue for all metabolic case studies.

We note that the primary advantage of employing Ollivier-Ricci curvature is its ability to characterize robustness at the interaction level (as opposed to genes where entropic measures are just defined at the nodal level). In particular, we first report the top and bottom ten interactions with respect to changes in Ollivier-Ricci curvature for the case of BRCAT (Table 2, Table 3). The investigation of this network is particularly compelling as we sought to find a subset of interactions that contribute to the network resilience (and/or fragility) amongst a set of known cancer related genes. We observe the gene RNF43 exhibits several robust and fragile pathways: RNF43-RSOP3, RNF43-RSOP2, RNF43-TP53, RNF43-NONO, RNF43-POT1. This is a surprising result given that RNF43 physically interacts with very few gene products and in general, is associated as a tumor suppressor in ovarian cancer26. On the other hand, RNF43 dominates the largest changes with respect to interaction robustness with several “hidden” non-adjacent pairs. We also computed the differential co-expression (see Methods) for case of breast cancer (both BRCAM and BRCAT) and refer the readers to previous work for computational details20. In particular, we observed the ranking of interactions of differential co-expression to that of differential Ollivier-Ricci curvature vastly differ (Fig. S2), i.e., we are uncovering hidden information of the underlying system. Similar observations were gleaned from the remaining sets; however, we focus on breast cancer for the sake of brevity.

Table 2 Top 10 pairs with respect to changes in Ollivier-Ricci curvature in BRCAT.
Table 3 Bottom 10 pairs with respect to changes in Ollivier-Ricci curvature in BRCAT.

To this end, we also applied our method to analyze metabolic genes for the case of breast cancer, i.e., BRCAM (Table S4). While the data did not include various associated cancer genes (i.e., TP53, KRAS, BRAF), we were able to uncover several lesser known targets. In particular, we observed at the top of our list, the gene LPO which has been known to contribute to the initiation of breast cancer27, SOD3 has been considered an important gene in the defense against oxidative stress and prevention of estrogen-mediated breast cancer28, GOT2 has been noted to significantly affect cell growth29 and over-expression of LRAT has lead to a poor prognoses in colorectal cancer30. While a complete analysis in the context of cancer biology will be a subject of future work, the above results should be placed in the context “lost information” due to the resolution limitations of network entropy (see Discussion). In short, we now have a proxy for robustness at the local interaction level.

Gene Robustness: Scalar Curvature

Until now, we have considered Ricci curvature (in the Ollivier sense), which is defined between any two vertices on a graph. This is the main focus of the present work. However, in order to compare the curvature based approach with that of network entropy4, we now define several nodal measures based on the notion of “scalar curvature.”

In standard geometry, scalar curvature represents the amount by which the volume of a geodesic ball in a curved Riemannian manifold deviates from that of the standard ball in Euclidean space8. On a weighted graph, it may be defined in an analogous manner as:

where we contract Ollivier-Ricci curvature with respect to measure μx(y). Analyzing this contraction, we note that the measure μx(y) serves as a normalization factor that attempts to remove biasing with regards to topology (i.e., node degree). We can analogously define the unnormalized scalar curvature by contracting with respect to the hop metric, i.e.,

where the summation runs over all y such that d(x, y) = 1.

One may also consider measures where nodal curvature at x in its adjacent neighborhood can be defined as its minimum (maximum) Ollivier-Ricci curvature. Given that lower bounds of Ricci curvature are connected to entropy9, attaching this bound as a measure yields yet another characterization of nodal robustness. We should note that contracting with respect to the measure μx(y) is in the spirit of local normalized entropy defined in previous cancer studies to be 4. Similarly, contracting with respect to hop metric above is very much in line with the unnormalized entropy, i.e., 7.

After evaluating the above measures on all cancer networks for which we had data, we found that the results are consistent and comparable in the sense of cancer network differentiation and present an average nodal measure for each cancer study along with the p-value of a one-tailed paired Wilcoxon signed-rank test (Table 4, Table S5, Table S6). We see that there exists a positive shift in the distribution for both entropy and curvature with the exception of only one case where the p-value for ΔS in HNSCT was insufficient (Table S6). Given that our primary focus of this work is on the interaction level, we present the top and bottom ten pairs in the BRCAT network with respect to normalized scalar curvature (Table 5, Table S7). This is done, in part, to illustrate the unavoidable “information loss” of any nodal measure chosen. For example, we observe that although the some genes (i.e., RNF43, ETV1) possess the strongest robust interactions, they are listed in the bottom list with respect to scalar curvature. As we will argue in the next section, emphasis should be placed on interactions when analyzing network robustness.

Table 4 Comparison of different nodal measures for curvature and entropy on all networks composed of metabolic genes.
Table 5 Bottom 10 genes in BRCAT ranked with respect to scalar curvature.

Discussion

In this work, we have presented a framework to quantify interaction gene-to-gene robustness through the notion of Ollivier-Ricci curvature with an application to cancer networks. This was motivated through the intrinsic connection between entropy and Ricci curvature and in turn, robustness via the Fluctuation Theorem. From this, we demonstrated that cancer tissue exhibits a higher curvature at the interaction and gene level on all the networks tested. While these two measures may provide important biological information4, it is important to first discuss the differences in the context of our findings and in general, cancer biology. As the eventual goal is to uncover “knock-down” targets (and the effect thereof), we must also explore how one can alter network properties with respect to robustness including changes to the network geometry and topology.

We begin by revisiting the top and bottom gene-to-gene interactions of breast cancer in the studied network composed of known cancer-related genes. At the interaction level, changes in robustness need not be restricted to simply a negative/positive change–genes will tend to interact in a wide ranging manner and may contain seemingly important interactions not explicitly defined by the underlying geometry and topology. From equation (4), we clearly see that through the contraction, we “lose” information through a (weighted) average in two distinct manners. Firstly, RNF43 possesses two of the strongest and weakest pairs–averaging these together will cancel out their relative significance. Secondly, nodal measures take an average over an adjacent neighborhood thereby ignoring those interactions that are non-adjacent. As we can see, several important interacting pathways (e.g., RNF43-POT1) should not be ignored as these gene-to-gene interactions exhibit larger changes than many interactions that are adjacent. The same arguments hold for network entropy. Further, previous work on network entropy discusses the significance and careful attention one must have with respect to the topology biasing. Hence, normalization factors are often adopted to provide insight into the nodal robustness4,7. No such normalization is required when employing Ollivier-Ricci curvature.

Further, the development of a systematic approach to altering network properties to uncover potential drug targets is key. In particular, certain targets may not be directly “druggable” thereby requiring one to alter a set of genes/interactions that provide similar impact. That is, simply choosing a “knock-down” gene on nodal robustness may prove to be insufficient. To this end, one may consider the corresponding Ricci flow:

Not much is known about this flow, but the idea would be, while keeping the same topology, one would change the graph weights, or the network of links among the nodes, in such a way as to uniformize the curvature κ. In the engineering literature31, this has been offered as an approach, in the case of certain wireless networks, to have the effect of removing some of the overloaded queues and thus should have important implications for cancer networks. Understanding discrete analogues of Ricci flow in this connection will be considered as a future research topic in this connection.

Next, we would like to mention some very interesting work32 that describes a metric geometry on the space of trees in connection with phylogenetics. It turns out that their space is a moduli space (universal parameter space) and has non-positive curvature. From previous results33, this allows one to do statistics on this space since between any two points there is a unique geodesic. This has had a number of intriguing applications in cancer research34. It would be very interesting to generalize this to more general network structures and instead of just looking at the geometric (curvature) property of an individual network to devise quantitative statistical methods based on the metric geometry comparing families of networks.

Finally, the work of Rabadan34 has been largely motivated by the problem of cancer cell heterogeneity. Indeed, cancer progression is believed to follow Darwinian evolutionary pattern: fitter subtypes replace other less fit cells, which leads to disease. In combination with high-throughput genomics one can construct trees to study this process. This is an example of a deep relationship between the concepts of Darwinian evolution and Boltzmann thermodynamics6. The idea is that macroscopic entropy increases under microscopic molecular collisions, while macroscopic evolution can be (partially) explained via the concept of the increase of entropy. This reasoning is very much in line with the overall thrust of the present paper in which we are trying to use curvature (positively correlated with robustness) to quantify network robustness. The macroscopic theory is very much in line with Boltzmann thermodynamics. Evolutionary changes and network adaptability are key topics to be considered in future research.

Methods

Data

All TCGA expression data were accessed using the Broad Institute Firehose on November 4, 2014.

Two distinct approaches were used to determine adjacency matrices for the networks under study. For our study of networks of cancer-related genes from Cosmic21, we used the simple interaction data provided by Pathway Commons project (v6 - accessed in February, 2015 from http://www.pathwaycommons.org/pc2/downloads). To do this, we first downloaded the binary relationships between pairs of genes in Simple Interaction Format. We then filtered the data only for interaction type “neighborhood-of” that represents any type of pathway-based interaction between a pair of genes. We next filtered out all interactions in which either of the interacting genes was not in our cancer gene set and therefore not of interest to us.

To identify adjacent edges in the metabolic gene data set, we used the Recon2 human metabolic reconstruction22 to identify pairs of genes whose enzymatic products shared a common substrate or product. To do so, we pruned the stoichiometric matrix (S) for cofactors (ATP, ADP, NADH, NAD, NADPH, NADP, etc.) and other highly-connected metabolites which might adversely affect the adjacency calculations (e.g. water, hydrogen ions, metal cofactors). We then used this pruned stoichiometric matrix SP and the reaction-to-gene matrix (R) to generate a matrix encoding which metabolites and genes participated in common reactions (MG = S × R). Finally, to generate an adjacency matrix (A) indicating which genes participated in reactions sharing a common metabolite, we multipled the transpose of MG by itself: A = MGT × MG. The matrix A is square, with the length of each dimension equal to the number of genes in the model. The curvature analysis was also repeated after removing highly connected reactions (i.e. with greater than 4 distinct metabolite substrates/products, after pruning for highly connected metabolites) from S with qualitatively similar results.

For each data set, gene co-expression networks were generated by calculating the non-parametric (Spearman) correlation between all pairs of genes. Note that since we are working with correlation data for which values can be less than zero, the analysis was conducted with respect to the transformed correlation coefficient: in order to construct the random walk over the network4. We should also note, that one could examine and compute weights through an interesting mass action approach13,14 as opposed to a more general computation of correlation values given above. The advantage of the mass action method is that from an statistical standpoint, it allows for the analysis to be carried out in a more sample specific manner. Moreover, given that biological networks involve both negative and positive weights representing specific activating and inhibiting interactions, a subject of future research will entail directly extending the current approach to more general directed graph case following related work35.

The Wasserstein Distance

We begin by recording the basic definition of the Lp-Wasserstein distance from optimal transport theory that we will need below. Roughly speaking, on a metric measure space, one gets a natural distance on “small” balls around points or the “fuzzified” points. For full details about the Monge-Kantorovich (optimal mass transport) problem and the associated Wasserstein distance, we refer the reader to several works on this topic23,24,36,37,38.

More precisely, let X be a metric measure space, equipped with distance d. Let μi, i = 1, 2, be two measures with the same total mass and finite p-th moment. A coupling between μ1 and μ2 is a measure μ on X × X such that

In other words, the marginals of μ are μ1 and μ2. Let Π(μ1, μ2) be the set of couplings between μ1 and μ2. We then define the Lp Wasserstein distance as

In this paper, we only consider the cases p = 1, 2. For p = 1, the Wasserstein distance is sometimes called the “Kantorovich-Rubinstein distance” or the Earth Mover’s distance (EMD) and can be formulated as linear program12. In particular, let X denote a discrete metric measure space with n points denoted {x1,…,xn}. Let μ1 and μ2 be two distributions and let d(x, y) denote the distance between x, yX (for the case of graphs, this is simply taken to be the hop distance). We assume that μ1 and μ2 have the same total mass. Then, W11, μ2) may be defined as follows:

where is a coupling (or flow) subject to the following constraints:

The cost above finds the optimal coupling of moving a set of mass from distributions μ1 to μ2 with minimal “work”.

Curvature and Robustness

There have been a number of approaches19,39,40,41 to extending the notion of Ricci curvature to more general metric measure spaces. At this point, the exact relationship of one approach as compared to another is unclear. Roughly, the techniques fall into two categories: the first generalizing the weak k-convexity of the entropy functional on the Wasserstein space of probability measures as in9,39,42 and the second directly working with Markov chains to define the generalization19,40,41 on networks. There is also a notion of “hyperbolicity” due to Gromov43 based on the “thinness” or “fatness” of triangles compared to the Euclidean case and more generally a certain four-point criterion. Depending upon the application, each approach seems to be useful. In particular, we follow19,39, because of connections to notions of metric entropy.

We first define the precise notion of “robustness” to which the Fluctuation Theorem6,44, is applicable. One considers random fluctuations (perturbations) of a given network that result in deviations of some observable. Let Pε(t) denote the probability that the mean deviates by more than ε from the original (unperturbed) value at time t. Since Pε(t)→0, we want to measure its relative rate, that is, we set

Therefore, large R means not much deviation and small R large deviations. In thermodynamics, it is well-known that entropy and rate functions from large deviations are very closely related.

Next we describe the relationship of curvature and entropy given in Lott and Villani9. Let (X, d, m) denote a geodesic space and set

We define

which is the negative of the Boltzmann entropy Se(μ): = −H(μ); note that the concavity of Se is equivalent to the convexity of H. Then we say that X has Ricci curvature bounded from below by k if for every μ0, μ1P(X) there exists a constant speed geodesic μt with respect to the Wasserstein 2-metric connecting μ0 and μ1 such that

This indicates that entropy and curvature are positively correlated that we will express as

We note here that changes in robustness, i.e., the ability of a system to functionally adapt to changes in the environment (denoted as ΔR) is also positively correlated with entropy via the Fluctuation Theorem6,44 and thus with network curvature:

Since the curvature is very easy to compute for a network, this may be used as an alternative way of expressing functional robustness. This being said, we adopt the recent notion of Ollivier-Ricci curvature motivated from coarse geometry18,19.

Ornstein-Uhlenbeck Process

It is very informative to consider the relationship of the the Ollivier-Ricci curvature and robustness via a simple example18,19. We consider the Ornstein-Uhlenbeck process. The latter is a modification of the Wiener process (random walk), in which there is a tendency to converge to a central location.

More precisely, consider the stochastic differential equation

where W is Brownian motion (Wiener process) and we take x0 to be deterministic. We treat the 1-dimensional case for simplicity. Everything goes through in higher dimensions as well. The corresponding Fokker-Planck equation is

where p = p(x, t|x0, 0) is the transition probability of the underlying Markov process. One may show that p(x, t|x0, 0) is a Gaussian process with mean and variance given by45:

We see that we get transition probabilities of mean x0eαt and variance independent of x0. Since all the transitions p(x, t|x0, 0) have the same variance (and are Gaussian) the 1-Wasserstein distance46

Finally,

Equation (24) illustrates the connection of fluctuation in a very simple explicit manner. Larger α corresponds to larger curvature κ and this corresponds to how quickly the systems returns to equilibrium, that is to the mean going to 0.

Convergence to Invariant Distribution

One can also see that relationship of robustness to the Ollivier-Ricci curvature in the following manner18 dealing with Markov chains. The basic idea is that larger Ollivier-Ricci curvature indicates greater robustness via rate of convergence to the invariant (equilibrium) distribution. Specifically, suppose κ(x, y) ≥ k > 0. Then there exists a unique invariant probability measure v. Moreover, for any x,

Here,

Note that W1x, μx) represents the jump of the random walk at x. On a connected graph X with diameter D (defined as the longest graph geodesic), this yields the following estimate for the mixing time:

This example combined with the previous one provides further support that Ollivier-Ricci curvature can be employed as a natural proxy for robustness with the the distinct advantage of being easily computable.

Differential (Co-)Expression

We conclude with a simple computation of differential co-expression. Following previous work20, differential co-expression was computed using the (non-transformed) sample correlation coefficient cxy by first applying the Fisher z-transformation in order to stabilize variances due to population size:

If we let and denote the z-transformation for cancer and normal gene pairs, respectively, one can then compute the differential co-expression as

where NT and NN is the number of tumor and normal samples respectively. For differential expression values, we summed those co-differential values for a given gene’s interaction defined by the underlying adjacency matrix. This was done in order to provide a fair comparison to the values computed by scalar curvature. We again note that complete information regarding this method and data can be found in previous work20.

Additional Information

How to cite this article: Sandhu, R. et al. Graph Curvature for Differentiating Cancer Networks. Sci. Rep. 5, 12323; doi: 10.1038/srep12323 (2015).