A bird’s eye view to the homeostatic, Alzheimer and Glioblastoma attractors

Available data for white matter of the brain allows to locate the normal (homeostatic), Glioblastoma and Alzheimer’s disease attractors in gene expression space and to identify paths related to transitions like carcinogenesis or Alzheimer’s disease onset. A predefined path for aging is also apparent, which is consistent with the hypothesis of programmatic aging. In addition, reasonable assumptions about the relative strengths of attractors allow to draw a schematic landscape of fitness: a Wright’s diagram. These simple diagrams reproduce known relations between aging, Glioblastoma and Alzheimer’s disease, and rise interesting questions like the possible connection between programmatic aging and Glioblastoma in this tissue. We anticipate that similar multiple diagrams in other tissues could be useful in the understanding of the biology of apparently unrelated diseases or disorders, and in the discovery of unexpected clues for their treatment. Graphical abstract In brief Aging, carcinogenesis and Alzheimer’s disease onset in white matter of the brain are shown as paths or directions in gene-expression space, a simple view that allows the analysis of their mutual relations and to rise interesting questions such as whether programmatic aging could be related to avoiding the Glioblastoma. Highlights Normal homeostatic, Glioblastoma and Alzheimer’s disease attractors are apparent in gene-expression space The relative disposition of paths for carcinogenesis and Alzheimer’s disease onset reproduce known relations between these diseases The observed corridor for aging is consistent with programmatic aging Avoiding the fall into the huge basin of the Glioblastoma could be the subject of selection pressure Aged normal samples could be captured by the weak Alzheimer’s disease attractor


INTRODUCTION
A well known paradigm in molecular genetics expresses that local maxima of fitness in gene expression space are related to biological viable states [1].This picture has been applied to the description of cell fates along differentiation lines [2].However, to the best of our knowledge, there are no plots based on real data for a given tissue representing at least a partial landscape with more than two of these maxima.In the present paper, we provide a drawing for white matter of the brain in which the normal state (N) is represented along with the glioblastoma (GBM) attractor and the seemingly modest maximum related to Alzheimer's disease (AD).
The plot shows that aging is a common risk factor for GBM and AD and, at the same time, that GBM and AD are opposite alternatives, as epidemiological [3,4,5,6] and molecular biology studies [7,8,9] suggest.The plot indicates also a path or corridor for normal aging, in accordance with the programmatic aging theory [10,11].
At the gene level, there are genes varying in the same way in the aging, AD progression and cancer processes, whereas there are also genes indicating the disjunctive between AD and GBM.An example of the latter is the MMP9 protein-coding gene, playing an important role in tumor invasion [12,13], but known also as a neuroprotector, controlling the interactions between axons and beta-amyloid fibers [14].Deviations of the gene expression value from its reference in normal tissue may indicate either a potential progression to AD (under-expression) or to GBM (over-expression).This unusual view, following from a simple plot, may help understand the relations between AD and GBM biology and identify useful gene markers for both processes.As an extra bonus, the plot allows to rise very interesting questions which are to be discussed below.

The N + GBM + AD diagram
Our starting point is the principal component analysis [15] diagram of gene expression data for white matter of the brain, shown in Fig. 1a).Four groups of samples are apparent in this figure.Samples labeled as N and GBM correspond, respectively, to pathologically normal and tumor specimens in The Cancer Genome Atlas data for Glioblastoma (TCGA, https://www.cancer.gov/tcga)[16].They are taken during surgery procedures.Tumors are geographically localized in different brain zones but, as it is common for Glioblastoma, they are white matter tumors As mentioned, we use gene expression data, in FPKM format, from Refs.[16,20].The data was obtained by using different platforms.We took the approximately 30,000 genes that are perfectly identified in both platforms and perform a simple Principal Component Analysis (PCA) [15], described elsewhere [19,21].The common reference used to define log-fold differential expression values and compute the covariance matrix for the PCA is the geometric mean in the N state.The 5 samples in the N state come from the TCGA data.There are also 169 GBM samples.On the other hand, in the Allen Institute data for white matter of the brain there are 47 control samples, which conform our old (O) group, and 28 AD samples.
There are both conceptual and technical issues arising when using these two dissimilar experiments in a single PCA calculation.For example, the reference N is not precisely the normal state, but a set of pathologically normal samples taken from individuals with GBM tumors.Two of the patients are even older than 70 years.From the computational side, on the other hand, one could use batch corrections [22,23], which partially amend the biases associated to each group of samples, but may introduce also uncontrolled artifacts.
Thus, we decided to take the data as it is, and use the simplest PCA technique, without any sophistication.We don't believe that any correction will essentially change the qualitative analysis following from the 3-attractors diagrams shown in Fig. 1a).
The ideal situation would be to repeat the studies within a unique technological framework, and to include data from young normal people, which should be used to set the reference for differential gene expression calculations, to include data from GBM and AD patients, and data for normal patients in different age ranges.This is particularly feasible in a mouse model [24].We look at our Fig 1a ) diagram as a qualitative approximation to this ideal experiment.
Thus, in our approximation we get a gene expression space landscape with 3 attractors: N, GBM and AD, and a set of O samples moving towards the latter.The relative positions and main transitions between attractors are summarized in Fig. 1b).We assume that they are determined by the Biology underlying the processes in the tissue.The N to AD transition is labeled as "early onset of AD" in order to stress that there is also a way to AD through aging, the "late onset of AD".A path for aging is also signaled in the figure.We shall come back to this point below.

Fitness landscape
There is still an additional qualitative information which can be introduced in our description.It is related to a fitness variable, in such a way that we draw a kind of Wright's diagram [1].A schematic drawing containing a contour plot of fitness is represented in Fig. 1c).The N and GBM attractors are fitness maxima, and they should be separated by a low-fitness barrier [21].The GBM should be the highest maximum [21,25].On the other hand, the transition from O to AD is quasi-continuous, with a relatively small number of differentially expressed genes [21].It means that there is a very small barrier or even a barrier-free path connecting O and AD.We expect a low-fitness barrier preventing the direct transitions from N to AD, and a small AD maximum, as this attractor is located in the far from N lowfitness region.All of these facts are represented in Fig. 1c).The scheme is constructed from a sum of Gaussians centered at the attractors, with standard deviations proportional to the actual values observed in Fig. 1a), and with heights qualitatively respecting the relative strengths of attractors.
Let us stress the meaning of a Wright's diagram in a brain tissue.In other tissues somatic evolution is mainly related to stem cell replications.But, in its normal state, brain is a very slowly replicating tissue

MAIN RESULTS
On the basis of our diagrams, we may formulate the following remarks or statements, which are the main results of the paper: 1.There is a direction in gene expression space, which roughly speaking may be identified with the PC1 axis, associated with aging and with an increase in the risk for AD and GBM.
Indeed, displacement along this direction implies partially climbing the low-fitness barriers separating N from the AD and GBM states, and thus augmenting the risk for both AD and GBM.
It is worth looking at the main genes involved in this process.To this end, we look at the unitary vector along the PC1 axis.Genes are ranked according to their contribution to the vector.The procedure is similar to the Page Rank algorithm [29].We used it in our previous work [19].A list with the first 100 genes in the ranking is given in Supplementary Table I.Positive amplitudes defines genes which expression increases in the displacement along the positive direction of PC1, whereas negative amplitudes refer to silenced genes.These genes should simultaneously play a crucial role in aging, GBM and AD.
Of course, due to the qualitative-only value of our analysis, the genes and specially the ranking should be taken with care.Nevertheless, notice that 20 of the silenced genes are related to the Transmission across chemical synapses pathway.In Supplementary Table II  Above, we mentioned MMP9 as an example of genes playing opposite roles in GBM and AD.The UBE2C protein-coding gene is another known gene with this characteristic [38,39].Fig. 1d) shows violin plots for the differential expression of both genes in N, AD and GBM samples.They are overexpressed in the N to GBM transition, but silenced in the early N to AD transition.
Notice also in Supplementary Table III the presence of many ribosome proteins, small nuclear, micro RNA and other genes, inversely regulated in both processes.

3.
There is an aging corridor, that is a preferential path for aging in gene expression space.
In our data, there are samples in the N region and samples corresponding to normal aged brains, located in a definite region close to the AD attractor.In other words, the process of aging seems to define a trajectory or corridor of continuously decreasing fitness, from which the O data shows the last segment.
Samples in the intermediate region are, however, lacking.
Instead of including additional samples to our figure, which would introduce additional batch effects, we use recent results in a mouse model [24] showing undoubtedly a continuous corridor for aging.We give in Supplementary Fig. 1 a replot of their data for corpus callosum, a white matter rich region.In the left panel, the first two principal components are plotted for the centers of the subgroups of samples.Mouse ages between 3 and 28 months are considered, the latter is roughly equivalent to 80 years in a human scale.A corridor for aging is apparent.The right panel, on the other hand, shows true distances including all the components.Thus, the projections into the (PC1, PC2) plane are a fair representation of the actual distribution of points.
In our scheme, Fig. 1b), an aging corridor is delineated.Fig. 1c) suggests that the corridor is a direction with minimal decrease of fitness.
A preferential direction or corridor for aging is consistent with the hypothesis of programmatic aging [10,11], i.e. the idea that aging is programmed in our genes.

4.
The predetermined aging corridor could be related to the pressure of avoiding the strong GBM attractor.
A very interesting question to answer is why is it a preferred direction for aging selected.Our oversimplified scheme Fig. 1c) offers an unexpected answer to this question: in white matter it could be related to the pressure of avoiding the strongest GBM attractor.
Indeed, for each small portion of the tissue, we may model aging as a kind of random motion starting in the N region.A similar model was used in Ref. [40] in order to describe somatic evolution to cancer.
We first assume that the direction of jumps is random in the plane shown in Fig. 1c).Then, there is a relatively high probability for trajectories to be captured by the huge basin of the GBM attractor leading to the initiation of a tumor.This implies an enormous increase of fitness, the spread of the As an indirect check, we may compare GBM and AD incidences.In a model where the direction of jumps is random, the incidence of GBM should be much higher than that of AD.However, global incidence for glioblastoma is less than 10 in 100,000 people [42], as contrasted with the 5% of AD for people in the age interval 65-74 years, and 13% of people age 75 to 84 [43].Motion towards the GBM center is avoided.

The late onset of AD could the result of capture by the AD attractor of aged brain micro states.
The picture is, thus, as follows.The process of aging is initially related to a displacement along the aging corridor with the corresponding decrease of fitness.In the last steps, the O states are captured by the weak AD attractor.This statement is supported by calculations in Ref. [21].We already mentioned that, as a function of age, subgroups of O samples move towards the AD center.
In Supplementary Table V we show the top 10 genes in the O to AD transition.They involve genes included in the Supplementary Table I, but varying in the opposite direction, that is in the negative direction of the PC1 axis.This fact is represented in the schematic diagram given in Fig. 1b).

DISCUSSION
Our simple qualitative drawings identify directions in gene expression space associated to different biological processes: aging, carcinogenesis, AD onset.Everyone of these directions is characterized by a "metagene" or gene expression profile, from which the main genes contributing to the process can be extracted.
Some of our results confirm previous knowledge, but others require further corroborations.For example, the idea that programmatic aging could be related to avoiding the strongest GBM attractor, or the late onset of AD as the capture by the AD attractor of normal aged samples.We hope, they will motivate experimental research work along these directions.Particularly feasible is a mouse model, of which Ref.
[24] is a nice example.
Let us stress that even more refined data or computational methods could not essentially modify our qualitative schemes with only 3 attractors.Their relative positions could vary, but the formulated statements will remain.We anticipate that similar diagrams in other tissues, besides providing an integral perspective, could be useful in the understanding of the biology of apparently unrelated diseases or disorders, and in the discovery of unexpected clues for their treatment.
[17].The centers of the N and GBM clouds of samples in gene expression space define, respectively, the Normal (homeostatic) and Glioblastoma Kaufmann attractors [18,19].On the other hand, the groups labeled as AD and O correspond, respectively, to Alzheimer disease and control white matter samples in the Allen Institute study on aging and dementia (http://aging.brainmap.org/)[20].They are taken post mortem.The O group comes from normal aged patients, with ages ranging in the interval between 77 and 101 years.We studied the O to AD transition in Ref. [21].As age increases, we observe a displacement of O samples towards the center of the AD cloud.Average positions of AD subgroups of samples, however, are fixed irrespective of age.This property is apparent in Fig. 4 of Ref. [21].From these facts, we conclude that the center of the AD cloud of samples may define an attractor in gene expression space, whereas O samples are captured by the AD attractor in the process of aging.

[ 26 ]Fig. 1 .
Fig. 1.Gene expression diagrams and schematic fitness landscape.a) Principal component analysis of the data studied in the paper.N -Normal homeostatic state, GBM -Glioblastoma, AD -Alzheimer disease state.The normal old samples are denoted by O. b) Schematics of the transitions between attractors.c) A Wright diagram showing a hypothetical contour plot of fitness.The absolute maximum corresponds to the GBM state.The AD attractor is represented as a slight local maximum.d) Violin plot for the log-fold changes of MMP9 and UBE2C genes in N, AD and GBM states.The geometric mean of the expression in the N state is taken as reference in order to compute differential expression values.
tumor in brain and a life expectancy for the individual of only around two years after initiation [41].It may impact on individuals of the reproductive age.Thus, avoiding the GBM attractor could be the subject of selection pressure.

Fig. 1a )
Fig. 1a) should be completed with data corresponding to other kinds of dementia or brain disorders.In particular, one should expect a Parkinson disease area close to the AD attractor and opposite to GBM [44].The whole picture may reveal a still finer topology of gene expression space and a richer Wright diagram.

There is a direction in gene expression space, which may be roughly identified with the PC2 axis, showing that AD and GBM are excluding alternatives.
support this disjunctive.Consequently, the PC2 axis involve genes inversely deregulated in AD and GBM.In Supplementary TableIII, we list the top 100 genes defined by the unitary vector along the PC2 axis.Positive weights correspond to genes which expression increases in the N to AD transition.On the other hand, negative amplitudes correspond to genes with increasing expression in the N to GBM transition.
[33]ist the main Reactome pathways associated to these genes[30].There are 56 annotated genes in this set.Decreased synaptic function is a known feature of the aged brain, according to the review[31].The second main characteristic, according to this reference, is an increased immune function, which is not particularly apparent in our set of genes.Instead, we observe genes related to Neurotoxicity of clostridium toxins[32], to a decrease of mitochondria activity[33], micro RNAs shared between AD and GBM [34], etc.2.Only 18 of the genes in our set are annotated in Reactome pathways.The pathways are seen in Supplementary TableIV.They are related to control of the cell cycle, DNA replication, apoptosis, modification of the extracellular matrix, etc, i.e. to cancer hallmarks [35-37].