Multiscale networks in multiple sclerosis

Complex diseases such as Multiple Sclerosis (MS) cover a wide range of biological scales, from genes and proteins to cells and tissues, up to the full organism. We conducted a multilayer network analysis and deep phenotyping with multi-omics data (genomics, phosphoproteomics and cytomics), brain and retinal imaging, and clinical data, obtained from a multicenter prospective cohort of 328 patients and 90 healthy controls. Multilayer networks were constructed using mutual information, and Boolean simulations identified paths within and among all layers. The path more commonly found from the boolean simulations connects MP2K, with Th17 cells, the retinal nerve fiber layer (RNFL) thickness and the age related MS severity score (ARMSS). Combinations of several proteins (HSPB1, MP2K1, SR6, KS6B1, SRC, MK03, LCK and STAT6)) and immune cells (Th17, Th1 non-classic, CD8, CD8 Treg, CD56 neg, and B memory) were part of the paths explaining the clinical phenotype. Specific paths identified were subsequently analyzed by flow cytometry at the single-cell level. Graphical abstract Author Summary Complex diseases such as Multiple Sclerosis (MS) involve the contribution of a wide range of biological processes. We conducted a systems biology study of MS based on network analysis and deep phenotyping in a prospective cohort of patients with clinical, imaging, genetics, and omics assessments. The gene, proteins and cell paths explained variation in central nervous system damage, and in metrics of disease severity. Such multilayer paths explain the different phenotypes of the disease and can be developed as biomarkers of MS.

together, again using mutual information, following a hierarchy that connects each layer 125 successively, starting with genomics and working up to the phenotypic (clinical) layer. (b-f) 126 Topology of individual layer networks from the experimental data. In each of the networks, the 127 degree of each node is color-coded, with higher degrees in darker colors. The edge weights are 128 coded in grey scale in a similar manner, with a darker edge representing a higher weight, and thus 129 a higher correlation between nodes. The genomics network was enriched with the previous

138
The focus of the results is on the paths between the genes, proteins, cells and the 139 phenotype (imaging and clinical scales). Each step below shows how the paths were identified, 140 and which sources tend to be more strongly connected with the phenotype. First, descriptive 141 information about the data is given, then the networks of the layers are constructed, then Boolean 142 simulations are run, and finally the top paths are selected.

180
Flow cytometry analysis was carried out at baseline in peripheral blood mononuclear cells 181 (PBMCs) from the first 227 patients and 82 HC. Results from the cytometry analysis in this cohort  193 We built networks for each of the five layers (genetics, phosphoproteomics, cytomics,  connections. The analysis was made using the 67 subjects with complete data in all 5 layers.  229 We next sought to integrate all the layers in paths that reflect the network dynamic 230 interactions, in order to obtain a functional view of the information flow across layers. To that 231 end, we created a single network including all five layers with the same hierarchy described  networks. We counted how many times a given path appeared in the permuted networks. Focus 283 was placed on those pathways that were present in less than 1% of the permuted pathways. Out Path analysis 294 For the path analysis we use the following notation: NODE 1 > NODE 2 > NODE 3, (In 295 the case there are multiple nodes on the same layer along similar paths they appear as NODE 1 > 296 NODE 2 -NODE 3 > NODE 4, where NODES 2 and 3 could be two proteins for example.) and

Dynamic network analysis identifies gene-protein-cell paths associated with phenotype
Paths predicting MS phenotype from single-cell data 351 In order to assess some of the paths identified in the study at the single-cell level, we were available from the baseline visit (Figure 7). 363 First, we found significant linear regression models for each of the three kinases  We then applied the single-cell data to our multilayer network and paths shown in Figure   382 4. The network was made using the significant values from the linear regressions to relate  2) SNP25 > MSGB non-HLA > STAT6 > Th17 > mRNFL> ARMSS; The interaction of several phosphoproteins-cell paths and the phenotype were validated 439 by flow cytometry studies, which were based on single cell analysis. A multi-layer network 440 analysis is thus able to identify a differential activation of the immune system's multiple scales 441 in MS patients that drives the phenotype.

442
It is of course possible that there were changes on the protein and genetics level, but they 443 were not acting as mediators between the changes in the cell counts and the phenotype seen in 444 this case. These could be considered sub-level systems that may cause the changes in the higher 445 levels when concerning the phenotype.

446
The results from the multilevel network analysis with the omics data and phenotype data 447 highlight the importance of considering MS as a multiscale disease, where the layers connect 448 with varying strengths and information is filtered or strengthened across the layers (34, 39).

449
Previous studies attempted to directly link the genomic layer with the phenotypes in many  The data provided by the Sys4MS cohort was rich in the wide range of scales it covered. were conducted using PBMCs, rather than in immune cells from the central nervous system.

493
Also, the protein analysis was not performed at the single cell level but in bulk PBMCs in the 494 overall cohort. Therefore, single-cell dynamics were not captured in the first experiment.     Once individual layer networks were constructed, the features between layers were 753 connected together, again with mutual information. Not all layers are interconnected, however, 754 due to a predetermined hierarchy applied to the system (see Figure 1g). Ultimately, this Calculation of correlation for edges 761 The method to calculate the edge weights in our networks was adopted from the 762 ARACNE method (62) and simplified. The networks were constructed using mutual information, placed between all significant pairs. Weights are assigned using the normalized value of mutual 782 information, which falls between 0 (no correlation) and 1 (perfect correlation).

783
The combined network (later used for the path analysis) was constructed using Pearson

834
Noise was also added to the system, where each element has a set probability of changing 835 its state at each iteration. The effect of noise can is illustrated in Figure 9. This addition of noise 836 reflects the inherent stochasticity in biological systems as well as prevents the simulations from 837 simply settling directly into a fixed state. The noise was chosen to be 5% because this allows 838 greater differences for the cross-correlation of the signals between nodes as shown in Figure 9. 839 With no noise at all, many of the nodes remain either active or inactive for the majority of the