## ABSTRACT

Networks are present in many aspects of our lives, and networks in neuroscience have recently gained much attention leading to novel representations of brain connectivity. Indeed, there is still room for investigation of the genetic contribution to brain connectivity. The integration of neuroimaging and genetics allows a better understanding of the effects of the genetic variations on brain structural and functional connections, but few studies have successfully investigated the longitudinal association of such a mutual interplay. Nevertheless, several Alzheimer’s disease-associated genetic variants have been identified through omic studies, and the current work uses whole-brain tractography in a longitudinal case-control study design and measures the structural connectivity changes of brain networks to study the neurodegeneration of Alzheimer’s. This is performed by examining the effect of targeted genetic risk factors on local and global brain connectivity. We investigated the degree to which changes in brain connectivity are affected by gene expression. More specifically, we used the most common brain connectivity measures such as efficiency, characteristic path length, betweenness centrality, Louvain modularity and transitivity (a variation of clustering coefficient). Furthermore, we examined the extent to which Clinical Dementia Rating reflects brain connections longitudinally and genetic variation. Here, we show that the expression of PLAU and HFE genes increases the change in betweenness centrality related to the fusiform gyrus and clustering coefficient of cingulum bundle over time, respectively. APP and BLMH gene expression associates with local connectivity. We also show that betweenness centrality has a high contribution to dementia in distinct brain regions. Our findings provide insights into the complex longitudinal interplay between genetics and neuroimaging characteristics and highlight the role of Alzheimer’s genetic risk factors in the estimation of regional brain connection alterations. These regional relationship patterns can be useful for early disease treatment and neurodegeneration prediction.

## Introduction

There are many factors which may affect the susceptibility to Alzheimer’s Disease (AD) and various ways to measure the disease status. However, there is no single factor which can be used to predict the disease risk sufficiently^{1}. Genetics is believed to be the most common risk factor in AD development^{2}. Towards studying the etiology of the disease, a number of genetic variants located in about 20 genes have been reported to affect the disease through many cell-type specific biological functions^{3}. Those efforts resulted from omic studies such as Genome-Wide Associations Studies (GWAS). GWAS highlighted dozens of multi-scale genetic variations associated with AD risk^{4–6}.

From the early stages of studying the disease, the well known genetic risk factors of AD were found to lie within the coding genes of proteins involved in amyloid-*β* (*Aβ*) processing. These include the well-known Apolipoprotein E (APOE) gene that increases the risk of developing AD^{7}, the Amyloid precursor protein (APP)^{8}, presenilin-1 (PSEN1) and presenilin-2 (PSEN2)^{9, 10}. More recently, the advancement in technologies and integration of genetic and neuroimaging datasets has taken Alzheimer’s research steps further, and produced detailed descriptions of molecular and brain aspects. Such studies have shown a great success in unveiling and replicating previous findings^{11, 12}. Shaw et al.^{13}, for example, showed that carriers of APOE are more likely to lose brain tissue, measured as the cortical gray matter, than noncarriers. Other studies have utilised the *connectome*^{14} to study different brain diseases through associating genetic variants to brain connectivity^{15}. A structural *connectome* is a representation of the brain as a network of distinct brain regions (nodes) and their structural connections (edges), calculated as the number of anatomical tracts. Those anatomical tracts are generally obtained by diffusion tensor imaging (DTI)^{16}, a method used for mapping and characterizing the diffusion of water molecules, in three-dimensions, as a function of the location. This representation highlighted a network based organization of the brain with separated subnetworks (*network segregation*) which are connected by few nodes (*network integration*)^{17}. Given such a “small-world” representation of the brain, it is also possible to represent each individual brain as single scalar metrics which summarize peculiar properties of segregation and integration^{18}. Alternatively, those global metrics can also be used to quantify local properties of specific nodes/areas. Early works demostrated that APOE-4 carriers have an accelerated age-related loss of global brain interconnectivity in AD subjects^{19}, and topological alterations of both structural and functional brain networks are present even in healthy subjects carrying the APOE gene^{20}. A more recent work has shown association between APOE expression and brain segregation changes^{21}. Going beyond the APOE gene, Jahanshad et al.^{22} used a dataset from Alzheimer’s Disease Neuroimaging Initiative (ADNI) to carry out a GWAS of brain connectivity measures and found an associated variant in F-spondin (SPON1), previously known to be associated with dementia severity. A meta-analysis study also showed the impact of APOE, phosphatidylinositol binding clathrin assembly protein (PICALM), clusterin (CLU), and bridging integrator 1 (BIN1) gene expression on resting state functional connectivity in AD patients^{23}.

Moreover, AD is a common dementia-related illness; in the elderly, AD represents the most progressive and common form of dementia. Accordingly, incorporating and assessing dementia severity when studying AD provides more insights about the disease progression from a clinical point of view. A reliable global rating of dementia severity is the Clinical Dementia Rating (CDR)^{24}. This paper uses a dataset from ADNI (http://adni.loni.usc.edu/) and presents an integrated association study of specific AD risk genes, dementia scores and structural connectome characteristics. Here, we adapted a longitudinal case-control study design to mainly examine the association of known AD risk gene expression with local and global connectivity metrics. We also aim at testing the longitudinal effect of brain connectivity on different CDR scores, and carrying out a multivariate analysis to study the longitudinal effect of gene expression and connectome changes on CDR. Our approach can be summarized in the simplistic representation in Figure 1, where specific genes affect decreases of connectivity comparing baseline and follow-up and this ultimately affects intellectual abilities and CDR scores.

## Results

### Longitudinal Connectivity Changes and CDR

Initially, we used descriptive statistics plots to visualize the data for the two populations of AD and matched control subjects. To facilitate the integrated analysis, we looked into the different sets of data individually to have a better understanding of the underlying statistical distribution of each, and chose the best analysis methods accordingly. Firstly, we plotted the global and local connectivity metrics in a way that illustrates the longitudinal change. Those longitunal changes are measured after 1 year followup from baseline screening. The global connectivity metric box plots show the baseline and follow-up distributions for both AD and controls for transitivity, Louvain modularity, characteristic path length and global efficiency (Figure 2). The figure shows that the longitudinal changes in connectome metrics are statistically significant among the AD subjects and not mere artifacts, but not within the control population which seem to have non significant changes. In fact, comparing all populations values, the only significant differences were for the AD group and for the characteristic path length (p-value 0.0057), global efficiency (p-value 0.0033), and Louvain modularity (p-value 0.0086).

Supplementary Figure S1, Supplementary Figure S2 and Supplementary Figure S3 show the distribution of the local efficiency, clustering coefficient and betweenness centrality connectivity metrics, respectively, at the baseline and follow-up (left sub-figures), as well as their absolute differences (right sub-figures), at all atlas brain regions. A list of the brain atlas region names, abbreviations and ids are available in Supplementary Table S1. Moreover, we show, in Supplementary Figure S4, the scatter and violin plots of the six CDR scores, at the baseline and follow-up. Those are the memory, orientation, judgment and problem solving, community affairs, home and hobbies, and personal care scores which take the categorical values illustrated in the Materials and Methods (and also in Supplementary Figure S4).

Both global and local connectivity features show non-symmetric distribution in the baseline, follow-up and absolute change between them. Therefore, we use non-parametric models and statistical tests in the following analysis.

### Gene Expression

We derived a list of 17 AD risk factor genes from BioMart, and retrieved 56 related probes sets. We performed a Mann-Whitney U test which aims at testing whether a specific probe set expression is different between AD and controls. For each gene, we chose the probe set that has the lowest p-value. Table 1 reports the selected probe set with the smallest p-value, at each gene. After estimating the expression of the 17 genes, as explained in the Materials and Methods, we plotted a heatmap of the related gene expression profiles showed in Supplementary Figure S5. Here, some of the genes appear to be highly expressed in the profiles (e.g. SORL1 and PSEN1), while others show very low expression (e.g. HFE and ACE).

### Association Analysis

We studied the undirected associations of the 17 gene expression with the longitudinal change in global and local brain connectivity, as well as the associations with longitudinal CDR and connectivity changes. The total sample size after integrating all the datasets was 47 participants. Firstly, we performed an association analysis of gene expression with the connectivity changes locally, at each Automated Anatomical Labeling (AAL) brain region. In Table 2 we show the top results reported along with the Spearman correlation coefficient. The APP gene, *ρ* =−0.58, p-value=1.9e-05) and BLMH, *ρ* =0.57, p-value=2.8e-05) are the top and only significant genes in the list, and associate with the change inlocal efficiency at the right middle temporal gyrus (Temporal_Mid_R AAL region) and clustering coefficient at the left Heschl gyrus (Heschl_L), respectively. Supplementary Figure S6 shows the scatter plots related to the latter scenarios.

In Table 2, there is a similar pattern observed in associations results between the clustering coefficient and local efficiency, e.g. both metrics are associated with BLMH at the left Heschl gyrus (Heschl_L), APP at the right middle temporal gyrus (Temporal_Mid_R) and PLAU at the right ngular gyrus (Angular_R). We interpret this by the strong correlation that exists between the local efficiency and clustering coefficient, at the baseline, follow-up and also, the absolute change (see Supplementary Figure S7). On the other hand, Supplementary Table S2 reports the top results of the association between gene expression and the change in brain global connectivity. In this case there is no significant associations, and therefore, all associations observed are due to chance.

### Regressing Change in Local and Global Brain Connectivity on Gene Expression

We analyzed the directed association through regressing the change in local connectivity (as a dependant variable), at each AAL region, on gene expression using (as an independent variable or predictor) a quantile regression model. Table 3 reports the top results, along with the regression coefficient, p-values and t-test statistic. PLAU was the most significant gene affecting the absolute change in betweenness centrality at left Fusiform gyrus (Fusiform_L) with an increase of 487.13 at each unit increase in PLAU expression (p-value= 3*e* − 06). Followed by the expression of HFE with an effect size of 0.1277 on the change in local efficiency at the right anterior cingulate and paracingulate gyri (Cingulum_Ant_R). Those observed associations are illustrated in Figure 3. Supplementary Figure S8, Supplementary Figure S10 and Supplementary Figure S9 shows the Manhattan plots for the -log10 of the p-values corresponding to the quantile regression models of the change in local efficiency, clustering coefficient and betweenness centrality, respectively.

Similarly, we regressed the absolute change of global connectivity measures on gene expression values and the top results are shown in Supplementary Table S3. All the results have p-values less than the threshold we set .

### Additive Genetic Effect on Brain Regions

To visualize the overall contribution of AD gene risk factors used in this work on distinct brain areas, we added up the -log10 p-values for the gene expression coefficients at each of the 90 AAL regions. The p-values were obtained from the quantile regression analysis between the gene expression values and each of the three connectivity metrics - those are the absolute difference between baseline and follow-up of; local efficiency, clustering coefficient and betweenness. Figure 4 summarizes this by 1) representing the brain connectome without edges for each one of the connectivity metric, 2) each node represents a distinct AAL region and is annotated with the name of the region, 3) the size of each node is the sum -log10 of the regression coefficient associated p-vales for all the genes. The color is assigned automatically by the BrainNet Viewer. Overall, although the gene contributions to the absolute change in local efficiency have a similar pattern to that of clustering coefficient, the contribution to betweenness centrality change is relatively small.

### Regressing the difference in CDR on the difference in Global and Local Connectivity

To asses the directed and undirected association of the longitudinal measures of global connectivity and CDR scores, we calculated the difference between baseline and follow-up visits for both CDR and global connectivity metrics, i.e. *CDR*_{baseline} − *CDR*_{follow−up} and *metric*_{baseline} − *metric*_{follow−up}, respectively. The Spearman and quantile regression results are both shown in Supplementary Table S4. We observe that the increase in overall brain segregation - through transitivity- reduces the memory over time (*β* = −6.14*e* − 06, p-value= 0.0034). On the other hand, there is a positive association between the brain integration - through global efficiency- and home and hobbies.

Similarly, in Supplementary Table S5 we looked at the monotonic effect of local connectivity metrics on the seven CDR scores, both represented as the subtraction of the follow-up visit from the baseline visit. The increase in betweenness centrality was shown to have different effects on the CDR score over the one-year time period. For example, as the betweenness centrality decreases over time, the judgement and problem solving increases in severity by 1.06e-08 over time (p-value=1.32e-17), in the frontal lobe (Frontal_Inf_Oper_L).

### Multivariate Analysis: Ridge Regression

Additionally, we regressed the difference in CDR visits (response variable; Y), one score at a time, on both the difference in global brain connectivity (predictor; X1), one connectivity metric at a time and all gene expression values (predictor; X2), using the ridge regression model. Supplementary Table S6 reports the mean squared error (the score column) and shows the top hits in the multiple ridge regression. It shows that the *α* (alpha column) could not converge, using the cross-validation, when the response variables were the judgment or personal care. However, the CDR score results show that genes and connectivity metrics have a small effect (*β*) on the response variables (the change in CDR scores over time), and the larger effects were observed when using the total CDR score (CDR_diff) as a response variable. The expression of genes have negative and positive effects on CDR change, and so are the connectivity metrics. The expression of APOE, for example, has a negative effect (*β*)of 0.24 on the change in memory score, i.e. the memory rating decreases by 0.24 as the APOE expression increases. While if the expression of APOE increases one unit, the home and hobbies score increases, over time, by 0.12.

## Discussion

Our results show that Alzheimer’s risk genes can manipulate the amount of change observed in the structural connectome, measured as the absolute difference of longitudinal connectivity metrics. Here, we show that longitudinal regional connectivity metrics, global brain segregation and integration have effects on the CDR scores. More specifically, we observe a consistent decrease, over time, in the local efficiency - a connectivity metric that measures the efficient flow of information around a node (a brain region) in its absence^{18} -in response to the increase in APP expression, at the right middle temporal gyrus (Temporal_Mid_R; see Table 2). The same connectivity metric increases over time as the expression of HFE increases, at the right anterior cingulate and paracingulate gyri (see Table 3). Furthermore, as the disease progresses, we observe a correlation between brain segregation and cognitive decline, the latter is measured as CDR memory scores. While if the brain becomes more integrated, as measured by global efficiency; it results in an improved growth of home and hobbies scores (see Supplementary Table S4).

Prescott et al.^{25} have investigated the differences in the structural connectome in three clinical stages of AD, using a cross-sectional study design, and targeted regional brain areas that are known to have increased amyloid plaque. Their work suggested that connectome damage might occur at an earlier preclinical stage towards developing AD. Here, we further adapted a longitudinal study design and incorporated known AD risk genes. We showed how the damage in the connectome is affected by gene expression, and that the change in connectome affects dementia, globally and locally - at distinct brain regions. Aside from our previous work^{21} examined the APOE associations with longitudinal global connectivity in AD, using longitudinal global connectivity metrics, this study, to the best of our knowledge, is the first of its type to include gene expression data with global and local brain connectivity. However, similar work has been done in schizophrenia structural brain connectivity, where longitudinal magnetic resonance imaging features, derived from the DTI, were associated with higher genetic risk for schizophrenia^{26}.

The results obtained here align with findings in the literature of genetics and neuroimaging. Specifically, Robson et al.^{27} studied the interaction of the C282Y allele HFE - the common basis of hemochromatosis - and found that carriers of APOE-4, the C2 variant in TF and C282Y are at higher risk of developing AD. Moreover, the HFE gene is known for regulating iron absorption, which results in recessive genetic disorders, such as hereditary haemochromatosis^{28}. According to Pujol et al.^{29}, the association between the harm avoidance trait and right anterior cingulate gyrus volume was statistically significant. In their study, they examined the association between the morphology of cingulate gyrus and personality in 100 healthy participants. Personality was assessed using the Temperament and Character Inventory questionnaire. Higher levels of harm avoidance were shown to increase the risk of developing AD^{30}. We show here that HFE expression affects the local efficiency at the right anterior cingulate gyrus (see Table 3 and Figure 3). This might indicate a possible effect of HFE expression on the personality of AD patient or the person at risk of developing the disease.

Moreover, in this study we found that the Plasminogen activator, urokinase (PLAU) expression affects the betweenness centrality (a measure of the region’s (or node) contribution to the flow of information in a network^{18}) in the left fusiform gyrus, over time (see Table 3 and Figure 3). Although the functionality of this region is not fully understood, its relationship with cognition and semantic memory was previously reported^{31}. PLAU, on the other hand, was shown to be a risk factor in the development of late-onset AD as a result of its involvement in the conversion of plasminogen to plasmin - a contributor to the processing of APP - by the urokinase-type plasminogen activator (uPA)^{32}.

When examining the linear associations between gene expression and local connectivity (see Table 2 and Supplementary Figure S6), we found that the right middle temporal gyrus, known for its involvement in cognitive processes including comprehension of language, negatively associates with APP expression. Additionally, the left Heschl gyrus positively correlates with bleomycin hydrolase (BLMH) expression. In the human brain, the BLMH protein is found in the neocortical neurons and senile plaques^{33}, microscopic decaying nerve terminals around the amyloid occurring in the brain of AD patients. Some studies^{34, 35} have found that a variant in the BLMH gene, which leads to the Ile443→Val in the BLMH protein, increases the risk of AD; this was strongly marked in APOE-4 carriers. The BLMH protein can process the *Aβ* protein precursor and is involved in the production of *Aβ* peptide^{36}.

Even though none of the AD risk genes showed a significant effect on the longitudinal change in global connectivity (see Supplementary Table S3 and Supplementary Table S2), the genes showed significant effects on local connectivity changes at regional brain areas (see Table 3 and Table 2). The global connectivity metrics of the brain, on the other hand, have shown promising results in affecting the change observed in CDR scores, including memory, judgement and problem solving, as well as home and hobbies, as shown in Supplementary Table S4. Previous work studied the association between genome-wide variants and global connectivity of Alzheimer’s brains, represented as brain integration and segregation, and found that some genes affect the amount of change observed in global connectivity^{6}. This suggests that a generalisation of the current study at a gene-wide level might warrant further analysis.

Our work provides new possible insights, though replication on a larger sample size is required. Indeed, one limitation here was the small sample size available. We needed to narrow down our selection of participants to those attended both baseline and follow-up visits, and have CDR scores, genetic and imaging information available. Another limitation is given by the use of only two time points, the baseline and the first follow-up visit. This does not allow capturing the effects of connectivity changes in a longer-term or studying the survival probabilities in AD. Extending to more time points would have been useful, but it would have further reduced the dataset. We foresee future work in using a more complex unified multi-scale model, to facilitate studying the multivariate effect of clinical and genetic factors on the connectome, besides considering the complex interplay of genetic factors.

In this work, we conducted an association analysis of targeted gene expression with various longitudinal brain connectivity features in AD. Aiming at estimating the neurodegeneration of the connectome, we obtained local and global connectivity metrics at two visits, baseline and follow-up, after 12 months. We calculated the change between the two visits and carried out an association analysis, using quantile and ridge regression models to study the relationship between gene expression and disease progression globally and regionally at distinct areas of the brain. We tested the effect of the change in connectivity on the longitudinal CDR scores through quantile regression. Furthermore, using a ridge regression model, we controlled for the genetic effects in the previous settings to study the effect of connectivity changes on the CDR change.

The present analysis was conducted in AD using a longitudinal study design and highlighted the role of PLAU, HFE, APP and BLMH in affecting how the pattern information is propagated in particular regions of the brain, which might have a direct effect on a person’s recognition and cognitive abilities. Furthermore, the results illustrated the effect of brain structural connections on memory and cognitive process of reaching a decision or drawing conclusions. The findings presented here might have implications for better understanding and diagnosis of the cognitive deficits in AD and dementia.

## Materials and Methods

### Data Description

We used two sets of data from ADNI, which is available at adni.loni.usc.edu. To fulfil our objectives, we merged neuroimaging, genetic and CDR datasets for all the participants with those three types of data at two-time points available. We considered follow-up imaging and CDR acquisition one year later than the baseline visit. Given those constraints, we ended up with a total of 47 participants. We adopted a case-control study design; 11 of the participants are AD patients, while 36 are controls. The data were matched by age, and the distribution of age in AD ranges between 76.5*±*7.4, and 77.0*±*5.1 years in controls.

#### Imaging Data

For the imaging, we obtained the DTI volumes at two-time points, the baseline and follow-up visits, with one year in between. Along with the DTI, we used the T1-weighted images and they were acquired using a GE Signa scanner 3T (General Electric, Milwaukee, WI, USA). The T1-weighted scans were obtained with voxel size = 1.2 × 1.0 × 1.0*mm*3*T R* = 6.984*ms*; TE = 2.848 ms; flip angle= 11°), while DTI obtained with voxel size = 1.4 × 1.4 × 2.7*mm*3, scan time = 9 min, and 46 volumes (5 T2-weighted images with no diffusion sensitization b0 and 41 diffusion-weighted images b= 1000*s/mm*^{2}).

#### Genetic Data Acquisition

We used the Affymetrix Human Genome U219 Array profiled expression dataset from ADNI. The RNA was obtained from blood samples and normalised before hybridization to the array plates. Partek Genomic Suite 6.6 and Affymetrix Expression Console were used to check the quality of expression and hybridization^{37}. The expression values were normalised using the Robust Multi-chip Average^{38}, after which the probe sets were mapped according to the human genome (hg19). Further quality control steps were performed by checking the gender using specific gene expression, and predicting the Single Nucleotide Polymorphisms from the expression data^{39, 40}

In this work, we targeted specific genes which have been reported to affect the susceptibility of AD. We used the BioMart software from Ensembl to choose those genes by specifying the phenotype as AD^{41}. We obtained a total of 17 unique gene names and retrieved a total of 56 probe sets from the genetic dataset we are using here.

#### Clinical Dementia Rating

The Clinical Dementia Rating, or CDR score is an ordinal scale used to rate the condition of dementia symptoms. It range from 0 to 3, and is defined by four values: 0, 0.5, 1, 2 and 3, ordered by severity, which stand for none, very mild, mild and severe, respectively. The scores evaluate the cognitive state and functionality of participants. Here, we used the main six scores of CDR; memory, orientation, judgement and problem solving, community affairs, home and hobbies, and personal care. Besides the latter, we used a global score, calculated as the sum of the six scores. We obtained the CDR scores at two-time points in accordance with the connectivity metrics time points.

#### Connectome Construction

Each DTI and T1 volume have been pre-processed performing Eddy current correction and skull stripping. Given the fact that DTI and T1 volumes were already co-registered, the AAL atlas^{42}, and the T1 reference volume are linearly registered according to 12 degrees of freedom. Tractography for all subjects has been generated by processing the DTI data with a deterministic Euler approach^{43}, using 2,000,000 seed-points and stopping when fractional anisotropy (FA) is smaller than 0.1 or a sharp angle (larger than 75°). To construct the connectome, we assigned a binary representation in the form of a matrix whenever more than three connections were present between two regions of the AAL, for any pair of regions. Tracts shorter than 30 mm were discarded. The FA threshold was chosen in a such a way that allows reasonable values of characteristic path length for the given atlas. Though the AAL atlas has been criticized for functional connectivity studies^{44}, it has been useful in providing insights in neuroscience and physiology, and is believed to be sufficient for our case study^{44}.

### Global and Local Connectivity Metrics

To quantify the overall efficiency and integrity of the brain, we extracted global measures of connectivity from the connectome, represented here in four values of network integration and segregation. Specifically, we used two network integration metrics 1) the global efficiency (*E*; Equation 1), and 2) the weighted characteristic path length (*L*; Equation 2). Both are used to measure the efficiency of which information is circulated in a network. On the other hand, we used; 1) Louvain modularity (*Q*; Equation 3), and 2) transitivity (*T*; Equation 4) to measure the segregation of the brain, that is, the capability of the network to shape sub-communities which are loosely connected to one another while forming a densely connected sub-network within communities^{17, 18}.

Suppose that *n* is the number of nodes in the network, *N* is the set of all nodes, the link (*i, j*) connects node *i* with node *j* and *a*_{ij} define the connection status between node *i* and *j*, such that *a*_{ij} = 1 if the link (*i, j*) exist, and *a*_{ij} = 0 otherwise. We define the global connectivity metrics as;
where, , is the shortest path length between node *i* and *j*, and *g*_{i↔j} is the geodesic between *i* and *j*.
where *l* = Σ_{i,j∈N} *a*_{ij}, *m*_{i} and *m*_{j} are the modules containing node *i* and *j*, respectively, and *δ* (*c*_{i}, *c*_{j}) = 1 if *c*_{i} = *c*_{j} and 0 otherwise.
where is the number of triangles around node *i*.

Using the AAL atlas, we constructed the following local brain network metrics at each region or node. We used the local efficiency (*E*_{loc,i}; Equation 5), clustering coefficient (*C*_{i}; Equation 6) and betweenness centrality (*b*_{i}; Equation 7) at each node to quantify the local connectivity. Both local efficiency and clustering coefficient measure the presence of well-connected clusters around the node, and they are highly correlated to each other. The betweenness centrality is the number of shortest paths which pass through the node, and measures the effect of the node on the overall flow of information in the network^{18}. The local connectivity metrics used in this work, for a single node *i*, are defined as follows;
where, *d*_{jh}(*N*_{i}), is the length of the shortest path between node *j* and *h* - as defined in Equation, and contains only neighbours of h 1.
where *ρ*_{hj}(*i*) is the weights of shoetest path between h and j that passes throgh i.

### Statistical Analysis

We used different statistical methods as described below; however, for the multiple testing we relied on the Bonferroni correction^{45, 46}. Where applicable, the thresholds were obtained by dividing 0.05 by the number of tests.

#### Quantifying the Change in CDR and Connectivity Metrics

To determine the longitudinal change in CDR, local and global connectivity metrics, we calculated the absolute difference between the first visit (the baseline visit) and the first visit after 12 months (the follow-up visit). Unless stated otherwise, this is the primary way of quantifying this longitudinal change we used in the analysis.

#### Estimation of Gene Expression from Multiple Probe Sets

Different probe set expression values were present for each gene in the data. To estimate a representative gene expression out of the probe set expression, we conducted a non-parametric Mann-Whitney U test to evaluate whether the expression in AD was different from those of controls. For each gene, we selected the probe set expression that has the lowest Mann-Whitney U p-value. In this way, we selected the most differential expressed probe sets in our data and considered those for the remaining analysis.

#### Spearman’s Rank Correlation Coefficient

To test the statistical significance of pair-wise undirected relationships, we used the Spearman’s rank correlation coefficient (*ρ*). The Spearman coefficient is a non-parametric method which ranks pairs of measurements and assesses their monotonic relationship. We report here the coefficient *ρ* along with the corresponding p-value to evaluate the significance of the relationship. A *ρ* of ±1 indicates a very strong relationship, while *ρ* = 0 means there is no relationship.

#### Quantile Regression

To model the directed relationship between two variables, we used the quantile regression model^{47}. This model is used as an alternative to the linear regression when assumptions of linear regression are not met. This fact allows the response and predictor variables to have non-symmetric distribution. The quantile regression model estimates the conditional median of the dependent variable given the independent variables. Besides, it can be used to estimate any conditional quantile; and is therefore robust to outliers. In this work, we used the second quantile; the median, to model the directed relationship between two variables using the quantile regression.

#### Ridge Regression

For estimating the relationship between more than two variables, we used ridge regression^{48}. The basic idea behind this model is that it solves the least square function penalizing it using the *l*_{2} norm regularization. More specifically, the ridge regression minimizes the following objective function:
where *y* is the dependent (or response) variable, *X* is the independent variable (feature, or predictor), *β* is the ordinary least square coefficient (or, the slope), *α* is the regularization parameter, *β*^{Ridge} is the ridge regression coefficient, *argmin* is the argument of minimum and it is responsible for making the function attain the minimum and is *L*_{2}(*v*) = ||*v*|| _{2} represents the L2 norm function^{49}. Moreover, we normalized the predictors to get a more robust estimation of our parameters.

#### Software

We used python 3.7.1 for this work; our code has been made available under the MIT License https://choosealicense.com/licenses/mit/, and is accessible at https://github.com/elssam/RGLCG.

## Author Contributions

S.S.M.E. conducted the integrated imaging-genetics analysis and wrote the paper. A.C. preprocessed all neuroimaging data and set the connectome pipeline. N.J.M. and A.C. gave constructive feedback on this work continuously and followed up the analysis and writing progress closely. E.R.C. gave feedback on the overall paper. The final version of the paper was proofread by all authors.

## Additional Information

### Supplementary information

Supplementary materials available for this paper.

### Competing interests

The authors declare no competing interests.

### Data Availability

The data used in this work are available at the ADNI repository (http://adni.loni.usc.edu/).

## Acknowledgements

We would like to acknowledge our funders, the Organisation for Women in Science for the Developing World (OWSD), the Swedish International Development Cooperation Agency (SIDA) and the University of Cape Town for their continuous support. Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.;Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.;Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.