Multiscale communication in cortico-cortical networks

Signaling events in brain networks unfold over multiple topological scales. Areas may exchange information over local circuits, primarily encompassing direct neighbours and areas with similar functions. Alternatively, areas may exchange information over global circuits, encompassing more distant neighbours with increasingly dissimilar functions. In the present report, we study communication in cortico-cortical networks by characterizing a region’s structural embedding over a continuous range of topological scales. We find that the centrality of a brain region varies across scales and that connection diversity determines scale preference, with less diverse unimodal regions showing preference for local communication and more diverse multimodal regions showing preferences for global communication. These preferences manifest as scale-specific structure-function relationships, with unimodal areas showing optimal coupling at local scales and multimodal regions showing optimal coupling at global scales. Altogether, the present findings reveal how functional hierarchies emerge from hidden but highly structured multiscale connection patterns.


INTRODUCTION
The brain is a network of anatomically connected neuronal populations [16]. This complex web of connections functions as a communication network, promoting signaling between brain regions [2,27]. A tendency for neuronal populations with similar functions to connect with each other gives rise to a nested hierarchy of increasingly polyfunctional neural circuits, spanning multiple topological scales [34,39,79].
Studies of network communication typically conceptualize signalling events as a global process, eschewing the possibility that communication takes places over multiple topological scales. Namely, areas may preferentially exchange information over small compact circuits encompassing direct neighbours and areas with similar functions, or over more extensive circuits encompassing more distant neighbours with dissimilar functions. An intuitive example is the worldwide air transportation network. The purpose of regional or domestic flights permitting transit between a country's regions is different from the purpose of international flights permitting transit between international hubs. The importance of an airport in this network will correspondingly depend on the type of flight we are interested in [30]. For example, Denver's, Philadelphia's and Detroit's airports are important for domestic flights within the United States, while airports in New York, Los Angeles or Chicago are important for international flights. In other words, the topological role of a node in the network depends on the scale at which it is evaluated.
By the same token, individual brain areas may exhibit characteristic interactions and communication patterns at multiple topological scales. The modular structure of the brain [33], in concert with a prominent connective core of high degree areas [65,66], creates conditions * bratislav.misic@mcgill.ca in which information can be either segregated into local clusters of highly interconnected brain regions, or globally integrated [77,78]. For instance, an area may facilitate the integration of information among its local neighbours, but lack the capacity to globally broadcast signals across the whole brain. In other words, the functional diversity of a region (i.e. who it can communicate or interact with) depends on scale. How inter-regional communication depends on topological scale is a key question in systems neuroscience [9].
Here we study how communication between brain regions unfolds over multiple scales. For a given region, we systematically define local neighborhoods of increasing size. We then track how the centrality of individual brain regions varies as the size of the probed neighborhoods increase. We find that sensory regions are preferentially central locally while association areas are preferentially central globally. We further show that variations in centrality are shaped by functional diversity. Finally, we demonstrate that structure-function coupling is scalespecific, such that functional interactions in unimodal regions are better approximated by comparing the pairwise similarity of small-scale structural neighbourhoods, while functional interactions in transmodal regions are better captured by the pairwise similarity of large-scale structural neighbourhoods.

RESULTS
The results are organized as follows. We delineate multiscale neighborhoods by parametrically tuning the range at which signals are transmitted on the white matter connectome. We subsequently investigate the propensity for brain areas to communicate with their neighbours across multiple scales using a weighted measure of regional closeness centrality. Finally, we consider how the similarity in two areas' embedding predicts their functional interactions. Data sources include (see Materials and Methods for detailed procedures): • Structural connectivity. Structural connectivity was derived from N = 67 healthy participants (source: Lausanne University Hospital). Individual connectomes were reconstructed using diffusion spectrum imaging and deterministic streamline tractography. A consistency-and length-based procedure was then used to assemble a group-representative weighted structural connectivity matrix [11,46,47].
• Functional connectivity. Functional connectivity was estimated in the same individuals using resting-state functional MRI (rs-fMRI). A groupaverage functional connectivity matrix was constructed by concatenating the regional fMRI BOLD time series from all participants and computing the zero-lag Pearson correlation coefficient between each pairs of brain region.
Analyses were performed using a network parcellation of 1000 cortical nodes [17]. They were subsequently repeated using a coarser resolution (219 nodes) and an independently collected dataset (HCP) (see Materials and Methods for more information on the Validation dataset).

Multiscale regional centrality
We first characterize local neighborhoods using unbiased random walks. Specifically, we use the transition probabilities of a random walker seeded in an individual brain region to delineate its local neighborhood. Transition probabilities were measured for 100 time scales t, logarithmically spaced between 10 −0.5 and 10 1.5 (see Methods for more details). Fig. 1a shows the effect of varying the scale of a random walk initiated at nodes located in the posterior cingulate (red), superior parietal (blue), transverse temporal (green) and insular cortex (purple), with t = 2, t = 5 and t = 10. As t is increased, the random walks are longer and the size of the probed neighborhood becomes larger, allowing us to consider communication over more expansive portions of the network.
To investigate how the role of a brain region varies across scales, we measured its weighted closeness -the inverse of its average path lengths -to other nodes in its local neighborhoods (see Methods for more details). Fig. 1b highlights the weighted closeness (C multi ) as t increases for the four brain regions depicted in panel a. Centrality trajectories for other brain regions are shown as grey lines. To allow for comparisons between scales, C multi scores are standardized relative to the distribution of scores obtained at individual scales. The relative centrality of individual brain regions varies considerably with increasing scale, such that some areas are more central, and some are less central, depending on the size of their neighbourhood. Fig. 1c illustrates the centrality of every brain region across four different topological scales. Locally, we observe clusters of highly central brain regions distributed across the whole brain; the clusters gradually evolve into larger-scale systems at more global scales.
What is the optimal scale at which individual regions communicate? Fig. 2a shows the regional closeness trajectories colored by the scale (t) at which their centrality peaks. Warmer colors indicate regions that preferentially communicate locally while cooler colors indicate regions that preferentially communicate globally. In general, we observe preference for local communication in primary sensory regions (pericalcarine cortex, transverse temporal cortex, post-central gyrus) and in limbic cortex; conversely, we observe preference for global communication in association cortex, including dorsolateral prefrontal cortex and superior parietal cortex. Fig. 2b shows the distribution of optimal values of t (top), as well as the mean multiscale closeness centrality (bottom) for seven intrinsic functional networks [76], highlighting a differentiation between limbic and unimodal (somatomotor and visual) networks versus multimodal networks (default-mode, dorsal-attention and fronto-parietal networks). The mean centralities of the seven intrinsic functional networks were also compared to their mean centrality in surrogate networks with preserved degree sequences (Fig. S1a). Limbic, somatomotor and ventral-attention networks have larger centrality than expected locally, while default-mode, dorsalattention and fronto-parietal networks have larger centrality than expected globally. Fig. 2c shows the distribution of optimal values of t (top), as well as the mean multiscale closeness centrality (bottom) for seven cytoarchitectonic classes defined by the von Economo atlas [58,69,[71][72][73], again highlighting a differentiation between sensory and association areas. The mean centrality of the seven cytoarchitectonic classes were also compared to their mean centrality measured in degree-preserving surrogate networks (Fig. S1b). Primary sensory cortex displays greater than expected local centrality scores but smaller than expected global centrality, while association cortex 2 displays smaller than expected local centrality but greater than expected global centrality. Altogether, we find that regional specialization appears to be anatomically mediated, with predisposition for local communication in unimodal regions and global communication in multimodal regions.

Multiscale connection diversity
In the previous section, we showed that regional centrality varies across topological scales. We next ask whether these variations are mediated by connection diversity. In homogeneous networks where nodes have similar topological characteristics, the local centrality of a node is expected to be similar to its global centrality. However, in heterogeneous networks, local attributes do not necessarily mirror global attributes [22]. For instance, one node may have strong connectivity with a small number of nodes, while another node may have Figure 1. Multiscale closeness centrality | (a) An unbiased random-walk process can be used to delineate local neighborhoods around single brain regions. As t, the number of iterations, is increased, the topological size of the characterized neighborhoods gets larger. Random walk process can be initiated, for example, in the posterior cingulate gyrus, (red), in the superior parietal gyrus (blue), in the transverse temporal gyrus (green) or in the insula (purple), and can delineate neighborhoods of different sizes (t = 2; t = 5, t = 10). (b) The weights of a node's transition probability vector are used to compute a multiscale measure of closeness centrality (C multi ; see Methods for more details). Grey lines represent the centrality of individual brain regions as t increases, standardized with respect to the distribution of centrality scores for individual values of t. Highlighted in red, blue, green and purple are the centrality trajectories of the four individual brain regions discussed in (a). A node's relative centrality varies largely depending on the scale of the neighborhoods. (c) The relative importance of a brain region in local communication processes can be evaluated by ranking C multi scores within small neighborhoods (t = 1; left-most). Its relative importance in global processes (global centrality) can be evaluated by ranking C multi scores within large neighborhoods (t = 10, right-most). The importance of a brain region in communication processes unfolding at intermediate scales can be similarly evaluated (e.g. t = 2 or t = 5). Darker colors indicate brain regions with large C multi ranks while lighter colors indicate brain regions with low C multi ranks. moderate but diverse connectivity with a larger number of nodes. The former is more central in a local sense, while the latter is more central in a global sense. The differential contribution of the two nodes to global communication -arising from their respective connection diversity -is reflected by their different closeness trajectories.
Before analyzing empirical brain networks, we first il-lustrate this concept using artificial networks with a predefined modular architecture. Using stochastic block models (SBM), we generated a network of n=2000 nodes with two equally-sized communities, which were further divided into two sub-communities. Ultimately, the network has 4 communities of 500 nodes (see Methods for the exact parameters used to construct the network). A two-dimensional embedding of the network, Figure 2. Optimal closeness scales | (a) We identified the optimal scale at which the closeness of a brain region peaks (topti). The closeness trajectories (left), as well as the individual brain regions (right) are colored according to the value of their topti. Sensory region are optimally central for low values of t and multimodal regions are optimally central for large values of t. (b) Violin plots representing the distributions of topti (top) as well as the mean C multi score (bottom) for seven intrisic functional networks [76]. Limbic and somatomotor networks are optimally central locally while dorsal attention and frontoparietal networks are optimally central globally. (c) Violin plots representing the distributions of topti (top) as well as the mean C multi score (bottom) for seven cytoarchitectonic classes defined by the von Economo atlas [58,69,[71][72][73]. Sensory and insular cortices are optimally central locally while association cortices are primarily central globally. Abbreviations: limbic = limbic network, sm = somatomotor network, va = ventral attention network, vis = visual network, dmn = default mode network, da = dorsal attention network, fpn = frontoparietal network, ps = primary sensory cortex, insular = insular cortex, pss = primary/secondary sensory cortex, limbic = limbic cortex, pm = primary motor cortex, ass1 = association cortex 1, ass2 = association cortex 2. generated using the ForceAtlas2 algorithm [38], is represented in Fig. 3a. To understand the position of a node in this embedding, it is important to not only analyze the strength of the interactions with its direct neighbors (i.e. degree), but to also analyze its connection diversity and the position of these neighbors in the network. The connection diversity of individual nodes can be characterized by comparing the number of within-and between-community connections, using the participation coefficient [29]. By measuring the participation coefficient at two different scales, we identify regions of the network important for communication between the four sub-communities and for communication between the two larger communities (Fig. 3b). Two nodes can have similar global centrality, as measured by their global closeness centrality, yet different local architectures. For Figure 3. Closeness trajectories in stochastic block models | (a) Two-dimensional representation of a hierarchical modular network (n=2000 nodes) generated using a stochastic-block model. Its community structure is composed of two large communities (labelled B) as well as four smaller communities (labelled A). (b) Participation coefficients of individual nodes for a 4-communities partition (left) and for a 2-communities partition (right). Nodes with large participation coefficients are located at the borders between the communities. (c) Closeness (C multi ) trajectories of the network's nodes, with the trajectories of the four numbered nodes in panel (a) highlighted in red, blue, green and purple. Pairs of nodes with the same global closeness centrality (red-purple; green-blue), have different trajectories (top), as anticipated by their respective locations in the low-dimensional embedding. Variations in a node's closeness can be quantified by measuring the local variations (slope) in their centrality (bottom). (d) Closeness trajectories of the network's nodes, colored according to their participation coefficients given a 4-communities parcellation (left) and a 2-communities parcellation (right). Nodes with a large participation coefficient show a positive slope and nodes with a small participation coefficient show a negative slope. (e) Spearman correlation between topological measures of diversity and the closeness slopes over a range of time scales. Clustering coefficient is negatively correlated with the slopes at small time scales while the two participation coefficients are strongly correlated with the slopes at larger time scales.
instance node (i) in the red community and node (iv) in the purple community have the same global closeness centrality, yet their location in the embedding suggests that they have very different connection profiles, with node iv) having more diverse connections. Their different connection diversity is captured by their different closeness centrality trajectory ( Fig. 3c; top).
To quantify variations in a node's trajectory across scales, we compute the local slopes in the closeness curves of individual nodes as t increases. (Fig. 3c; bottom). These slopes, capturing variations in a node's centrality, highlight how diverse the connectivity profile of a node is, similarly to how diversity can be quantified using the participation coefficient in modular networks with a pre-specified partition (Fig. 3d). Interestingly, slopes can be measured for any value of t, highlighting the diversity Figure 4. Multiscale functional diversity | (a) The functional diversity of a brain region can be quantified by measuring the amplitude of the local variations in a region's closeness centrality (slope) as t varies. (b) Slopes can show different topographic distribution, for select values of t. Highly diverse brain regions have a positive slope (red) while less diverse brain regions have a negative slope (blue). (c) Slope scores are correlated, as t increases, with other measures of connection diversity (node degree, clustering coefficient, participation coefficients for partitions of 16, 12, 9, 6 and 4 communities). Local measures of diversity such as degree and clustering coefficient (negative) are correlated with the centrality slopes measured at intermediate scales, with peaks at t = 1.40 (r=0.74) for node degree and t = 3.08 (r=0.66) for clustering coefficient (negative). Participation coefficients, viewed as meso-scale measures of functional diversity, are also correlated with a node's local variation in centrality. The scale at which the correlation peaks highlights the size of the communities used to compute the participation coefficient, with larger values of t measuring the role of brain regions inside larger communities. of a node's connections at the chosen scale. Slopes are correlated with local measures of diversity, such as clustering coefficient (negative correlation), when computed for small values of t, and are correlated with increasingly more global measures of diversity as t increases (Fig. 3e). In other words, by considering the centrality of a node across scales, one can quantify the multiscale diversity of a node's connections without assuming a particular partition.
The centrality trajectories of individual brain regions can be similarly measured. We computed the local slopes in the closeness centrality of individual nodes as t increases. Fig. 4a shows four different trajectories, while Fig. 4b shows how the topographic distribution of these slopes on the brain varies across scales. By considering the slope of a brain region at a single scale, we can measure its functional diversity, with highly diverse brain regions having a positive slope (red) and less diverse brain regions having a negative slope (blue). We term this property "functional diversity", rather than "connection diversity", because it encompasses the diversity of a neighbourhood of arbitrary size, as opposed to the diversity of directly connected nodes only.
To demonstrate how transitions in local closeness can highlight the functional diversity of a brain region at multiple scales, we correlated these slopes with multiple measures of connection diversity (Fig. 4c). Measures are ordered according to the scale at which they are maximally correlated with the closeness slope. Local measures, such as strength and clustering coefficient, tend to be maximally correlated at lower scales (t = 1.40 and t = 3.08, respectively). Participation coefficients -indexing the diversity of inter-modular links for a given partition -tend to be correlated with local variations in closeness at greater values of t. In particular, we compute node participation with respect to multiple community partitions, ranging from 4 to 16 communities. As the partition resolution is gradually decreased, from 16 to 4 communities, we find optimal correlations with increasing values of t. Altogether, our results demonstrate that variations in a node's centrality are mediated by its functional diversity. Interestingly, the present method also serves to highlight functional diversity without a predefined partition, making it a potentially complementary measure to more traditional diversity statistics (such as participation coefficients), without the need to explicitly define or assume a hard partition.

Multiscale structure-function coupling
We next investigate how multiscale connection patterns influence structure-function coupling. Functional connectivity between pairs of brain regions is typically computed as a correlation between the time series of their respective fMRI BOLD signals. Coherent fluctuations in neural activity are thought to arise from interactions on the underlying structural connectome [8,26,62]. A variety of pair-wise measures have been proposed to predict functional connectivity from the structural connectivity between brain regions, including structural connectivity strength, path length, search information, path transitivity [26] and communicability [8].
However, most of the proposed measures of structurefunction coupling assume a single, global relationship between the two (but see [1,8] for their use of multiscale measures). By comparing the similarity between the neighborhoods of two brain regions, we can assess how similar the dynamical processes unfolding around these two brain regions are across a variety of topological scales [57]. Nodes with overlapping neighborhoods (i.e. large neighborhood similarity) may therefore display greater functional coupling than nodes that are part of different neighborhoods [47]. We hence computed the pairwise cosine similarity between the transition probability vectors of pairs of brain regions for multiple values of t.
We first measured the Pearson's correlation between the functional connectivity strength and the neighborhood similarity of every pair of brain regions, globally. We iterated across a logarithmic range of time scale between 10 −0.5 and 10 1.5 to identify the scale for which this correlation was optimal (Fig. 5a). The correlation was optimal at t = 2.33 (r = 0.41). This correlation was greater than correlations obtained with the weighted ad-jacency matrix, (r = 0.29), the shortest paths (r = 0.31) or the communicability (r = 0.33) between pairs of nodes ( Fig. 5b). At the same time, all four measures were generally correlated with each other (Fig. 5c), with correlations ranging from r = 0.47 to r = 0.91.
Importantly, the present framework does not assume that structure-function relationships are uniform across the brain, and instead opens the possibility that functional interactions occur at different scales for different brain regions. To investigate this possibility, we estimated the optimal scale maximizing structure-function coupling for individual regions. Specifically, we computed the Spearman's correlations between the neighborhood similarity profiles of individual brain regions and the functional connectivity profiles of the same brain regions. We find that the optimal t varies considerably across brain regions (Fig. 5d). Interestingly, the pattern outlines the putative unimodal-transmodal hierarchy [43], such that functional connectivity profiles of unimodal regions are better captured by smaller neighbourhoods (small t), while profiles of transmodal regions are better captured inside more extensive neighbourhoods (Figs. 5e, f). Furthermore, the optimal scale of a brain region was not correlated with the strength of the correlation between neighborhood similarity and functional connectivity (r = −0.02, p spin = 0.78), suggesting that the larger optimal scale observed in multimodal regions is not a trivial result of their weaker structurefunction coupling [70].

Sensitivity and replication
In the present report, we delineated local topological neighborhoods using unbiased random walks. To ensure that the observed results are not dependent on our choice of this particular dynamical process, we repeated the analyses using the Personalized PageRank vector (i.e. random-walks with restarts) (Fig. S2a) and the normalized Laplacian matrix (i.e. diffusion process) (Fig. S2b). These alternative dynamical processes also permit multiscale investigations with a parameter that can be tuned to constrain the diffusion process (t for the Laplacian and α for the personalized PageRank; see Methods for more details). We found similar results using these two methods. Specifically, we once again found that (i) sensory regions are more central locally, while multimodal, especially frontal regions, are more central globally; that (ii) scale-specific variations in centrality are strongly correlated with scale-specific measures of connection diversity and that (iii) functional connectivity in multimodal brain regions is more strongly correlated with the pairwise similarity of larger-scale structural neighborhoods.
We also replicated our results in a Validation dataset (HCP), which was parcellated according to a functional parcellation of 800 nodes [56] (Fig. S3a). Finally, to assess whether the results depend on parcellation resolution, we replicated our results using the Lausanne dataset, but this time parcellated in 219 cortical brain Figure 5. Multiscale structure-function coupling | (a) Pearson correlations between edge-wise measures of communication (neighborhood similarity with respect to t, blue/dashed; communicability, green; structural weights, red; negative shortest-path, orange) computed on the structural connectome and edge-wise functional connectivity. (b) Scatter-plots of the relations between functional connectivity and (blue) neighborhood similarity at its peak (t=2.33), (green) communicability, (red) structural connectivity weights and (orange) negative shortest path. (c) Pair-wise Pearson correlations between neighborhood similarity (S), communicability (C), structural connectivity weights (A) and negative shortest path (Φ). (d) Row-standardized heatmap of the Spearman correlations between the functional connectivity of individual brain regions with the rest of the network and the neighborhood similarity of the same brain regions for increasing values of t [x-axis]. (e) Topographic distributions of the local t values at which the Spearman correlation between neighborhood similarity and functional connectivity is maximal. (f) Topographic distributions of the first (principal) gradient of functional connectivity estimated using diffusion map embedding. There is a significant relationship between the optimal value of t and this gradient (r = 0.23; p spin = 0.0002). regions (Fig. S3b). For this network, multiscale investigations were focused on smaller scales (t values logarithmically increasing from 10 −1 to 10 1 ). In both cases, we obtained similar results.

DISCUSSION
In the present report, we study how inter-regional communication occurs over multiple topological scales. By tracing the trajectory of a region's closeness in expanding neighborhoods, we identify topological attributes that mediate transitions from more localized communication to more global communication. We find that connection diversity determines scale preference, with less diverse unimodal regions showing preference for local communication and more diverse multimodal regions showing preferences for global communication. These preferences manifest as scale-specific structurefunction relationships with unimodal areas showing optimal coupling at local scales and multimodal regions showing optimal coupling at global scales.
Numerous reports have found evidence of regional differences in multiple types of centrality measures [15,31,61,67,78,80]. These studies were performed at a single scale, eschewing the possibility that communication occurs across a spectrum of local, intermediate and global scales [3]. In other words, traditional methods overlook the possibility that proximal populations with similar functions engage in a different mode of communication from more distant populations with dissimilar functions. By tracing the trajectory of a region's closeness in expanding neighborhoods, we find that regions participating in fewer integrative functions, such as primary visual, auditory and somatosensory cortices, optimally communicate at local scales. Conversely, polysensory regions in association or transmodal cortex optimally communicate at global scales. In other words, regional specialization naturally emerges from its anatomical connectivity [44,51].
By considering the centrality of a node over a spectrum of expanding local neighbourhoods, we naturally reveal the functional diversity of that node. The diversity of a brain region is typically estimated from the number of direct connections within-vs. between-modules [29,53]. Regions with diverse connection profiles, with links to many specialized communities, are theoretically well-placed to integrate information from multiple domains [5-7, 54, 78]. By considering diversity over multiple hops, we can characterize not only the diversity of a node's direct connections, but the diversity of its higher-order relationships with other regions. Moreover, our method allows for the characterization of functional diversity across a continuous range of scales, eliminating the need to partition the network into communities. This property may prove to be methodologically convenient and theoretically desirable for two reasons. First, brain networks possess prominent community structure at multiple scales, and there may not exist a single "characteristic" scale [9,10]. Second, the community structure of the brain may not be exclusively assortative [12,23,52], yet most community detection algorithms assume the presence of assortative communities [24]. By tracking functional diversity over a range of expanding neighbourhoods, we avoid having to make assumptions about the presence, nature or scale of communities.
Functional diversity is naturally intertwined with the concept of segregation and integration in the brain [60]: local connectivity among regions with similar functions promotes specialized information processing whereas global connection patterns among regions with dissimilar functions promote integrative information processing. Thus, the connection profile of a region over a spectrum of scales determines how it balances specialized and integrative functions. Our results highlight a continuous gradient of localized versus distributed processing [37], and are therefore in line with an emerging literature emphasizing large-scale gradients of cortical organization [25,32,36,43]. Our findings offer a possible explanation for how these large-scale gradients emerge from the brain's structural embedding. Namely, at individual topological scales, areas may appear to preferentially form connections with a subset of others areas, manifesting as specific communities. As we zoom across multiple scales, however, we reveal a layered organization of interdigitated connections among areas, yielding an organizational axis of scale-specific organizational characteristics, including centrality and connection diversity. It is noteworthy that, from a geometric perspective, unimodal brain regions tend to have preferentially short-distance connections and multimodal regions have preferentially long-distance connections [48,49,59]. Our results are therefore consistent with recent theories suggesting that the main role of long-distance connections is to enhance the functional diversity of a brain region [11].
This localized-distributed gradient is further highlighted by local variations in structure-function coupling. The similarity of local structural neighborhoods best predicts functional connectivity in sensory areas, suggesting that interactions among these regions unfold mainly over local, small-scale neighbourhoods. Conversely, functional connectivity in multimodal brain regions is best predicted by the topological similarity of large-scale neighborhoods, suggesting that interactions among these regions unfold mainly over more extensive, large-scale neighbourhoods. These results build upon recent reports showing that structure-function coupling is region-specific [4,20,55,62,70,74]. Our findings suggest that this phenomenon is explained by the increasing scales at which communication processes unfold. Functional connectivity in sensory regions is easier to predict from structural connectivity because it is mediated by direct communication on the structural connectome, while functional connectivity in multimodal region is harder to predict from structural connectivity because it is mediated via more indirect, polysynaptic communication pathways [8].
The present findings should be interpreted with respect to several important methodological considerations. We focused on the topological organization of networks reconstructed from diffusion-weighted imaging data using computational tractometry. This approach is prone to systematic false positives and false negatives [42,63]. In addition, the connectomes generated are undirected, naturally limiting inferences about causal influence. We did, however, ensure that our results did not trivially depend on confounding factors by replicating our results using (1) different dynamical processes to generate neighborhoods, (2) an independent validation dataset and (3) a different parcellation resolution.
Altogether, the present findings demonstrate that exclusively considering communication at the global level might obscure functionally relevant features of brain networks. By studying regional embedding across multiple topological scales, we reveal a continuous range of communication preferences. In doing so, we take a step towards conceptually linking long-standing ideas in neuroscience such as integration and segregation, functional hierarchies and connection diversity.

Network reconstruction
All analyses were performed in two independently collected and preprocessed datasets, one collected at the University Hospital Center of Lausanne University (N = 67; Discovery) [28] and one as part of the Human Connectome Project S900 release (N = 201; Validation) [68].
Briefly, the Discovery dataset protocol comprised a diffusion spectrum imaging (DSI) sequence and a resting state functional magnetic resonance imaging (fMRI) sequence. Grey matter was parcellated into 68 cortical regions according to the Desikan-Killiany atlas [21] and was then further parcellated into either 219 or 1000 equally sized parcels [17]. Structural connectivity matrices were reconstructed for individuals patients using deterministic streamline tractography and a group-consensus structural network was built such that the mean density and edge length distribution observed across individual participants was preserved [11,46,47]. The weights of the edges in the consensus networks correspond to the log-transform of the average streamline density of the parcels across participants for whom these edges existed. A group-average functional connectivity matrix was constructed by concatenating the regional fMRI BOLD time series from all participants and computing the zero-lag Pearson correlation coefficient between each pairs of brain region. To threshold this matrix, we generated 1000 bootstrapped connectivity matrices by randomly sampling 276 points from the concatenated time series and by computing the correlation between these pairs of bootstrapped time series. Using these bootstrapped samples, we then estimated confidence intervals and retained the correlations between pairs of regions that were consistently positive or negative. Inconsistent correlations were set to 0. Further details regarding the pre-processing are available in [28].
The Validation dataset protocol comprised a high angular resolution diffusion imaging (HARDI) sequence and four resting state fMRI sessions. All analyses were performed in a subset of N = 201 unrelated participants. To ensure that results were not confounded by the parcellation scheme or resolution, grey matter was parcellated into 800 cortical regions according to the Schaefer functional atlas [56]. White matter tracts were reconstructed using probabilistic streamline tractography and a consensus network preserving the mean density and edge length distribution was constructed from the structural networks of individual participants, as before. Functional networks were also reconstructed using the same procedure as described above, using all four sessions for every individuals.

Multiscale topological neighborhoods
Local neighborhoods were characterized by modelling dynamical process initiated from individual nodes in the network. By controlling the length of these processes, we controlled the topological size of the delineated neighborhoods. Our main results relied on unbiased randomwalks. We further replicated our results using random walks with restarts and a heat-diffusion process.

Unbiased random-walks
Given an adjacency matrix A where A ij corresponds to the weight of the edge connecting nodes i and j, the probability that a walker at node i transitions to node j in a single iteration is given by Aij di , where d i is the degree (strength if weighted) of node i and corresponds to the sum of the weights of the edges leaving node i. The overall transition probabilities of a network can be represented in a transition matrix P such that: where D is the diagonal matrix with the value D ii corresponding to the degree of node i. Given an initial distribution of random walkers p(0), it is possible to compute the distribution of these random walkers at time t , p(t): The vector p(t) indicates the proportion of walkers located at any other node at time t. We initiate this random walk process on a single node i by setting the initial distribution p(0) to be equal to 0 everywhere, and be equal to 1 in position i. This initial vector can be written as e i . The transition probability vector at time t, given that the random walk process was started on node i can then we written as: By initiating a random-walk process from a single node, we can measure the topological proximity between this node and the other nodes in the network. By increasing the value of t, we increase the length of the randomwalks, and consequently measure the topological proximity of nodes in larger neighborhoods.
The discrete-time random walk process defined above can be "continuized" by considering the interval τ between two moves as an exponential random variable with λ = 1 [45]. The transition probability vector, given that the random walk process was initiated on node i is then given by: where L rw = I−P is the graph random-walk normalized Laplacian.

Random-walks with restarts
A variant of this equation consists in adding to this random-walk process an additional probability that the random walker randomly teleports itself back to the seed node. Random walks with teleportation are the basis for the PageRank algorithm [14] and ensure that random walks on directed networks do not get trapped in absorbing states. The equation, given that the random walk process is initiated on a single node i, is defined as follows: where α corresponds to the damping factor. By varying the damping factor, we can decrease the probability of restart of the random walks and therefore increase their length. More specifically, the probability that a randomwalks is of length l (P rob[L = l]) is geometrically related to the value of the parameter α: Ultimately, the stationary distribution of this Markov chain, which can be computed using the power iteration method, indicate the probability that random walkers of a certain length, starting from a single source node, reach other nodes in the network.

Diffusion
An alternative method to dynamically measure the topological proximity of brain regions in a network consists in modelling a heat-diffusion process on the network. More specifically, the Laplacian matrix (L) of the network is used to compute the distribution of some material at time t, given that the process was initiated at node i: where Again, by varying the value of t, we can vary the size of the neighborhoods on which the diffusion process unfolds, and therefore measure the topological proximity of nodes in the network given dynamical processes unfolding at increasingly large scales.

Multiscale Closeness Centrality
Measures of network centrality often consider a node's relationship with all of the other nodes. For instance, the closeness centrality of a node in a network can be measured by computing the inverse of its averaged topological distance to the other nodes in the network: where n corresponds to the number of nodes in the network parcellation and φ ij corresponds to the weighted shortest path between nodes i and j. To capture the centrality of a node at different scales, we computed the closeness centrality of a node as a weighted average using the weights of the probability vector p to prioritize the node's relationships with nodes that are topologically close and ignore nodes that are topologically remote. Specifically, given a scale-dependent weight vector p(t|i), the multiscale closeness centrality of a node i for the specified scale t is defined as: The topological distance between a pair of connected nodes i and j was measured as the inverse of the connection weight between the two nodes, and the topological shortest paths between pairs of nodes were subsequently retrieved using the Dijkstra's algorithm.

Clustering coefficient
The clustering coefficient of a node corresponds to the number of triangles attached to it, normalized by the maximum number of possible triangles that could be attached to the node [75]: where k i is the degree of node i and t i is the number of triangles attached to node i. A weighted version of the clustering coefficient, which can be viewed as a measure of the average "intensity" of triangles around a node, can also be expressed as follows [50]: wherew jk is the weight of the connection between nodes j and k, scaled by the largest weight in the network .

Community detection
Communities are groups of nodes with dense connectivity among each other. A Louvain-like greedy algorithm was used to identify a community assignment or partition that maximizes the quality function Q [13,41]: where A ij is the weight of connection between nodes i and j, s i and s j are the directed strengths of i and j, m is a normalizing constant, c i is the community assignment of node i and the Kronecker δ-function δ(u, v) is defined as 1 if u = v and 0 otherwise. The resolution parameter γ scales the importance of the null model and effectively controls the size of the detected communities: larger communities are more likely to be detected when γ < 1 and smaller communities (with fewer nodes in each community) are more likely to be detected when γ > 1.
To detect stable community assignments for our consensus networks, the algorithm was initiated 100 times at each value of the resolution parameter and consensus clustering was used to identify the most representative partitions [40]. This procedure was repeated for a range of 100 resolutions between γ = 0.25 and γ = 7.5. We then quantified the similarity between pairs of consensus partitions using the z score of the Rand index [64]. For each network, we identified five values of γ at which the generated partitions showed high mutual similarity and persisted through stretches of γ values. For the Discovery network of 1000 nodes, this procedure yielded partitions of 4, 6, 9, 12 and 16 communities (corresponding to γ=0. 54

Participation coefficient
Given a partition, we quantify the diversity of a node's connections to multiple communities using the participation coefficient [29]. The participation coefficient is defined as where s i is the total strength of node i, s i (c) is the strength of i in community c and the sum is over the set of all communities C. Nodes with a low participation coefficient are mainly connected with nodes in a single community, while nodes with a high participation coefficient have a diverse connection profile, with connections to multiple communities.

Neighborhood similarity
The pairwise neighborhood similarity of nodes in the network, for a particular scale t can be represented in a matrix S(t), where S ij (t) corresponds to the cosine similarity between the transition probability vectors of nodes i and j, for the scale t: S ij (t) = 1 − p(t|i)p(t|j) p(t|i)p(t|j) .

Communicability
For a binary adjacency matrix A, communicability is defined as with walks of length n normalized by n!, ensuring that shorter, more direct walks contribute more than longer walks [22]. For a weighted adjacency matrix, this definition can be extended as where D is the diagonal degree matrix [19] Generative models A stochastic block model [35] was used to generate an artificial hierarchically modular network of 2000 nodes with a layer of 4 equally-sized communities and another of 2 equally-sized communities. The edge probability matrix P was defined as follows: A two-dimensional embedding was generated using the ForceAtlas2 algorithm [38].

Diffusion map embedding
Diffusion map embedding is a nonlinear dimensionality reduction algorithm [18]. The algorithm seeks to project a set of embeddings into a lower-dimensional Euclidean space. Briefly, the similarity matrix among a set of points (in our case, the correlation matrix representing functional connectivity) is treated as a graph, and the goal of the procedure is to identify points that are proximal to one another on the graph. In other words, two points are close together if there are many relatively short paths connecting them. A diffusion operator, representing an ergodic Markov chain on the network, is formed by taking the normalized graph Laplacian of the matrix. The new coordinate space is described by the eigenvectors of the diffusion operator. We set the diffusion rate α = 1. The procedure was implemented using the mapalign Toolbox (https://github.com/satra/mapalign). Figure S2. Sensitivity and replication -methods | We replicated the main results discussed in this report using two other methods to defined the topological neighborhoods of brain regions. We replicated our results using (a) the personalized PageRank and (b) the normalized Laplacian. Section (i) shows the optimal centrality scale of a brain region. Section (ii) shows the Pearson correlation between the slope of the closeness centrality and other measures of connection diversity. Section (iii) shows the topological scale at which the Spearman correlation between the functional connectivity vector of an individual brain region and neighborhood similarity vector of the same region is the highest (left) and the scatterplot of the relationship between the best scale of a node and its position in a functional diffusion gradient Figure S3. Sensitivity and replication -dataset and scale | We replicated the main results discussed in this report with structural and functional networks generated from (a) a different dataset and parcellation (HCP, 800 nodes) and from (b) a different, coarser parcellation (219 nodes). Section (i) shows the optimal centrality scale of a brain region. Section (ii) shows the Pearson correlation between the slope of the closeness centrality and other measures of connection diversity. Section (iii) shows the topological scale at which the Spearman correlation between the functional connectivity vector of an individual brain region and neighborhood similarity vector of the same region is the highest (left) and the scatterplot of the relationship between the best scale of a node and its position in a functional diffusion gradient