Individual-Specific Areal-Level Parcellations Improve Functional Connectivity Prediction of Behavior

Resting-state functional MRI (rs-fMRI) allows estimation of individual-specific cortical parcellations. We have previously developed a multi-session hierarchical Bayesian model (MS-HBM) for estimating high-quality individual-specific network-level parcellations. Here, we extend the model to estimate individual-specific areal-level parcellations. While network-level parcellations comprise spatially distributed networks spanning the cortex, the consensus is that areal-level parcels should be spatially localized, i.e., should not span multiple lobes. There is disagreement about whether areal-level parcels should be strictly contiguous or comprise multiple non-contiguous components, therefore we considered three areal-level MS-HBM variants spanning these range of possibilities. Individual-specific MS-HBM parcellations estimated using 10min of data generalized better than other approaches using 150min of data to out-of-sample rs-fMRI and task-fMRI from the same individuals. Resting-state functional connectivity (RSFC) derived from MS-HBM parcellations also achieved the best behavioral prediction performance. Among the three MS-HBM variants, the strictly contiguous MS-HBM (cMS-HBM) exhibited the best resting-state homogeneity and most uniform within-parcel task activation. In terms of behavioral prediction, the gradient-infused MS-HBM (gMS-HBM) was numerically the best, but differences among MS-HBM variants were not statistically significant. Overall, these results suggest that areal-level MS-HBMs can capture behaviorally meaningful individual-specific parcellation features beyond group-level parcellations. Multi-resolution trained models and parcellations are publicly available (https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/brain_parcellation/Kong2022_ArealMSHBM).


Introduction
The human cerebral cortex comprises hundreds of cortical areas with distinct function, architectonics, connectivity, and topography (Kaas, 1987;Felleman and Van Essen, 1991;Eickhoff et al., 2018a). These areas are thought to be organized into at least six to twenty spatially distributed large-scale networks that broadly subserve distinct aspects of human cognition (Goldman-Rakic, 1988;Mesulam, 1990;Smith et al., 2009;Bressler and Menon, 2010;Uddin et al., 2019). Accurate parcellation of the cerebral cortex into areas and networks is therefore an important problem in systems neuroscience. The advent of in-vivo non-invasive brain imaging techniques, such as functional magnetic resonance imaging (fMRI), has enabled the delineation of cortical parcels that approximate these cortical areas (Sereno et al., 1995;Van Essen and Glasser, 2014;Eickhoff et al., 2018b).
Most individual-specific parcellations only account for inter-subject (between-subject) variability, but not intra-subject (within-subject) variability. However, inter-subject and intrasubject RSFC variability can be markedly different across brain regions (Mueller et al., 2013;Chen et al., 2015;. For example, the sensory-motor cortex exhibits low inter-subject variability, but high intra-subject variability (Mueller et al., 2013;. Therefore, it is important to consider both inter-subject and intra-subject variability when estimating individual-specific parcellations (Mejia et al., 2015(Mejia et al., , 2018. We 4 have previously proposed a multi-session hierarchical Bayesian model (MS-HBM) of individualspecific network-level parcellation that accounted for both inter-subject and intra-subject variability . We demonstrated that compared with several alternative approaches, individual-specific MS-HBM networks generalized better to new resting-fMRI and task-fMRI data from the same individuals .
In this study, we extend the network-level MS-HBM to estimate individual-specific areallevel parcellations. While network-level parcellations comprise spatially distributed networks spanning the cortex, the consensus is that areal-level parcels should be spatially localized (Kaas, 1987;Amunts and Zilles, 2015), i.e., an areal-level parcel should not span multiple cortical lobes. Consistent with invasive studies (Amunts and Zilles, 2015), most areal-level parcellation approaches estimate spatially contiguous parcels (Shen et al., 2013;Honnorat et al., 2015;Chong et al., 2017). However, a few studies have suggested that individualspecific areal-level parcels can be topologically disconnected Li et al., 2019b). For example, according to Glasser and colleagues, area 55b might comprise two disconnected, but spatially close, components in some individuals . Given the lack of consensus, we considered three different spatial localization priors. Across the three priors, the resulting parcels ranged from being strictly contiguous to being spatially localized with multiple non-contiguous components.
We compared MS-HBM areal-level parcellations with three other approaches Li et al., 2019b) in terms of their generalizability to out-ofsample rs-fMRI and task-fMRI from the same individuals. Furthermore, a vast body of literature has shown that RSFC derived from group-level parcellations can be used to predict human behavior (Hampson et al., 2006;Finn et al., 2015;Rosenberg et al., 2016;Li et al., 2019a). Therefore, we also investigated whether RSFC derived from individual-specific MS-HBM parcellations could improve behavioral prediction compared with two other parcellation approaches Li et al., 2019b).

Overview
We proposed the spatially-constrained MS-HBM to estimate individual-specific areallevel parcellations. The model distinguished between inter-subject and intra-subject functional connectivity variability, while incorporating spatial contiguity constraints. Three different contiguity constraints were considered: distributed MS-HBM, contiguous MS-HBM and gradient-infused MS-HBM. The resulting MS-HBM parcels ranged from being strictly contiguous (contiguous MS-HBM) to being spatially localized with multiple topologically disconnected components (distributed MS-HBM). Subsequent analyses proceeded in four stages.
First, we explored the pattern of inter-subject and intra-subject functional variability across the cortex. Second, we examined the intra-subject reproducibility and inter-subject similarity of MS-HBM parcellations on two different datasets. Third, the MS-HBM was compared with three other approaches using new rs-fMRI and task-fMRI data from the same participants. Finally, we investigated whether functional connectivity of individual-specific parcellations could improve behavioral prediction.

Multi-session rs-fMRI datasets
The Human Connectome Project (HCP) S1200 release (Van Essen et al., 2012a; comprised structural MRI, rs-fMRI and task-fMRI of 1094 young adults. All imaging data were collected on a custom-made Siemens 3T Skyra scanner using a multiband sequence.
Each participant went through two fMRI sessions on two consecutive days. Two rs-fMRI runs were collected in each session. Each fMRI run was acquired at 2 mm isotropic resolution with a TR of 0.72 s and lasted for 14 min and 33 s. The structural data consisted of one 0.7 mm isotropic scan for each participant.
The Midnight Scanning Club (MSC) multi-session dataset comprised structural MRI, rs-fMRI and task-fMRI from 10 young adults (Gordon et al., 2017b;Gratton et al., 2018). All imaging data were collected on a Siemens Trio 3T MRI scanner using a 12-channel Head Matrix Coil. Each participant was scanned for 10 sessions of resting-state fMRI data. One rs-fMRI run was collected in each session. Each fMRI run was acquired at 4mm isotropic resolution with a TR of 2.2 s and lasted for 30 min. The structural data was collected across two separate days and 6 consisted of four 0.8 mm isotropic T1-weighted images and four 0.8 mm isotropic T2-weighted images.
It is worth noting some significant acquisition differences between the two datasets, including scanner type (e.g., Skyra versus Trio), acquisition sequence (e.g., multiband versus non-multiband), and scan time (e.g. day versus midnight). These differences allowed us to test the robustness of our parcellation approach.

Preprocessing
Details of the HCP preprocessing can be found elsewhere (Van Essen et al., 2012a;Glasser et al., 2013;Smith et al., 2013; HCP S1200 manual). Of particular importance is that the rs-fMRI data has been projected to the fs_LR32k surface space (Van Essen et al., 2012b), smoothed by a Gaussian kernel with 2mm full width at half maximum (FWHM), denoised with ICA-FIX Salimi-Khorshidi et al., 2014) and aligned with MSMAll (Robinson et al., 2014). To eliminate global and head-motion related artifacts (Burgess et al., 2016;Siegel et al., 2016), additional nuisance regression and censoring was performed Li et al., 2019a). Nuisance regressors comprised the global signal and its temporal derivative. Runs with more than 50% censored frames were removed. Participants with all four runs remaining (N = 835) were considered.
In the case of the MSC dataset, we utilized the preprocessed rs-fMRI data of 9 subjects on fs_LR32k surface space. Preprocessing steps included slice time correction, motion correction, field distortion correction, motion censoring, nuisance regression and bandpass filtering (Gordon et al., 2017b). Nuisance regressors comprised whole brain, ventricular and white matter signals, as well as motion regressors derived from Volterra expansion (Friston et al., 1996). The surface data was smoothed by a Gaussian kernel with 6mm FWHM. One participant (MSC08) exhibited excessive head motion and self-reported sleep (Gordon et al., 2017b;Seitzman et al., 2019) and was thus excluded from subsequent analyses.

Functional connectivity profiles
As explained in the previous section, the preprocessed rs-fMRI data from the HCP and MSC datasets have been projected onto fs_LR32K surface space, comprising 59412 cortical vertices. A binarized connectivity profile of each cortical vertex was then computed as was done 7 in our previous study . More specifically, we considered 1483 regions of interest (ROIs) consisting of single vertices uniformly distributed across the fs_LR32K surface meshes . For each rs-fMRI run of each participant, the Pearson's correlation between the fMRI time series at each spatial location (59412 vertices) and the 1483 ROIs were computed. Outlier volumes were ignored when computing the correlations. The 59412 × 1483 RSFC (correlation) matrices were then binarized by keeping the top 10% of the correlations to obtain the final functional connectivity profile .
We note that because fMRI is spatially smooth and exhibits long-range correlations, therefore considering only 1483 ROI vertices (instead of all 59412 vertices) would reduce computational and memory demands, without losing much information. To verify significant information has not been lost, the following analysis was performed. For each HCP participant, a 59412 × 59412 RSFC matrix was computed from the first rs-fMRI run. We then correlated every pair of rows of the RSFC matrix, yielding a 59412 × 59412 RSFC similarity matrix for each HCP participant. An entry in this RSFC similarity matrix indicates the similarity of the functional connectivity profiles of two cortical locations. The procedure was repeated but using the 59412 × 1483 RSFC matrices to compute the 59412 × 59412 RSFC similarity matrices.
Finally, for each HCP participant, we correlated the RSFC similarity matrix (generated from 1483 vertices) and RSFC similarity matrix (generated from 59412 vertices). The resulting correlations were high with r = 0.9832 ± 0.0041 (mean ± std) across HCP participants, suggesting that very little information was lost by only considering 1483 vertices.

Group-level parcellation
We have previously developed a set of high-quality population-average areal-level parcellations of the cerebral cortex , which we will refer to as "Schaefer2018". Although the Schaefer2018 parcellations are available in different spatial resolutions, we will mostly focus on the 400-region parcellation in this paper ( Figure 4A) given that previous work has suggested that there might be between 300 and 400 human cortical areas (Van Essen et al., 2012b). The 400-region Schaefer2018 parcellation will be used to initialize the areal-level MS-HBM for estimating individual-specific parcellations. The Schaefer2018 parcellation will also be used as a baseline in our experiments. 8

Areal-level multi-session hierarchical Bayesian model (MS-HBM)
The areal-level MS-HBM ( Figure 1A) is the same as the network-level MS-HBM  except for one crucial detail, i.e., spatial localization prior Φ ( Figure 1A).
Nevertheless, for completeness, we will briefly explain the other components of the MS-HBM, although further details can be found elsewhere .
We denote the binarized functional connectivity profile of cortical vertex during session of subject as , . For example, the binarized functional connectivity profiles of a posterior cingulate cortex vertex ( 1,1 ) and a precuneus vertex ( 1,1 ) from the 1st session of the 1st subject are illustrated in Figure 1A (fourth row). The shaded circle indicates that , are the only observed variables. Based on the observed connectivity profiles of all vertices during all sessions of a single subject, the goal is to assign a parcel label for each vertex of subject .
Even though a vertex's connectivity profiles are likely to be different across fMRI sessions, the vertex's parcel label was assumed to be the same across sessions. For example, the individualspecific areal-level parcellation of the 1st subject using data from all available sessions is illustrated in Figure 1A (last row). -specific areallevel parcellations. , denote the RSFC profile at brain location of subject during rs-fMRI session . The shaded circle indicates that , are the only observed variables. The goal is to estimate the parcel label for subject at location given RSFC profiles from all sessions. is the group-level RSFC profile of parcel . is the subject-specific RSFC profile of parcel . A large indicates small inter-subject RSFC variability, i.e., the group-level and subject-specific RSFC profiles are very similar. , is the subject-specific RSFC profile of parcel during session . A large indicates small intra-subject RSFC variability, i.e., the subject-level and session-level RSFC profiles are very similar. captures inter-region RSFC variability. A large indicates small inter-region variability, i.e., two locations from the same parcel exhibit very similar RSFC profiles. Finally, captures inter-subject variability in the spatial distribution of parcels, smoothness prior encourages parcel labels to be spatially smooth, and the spatial localization prior Φ ensures each parcel is spatially localized. The spatial localization prior Φ is the crucial difference from the original network-level MS-HBM . (B) Illustration of three different spatial localization priors. Individual-specific parcellations of the same HCP participant were estimated using distributed MS-HBM (dMS-HBM), contiguous MS-HBM (cMS-HBM), and gradient-infused MS-HBM (gMS-HBM). Four parcels depicted in pink, red, blue and yellow are shown here. All four parcels estimated by dMS-HBM were spatially close together but contained two separate components. All four parcels estimated by cMS-HBM were spatially contiguous. Three parcels (pink, red, yellow) estimated by gMS-HBM were spatially contiguous, while the blue parcel contained two separate components.
The multiple layers of the areal-level MS-HBM explicitly differentiate inter-subject (between-subject) functional connectivity variability from intra-subject (within-subject) functional connectivity variability ( and in Figure 1A). The connectivity profiles of two vertices belong to the same parcel will not be identical. This variability is captured by ( Figure   1A). Some model parameters (e.g., group-level connectivity profiles) will be estimated from a training set comprising multi-session rs-fMRI data from multiple subjects. A new participant (possibly from another dataset) with single-session fMRI data could then be parcellated without access to the original training data.
The Markov random field (MRF) spatial prior ( Figure 1A last row) is important because the observed functional connectivity profiles of individual subjects are generally very noisy. Therefore, additional priors were imposed on the parcellation. First, the spatial smoothness prior encouraged neighboring vertices (e.g., PCC and pCun) to be assigned to the same parcels.
Second, the inter-subject spatial variability prior , denote the probability of parcel occurring at a particular spatial location . The two priors ( and , ) are also present in the network-level MS-HBM .
However, an additional spatial prior is necessary because of well-documented long-range connections spanning the cortex. Therefore, with the original MRF prior , brain locations with similar functional connectivity profiles could be grouped together regardless of spatial proximity. In the case of network-level MS-HBM, this is appropriate because largescale networks are spatially distributed, e.g., the default network spans frontal, parietal, temporal and cingulate cortex. In the case of areal-level parcellations, there is the expectation that a single parcel should not span large spatial distances . Therefore, the areal-level MS-HBM incorporates an additional prior Φ constraining parcels to be spatially localized ( Figure 1A last row).
As mentioned in the introduction, even though there is consensus that individual-specific areal-level parcels should be spatially localized, there are differing opinions about whether they should be spatially contiguous. Some studies have enforced spatially contiguous cortical parcels Chong et al., 2017) consistent with invasive studies (Amunts and Zilles, 2015). Other studies have estimated parcels that might comprise multiple spatially close components Li et al., 2019b). For example, Glasser and colleagues suggested that area 55b might be split into two disconnected components in close spatial proximity. Given the lack of consensus, we consider three possible spatial localization priors (i.e, Φ in Figure 1A): 1. Distributed MS-HBM (dMS-HBM). Previous studies have suggested that after registering cortical folding patterns, inter-individual variability in architectonic locations are different across architectonic areas (Fischl et al., 2008). One of the most spatially variable architectonic area is hOc5, which can be located in an adjacent sulcus away from the groupaverage location (Yeo et al., 2010a(Yeo et al., , 2010b. This variability corresponded to about 30mm. Therefore, similar to Glasser and colleagues (2016), Φ comprises a spatial localization prior constraining each individual-specific parcel to be within 30mm of the group-level Schaefer2018 parcel boundaries. We note that this prior only guarantees an individualspecific parcel to be spatially localized, but the parcel might comprise multiple distributed components ( Figure 1B left panel). We refer to this prior as distributed MS-HBM (dMS-HBM). MS-HBM (cMS-HBM). In addition to the 30mm prior from dMS-HBM, we include a spatial localization prior encouraging vertices comprising a parcel to not be too far from the parcel center, as was done in our previous study . If this spatial contiguity prior is sufficiently strong, then all individual-specific parcels will be spatially connected ( Figure 1B middle panel). However, an overly strong prior will result overly round parcels, which is not biologically plausible (Vogt 2009). To ameliorate this issue, the estimation procedure starts with a very small weight on this spatial contiguity prior and then progressively increases the weights to ensure spatial contiguity. Thus, we refer to this prior as contiguous MS-HBM model (cMS-HBM). We note that requiring parcels to be spatially connected within an MRF framework is non-trivial; our approach is significantly less computationally expensive than competing approaches (Nowozin and Lampert, 2010;Honnorat et al., 2015).

Gradient-infused MS-HBM (gMS-HBM).
A well-known areal-level parcellation approach is the local gradient approach, which detects local abrupt changes (i.e., gradients) in RSFC across the cortex . Our previous study  has suggested the utility of combining local gradient  and global clustering ) approaches for estimating areal-level parcellations. Therefore, we complemented the spatial contiguity prior in cMS-HBM with a prior based on local gradients in RSFC, which encouraged adjacent brain locations with gentle changes in functional connectivity to be grouped into the same parcel. In practice, we found that the gradient-infused prior, together with a very weak spatial contiguity prior, dramatically increased the number of spatially contiguous parcels ( Figure 1B right panel). Furthermore, the parcels are also less round than cMS-HBM, which is in our opinion more biologically plausible. We refer to this prior as gradient-infused MS-HBM (gMS-HBM).
A more detailed mathematical explanation of the model can be found in Supplemental Methods S1. Given a dataset of subjects with multi-session rs-fMRI data, a variational Bayes expectation-maximization (VBEM) algorithm can be used to estimate the following model parameters : group-level parcel connectivity profiles , the inter-subject functional connectivity variability , the intra-subject functional connectivity variability , the spatial smoothness prior and the inter-subject spatial variability prior . The individualspecific areal-level parcellation of a new participant could then be generated using these 13 estimated group-level priors without access to the original training data. Furthermore, although the model requires multi-session fMRI data for parameter estimation, it can be applied to a single session fMRI data from a new participant . Details of the VBEM algorithm can be found in Supplementary Methods S2.

Characterizing inter-subject and intra-subject functional connectivity variability
Previous studies have shown that sensory-motor regions exhibit lower inter-subject, but higher intra-subject functional connectivity variability than association regions (Mueller et al., 2013;. Therefore, we first evaluate whether estimates of areal-level inter-subject and intra-subject variability were consistent with previous work ( Figure   2A). The HCP dataset was divided into training (N = 40), validation (N = 40) and test (N = 755) sets. Each HCP participant underwent two fMRI sessions on two consecutive days. Within each session, there were two rs-fMRI runs. All four runs were utilized.
The parameters of three MS-HBM variants (dMS-HBM, cMS-HBM and gMS-HBM) were estimated. More specifically, the group-level parcel connectivity profiles , the intersubject resting-state functional connectivity variability , the intra-subject resting-state functional connectivity variability and inter-subject spatial variability prior were estimated using the HCP training set ( Figure 2A). We tuned the "free" parameters (associated with the spatial smoothness prior and spatial localization prior Φ) using the HCP validation set ( Figure   2A). The Schaefer2018 400-region group-level parcellation ( Figure 4A) was used to initialize the optimization procedure. The final trained MS-HBMs ( Figure 2A) were used in all subsequent analyses.

Intra-subject reproducibility and inter-subject similarity of MS-HBM parcellations
Within-subject reliability is important for clinical applications (Shehzad et al., 2009;Birn et al., 2013;Zuo and Xing, 2014;. Having verified that the spatial patterns of inter-subject and intra-subject functional connectivity variability were consistent with previous work, we further characterized the intra-subject reproducibility and inter-subject similarity of individual-specific MS-HBM parcellations ( Figure 2B). The three trained models (dMS-HBM, cMS-HBM and gMS-HBM) were applied to the HCP test set. Individual-specific MS-HBM parcellations were independently estimated using the first 2 runs (day 1) and the last 2 runs (day 2).

15
To evaluate the reproducibility of individual-specific parcellations, the Dice coefficient was computed for each parcel from the two parcellations of each participant: Dice( 1 , s 2 ) = 2 × # vertices that overlap between parcels 1 and 2 #vertices in parcel 1 + #vertices in parcel 2 , where 1 and 2 are parcel from the two parcellations of subject . The Dice coefficient is widely used for comparing parcellation or segmentation overlap (Destrieux et al., 2010;Sabuncu et al., 2010;Birn et al., 2013;Blumensath et al., 2013;Arslan et al., 2015;Honnorat et al., 2015;Salehi et al., 2018). The Dice coefficient is equal to 1 if there is perfect overlap between parcels and zero if there is no overlap between parcels. The Dice coefficients were averaged across all participants to provide insights into regional variation in intra-subject parcel similarity. Finally, the Dice coefficients were averaged across all parcels to provide an overall measure of intrasubject parcellation reproducibility.
To evaluate inter-subject parcellation similarity, for each pair of participants, the Dice coefficient was computed for each parcel. Since there were two parcellations for each participant, there were a total of four Dice coefficients for each parcel, which were then averaged. Furthermore, the Dice coefficients were averaged across all pairs of participants to provide insights into regional variation in inter-subject parcel similarity. Finally, the dice coefficients were averaged across all parcels to provide an overall measure of inter-subject parcellation similarity.
To evaluate whether the parameters of MS-HBM algorithms from one dataset could be generalized to another dataset with different acquisition protocols and preprocessing pipelines, we used the HCP model parameters to estimate individual-specific parcellations in the MSC dataset. More specifically, the MS-HBM parcellations were independently estimated using the first 5 sessions and the last 5 sessions for each MSC participant ( Figure 2B).

Geometric properties of MS-HBM parcellations
The three MS-HBM variants impose different spatial priors on areal-level parcellations.
To characterize the geometric properties of the MS-HBM parcels ( Figure 2C), the three trained models (dMS-HBM, cMS-HBM and gMS-HBM) were applied to the HCP test set using all four rs-fMRI runs. We then computed two metrics to characterize the geometry of the parcellations.
First, for each parcellation, the number of spatially disjoint components was computed for each parcel and averaged across all parcels. Second, for each parcellation, a roundness metric was computed for each parcel and averaged across all parcels. Here, the roundness of a parcel is defined as 1 − # # ℎ ; a larger value indicates that a parcel is rounder.

Comparison with alternative approaches
Here, we compared the three MS-HBM approaches (dMS-HBM, cMS-HBM, and gMS-HBM) with three alternative approaches. The first approach was to apply the Schaefer2018 400region group-level parcellation to individual subjects. The second approach is the well-known gradient-based boundary mapping algorithm that has been widely utilized to estimate individualspecific areal-level parcellation Gordon et al., 2017b). We will refer to this second approach as "Laumann2015" (https://sites.wustl.edu/petersenschlaggarlab/resources).
The third approach is the recent individual-specific areal-level parcellation algorithm of Li and colleagues (Li et al., 2019b; http://nmr.mgh.harvard.edu/bid/DownLoad.html), which we will refer to as "Li2019".
Evaluating the quality of individual-specific resting-state parcellations is difficult because of a lack of ground truth. Here, we considered two common evaluation metrics Chong et al., 2017;: resting-state connectional homogeneity and task functional inhomogeneity (i.e., uniform task activation; see below). These metrics encode the principle that if an individual-specific parcellation captured the areal-level organization of the individual's cerebral cortex, then each parcel should have homogeneous connectivity and function. Furthermore, we also compared the relative utility of the different parcellation approaches for RSFC-based behavioral prediction. Comparing out-of-sample resting-state homogeneity across different parcellation approaches applied to a single rs-fMRI session. (B) Comparing out-of-sample resting-state homogeneity across different parcellation approaches applied to different lengths of rs-fMRI data. (C) Comparing task inhomogeneity across different approaches. (D) Comparing RSFC-based behavioral prediction accuracies across different approaches. Across all analyses, MS-HBM parcellaions were estimated using the trained models from Figure 2A. We remind the reader that the trained MS-HBMs were estimated using the HCP training and validation sets (Figure 2A), which did not overlap with the HCP test set utilized in the current set of analyses. In the case of analyses (A) and (B), only a portion of rs-fMRI data was used to estimate the parcellations. The remaining rs-fMRI data was used to compute out-of-sample resting-state homogeneity. For analyses (C) and (D), all available rs-fMRI data was used to estimate the parcellations. Finally, we note that the local gradient approach (Laumann2015) does not yield a fixed number of parcels. Thus, the number of parcels is variable within an individual with different lengths of rs-fMRI data, so Laumann2015 was not considered for analysis B. Similarly, the number of parcels is different across participants, so the sizes of the RSFC matrices are different across participants. Therefore, Laumann2015 was also not utilized for analysis D.

Resting-state connectional homogeneity
Resting-state connectional homogeneity was defined as the averaged Pearson's correlations between rs-fMRI time courses of all pairs of vertices within each parcel, adjusted for parcel size and summed across parcels . Higher restingstate homogeneity means that vertices within the same parcel share more similar time courses.
Therefore, higher resting-state homogeneity indicates better parcellation quality.
For each participant from the HCP test set (N = 755), we used one run to infer the individual-specific parcellation and computed resting-state homogeneity with the remaining 3 runs. For the MSC dataset (N = 9), we used one session to infer the individual-specific parcellation and computed resting-state homogeneity with the remaining 9 sessions ( Figure 3A).
Because MSC participants have large amount of rs-fMRI data (300 min), we also parcellated each MSC participant using different length of rs-fMRI data (10 min to 150 min) and evaluated the resting-state homogeneity with the remaining 5 sessions. This allowed us to estimate how much the algorithms would improve with more data ( Figure 3B).
When comparing resting-state homogeneity between parcellations, the effect size (Cohen's ) of differences and a two-sided paired-sample t-test (dof = 754 for HCP, dof = 8 for MSC) were computed.

Task functional inhomogeneity
Task functional inhomogeneity was defined as the standard deviation of (activation) zvalues within each parcel for each task contrast, adjusted for parcel size and summed across parcels (Gordon et al., 2017b;. Lower task inhomogeneity means that activation within each parcel is more uniform. Therefore, lower task inhomogeneity indicates better parcellation quality. The HCP task-fMRI data consisted of 7 task domains: social cognition, motor, gambling, working memory, language processing, emotional processing, and relational processing (Barch et al., 2013). The MSC task-fMRI data consisted of three task domains: motor, mixed, and memory (Gordon et al., 2017b). Each task domain contained multiple task contrasts. All available task contrasts were utilized.
For each participant from the HCP test set (N = 755) and MSC dataset (N = 9), all rs-fMRI sessions were used to infer the individual-specific parcellation ( Figure 3C). The individualspecific parcellation was then used to evaluate task inhomogeneity for each task contrast and then averaged across all available contrasts within a task domain, resulting in a single task inhomogeneity measure per task domain. When comparing between parcellations, we averaged the task inhomogeneity metric across all contrasts within a task domain before the effect size (Cohen's ) of differences and a two-sided paired-sample t-test (dof = 754 for HCP, dof = 8 for MSC) were computed for each domain.

Methodological considerations
It is important to note that a parcellation with more parcels tends to have smaller parcel size, leading to higher resting-state homogeneity and lower task inhomogeneity. For example, if a parcel comprised 2 vertices, then the parcel would be highly homogeneous. In our experiments, the MS-HBM algorithms and Li2019 were initialized with the 400-region Schaefer2018 grouplevel parcellation, resulting in the same number of parcels as Schaefer2018, i.e., 400 parcels.
This allowed for a fair comparison among MS-HBMs, Li2019 and Schaefer2018.
However, parcellations estimated by Laumann2015 had a variable number of parcels across participants. Furthermore, Laumann2015 parcellations also had a significant number of vertices between parcels that were not assigned to any parcel, which has the effect of artificially increasing resting homogeneity and decreasing task inhomogeneity. Therefore, when comparing MS-HBM with Lauman2015 using resting-state homogeneity ( Figure 3A) and task 20 inhomogeneity ( Figure 3C), we performed a post-hoc processing of MS-HBM parcellations to match the number of parcels and unlabeled vertices of Laumann2015 parcellations (Supplementary Methods S3).
In addition, the Laumann2015 approach yielded different numbers of parcels within an individual with different lengths of rs-fMRI data. Therefore, Laumann2015 was also excluded from the analysis of out-of-sample resting-state homogeneity with different lengths of rs-fMRI data ( Figure 3B).

RSFC-based behavioral prediction
Most studies utilized a group-level parcellation to derive RSFC for behavioral prediction (Dosenbach et al., 2010;Finn et al., 2015;Dubois et al., 2018;Li et al., 2019a;Weis et al., 2019). Here, we investigated if RSFC derived from individual-specific parcellations can improve behavioral prediction performance. As before Li et al., 2019a;He et al., 2019), we considered 58 behavioral phenotypes measuring cognition, personality and emotion from the HCP dataset. Three participants were excluded from further analyses because they did not have all behavioral phenotypes, resulting in a final set of 752 test participants.
The different parcellation approaches were applied to each HCP test participant using all four rs-fMRI runs ( Figure 3D). The Laumann2015 approach yielded parcellations with different numbers of parcels across participants, so there was a lack of inter-subject parcel correspondence. Therefore, we were unable to perform behavioral prediction with the Laumann2015 approach, so Laumann2015 was excluded from this analysis. In practice, an 2 -regularization term (i.e., kernel ridge regression) was included to reduce overfitting (Supplementary Methods S4; Murphy, 2012). We performed 20-fold cross-validation for each behavioral phenotype. Family structure within the HCP dataset was taken into account by ensuring participants from the same family (i.e., with either the same mother ID or father ID) were kept within the same fold and not split across folds. For each test fold, an inner-loop 20fold cross-validation was repeatedly applied to the remaining 19 folds with different regularization parameters. The optimal regularization parameter from the inner-loop crossvalidation was then used to predict the behavioral phenotype in the test fold. Accuracy was measured by correlating the predicted and actual behavioral measure across all participants within the test fold (Finn et al., 2015;Li et al., 2019b). By repeating the procedure for each test fold, each behavior yielded 20 correlation accuracies, which were then averaged across the 20 folds. Because a single 20-fold cross-validation might be sensitive to the particular split of the data into folds (Varoquaux et al., 2017), the above 20-fold cross-validation was repeated 100 times. The mean accuracy and standard deviation across the 100 crossvalidations will be reported. When comparing between parcellations, a corrected resampled t-test for repeated k-fold cross-validation was performed (Bouckaert and Frank, 2004). We also repeated the analyses using coefficient of determination (COD) as a metric of prediction performance.
As certain behavioral measures are known to correlate with motion (Siegel et al., 2016), we regressed out age, sex, framewise displacement (FD), DVARS, body mass index and total brain volume from the behavioral data before kernel ridge regression. To prevent any information leak from the training data to test data, the nuisance regression coefficients were estimated from the training folds and applied to the test fold. 22
We note that the computational bottleneck for gMS-HBM is the computation of the local gradients . We implemented a faster and less memory-intensive version of the local gradient computation by subsampling the functional connectivity matrices The resulting gradient maps were highly similar to the original gradient maps (r = 0.97). The faster gradient code can be found here (https://github.com/ThomasYeoLab/CBIG/tree/master/utilities/matlab/speedup_gradients).

Overview
Three variations of the MS-HBM with different contiguity constraints ( Figure 1) were applied to two multi-session rs-fMRI datasets to ensure that the approaches were generalizable across datasets with significant acquisition and processing differences. After confirming previous literature (Mueller et al., 2013;) that inter-subject and intra-subject RSFC variabilities were different across the cortex, we then established that the MS-HBM algorithms produced individual-specific areal-level parcellations with better quality than other approaches. Finally, we investigated whether RSFC derived from MS-HBM parcellations could be used to improve behavioral prediction.

Sensory-motor cortex exhibits lower inter-subject but higher intra-subject functional connectivity variability than association cortex
The parameters of gMS-HBM, dMS-HBM and cMS-HBM were estimated using the HCP training set. Figure S1 shows the inter-subject RSFC variability ( ) and intra-subject RSFC variability ( ) overlaid on corresponding Schaefer2018 group-level parcels. The pattern of intersubject and intra-subject RSFC variability were consistent with previous work (Mueller et al., 2013;. More specifically, sensory-motor parcels exhibited lower inter-subject RSFC variability than association cortical parcels. On the other hand, association cortical parcels exhibited lower intra-subject RSFC variability than sensorymotor parcels.

Individual-specific MS-HBM parcellations exhibit high intra-subject reproducibility and low inter-subject similarity
To assess intra-subject reproducibility and inter-subject similarity, the three MS-HBM variants were tuned on the HCP training and validation sets, and then applied to the HCP test set.
Individual-specific parcellations were generated by using resting-state fMRI data from day 1 (first 2 runs) and day 2 (last 2 runs) separately for each participant. All 400 parcels were present in 99% of the participants. Inter-subject spatial similarity for different parcels. (C) Intra-subject reproducibility for different parcels. Yellow color indicates higher overlap. Red color indicates lower overlap. Individual-specific MS-HBM parcellations were generated by using day 1 (first two runs) and day 2 (last two runs) separately for each participant. Sensory-motor parcels exhibited higher intra-subject reproducibility and inter-subject similarity than association parcels. Figure 4 shows the inter-subject and intra-subject spatial similarity (Dice coefficient) of parcels from the three MS-HBM variants in the HCP test set. Intra-subject reproducibility was greater than inter-subject similarity across all parcels. Consistent with our previous work on individual-specific cortical networks , sensory-motor parcels were more spatially similar across participants than association cortical parcels. Sensory-motor parcels also exhibited greater within-subject reproducibility than association cortical parcels.
Overall, gMS-HBM, dMS-HBM and cMS-HBM achieved intra-subject reproducibility of 81.0%, 80.4% and 76.1% respectively, and inter-subject similarity of 68.2%, 68.1% and 63.9% respectively. We note that these metrics cannot be easily used to judge the quality of the parcellations. For example, gMS-HBM has higher intra-subject reproducibility and higher intersubject similarity than cMS-HBM, so we cannot simply conclude that one is better than the other.

Geometric properties of MS-HBM parcellations
In the HCP test set, the average number of spatially disconnected components per parcel was 1.95 ± 0.29 (mean ± std), 1 ± 0 and 1.06 ± 0.07 for dMS-HBM, cMS-HBM and gMS-HBM respectively. In the case of dMS-HBM, the maximum number of spatially disconnected components (across all participants and parcels) was 11 ( Figure S8). In the case of gMS-HBM, the maximum number of spatially disconnected components (across all participants and parcels) was 3 ( Figure S8). On the other hand, the average roundness of the parcellations was 0.56 ± 0.02 (mean ± std), 0.60 ± 0.01 and 0.58 ± 0.02 for dMS-HBM, cMS-HBM and gMS-HBM Cohen's d = 7.5, largest p = 1.2e-9). All reported p values were significant after correcting for multiple comparisons with false discovery rate (FDR) q < 0.05. and S10). We note that Laumann2015 parcellations had different number of parcels with different length of rs-fMRI data. Therefore, the resting-state homogeneity of Laumann2015 parcellations was not comparable across different length of rs-fMRI data, so the results were not shown. Because Schaefer2018 is a group-level parcellation, the parcellation stays the same regardless of the amount of data. Therefore, the performance of the Schaefer2018 group-level parcellation remained constant regardless of the amount of data. Surprisingly, the performance of the Li2019 individual-specific parcellation approach also remained almost constant regardless of the amount of data. One possible reason is that Li2019 constrained individual-specific parcels to overlap with group-level parcels. This might be an overly strong constraint, which could not be overcome with more rs-fMRI data. By contrast, the MS-HBM algorithms (dMS-HBM, cMS-HBM, and gMS-HBM) exhibited higher homogeneity with increased length of rs-fMRI data, suggesting that MS-HBM models were able to improve with more rs-fMRI data.

Individual-specific MS-HBM parcels exhibit lower task inhomogeneity than other approaches
Individual-specific parcellations were estimated using all rs-fMRI sessions from the HCP test set and MSC dataset. Task inhomogeneity was evaluated using task fMRI. Figures 8 and S11 show the task inhomogeneity of all approaches for all task domains in the MSC and HCP  Figure 8). Task inhomogeneity was evaluated using task fMRI. Task inhomogeneity was then defined as the standard deviation of task activation within each parcel, and then averaged across all parcels and contrasts within each behavioral domain. Lower value indicates better task inhomogeneity.
Each circle represents one MSC participant. Dash lines connect the same participants. (B) Same as (A) except that Laumann2015 yielded different number of parcels for each participant, so we matched the number of MS-HBM parcels accordingly for each participant. HCP results are shown in Figure S11.
Among the three MS-HBM variants, cMS-HBM achieved the best task inhomogeneity, while dMS-HBM achieved the worst task inhomogeneity. Compared with gMS-HBM, cMS-HBM achieved an improvement ranging from 0.03% to 0.92% across all task domains and datasets (average improvement = 0.3%, average Cohen's d = 0.6, largest p = 0.013). Compared with dMS-HBM, gMS-HBM achieved an improvement ranging from 0.06% to 1.1% across all task domains and datasets (average improvement = 0.5%, average Cohen's d = 1.1, largest p = 1.2e-3). All reported p values were significant after correcting for multiple comparisons with FDR q < 0.05.

Functional connectivity of individual-specific MS-HBM parcels improves behavioral prediction
Individual-specific parcellations were estimated using all rs-fMRI sessions from the HCP test set. The RSFC of the individual-specific parcellations was used for predicting 58 behavioral measures. We note that the number of parcels was different across participants for Laumann2015, so Laumann2015 could not be included for this analysis.
Tables S2 and S3 summarize the average prediction accuracies (Pearson's correlation) for different sets of behavioral measures, including cognitive, personality and emotion measures.
Overall, individual-specific functional connectivity strength from MS-HBM parcellations achieved better prediction performance than other approaches. In general, gMS-HBM achieved better prediction performance than dMS-HBM and cMS-HBM, but differences were not significant. Figure 9A shows the average prediction accuracies of all 58 behaviors across different parcellation approaches. Compared with Schaefer2018 and Li2019, gMS-HBM achieved improvements of 16% (p = 5.0e-4) and 18% (p = 5.4e-4) respectively. Both p values remained significant after correcting for multiple comparisons with FDR q < 0.05. Compared with cMS-HBM and dMS-HBM, gMS-HBM achieved an improvement of 5.5% and 3.4% respectively.
However, differences among MS-HBM variants were not significant. We note that some behavioral measures were predicted poorly by all approaches. This is not unexpected because we do not expect all behavioral measures to be predictable with RSFC.
Therefore, we further consider a subset of behavioral measures that could be predicted well by at least one approach. Figure 9B shows the average prediction accuracies of 36 behaviors with accuracies higher than 0.1 for at least one approach ("36 behaviors > 0.1"). Compared with Schaefer2018 and Li2019, gMS-HBM achieved improvements of 13% (p = 2.2e-4) and 13% (p

Task performance measures are more predictable than self-reported measures
To explore which behavioral measures can be consistently predicted well regardless of parcellations, we ordered the 58 behavioral measures based on averaged prediction accuracies (Pearson's correlation) across Schaefer2018, Li2019 and the three MS-HBM variants ( Figure   11B). Our previous studies (Li et al., 2019a;Liégeois et al., 2019) have suggested that "selfreported" and "task performance" measures might be differentially predicted under different conditions. Using the same classification of behavioral measures (Li et al., 2019a;Liégeois et al., 2019), we found that the average prediction accuracies of self-reported measures and task performance measures were r = 0.0890 ± 0.0048 and r = 0.1181 ± 0.0033 respectively ( Figure   11A), suggesting that on average, task performance measures were more predictable than selfreported measures (p = 0.042). Figure 11. Task performance measures were predicted better than self-reported measures across different parcellation approaches. Prediction accuracies were averaged across all parcellation approaches (three MS-HBM variants, Schaefer2018, and Li2019). (A) Prediction accuracies averaged across HCP task-performance measures (gray) and HCP self-reported measures (white). (B) Behavioral measures were ordered based on average prediction accuracies. Gray color indicates task performance measures. White color indicates self-reported measures. Boxplots utilized default Matlab parameters, i.e., box shows median and inter-quartile range (IQR). Whiskers indicate 1.5 IQR (not standard deviation). Circle indicates mean. Designation of behavioral measures into "self-reported" and "task-performance) measures followed previous studies (Li et al., 2019a;Liégeois et al., 2019).

Discussion
In this manuscript, we demonstrated the robustness of the MS-HBM areal-level parcellation approach. Compared with a group-level parcellation and two state-of-the-art individual-specific areal-level parcellation approaches, we found that MS-HBM parcels were more homogeneous during resting-state while also exhibiting more uniform task activation patterns (i.e., lower task inhomgeneity). Furthermore, RSFC derived from individual-specific MS-HBM parcellations achieved better behavioral prediction performance than other approaches. Among the three MS-HBM variants, the contiguous MS-HBM (cMS-HBM) exhibited the best resting homogeneity and task inhomogeneity, while the gradient-infused MS-HBM (gMS-HBM) exhibited the best behavioral prediction performance.

Interpretation of the MS-HBM areal-level parcellations
Previous studies have estimated around 300 to 400 classically defined cortical areas in  (Valk et al., 2020). Despite our focus on the 400-region areal-level parcellations in the current study, we do not believe that there is an optimal number of cortical parcels because of the multi-resolution organization of the cerebral cortex (Churchland and Sejnowski, 1988;van den Heuvel and Yeo, 2017). Indeed, given the heterogeneity of cortical areas (Kaas, 1987;Amunts and Zilles, 2015), cortical areas might be further subdivided into meaningful computational sub-units.
More specifically and consistent with other studies, our areal-level parcels likely captured sub-areal features such as somatotopy and visual eccentricity . Ultimately, the choice of parcellation resolution might depend on the specific application. For example, a recent study suggested that brain-behavior relationships are scaledependent (Betzel et al., 2019). Furthermore, a higher resolution parcellation might be computationally infeasible for certain analysis, such as edge-centric network analysis (Faskowitz et al., 2020). Therefore, we have provided trained MS-HBM at different spatial resolutions, ranging from 100 to 1000 parcels. It is worth noting that because our parcels do not correspond to traditional cortical areas (Kaas, 1987;Amunts and Zilles, 2015), we have been careful to avoid the term "areas". Instead we use the term "areal-level parcellation" when referring to the entire parcellation and "parcels" when referring to individual regions throughout the manuscript.
Several studies have shown that brain networks reconfigure during tasks ( Eickhoff et al., 2018a) in order to achieve invariance across brain states. We leave this for future work.

MS-HBM areal-level parcellations are more homogeneous than other approaches in out-ofsample resting and task-fMRI
Dealing with RSFC matrices at the original voxel or vertex resolution is difficult because of the high-dimensionality. Thus, areal-level brain parcellations have been widely utilized as a dimensionality reduction tool (Eickhoff et al., 2018a), e.g., averaged time course of a parcel is used to represent the entire parcel (Varoquaux and Craddock, 2013;Finn et al., 2015;Rosenberg et al., 2016). For the dimensionality reduction to be valid, vertices within each areal-level parcel should have similar time courses, i.e., high resting-state homogeneity . Across two datasets (HCP and MSC), we found that MS-HBM areal-level parcellations exhibited higher resting-state homogeneity than three other approaches, suggesting that rs-fMRI time courses are more similar within MS-HBM parcels (Figures 6 and 7).
Furthermore, if an individual-specific areal-level parcellation accurately captures the functional brain organization of a participant, one might expect task activation to be uniform within parcels, i.e., low task inhomogeneity (Gordon et al., 2017b;. We found that MS-HBM parcellations achieved better task inhomogeneity than other approaches in both HCP and MSC datasets (Figure 8 and S11). Given the strong link between task-fMRI and rs-fMRI (Smith et al., 2009;Mennes et al., 2010;Cole et al., 2014;Krienen et al., 2014;Bertolero et al., 2015;Yeo et al., 2015;Tavor et al., 2016), this is perhaps not surprising.
Intriguingly, the improvement in task inhomogeneity varied significantly across task domains (Figures 8 and S11) with the motor task exhibiting the least task inhomogeneity improvement for both HCP and MSC datasets. Given that the motor domain exhibited one of the lowest task inhomogeneity across behavioral domains, there might not be much room for improvement. Furthermore, sensory-motor parcels exhibited low inter-subject variation in terms of location and spatial topography ( Figure 4B), so different approaches might perform similarly well. Other possible reasons might include variation in task design and duration.
It is worth pointing out that even though MSC dataset only contained 9 participants, MS-HBM parcellations exhibited better resting homogeneity and task inhomogeneity in every single participant (Figures 6-8, S9-S11). This suggests that MS-HBM parameters estimated from HCP were effective in MSC despite significant acquisition and preprocessing differences.

MS-HBM works well even with only 10 min of rs-fMRI data
It is well-known that longer scan durations can improve the reliability of RSFC measures (Van Dijk et al., 2010;Xu et al., 2016;. Recent studies have suggested that at least 20-30 min of data is needed to obtain reliable measurements O'Connor et al., 2017;Gordon et al., 2017b). Consistent with previous work, we found that resting-state homogeneity of individual-specific areal-level parcellations continued to improve with more data (Figure 7 and S10). The improvements started to plateau around 40-50 min of data.
Although MS-HBM required multi-session rs-fMRI data for training, the models could be applied to a single rs-fMRI session from a new dataset. More specifically, in the MSC dataset, we showed that MS-HBM areal-level parcellations estimated with only 10 min of rs-fMRI data exhibited better resting-state homogeneity than two other approaches using 150 min of data (Gordon et al., 2017b;Li et al., 2019b).

RSFC of individual-specific MS-HBM parcellations improves behavioral prediction
A vast body of literature has shown that functional connectivity derived from group-level parcellations can be utilized for behavioral prediction (Hampson et al., 2006;  It is worth noting that the absolute improvement in prediction performance was modest on average, although some behavioral measures appeared to benefit more than others. For example, when comparing Li2019 and gMS-HBM for behavioral prediction, the prediction accuracy (Pearson's correlation) of "openness (NEO)" improved from 0.19 to 0.26, while the accuracy (Pearson's correlation) of "vocabulary (picture matching)" improved from 0.36 to 0.39.
Thus, gMS-HBM might be more helpful for predicting certain behavioral measures than others. Further analysis suggested that task performance measures were on average predicted with higher accuracy than self-reported measures ( Figure 11). This differentiation between task performance and self-reported measures was consistent with previous investigations of RSFCbehavior relationships. For example, RSFC has been shown to predict cognition better than personality and mental health (Dubois et al., 2018;Chen et al., 2020). Dynamic functional connectivity is also more strongly associated with cognition and task performance than selfreported measures (Vidaurre et al., 2017;Liégeois et al., 2019). Finally, regressing the global signal has been shown to improve the prediction of task performance measures more than selfreported measures .

Spatially-localized individual-specific areal-level parcels
Postmortem studies have generally identified cortical areas that are spatially contiguous (Kaas, 1987;Felleman and Van Essen, 1991;Amunts and Zilles, 2015). This has motivated most resting-state areal-level parcellations to estimate spatially contiguous parcels (Shen et al., 2013;Honnorat et al., 2015;Chong et al., 2017). One approach to achieve spatially contiguous parcels is to introduce a spatial connectedness term into the optimization objective so that distributed parcels would have large penalty (Honnorat et al., 2015;. Another approach is to start with initial spatially contiguous parcels and to iteratively adjust the boundaries to maintain spatial contiguity (Blumensath et al., 2012;Chong et al., 2017;Salehi et al., 2019). Yet another method is to utilize the local-gradient approach, which detects sharp transitions in RSFC profiles, followed by a post-processing procedure . However, work from Glasser and colleagues suggested that some individual-specific areal-level parcels might comprise multiple spatially close components in some individuals .
Given the lack of consensus, we explored three MS-HBM variants in this study. We found that strictly contiguous cMS-HBM parcels achieved the best out-of-sample resting-state homogeneity and task inhomogeneity (Figures 6-8, S8-S10). One possible reason is that cMS-HBM parcellation boundaries were smoother than dMS-HBM and gMS-HBM parcellations.
Since fMRI data is spatially smooth, parcellations with smoother boundaries might have an inherent homogeneity advantage, without necessarily being better at capturing true areal boundaries. Another potential artefact of smooth data is the appearance of excessively round parcels that are at odds with histological studies which show that cortical areas express diverse spatial configurations.
Based on our geometric analyses, we found gMS-HBM to be most anatomically plausible among the three parcellations, having both fewer spatially disconnected components than dMS-HBM, and intermediate levels of roundness between dMS-HBM and cMS-HBM. Furthermore,

42
RSFC derived from gMS-HBM parcels achieved the best behavioral prediction performance, albeit not reaching statistical significance (Figures 9, S11; Tables S2-S5). As elaborated in previous studies , assessment of parcellations should integrate and weigh performance across multiple metrics. For the reasons outlined above, we prefer individual-specific gMS-HBM areal-level parcellations among the three MS-HBM variants.
Overall, our findings suggest that the brain's large-scale organization might potentially comprise certain functional regions that are spatially disconnected. Neuronal migration, guided by cell-to-cell interactions and gradients of diffusible cues, plays an important role in establishing the brain's complex cytoarchitectonic organization during embryogenesis (Silva et al., 2019). Spatially disconnected parcels might reflect functionally analogous neuronal populations from the same cellular lineage that separate due to natural variation in migration patterns in early development.
That said, we are aware that one cannot establish with certainty the existence of spatially disconnected cortical areas based on resting-fMRI data alone. It is possible that disconnected components of a non-contiguous parcel are inseparable by resting-fMRI measurements, but are separable by other neural properties, such as microstructure or task activations. Given that fMRI is an indirect measurement of neuronal signals, the functional coupling among disconnected components could also be driven by non-neural mechanisms (e.g., vasculature). Nevertheless, our individual-level areal parcellation provides an explicit model that can be further validated using prospectively acquired rs-fMRI paired with other approaches, e.g., post-mortem histological analyses (Xu et al., 2018;Hayashi et al., 2020) or with spatiallytargeted intracranial recording Fox et al., 2018).

Conclusions
We proposed a multi-session hierarchical Bayesian model (MS-HBM) that accounted for both inter-subject and intra-subject functional connectivity variability when estimating individualspecific areal-level parcellations. Three MS-HBM variants with different spatial localization priors were explored. Using 10 min of rs-fMRI data, individual-specific MS-HBM areal-level parcellations generalized better to out-of-sample rs-fMRI data from the same participants than a group-level parcellation approach and two prominent individual-specific areal-level parcellation approaches using 150 min of rs-fMRI data. Furthermore, RSFC derived from MS-HBM parcellations exhibited better behavioral prediction performance than alternative parcellation approaches.

Acknowledgment
We

Supplemental Material
This supplemental material is divided into Supplemental Methods and Supplemental Results to complement the Methods and Results sections in the main text, respectively.

Supplementary Methods
This section provides additional mathematical and implementation details of the multi-session hierarchical Bayesian model (MS-HBM). Section S1 provides mathematical details about the generative model. Section S2 describes the algorithms for estimating group-level priors and deriving the individual-specific parcellations and how "free" parameters of the model are set. Section S3 describes the matching algorithms for comparing MS-HBM parcellations with another parcellation approach. Section S4 describes the kernel ridge regression model for behavioral prediction.

S1. Mathematical model
In this section, we describe our model for individual-specific areal-level parcellation of the cerebral cortex. We assume a common surface coordinate system, where the cerebral cortex is where is the parcellation label at vertex of subject , and 〈 , 〉 denote inner product. , and are the mean direction and concentration parameter of the von Mises-Fisher distribution for parcel label of subject during session . 1: , are the mean directions for networks 1 to . We can think of , as the mean connectivity profile of network label normalized to unit length. If functional connectivity profile , is similar to mean connectivity profile , (i.e., 〈 , , , 〉 is big), then vertex is more likely to be assigned to parcel . The concentration parameter controls the variability of the functional connectivity profiles within parcel . A higher results in a lower dispersion (i.e., lower variance), which means that vertices belonging to the same parcel are more likely to possess functional connectivity profiles that are close to the mean connectivity profile of the parcel. is assumed to be the same for all parcels, subjects and sessions. Thus, we will use to indicate henceforth. Finally, ( ) is a normalization constant to ensure a valid probability distribution (Banerjee et al., 2005): where −1 2 −1 (•) is the modified Bessel function of the first kind with order −1 2 − 1.
Similar to our previous work , we modeled both inter-and intra-subject variability. To model intra-subject functional connectivity variability, we assume a conjugate prior on the subject-specific and session-specific mean connectivity profiles , , which turns out to also be a von Mises-Fisher distribution: where and are the mean direction and concentration parameter of the von Mises-Fisher distribution for parcel label of subject . We can think of as the individual-specific functional connectivity profile of parcel of subject . The concentration parameter controls how much the session-specific mean direction , of subject during session can deviate from the subject-specific mean direction . A higher would imply lower intra-subject functional connectivity variability across sessions. is network-specific but is assumed to be the same for all subjects.
To model inter-subject functional connectivity variability, we assume a conjugate prior on the subject-specific mean connectivity profiles , which is again a von Mises-Fisher distribution whose mean direction corresponded to the group-level mean direction : where and are the mean direction and concentration parameter of the von Mises-Fisher distribution for parcel label . We can think of as the group-level functional connectivity profile of parcel . The concentration parameter controls how much the individual-specific connectivity profile can deviate from the group-level connectivity profile . A higher would imply lower intersubject functional connectivity variability across subjects.
Because the functional connectivity profiles of individual subjects are generally very noisy, we impose a MRF prior on the hidden parcellation labels 1: where ( , ) is a normalization term (partition function) to ensure ( 1: ) is a valid probability distribution. log ( = | ) = log , is a singleton potential encouraging certain vertices to be associated with certain labels. ( , ) is a pairwise potential (Potts model) encouraging neighboring vertices to have the same parcellation labels: The parameters and are tunable parameters greater than zero, and control the tradeoffs between the various terms in the generative model. Assuming that = 1 and = 0, then , can be interpreted as the probability of label occurring at vertex of subject .
However, many parcels will be spatially distributed due to strong long-range RSFC.
Requiring parcels in a MRF framework to be spatially connected (with minimal other assumptions) is non-trivial (Nowozin and Lampert 2010;Honnorat et al. 2015). Here, spatial localization prior is imposed by Φ: ) , (7) where Φ( ) is a singleton potential encoding the spatial localization constraint. The parameters is a tunable parameter greater than zero. We consider three different spatial localization priors which will be explained in S1.1, S1.2, and S1.3.
In addition, it is well-known that early sensory and late motor architectonic areas (areas 3 and 4) are on either side of the central sulcus. Given low inter-subject variability of these areas (Fischl et al., 2008), we do not expect this property to be violated in individuals. Therefore, we additionally constrained the Schaefer2018 group-level parcels on either side of the central sulcus not to extend across the central sulcus in the individual-specific areal-level parcellations. Removal of this constraint resulted in parcels extending across the central sulcus, while not yielding any quantitative improvements, e.g., in behavioral prediction accuracies. Consequently, we decided to keep this constraint because the resulting parcellations were biologically more plausible.

S1.1 Distributed MS-HBM (dMS-HBM).
We considered a spatial localization prior that constrained each individual-specific parcel to be within 30mm of the group-level Schaefer2018 parcel boundaries (similar to . The spatial localization prior Φ = Φ is therefore defined as a × binary mask, where Φ , = 1 if the shortest geodesic distance between vertex and group-level parcel is less than (or equal to) 30 mm, else Φ , = 0. Thus, when Φ , = 0, then it is impossible for the vertex to be assigned to parcel (Eq. (7)).

S1.2 Contiguous MS-HBM (cMS-HBM).
We considered a spatial localization prior that encouraged brain locations with similar 3dimensional spherical coordinates to be grouped into the same parcel (similar to . Here, we denote the 3 × 1 spherical coordinates of vertex as . The spatial localization prior Φ( ) is then defined as Φ( ) = Φ , ( | = , 1: , 1: ) = Φ , ( | , ), where Φ , is the spatial localization constraint from dMS-HBM (S1.1) and ( | , ) follows a von Mises-Fisher distribution: where and are the mean direction and concentration parameter of the von Mises-Fisher distribution for parcel label of subject . We can think of as the mean spatial coordinates of parcel normalized to unit norm (i.e., sphere). Therefore, if vertex is spatially close to the mean spatial location of parcel (i.e., 〈 , 〉 is big), then vertex is more likely to be assigned to parcel .
Consequently, for large enough values of , the parcels will be spatially connected. In the optimization procedure, we start with a smaller value for and then iteratively increase for parcel to ensure spatial contiguity.

S1.3.1 Gradient-infused spatial localization prior
A well-known areal-level parcellation approach is the local gradient approach, which detects local abrupt changes (i.e., gradients) in RSFC across the cortex . Our previous study  has suggested the utility of combining local gradient  and global clustering ) approaches for estimating areal-level parcellations. Therefore, in the case of gradient-infused MS-HBM (gMS-HBM), the spatial contiguity prior in cMS-HBM is complemented with a prior based on RSFC gradient maps, which encourages brain locations with gentle changes in functional connectivity to be grouped into the same parcel.
For each subject , the RSFC gradient map is an × 1 vector, where is the number of vertices. High values indicate abrupt change in RSFC; low values indicate gentle change in RSFC. We define a graph where the vertices and edges correspond to the mesh structure of the surface mesh. For a particular subject , the distance between two adjacent vertices and is defined as ( + )/2. Based on this graph, we constructed an × gradient distance matrix , where each entry represents the shortest geodesic distance between each pair of vertices. If there is an abrupt change in RSFC (high ) somewhere along all paths linking two vertices and , then the shortest geodesic distance between and will be large, even though and are spatially close. By contrast, if there is a path linking vertex and vertex , such that RSFC changes are gentle (low ) along the entire path, then the shortest geodesic distance between and will be relatively small, except if vertices and are far apart.
Ideally, we would like to incorporate into our model: two vertices with high should be encouraged to be in different parcels. This will generally encourage spatially contiguous parcels since two vertices spatially far apart will generally have higher and thus will be encouraged to be in separate parcels. Because gradient-based parcellations tend to have irregularshaped parcels , we will also be encouraging spatial contiguity without relatively round parcels like in cMS-HBM. However, incorporating into our model is not easy because is an × distance matrix, which is not easily incorporated into our mixture-type model.
Furthermore, is also quite large (e.g., more than 30K), so dealing with an × matrix would slow down our algorithm.
Therefore, we applied the diffusion embedding algorithm (Coifman et al., 2005) to the × geodesic gradient distance matrix , thus embedding the vertices into a dimensional space.
Roughly speaking, the Euclidean distance between two vertices (based on their diffusion coordinates) should be similarly to the geodesic distance. The larger would lead to a better approximation, but with more computation costs. We found that = 100 leads to a sufficiently good approximation. More specifically, the correlation between and the Euclidean distance in the dimensional space was 0.84 when averaged across 40 random HCP subjects. In this way, for a particular subject , the information of the geodesic gradient distance matrix is embedded as a 1 × 100 diffusion coordinates for each vertex , which we denote as . If there is an abrupt change in RSFC (high ) somewhere along all paths linking two vertices and , then the Euclidean distance between their diffusion coordinates and will be large. By contrast, if there is a path linking vertex and vertex , such that RSFC changes are gentle (low ) along the entire path, then the Euclidean distance between their diffusion coordinates and will be relatively small, except if vertices and are far apart. The spatial localization prior Φ( ) is thus defined as Φ( ) = Φ , ( | = , 1: , 1: ) = Φ , ( | , ), where ( | , ) follows a Gaussian distribution: where is the mean and is the identity matrix. We can think of as the mean diffusion coordinates of parcel . Therefore, if vertex is spatially close to the mean diffusion coordinates of parcel (i.e., ( − ) ( − ) is small), then vertex is more likely to be assigned to parcel .
We note that the goal of the diffusion embedding was not to compute principal gradients (Margulies et al., 2016), but simply for dimensionality reduction to reduce computational costs and memory requirements when training and applying gMS-HBM. In addition, the computation of the local gradient , the computation of the gradient distance matrix and the diffusion embedding of also required significant computation costs and memory. Thus, in the following sections (Supplementary Methods S1.3.2 and Supplementary Methods S1.3.3), we discussed how to speed up these computations and reduce memory requirements.

S1.3.2 Reducing computational costs of gradient maps
We mentioned in Section S1.3.1 that the spatial localization prior in gMS-HBM was constructed based on the RSFC gradient maps. Figure SM1 (left panel) illustrates the original approach for generating gradient map for a particular subject . Given the resting-state fMRI data of a subject, the × RSFC matrix was first computed. Each row and each column of the RSFC matrix were then correlated to obtain an × RSFC similarity matrix, where the -th column represented the spatial correlation of functional connectivity patterns between vertex and all other vertices. By computing the first order spatial gradient for each column of the RSFC similarity matrix, we obtained the gradients of the RSFC similarity matrix, where high gradient values indicated rapid changes in RSFC, and low gradient values indicated gentle changes in RSFC.
After that, the watershed algorithm 1 was applied to each column of the gradient matrix to generate one binarized gradient map for each vertex. The binarized gradient maps were then averaged to obtain a 1 A gradient map can be thought of as a geological map with values representing the height of the land. Thus, low gradient values correspond to "valleys/bowls" and high gradient values correspond to "hills/peaks". The watershed algorithm (https://en.wikipedia.org/wiki/Watershed_(image_processing)) works by placing "seeds" at the minima of the gradient map. The seeds are then grown (imagine rain flooding the geological landscape) until the boundaries of grown seeds (parcels) meet other grown seeds (parcels). The boundary vertices between parcels are set to 1, while non-boundary vertices are set to 0, thus resulting in a binarized gradient map. single gradient map, where high values indicate high probability of a rapid change in RSFC patterns occurring at that location. Because this original version involved × matrices, it was computationally very expensive to compute. Figure SM1. Illustration of the procedure to generate a RSFC gradient map. Left panel shows the original version, right panel shows our speed-up version. In the case of fs_LR32k, note that = 59412. In the case of fsaverage6, note that = 74947. Therefore, we implemented a faster and less memory-intensive version by subsampling the functional connectivity matrices. Figure SM1 right panel illustrates the speed-up version for generating the gradient map for each participant. Instead of computing the × RSFC matrix to generate the RSFC similarity matrix, we randomly selected 1 and 2 vertices from vertices to compute the subsampled × 2 and 2 × 1 RSFC matrices. We correlated each row of the × 2 RSFC matrix and each column of the 2 × 1 RSFC matrix to obtain the subsampled × 1 RSFC similarity matrix. We then computed the spatial gradients of the subsampled RSFC similarity matrix and utilized the watershed algorithm to obtain 1 binarized gradient maps. The 1 binarized gradient maps were then averaged to obtain a single gradient map. Furthermore, to reduce the memory usage, we split the × 1 RSFC similarity matrix into multiple small blocks (without doing any subsampling), where each block of the RSFC similarity matrix was generated sequentially.
In the case of fs_LR32k, on our machine, the original algorithm (working with the full 59412 × 59412 RSFC matrix) required 40GB of RAM and 4 hours using 1 CPU core to compute the gradient map for a single rs-fMRI run. On the same machine, the speed-up version ( 1 = 297 and 2 = 5941) only required 3GB of RAM and 15 minutes with 1 core. The resulting gradient maps were highly similar to the original gradient maps (r = 0.97 averaged across 40 random participants). In the case of fsaverage6, on our machine, the original algorithm (working with the full 74947 × 74947 RSFC matrix) required 60GB of RAM and 6 hours using 1 CPU core to compute the gradient map for a single rs-fMRI run. On the same machine, the speed-up version ( 1 = 375 and 2 = 7495) only required 4GB of RAM and 20 minutes with 1 core. The resulting gradient maps were highly similar to the original gradient maps (r = 0.97 averaged across 40 random participants).

S1.3.3 Reducing computational costs of computing the diffusion embedding matrix
In Section S1.3.1, we mentioned that the RSFC gradient map was utilized to generate an × geodesic gradient distance matrices. We then applied the diffusion embedding algorithm (Margulies et al., 2016) to reduce the dimensionality to 100 diffusion coordinates. The resulting × 100 diffusion embedding matrix was utilized for the spatial localization prior in Eq. (10).
However, this approach involved × geodesic gradient distance matrices, so it was also computationally expensive.
To overcome this issue, we downsampled the × 1 RSFC gradient map from S1.3.2 to a lower resolution mesh with 3 vertices. The resulting 3 × 1 downsampled gradient map was then used to generate 3 × 3 geodesic gradient distance matrices. The diffusion embedding algorithm was performed on the 3 × 3 gradient distance matrix to obtain an 3 × 100 diffusion embedding matrix.
After that, we upsampled the 3 × 100 diffusion embedding matrix back to the original space to generate an × 100 diffusion embedding matrix.
In the case of fs_LR32k, on our machine, the original diffusion embedding algorithm (working with the full 59412 × 59412 geodesic gradient distance matrices) required 6GB of RAM and 2 hours using 1 core to convert the × 1 RSFC gradient map (Section S1.3.2) to an 59412 × 100 diffusion embedding matrix. On the same machine, the sped-up diffusion embedding algorithm ( 3 = 20484) only required 3GB of RAM and 10 minutes with 1 core. The resulting diffusion embedding matrices were highly similar to the original diffusion embedding matrices computed using the original gradient algorithm and original diffusion embedding algorithm (r = 0.98 averaged across 40 random participants). In the case of fsaverage6, on our machine, the original diffusion embedding algorithm (working with the full 74947 × 74947 geodesic gradient distance matrices) required 8GB of RAM and 2.5 hours using 1 core to convert the 74947 × 1 RSFC gradient map (Section S1.3.2) to an 74947 × 100 diffusion embedding matrix. On the same machine, the sped-up diffusion embedding algorithm ( 3 = 25924) only required 3GB of RAM and 10 minutes with 1 core. The resulting diffusion embedding matrices were highly similar to the original diffusion embedding matrices computed using the original gradient algorithm and original diffusion embedding algorithm (r = 0.97 averaged across 40 random participants).

S2. Model estimation
In this section, we describe how model parameters are estimated from a training set and a validation set (Section S2.1), and how the parameters can be used to parcellate a new subject (Section S2.2). Throughout the entire section, we assume that the number of parcels = 400 without loss of generality.

S2.1 Learning model parameters
Our goal is to estimate the model parameters { 1: , 1: , 1: ,1: , 1: , , , } from a training set and a validation set of binarized and normalized functional connectivity profiles, which can then be utilized for estimating individual-specific parcellations in unseen data of new subjects (Section S2.2).
By using the constraints that 〈 , , , 〉 = 1, 〈 , 〉 = 1, 〈 , 〉 = 1, > 0, > 0, > 0, and differentiating ℒ( , ) with respect to 1: , 1: , , 1: 1: 1: , 1: 1: ,1: , 1: ,1: , and setting the derivatives to zero, we get the following update equations: where is the length of , (i.e., number of ROIs in each functional connectivity profile), is the number of subjects, is the number of sessions, and ‖•‖ corresponds to the 2 -norm. Therefore, the estimate of the functional connectivity profile , (Eq. (26) ) and the subject-specific mean direction , with weights and for each term, normalized to be unit norm. If is much greater than , then , is more likely to be dominated by subject-specific mean direction , which means that the functional connectivity profile of parcel is highly stable across sessions. Similarly, the estimate of the functional connectivity profile (Eq. (28)) of parcel of subject is the weighted sum of the average session-specific mean directions across all sessions for parcel of subject (∑ , =1 ) and the group-level mean direction , with weights and for each term, normalized to be unit norm. If is much greater than , then is more likely to be dominated by group-level mean direction , which means that the functional connectivity profile of parcel is highly stable between subjects. Finally, the estimate of the grouplevel functional connectivity profile (Eq. (30)) of parcel is the sum of the subject-specific mean directions across all subjects for parcel (∑ =1 ), normalized to be unit norm. The estimate of , (Eq. (32)) is the posterior probability of parcel being assigned to vertex , averaged across all the subjects.
Given the training set, the algorithm first utilizes the 400-region Schaefer2018 group-level parcellation  to initialize the EM algorithm. The EM algorithm iterates E-step (Eq. (24)) and M-step (Eqs. (26-32)) till convergence. We note that the update equations  in the M-step are dependent on each other. Therefore, within the M-step, the update equations (Eqs. (26-32) are iterated till convergence.
Throughout the paper (main text), the reported quality (Figures 6-11) of the individualspecific parcellations was evaluated using subjects not used to tune the parameters. For example, in the case of the HCP data ( Figures 6-11 Assuming a uniform (improper) prior on , and by introducing the parcellation labels 1: of the new subject as latent variables, the lower bound ℒ( , ) of the MAP problem (Eq. (40)) can be written as: where equality is achieved when is the posterior probability of the individual-specific parcellation of subject given the parameters . Similar to Section S2.1.1, we can maximize the lower bound (Eq. More specifically, in the variational E-step, is fixed and is estimated as follows: Once the VBEM algorithm converges, vertex of subject will be assigned to label with the highest (approximate) posterior probability.

S3. Matching algorithm for comparison with Laumann2015
We mentioned in the main text that parcellations estimated by Laumann2015 Gordon et al., 2017) had a variable number of parcels across subjects. Furthermore, Laumann2015 parcellations also had a significant number of vertices between parcels that were not assigned to any parcel, which had the effect of increasing resting homogeneity and decreasing task inhomogeneity. Therefore, we performed a post-hoc processing of MS-HBM parcellations to match the number of parcels and unlabeled vertices of Laumann2015 parcellations.
Since Laumann2015 individual-specific parcellations typically have around 500-600 parcels, we utilized 600-region Schaefer2018 group-level parcellation to initialize MS-HBM algorithms to generate 600-region MS-HBM parcellations. For each subject, we sort the 600 MS-HBM parcels based on parcel size and for parcels with the same size, we sort them based on resting-state homogeneity.
Assume Laumann2015 parcellation has M parcels, we find the 600-M MS-HBM parcels with the smallest size and the lowest resting-state homogeneity. For each vertex within these 600-M parcels, we re-assigned it to the neighboring parcel with the highest similarity , = If W was more than 2 ℎ , we first removed the labels from all the vertices within the 2-vertex thick boundary, and then remove the labels from the remaining W-2 ℎ vertices, which were the nearest neighbors of the 2-vertex thick boundary vertices with the lowest resting-state homogeneity. In this way, we can match the MS-HBM parcellations to the Laumann2015 parcellations with the same number of parcels and the same number of unlabeled vertices.

S4. Behavioral prediction model
In this section, we describe our model for behavioral prediction based on individual differences in the functional connectivity. Kernel regression (Murphy et al., 2012) was utilized to predict each behavioral phenotype in individual subjects. The derivations of our behavioral prediction model has been shown in our previous work . Here, we include the derivation again for completeness.
Suppose we have training subjects, is the behavioral measure (e.g., fluid intelligence) and is the 400  400 RSFC matrix generated by individual-specific parcellation of the -th training subject. Given { 1 , 2 , … , } and { 1 , 2 , … , }, the kernel regression model will be: where ( Differentiating Eq. (53) with respect to , we can get To predict the behavior measure (e.g., fluid intelligence) of a test subject with its RSFC matrix , we can compute ( , ), which is the Pearson correlation between RSFC matrix of subject and -th training subject. The predicted behavior measure can be calculated as where is estimated by Eq. (53). If we denote = [ ( 1 , ), ( 2 , ), … , ( , )], then Eq. (55) can be written as: In practice, −1 is a symmetric matrix whose diagonal elements are roughly the same and ~100 times larger than the off-diagonal elements. Therefore, the predicted behavior measure can be seen as the weighted average of the behaviors of the training subjects: ≈ = ∑ ( , ) =1 . If the RSFC matrix of test subject is more similar to the RSFC matrix of training subject than training subject , then weight ( , ) will be larger than ( , ), and so will be more similar to than .
To reduce overfitting, an 2 -regularization term (i.e., kernel ridge regression) is typically added to cost function (Eq. (53)), resulting in a new regularized cost function: where is a tuning parameter, which controls the importance of the regularization term.
To predict the behavior measure (e.g., fluid intelligence) of a test subject , Eq. (58)

Supplemental Results
Figure S1. Sensory-motor parcels exhibit lower inter-subject, but higher intra-subject, functional connectivity variability than association cortical parcels for in the HCP training set. (A) 400region Schaefer2018 group-level parcellation. (B) Inter-subject resting-state functional connectivity variability for different parcels. (C) Intra-subject resting-state functional connectivity variability for different parcels. Note that (B) and (C) correspond to the and parameters in Figure 1, where higher values indicate lower variability. Figure S2. 400-region individual-specific areal-level parcellations were estimated using rs-fMRI data from day 1 and day 2 separately for each participant from HCP test set. Right hemisphere parcellations of four representative participants are shown here. Left hemisphere parcellations are shown in Figure 2C.        Figure S11. MS-HBM parcellations achieved better task inhomogeneity in the HCP dataset. (A) 400-region individual-specific parcellations were estimated using all resting-state fMRI data. Task inhomogeneity was evaluated in the task data. Task inhomogeneity was then defined as the standard deviation of task activation within each parcel, and then averaged across all parcels and contrasts within each behavioral domain. Lower value indicates better task inhomogeneity.    Table S2. Average prediction accuracies (Pearson's correlation) for all 58 behavioral measures and 36 behaviors with accuracies higher than 0.1 for at least one approach ("36 behaviors > 0.1") across different parcellation approaches. Prediction was based on individual-specific functional connectivity. The mean accuracy and standard deviation were calculated across 100 20-fold cross-validations. The percentage improvement and number of cross-validations that MS-HBM algorithms outperform Schaefer2018 and Li2019 across 100 20-fold cross-validations (shown in brackets) were reported. Red font indicates statistical significance after correcting for multiple comparisons with false discovery rate q < 0.05. Table S3. Average prediction accuracies (Pearson's correlation) for 13 cognitive measures, NEO-5 personality measures and emotional measures across different parcellation approaches. Prediction was based on individual-specific functional connectivity. The mean accuracy and standard deviation were calculated across 100 20-fold cross-validations. The percentage improvement and number of cross-validations that MS-HBM algorithms outperform Schaefer2018 and Li2019 across 100 20-fold cross-validations (shown in brackets) were reported. Red font indicates statistical significance after correcting for multiple comparisons with false discovery rate q < 0.05. Table S4. Average prediction accuracies (COD) for all 58 behavioral measures and 36 behaviors with accuracies higher than 0.1 for at least one approach ("36 behaviors > 0.1") across different parcellation approaches. Prediction was based on individual-specific functional connectivity. The mean accuracy and standard deviation were calculated across 100 20-fold cross-validations. The percentage improvement and number of cross-validations that MS-HBM algorithms outperform Schaefer2018 and Li2019 across 100 20-fold cross-validations (shown in brackets) were reported. Red font indicates statistical significance after correcting for multiple comparisons with false discovery rate q < 0.05. Table S5. Average prediction accuracies (COD) for 13 cognitive measures, NEO-5 personality measures and emotional measures across different parcellation approaches. Prediction was based on individual-specific functional connectivity. The mean accuracy and standard deviation were calculated across 100 20-fold cross-validations. The percentage improvement and number of cross-validations that MS-HBM algorithms outperform Schaefer2018 and Li2019 across 100 20fold cross-validations (shown in brackets) were reported. Red font indicates statistical significance after correcting for multiple comparisons with false discovery rate q < 0.05.