Asymmetry in marginal population performance foreshadows widespread species range shifts

Range shifts are expected to occur when populations at one range margin perform better than those at the other margin, yet no global trend in population performances at range margins has been demonstrated empirically across a wide range of taxa and biomes. Here we test the prediction that, if impacts of ongoing climate change on population performance are widespread, then populations from the high-latitude margin (HLM) should perform as well as or better than central populations, whereas populations at low-latitude margins (LLM) populations should perform worse.Global1898–2020Plants and animalsTo test our prediction, we used a meta-analysis quantifying the empirical support for asymmetry in the performance of high- and low-latitude margin populations compared to central populations. Performance estimates were derived from 51 papers involving 113 margin-centre comparisons from 54 species and 705 populations. We then related these performance differences to climatic differences among populations. We also tested whether patterns are consistent across taxonomic kingdoms (plants vs. animals) and across habitats (marine vs. terrestrial).Populations at margins performed significantly worse than central populations and this trend was primarily driven by the low-latitude margin. Although the difference was of small magnitude, it was largely consistent across biological kingdoms and habitats. The differences in performance were positively related to the difference in average temperatures between populations during the period 1985–2016.The observed asymmetry in marginal population performance confirms predictions about the effects of global climate change. It indicates that changes in demographic rates in marginal populations, despite extensive short-term variation, can serve as early-warning signals of impending range shifts.

existing disequilibrium of species ranges with climate and hence the propensity of species to shift their range.Such knowledge is crucial to accurately forecast future climate-driven range shifts 6,7 and changes in ecosystem functioning, and for informing resource and conservation planning.
Changes in the performance of marginal populations should represent a much more direct and immediate indicator of species' response to climate warming than the more widely monitored distribution changes 8,,9 because range limits can also be constrained by diverse nonclimatic factors such as habitat availability, dispersal limitation or biotic interactions [10][11][12][13] .Even when range limits are directly determined by contemporary climate, the effects of climate on population dynamics might be difficult to detect except in meteorologically extreme years.Detailed observations of marginal population dynamics are rare, especially for populations at contracting range margins 14,15 .The scant empirical evidence currently prevents wide-ranging comparisons of population dynamics at expanding and retreating range edges.
Here, we use the abundant empirical literature spawned by the so-called centre-periphery (CP) paradigm to examine differences in performance between range centres and high-and lowlatitude margins for a wide range of taxa.The CP hypothesis states that the size, density and long-term growth rate of populations tend to decrease from the centre towards the periphery of the range as environmental conditions become increasingly less favourable 4,16,17 (Fig. 1).The CP paradigm has motivated hundreds of empirical studies that have compared various indicators of population performance (including measures of individual survival or fecundity, population viability and others) in central and marginal populations 13 .We use a comprehensive sample of published studies to compare measures of population performance in sites located at the centre and at the high-latitude margins (HLM) or low-latitude margins (LLM) of species ranges 18 .We predict that if impacts of ongoing climate change on population performance are widespread, then HLM populations should perform as well as or better than central populations whereas LLM populations should perform worse (Fig. 1).To test this prediction, we quantify the empirical support for this hypothesized asymmetry in the performance of HLM and LLM populations compared to central populations, and test if patterns are consistent across taxonomic kingdoms (plants vs. animals) and across habitats (marine vs. terrestrial).We also predict that if climate is an ultimate driver of population performance, then performance differences should increase with the difference in climate between central and marginal populations (Fig. 1).To test this prediction, we relate the observed differences in performance between central and peripheral populations with the actual differences in climate.
We searched the scientific literature for peer-reviewed publications published by 23 rd May 2018 using keywords related to CP comparisons of population performance, retaining papers that provided data for at least two populations from the range centre and two populations from one latitudinal range margin (HLM or LLM) in the species' natural environment (Supplementary Material S1).We only considered primary papers reporting demographic performance metrics that could clearly be identified as estimators of individual fecundity, survival, or lifetime fitness.We identified 42 papers that fulfilled our criteria, involving 96 CP comparisons (HLM: n = 58, LLM: n = 38) and 623 populations (Fig. S1).To compare performance in central vs. marginal populations, we conducted a multi-level meta-analysis using Hedges' d effect sizes for a standardized comparison.We modelled heterogeneity among effect sizes using margin type (HLM vs. LLM), kingdom (animals vs. plants) and habitat (marine vs. terrestrial) as moderators (Fig. S2).
. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted January 24, 2019.; https://doi.org/10.1101/529560doi: bioRxiv preprint Grand mean effect size was negative (-0.37; 95% CI: -0.71, -0.04), meaning that marginal populations on average performed worse than central populations.There was a significant amount of total heterogeneity, with 61% of it arising from among-study heterogeneity (τ² = 1.49,I² = 0.61, Q E = 289.27,P < 0.0001).Performance declined from the range centre towards the LLM (-1.07; 95% CI: -1.67, -0.47; estimated from the model with Margin as the sole moderator) but not towards the HLM (-0.14; 95% CI: -0.64, 0.36) (Fig. 2).Thus, HLM populations showed overall similar performance as central populations.Margin type was the most important moderator (w H = 0.96) while the best model only explained 4% of the total heterogeneity (HLM-LLM difference: z = -2.69,P = 0.007).Residual heterogeneity (best model: Q E = 260.63,P < 0.0001) was neither explained by habitat (w H = 0.69, difference between marine and terrestrial habitats in the best model: z = -1.55,P = 0.121) nor kingdom (w H = 0.54; difference between animals and plants in the best model: z = 1.33,P = 0.184) (Fig. S3; Table S1).
The differences in performance between marginal and central populations were significantly related (P = 0.015) to the difference in their average temperature in the period 1990-2013 (Table S3; total deviance explained by an additive mixed model: 24.9%).As predicted, performance decreased with increasingly departing temperatures from central populations, although the decline was considerably stronger in LLM than in HLM populations (Fig. 3).Thus, HLM populations experiencing 5º C colder temperatures than central populations have similar fitness, whereas LLM populations experiencing 5° C warmer temperatures perform worse (Fig. 3).These differences in performance were not related to geographical distance between marginal and central populations (Fig. S4).
Overall, our results show that populations from the centre of the range tend to outperform those residing at the LLM but not those at the HLM.Such latitudinal asymmetry is predicted when the environmental conditions relevant for population performance are directionally displaced (Fig. 1) 5 .Global warming has provoked a rapid large-scale poleward displacement of climatic zones since the 1970s, and the trend is predicted to further accelerate through the coming decades 19 .In contrast to range shifts, changes in population performance in response to environmental or climate change are expected to occur with little or no time lag.The observed difference is therefore likely to largely result from ongoing climate change, although we cannot exclude effects of changes in factors unrelated with current climate 10,11,20 .We thoroughly searched the literature for reports on range dynamics for our target species and detected evidence for 9 cases (6 species); all of them showed asymmetric population performance (see Table 1) and all are experiencing poleward range shifts.Although limited, this evidence suggests that demographic rates could act as early warning signals of impending range shifts.
The type of range margin (HLM or LLM) explained only a moderate 4% of the overall variation in the relative performance of marginal populations.This is unsurprising given the great variety of organisms, response variables, and ecological contexts considered in our analysis.In addition, most primary studies reported only short-term data that are likely to stem from meteorologically 'normal' years, whereas range shifts might primarily be catalysed by extreme years 21 .Finally, performance at some specific life stages is not necessarily a reliable predictor of lifetime fitness and population growth rates 12,22 .Despite these limitations, the type of range margin was the main predictor of performance in marginal populations.
Our findings suggest that latitudinal asymmetries exist worldwide, for animals as well as plants, and for terrestrial as well as marine species (Fig. S1, S2).This pervasive nature of the phenomenon is the more striking as climatic constraints and the responses of populations differ greatly between groups of organisms.For instance, plants generally tend to have a greater capacity to buffer climatic stress through phenotypic plasticity and persistent life cycle stages than animals 22 , which would allow them to slow population declines and accumulate a higher extinction debt 23,24 .Moreover, climate is shifting at different pace in marine and terrestrial environments, with median temperatures increasing more than three times faster on land than at sea 25 .Water temperature and related properties drive population dynamics of marine species, whereas many LLM populations of terrestrial species are primarily constrained by water balance 26 .This difference may also explain why marine ectothermic animals tend to more fully occupy the latitudinal ranges situated within their thermal tolerance limits than terrestrial ectotherms, which are commonly absent in the warmest parts of their potential range 27 .Even these important differences between organisms and environments do not blur the effect of the range margin as the most consistent predictor of population performance.Given that differences in population performance can represent a powerful early indicator of impending range shifts 3,5 , our results indicate that many extant species ranges are not in equilibrium with current climates even though they to date have not experienced perceivable shifts.Considering empirical fitness trends in marginal populations will substantially increase the realism of population-based approaches to species distribution modelling 28,29 .Given that latitudinal range shifts are likely to be ongoing or impending for many species, such improved predictive capacity is needed if we are to forecast their implications for biodiversity and ecosystem function.

S1. General methods
Data compilation -We searched the Thomson Web of Knowledge ® and Scopus until 23 rd May 2018 for publications in peer-reviewed international scientific journals using key search terms in the title or the abstract.In addition, we searched Google Scholar using the same terms in the whole text of articles and restricting our selection to the first 200 references.The terms 'centre-periphery', 'central-marginal', 'abundant centre', and 'latitudinal cline' were introduced in combination with performance related terms including 'fecundity', 'performance', 'survival', 'recruitment' and 'population growth rate'.We identified additional papers by searching the literature cited sections of these articles.
Selection criteria and data collection -Three filters were applied to the obtained collection of primary papers.First, we only considered studies reporting field data from natural populations (including control populations of transplant experiments if these were measured at their home sites and met all other criteria).Second, we only considered studies with at least two central and two peripheral populations (i.e., true replicates).Third, we only considered papers that provided sufficiently clear criteria for the definition of central and peripheral range parts relative to the global range of the target species.This filtering procedure resulted in a total of 42 retained primary papers with 96 CP comparisons of 44 species including woody plants (17%), herbs (45%), different invertebrates (27%), birds (6%), and reptiles (5%), with 31 (70.5%)being terrestrial and 13 (29.5%)marine organisms  . The wrkflow and output of our compilation and selection process is described in detail in Supplementary Information S2.
We extracted the reported performance metrics from each primary paper and assigned them to one of the following categories: (i) 'Survival' (e.g.mortality of individuals or ramets, rates of fruit abortion or germination), (ii) 'Reproduction' (e.g., proportion of actively reproducing individuals, seed number, gonadal mass, total seed or egg mass), or (iii) 'Lifetime fitness' (different estimates of population growth rate).Moreover, we assigned each case study to one of two major categories of taxonomic status (plants vs. animals) and habitat (terrestrial vs. marine).Two major kinds of papers provided suitable information: i) explicit CP comparisons of mean performance values from populations classified as central or marginal by the authors, and ii) papers reporting on latitudinal clines.In the first case, we followed the criteria of the original authors for classifying populations as central or marginal.In the second case, we selected the three most central and the three most marginal populations along the gradient (rarely more if several populations were located closely together).We extracted quantitative data for our target metrics either manually from text and tables or from figures with Dagra digitizing software version 2.0.12 81 .We recorded average values for each individual population (Fig. S1) and pooled them subsequently to calculate the average performance, sample size and resulting standard deviation for C, HLM and LLM, respectively.
Effect Sizes -We used Hedges' d statistic as our standardised measure of effect size.Hedges' d is the most appropriate effect size to compare raw means when both positive and negative values are present in data 81 .Hedges' d was calculated as: Note that v d contains information about both the sample size and the standard deviation (within d 2 ) of the original studies; it hence can be used to weight the relative importance of studies within the meta-analysis (see also Fig. 2).In some papers, both HLM populations and LLM populations were compared to the same central populations, resulting in an overestimated pooled sample size (N = n center + n margin ) because, for such primary papers, n center is counted twice.We manually corrected N in all such cases before conducting the analysis.
Meta-analytical models -Our dataset had a hierarchical structure as some primary papers contained several case studies.We accounted for this potential non-independence of cases by estimating model heterogeneity from multiple sources: (i) among true effect sizes, (ii) among CP comparisons stemming from the same primary papers (by computing the variance-covariance matrix among all effect sizes) and (iii) among groups of moderators.This was done using multilevel error meta-analysis 82 with the rma.mv function of the R package metafor v. 2.0-0 83,84 .
Primary paper identity was declared as a random factor and individual CP comparisons were nested as random factor within primary papers.We estimated variance components for primary papers (σ 1 2 ) and case studies (σ 2 2 ) together with intra-class correlations (ρ), that is, correlations between true effect sizes from the same study (such that ρ= σ 1 2 / (σ 1 2 + σ 2 2 )).
We first calculated grand mean effect size as the overall weighted mean across all effect sizes 85 .This corresponded to a random-effect meta-analyses, where heterogeneity among true effect sizes (τ²) is used to weight individual effects sizes (weight = 1/(v + τ²)), which allows inferences for CP comparisons not included in the analysis.Then, we used multi-level (hierarchical) meta-analyses to test the effect of three moderators: Margin (HLM vs. LLM), Kingdom (animals vs. plants) and Habitat (marine vs. terrestrial).We built a set of the 17 possible models including all possible combinations of simple effects (n = 7 models) and two-way interactions among Margin, Kingdom and Habitat (n = 10 models).We ranked these 17 models plus the null model (i.e., intercept only) according to their AICc using the R package glmulti v. 1.0.7 86 .For each model, we calculated ΔAICc and AICc weight (w i ).Models within ΔAICc < 2 typically are considered as competing best models, given the model set and the data (Table S1).AICc weights represent the probability that a given model is selected as the best model.For each moderator, we then estimated its relative importance (w H ) by summing all w i of the models including this moderator (w H = Σw i ); w H can be interpreted as the probability that a given moderator is included in the best model (Fig. S3).Finally, we estimated model parameters for all competing models with ΔAICc < 2. We report model parameter estimates for the best model and, whenever necessary, for competing models (Table S2).
Publication bias -Please see the Supplementary Information 2 for further details upon the meta-analysis, including several assessments of its inherent reliability (e.g.publication bias, balanced representation of moderators, etc.) (Fig. S5, S6).
Collection of climate data -We gathered the geographical coordinates of all populations included in the meta-analysis from the primary papers (n = 623 populations; see map in Fig. S1).
For each population, we calculated the average annual temperature between 1990 and 2013 (when most studies were performed) based on monthly temperature data, from CRU TS 3.22 87 for terrestrial species and HadISST 1.1 88 for marine species.We then aggregated populations to calculate average temperatures for each combination of study, species, performance variable, and region (either central, HLM, or LLM).We could then relate each comparison of performance between a margin (HLM or LLM) and the central range (i.e., Hedges' d) with the difference in average temperatures between both regions.
Analysis of relationships between climate and population performance -We decided to compare average long-term temperatures among regions, rather than warming trends, as the former can be estimated more accurately and precisely at the scale of this study.Similarly, although precipitation might also be an important climatic variable for some terrestrial species, we decided to focus on temperature only due to the limited sample size available to fit our models.To assess the relationship between the differences in performance and the differences in climate between marginal and central populations, we used additive mixed models (function gam in the R package mgcv, version 1.8-17 89 ) using the temperature differences as predictor, and the study as random effect (to control for lack of independence).We weighted performance effect sizes by their variances so that their influence in model calibration was inversely related to their uncertainty (see Supplementary Information S4 for further details).

S2. Bibliographic compilation
We searched the ISI Web of Science (WOS) and Scopus until 23 rd May 2018 for papers containing adequate data for our study.The terms "centre AND periphery", "central AND marginal", "abundant centre" and "latitudinal cline" were introduced in combination with performance related terms including "fecundity", "survival", "recruitment" and "population growth rate".We restricted our search to the categories: Environmental Sciences-Ecology, Plant Sciences, Zoology, Entomology, Marine and Freshwater Biology, Biodiversity Conservation, Agriculture, and Forestry in WOS ("Theme" as the search field) and Agricultural and Biological Sciences and Environmental Science in Scopus ("Article title, abstract and keywords" as search fields).After discarding papers that were clearly out of scope, we retained 54 papers from WOS and 41 papers from Scopus.We performed an additional search with Google Scholar (which tends to generate a larger number of papers but lacks specific tools for search refinement) by combining the search terms with the term "ecology" and restricting our screening to the first 200 papers found.By this procedure we found 66 papers.After removing duplicates from the three sources, we came to a joint list of 98 papers.Then we screened their abstracts or, when necessary, the main text of the articles to select only those papers fulfilling our criteria: (1) we only considered studies reporting field data from natural populations (including control populations of transplant experiments if these were measured at their home sites and met all other criteria); (2) we only considered studies with at least two central and two peripheral populations (i.e., true replicates); and (3) we only considered papers that provided sufficiently clear criteria for the definition of central and peripheral range parts relative to the global range of the target species.Finally, we searched the text of the selected papers and came to a final set of 42 papers that provided data amenable to meta-analysis, either primary data or data extracted from figures (see S1).These papers were then classified in two major kinds.First, papers including explicit centre-periphery comparisons of mean performance values from populations described as central or peripheral in the text.Second, papers based on latitudinal clines.In this later case, from each region we used the three most central and the three most extreme populations along the gradient (or more when several populations were located closely together; see S1).

S3.1. Is the dataset subject to publication bias?
Publication bias occurs when a dataset lacks disproportionately many case studies with either positive or negative effect sizes, that is, when some tendency has been more likely to be published (publication bias) or retrieved (dissemination bias) than others.We used four complementary approaches to estimate whether publication bias was likely to occur in our dataset: (1) visual inspection of a funnel plot, (2) the calculation of a fail safe number, (3) a correlation between reported effect sizes and the impact factor of source journals, and (4) a cumulative meta-analysis to test for time-lag bias.
(1) Funnel plot.Funnel plots probe whether studies with little precision (small studies) give different results from studies with greater precision (larger studies).Asymmetry in the funnel plot is often interpreted as a sign of publication bias (i.e., the decision of authors or editors to publish or not a given result) or dissemination bias (i.e., small studies tend to be published in poorly accessible or indexed journals).On the contrary, the funnel plot we constructed from our dataset was symmetrical, indicating that small and large studies, as well as studies reporting negative, positive or close to zero effect sizes were equally likely to be published.
The utility of funnel plots in the context of multi-level meta-analysis remains a matter of debate, because sets of points may be clustered together as a result of statistical dependencies.
However, there was no evidence for such clustering, and data points corresponding to HLM and LLM case studies were fairly well distributed.This observation further supports our conclusion that publication bias was unlikely in our dataset.
(2) Fail safe number.Fail safe numbers (FSN) estimate how many studies with effect sizes averaging zero should be added to negate the significance of the grand mean effect size (or to reduce it to a specified minimal value).Among the various available metrics, Koricheva et al. 81 recommended the use of Rosenberg's FSN 90 , a weighted metric that is tested against a normal distribution.Rosenberg's FSN was 257 (P < 0.001), indicating that "publication biases (if they exist) may be safely ignored" 90 .
(3) Correlation between effect sizes and the impact factors of the reporting journal.Publication bias is likely to occur if higher impact journals tend to publish papers with stronger results, whereas results reporting weaker or no empirical support for hypotheses are more likely published in lower rank (and maybe less accessible) journals (or not published at all).Following this logic, Murtaugh 91 proposed a test of publication bias that consists in regressing effect sizes against the impact factor of the journal they were taken from.We used 2015 5-years impact factors of journals that provided case studies included in the meta-analysis and assumed the rank of journal impact factors was stable over the period covered by our data.We found no correlation between effect sizes and impact factors (r = 0.03, P = 0.700).Strong effect sizes were neither more likely to be reported in top-rank journals, nor small effect sizes were more common in lower rank journals.However, it must be noticed that this trend was driven by some cases with small effect-sizes published in the journal Nature.When these cases are not accounted for, the correlation is still weak, but turns significant (r = -0.23,P = 0.030).The fact a top-rank journal such as Nature published results with small effect sizes is, however, a clear indication against publication bias in our dataset.
(4) Cumulative meta-analysis.Temporal trends in effect sizes may affect the generality (and stability) of conclusions drawn from meta-analyses.Temporal trends may result from changes in methodology, technology or dominant paradigms.We assessed the temporal stability of the grand mean effect size (both for the complete dataset and for HLM and LLM separately) by conducting a cumulative meta-analysis.This analysis calculates the grand mean effect size of a successively accumulating subsample of the global dataset to which case studies are sequentially aggregated in their order of publication (i.e., from the oldest to the most recent publication year).We tested for the existence of a temporal trend by means of a weighted regression analysis with the year of publication as predictor variable and the grand mean effect sizes as response variable (function rma.mv in metafor).
Overall, grand mean effect sizes increased through time (i.e., became less negative) and approached zero, but were still negative in 2015 (Fig. S4).We hence cannot exclude the possibility that the topic is not fully mature yet and that future studies will report a greater amount of near zero or even positive effect sizes.This overall trend was primarily driven by the oldest cases involving LLM populations (Fig, S4), whereas no temporal trend was observed for HLM populations (Fig. S4).On the other hand, we observed a markedly stronger increase in the number of studies on HLM populations than on LLM populations through the past 10 years.Regardless of this difference, however, the difference between HLM and LLM remained strong and consistent, implying that our main result is likely to be insensitive to time-lag bias.

S3.2. Accounting for potential non-independence of case studies drawn from the same primary paper
Some primary papers contained more than one measure that we could use for our metaanalysis (e.g., reporting different performance estimators for the same species or the same estimator for different species).Such measures could be mutually non-independent, leading to pseudo-replication in the dataset used for the meta-analysis.
We used two complementary approaches to account for potential non-independence of case studies stemming from the same primary paper.(1) We used multi-level meta-analysis where we specified two random factors with the rma.mv function of the R package metafor 83 : the identity of the case study and the identity of the primary paper; the first factor was nested within the second.(2) We ran a sensitivity analysis to assess the robustness of the main result (i.e., margins differ in relative population performance) against the non-independence of case studies from the same primary paper.
Sensitivity analysis: we created a subsample of our global dataset that contained only one randomly selected case study from each primary paper.We then performed a mixed-effect metaanalysis on this subsample to test for the existence of a margin (HLM vs. LLM) effect with the rma function from package metafor 83 .This procedure was repeated 1000 times, each time with a newly created test dataset (which is analogous to bootstrap procedures using random drawing with replacement).Among the 1000 models, 63.5% supported a significant difference between HLM and LLM.The mean (± 95% distribution) of random samples for HLM (-0.09 ± [-0.40, 0.21]) and LLM (-0.90 ± [-1.55, -0.25]) were very close to model parameters estimated from the multi-level error meta-analyis presented in the main text (-0.14 ± [-0.64, 0.36] and -1.07 ± [-1.68, -0.46]) for HLM and LLM, respectively; see Fig. S5).The sensitivity analysis thus supports our main finding, implying that the reported asymmetry in the relative performance of HLM and LLM populations is unaffected by potential lack of independence among case studies stemming from the same primary paper.

S3.3. Does asymmetry in marginal population performance differ between taxonomic kingdoms (animals vs. plants) and between major habitats (terrestrial vs. marine)?
Our model selection procedure retained five models within two units of ∆AICc of the best model.All included margin type as a moderator and the null model (i.e., intercept only) was excluded (Table S1).
To assess the relative relevance of each of our three moderators, we calculated the sum of weights (w H ) of individual moderators as the sum of weights of all models (w i ) with this predictor and ∆AICc < 10 92 .The result is shown in Fig. S2.
Margin type was the most important predictor (w H, margin = 0.96), whereas Habitat (w H,habitat = 0.69) and Kingdom (w H, kingdom = 0.54) received only marginal support, and interactions (Margin × Habitat and Margin × Kingdom) were even less relevant.Fully in line with this result, neither Kingdom nor Habitat explained a significant amount of heterogeneity in any of the five models retained in the set of best models (Table S2).
The combined evidence supports our conclusion that the reported latitudinal asymmetry in marginal population performance is not discernibly affected by differences in effect sizes between plants and animals or between marine and terrestrial organisms (Fig. S1).

S4. Supplementary Methods: Description and results of the climate analysis
The relationship between the relative performance at marginal populations (Hedges' d) and the difference in average climate between marginal and central populations (differences in average temperature during 1990-2013) was analysed by means of generalised additive mixed models (GAMM).We used the following model: RelativePerformance ~ s(TemperatureDifference) + s(study, bs = "re") where s represents smooth terms 89 .We used random effects smooths (bs = "re") to account for non-independence of comparisons within published studies.We also weighted relative performance effect sizes (Hedges' d) by their variances so that their influence in model calibration was inversely related to their uncertainty 88 .We fitted the model in package mgcv v.
1.8-17 88 in R 3.4.1 83 .The R code to reproduce these analyses is available as a research compendium 93 .
We found a moderate but statistically significant effect of temperature on the relative performance of marginal populations (estimated degrees of freedom = 3.16, P = 0.015, Table S3).The model managed to explain 25% of the total deviance.Table S2.Summary of the five models retained in the set of best models (i.e., with ΔAICc < 2, Table S3.1).Margin explained a significant amount of heterogeneity in each of the five competing best models whereas neither Kingdom nor Habitat explained a significant amount of heterogeneity in any of the five models retained in the set of best models.Q M and associated Pvalues represent the test associated with each moderator, separately.Pseudo R² were calculated 5 as 1 -LLR, where LLR is the ratio between the log-likelihood of model i and the log-likelihood of the null model.centroids of marginal (HLM or LLM) and central populations in each case.We found no evidence for a distance effect on explaining differences in relative population performance, as we found for climate (Fig. 3).

Figure 2 .
Figure 2. Observed differences in performance (Hedges'd effect sizes) between marginal (highlatitude, HLM, and low-latitude, LLM) and central populations across all species and studies.(A) Grand mean (grey) and margin-specific (blue and orange) combined effect sizes.Error bars represent 95% confidence intervals.Numbers in parentheses correspond to the number of case studies.(B) Individual effect sizes for HLM and LLM case studies ranked from the lowest to the highest value.Dot size is proportional to the weight of individual effect sizes in the metaanalysis.Both in (A) and (B), positive and negative values indicate higher and lower performances in marginal than in central populations, respectively.Horizontal dashed lines represent the null hypothesis of no difference in the performance of central vs. marginal populations.

Figure 3 .
Figure 3. Relationship between the observed difference in demographic performance (Hedges' d) and the difference in average temperatures between peripheral and central populations (for the period 1990-2013).Positive values of Hedges' d indicate better performance in the margin compared to central populations, and vice versa.Point size is inversely related to Hedges' d variance for each contrast (i.e.bigger points represent stronger effect sizes).The curve represents the fit of a generalized additive mixed model with temperature difference and study as predictors.The shaded area represents the standard error .HLM = high-latitude margin, LLM = low-latitude margin.
where and , n and s² the mean, sample size and sampling variance.Negative values of d indicate lower performance in marginal (either HLM or LLM) populations than in central populations (consistent with the CP paradigm), whereas positive values indicate higher performance.The sampling variance of effect sizes was:

Figure S2 .
Figure S2.Asymmetry in population performance at High Latitude Margins (HLM) and Low Latitude Margins (LLM) for each Kingdom and Habitat.Symbols representing Habitats and Kingdoms are centered on the mean estimate.Vertical bars represent 95% CI estimated from the multi-level meta-analysis.Negative and positive values indicate lower and higher performance of marginal populations as compared to central populations, respectively.Numbers within parentheses indicate the number of case studies for each category.

Figure S3 .
Figure S3.Sum of weights of moderators quantifying the relative importance of individual moderators and their interactions.Values are interpreted as the probability that a given variable is retained in the best model.Among tested moderators, margin type was the most important predictor (wH, margin = 0.96), whereas Habitat (wH,habitat = 0.69) and Kingdom (wH, kingdom = 0.54) received only marginal support, and interactions (Margin × Habitat and Margin × Kingdom) were even less relevant.Fully in line with this result, neither Kingdom nor Habitat explained a significant amount of heterogeneity in any of the five models retained in the set of best models (Extended Data, Table2).

Figure S4 .
Figure S4.Relative performance of marginal vs central populations (Hedges' d) in relation to the geographic distance between them.The latter was calculated as the distance between the

Figure S5 .
Figure S5.Cumulative meta-analysis.Grand mean effect sizes (dots), 95% CI (bars) and sample sizes (k) are shown for each year, including all previous years.Plate (A) depicts the global data set, plates (B) and (C) the datasets for HLM and LLM populations, respectively.Only significant relationships between publication year and effect sizes are shown by a regression line

Figure S6 .
Figure S6.Sensitivity analysis.Thin coloured bars represent the 95% CI of effect sizes estimated for HLM and LLM from the 1000 'i' models ran with a random sample of one case per primary paper.Dark dots and error bars represent the corresponding mean and 95% distribution of mean effect sizes.Predictions based on the complete dataset (i.e., those reported in the main text) are shown in white for comparison.The match between the results of the main analysis and sensitivity analysis confirm the robustness of our conclusions about asymmetry in marginal population performance.

TABLES Table 1 .
List of species included in the meta-analysis for which information about ongoing range shifts has been reported in the scientific literature.All cases correspond to the northern hemisphere.