Abstract
Close analysis of an extensive data set combined with independent evidence prompts our proposal to view human lifetimes as individually finite but collectively unbounded. We formulate a model incorporating this idea whose predictions agree very well with the observed data. In the model, human lifetimes are theoretically unbounded, but the probability of an individual living to an extreme age is negligible, so lifetimes are effectively limited. Our model incorporates a mortality hazard rate plateau and a late-life mortality deceleration effect in conjunction with a newly observed advanced age mortality acceleration. This reconciles many previously observed effects. The model is temporally stable: consistent with observation, parameters do not change over time. As an application, assuming no major medical advances, we predict the emergence of many individuals living past 120, but due to accelerating mortality find it unlikely that any will subsequently survive to an age of 125.
Introduction
The question of whether or not a limit to human lifespan exists may be as old as civilization itself. Earliest speculation was hampered by a lack of reliable age data, biological knowledge and statistical methods. Foundations for the modern field of demography were laid in the 19th Century by Gompertz (1), who modelled mortality as increasing exponentially with age, and Makeham (2), who added an age-independent term to the equation. The Gompertz-Makeham Law of Mortality now enjoys near-universal acceptance as an accurate model for mortality at most adult ages. However, scientists still fiercely debate whether mortality deviates from the Gompertz-Makeham law at very old ages and whether human longevity can be increased into the future. On these issues, there is no consensus, with some foreseeing many years of extended longevity (3), while others argue that previous upward trends are largely played out and only modest improvements can be expected in the future (4).
In (5) an actuarially-motivated “smooth threshold life table” (STLT) model was fitted to a high-quality Netherlands data set: mortality data on 231,129 females and 73,788 males in 1-year, year-at-birth cohorts, from 1893 to 1908, with individuals aged up to 114 years (the recorded highest attained age). Fig. 4 in the Materials and Methods section displays the models fitted to the Netherlands data. The analysis was repeated on a smaller but up-to-date Australian data set reported in (6).
Herein the STLT model is closely analyzed to reveal fine structure details of the human mortality curve. An important feature found is an acceleration of mortality at very advanced ages. Based on this and other information from the modeling and the literature, we draw the plausible inference that individual human lifetimes are inherently finite.
Independent evidence in support of this can be found in an analysis of trends in three public longevity data sets (7), where it is argued that, nevertheless, no hard upper bound should be assumed for life-lengths overall. A continuing increase in human life expectancy since 1900 was deduced, but with a rate of improvement in survival which peaks and then declines for very old ages. It was inferred from this that human lifespan may have a natural limit but maximum lifetime is not fixed — although survival becomes negligible after a certain age, and there is no discernible improvement in extreme old-age survival over time.
Given this background, we argue that what is called for is a model in which life-lengths, though finite individually, vary randomly over the population according to a distribution which is unbounded. Herein we postulate such a model, estimate its parameters from the available data, and use it to predict numbers in the population who will live to very extreme ages. These predictions concur very well with the actual numbers observed.
Results
The STLT model
The STLT model in (5) and (6) consists of a Gompertz distribution describing lifelengths up till a data determined threshold age N, together with a generalised Pareto distribution (GPD) for ages greater than N. It incorporates a “late-life mortality deceleration effect”, where the rate of increase in mortality with advanced age becomes slower than exponential (as it is in the Gompertz law), and a “mortality hazard rate plateau” at high ages. Such effects have been observed before ( (8), (9), (10)); see the Materials and Methods section for further discussion.
But how can a late-life mortality deceleration effect, for which mortality will tend to decrease and even level off at high ages, be consistent with finite human lifetimes? The answer is that the STLT model also implies an advanced age mortality acceleration following the late life mortality deceleration. This acceleration in hazards results in finite limits to the life spans. In the following we develop this idea.
Finite but unlimited human lifetimes
The GPD formulation contains a shape parameter κ which is inversely proportional to the right extreme of the fitted mortality distribution. (See the Materials and Methods section for details.) For convenience, we take the shape parameter κ in the GPD to be the negative of that in the usual formulation (the ξ in (11), p.162, and the γ in (12) and (13)). A positive κ value signifies a finite upper bound to the lifetime distribution. The κ values estimated from the 32 year-of-birth × sex cohorts in the Netherlands data and for the 18 5-year Australian cohorts are all significantly greater than 0. They vary over a range shown in Table 1 for the Netherlands data and in Tables 3 and 4 of the Materials and Methods section for the Australian data.
We infer that lifetimes in a cohort can be modelled as bounded above at a finite age U inversely proportional to κ, but with U varying randomly over the cohorts. We extend this insight to a model in which individual lifetimes are finite but with upper bounds which vary over the population with no definite upper limit.
A mixture model for lifetimes
Write the lifetime T of a person in the population as T = N + Y, where Y is the excess lifetime above the transition age N. Thus Y has as distribution the Pareto distribution component of the STLT model, conditional on having reached age N, with shape parameter the κ in that component.
Assume that associated with each person is a positive random variable whose realized value is κ. Then the lifetime distribution in the population as a whole is given by integrating the conditional distribution over possible values of κ. The distribution of κ required for this can be estimated from the fitted STLT models. We used the κ values in Table 1 (Netherlands data) combined with those in Tables 3 and 4 in the Materials and Methods section (Australian data).
The resulting empirical distribution is shown in Fig. 5 of the Materials and Methods section. We approximated the histogram in Fig. 5 with a gamma(a, b) distribution. The resulting distribution of Y calculated by mixing over the distribution of κ using the estimated density for κ is in Eq. (5) of the Materials and Methods section. It is unbounded, consistent with our belief that there is no hard upper bound on lifetimes overall. The resulting population distribution densities are shown in Fig. 1. There are non-negligible probabilities of values of Y up till about 12, corresponding to T ≈ 98 + 12 = 110 years. Survival probabilities then decline rapidly and become negligible by age 120.
Advanced age mortality accelerations
The existence of an age xA before which the force of mortality (the hazard) in the GPD increases slower than that of the Gompertz, but after which the GPD hazard increases faster than that of the Gompertz, is verified for the STLT model in the Materials and Methods section. This phenomenon gives rise to the advanced age mortality acceleration. In the Netherlands data, the acceleration is observed subsequent to the threshold age N and prior to the limiting age U for all cohorts. The ages xA are listed for each cohort in Table 1 and illustrated in Fig. 2 for the pooled data. The ages xB at which the hazards of the Gompertz and Pareto curves cross again are also shown in Table 1 and Fig. 2. Further information on the advanced age mortality acceleration and its relation to the “three laws of biodemography” is Discussion.
Observed vs expected centenarians
The probability of living to age x > N years or more can be calculated as where N is the transition age for the cohort under consideration. Multiplying the probability P (Y > x − N) in (1) by the number in the cohort at age N gives the expected number in the cohort living to age x, or beyond, according to the model. These are compared with the actual numbers for ages x = 100, 105 in Fig. 3. (The numbers themselves, also for x = 110, are in Table 5 of the Materials and Methods section. No person of either sex in the data lived longer than 115 years, and at this age model predictions were < 0.01 persons.)
Observed and expected compare well but some inaccuracy is to be expected especially for the smaller numbers as we are attempting to predict in the extreme upper end of the life-table. We note an increasing trend with birth cohort year for females surviving to 100 years or beyond, but not for the other groups.
Expected survivors at extreme ages
How many individuals exceeding Jeanne Calment’s record age (122 years and 164 days) might we expect to see in the remainder of this century? In the mixture model, while survival to age 115 or older is possible, it is with a probability so small as to be negligible. One could speculate that, on a global scale, an increasing number of supercentenarians might result in the appearance of new longevity records. However, our findings place a robust limit on the emergence of any record-breaking individuals.
Assuming the model derived from the Netherlands and Australian data applies generally, we can estimate roughly the potential growth in the number of supercentenarians, globally, from 2021-2100. The annual growth in supercentenarians recorded in the International Database on Longevity (IDL) for England and Wales, 1968–2017, is well described by an exponential growth function with a doubling-time of 12.9 years (see the Materials and Methods section). From this function we extrapolated, optimistically, the supercentenarian count to be expected in England and Wales for 2021–2100. Multiplying by a factor of 100 (for the approximate proportion of the world population in England and Wales) gives an estimate of 1,476,796, or in broad terms a total of ≈ 1.5 million supercentenarians, male and female, to be expected world-wide over the 80-year period 2021–2100.
For an optimistic estimate of the number of survivors from this population expected to exceed a specified age of interest, we multiply 1.5×106 by the corresponding proportion surviving calculated from the (female) density values in Fig. 1. The model then predicts that 8268 of those 1.5m supercentenarians would survive to age 115, and 41 would live to age 120, but less than 1 (expectation 0.23) would live past age 125. These rough estimates at the extreme upper end of the age distribution come with large standard errors, but are consistent with the small numbers of extreme-aged individuals currently observed, and suggest there is a low probability of Jeanne Calment’s record being broken even once in the remainder of this century. Barring a revolutionary advance in medicine, the benchmark age of 125 appears to represent a firm limit to human longevity unlikely to be exceeded in the foreseeable future.
Discussion
Is there an upper limit to human lifetimes?
Much commentary on this issue related to (7), some critical, some supportive, followed soon after its publication. A study based on the IDL alone (13) was critical: it rejected the inferred finite limit to human life, and, using longevity data from Japan and Western countries, suggested instead an exponential model for lifetimes beyond 110 years. As in (5), the study fitted the GPD, but in this case, to data from 4 areas (North and South Europe, North America and Japan), and to a very limited set (a total of 566 life lengths) from extreme ages only (supercentenarians, 110+). In two instances (North Europe and Japan) the estimated κ in their GPD fits is negative (we write κ for −γ, in their notation), while for South Europe and North America it is positive. But in no case is it significantly different from 0 (Table 5, (13)) and pooled κ estimates are all positive. From being non-significantly different to 0, the authors concluded that κ should be taken as exactly zero, indicating an exponential distribution for the data, and argue from this for an unlimited human lifespan.
Later (14) and (15) updated (13) with more data and analyses but ultimately still concluded in favour of an exponential model for lifetimes and an unlimited human lifespan.
However, these analyses obtaining non-significant positive (and in some cases non-significant negative) κ values are in fact consistent with those in (5) and (6), where the estimated κ are strongly positive. Only limited data from extreme ages is used in (13), (14) and (15), whereas in (5) and (6) more extensive data sets spanning ages from 65 on was used, thus obtaining greater statistical power in a demographically-informed model.
Other comments on (7) occurred in (16), (17), (18), and (19). Most of those writers argued for the contrary conclusion – for no upper limit to human lifespan. One comment (20), using a logistic model, argued that an individual may live to 125 years within this century. Their model tacitly assumes an infinite endpoint.
By contrast, a number of papers which also include discussion or estimation of the GPD shape parameter κ come to similar conclusions as (7), (5), and (6). They include: an analysis of data from the Gerontology Research Group, which also observed no trend in supercentenarian longevity (21); an actuarial study of 46,66 Belgians > 95 years old, finding a limit to lifetimes of ≈ 120 − 130 years and no trend across cohorts (22); an analysis of the combined datasets of the IDL and the Human Mortality Database, which found a limit of ≈ 125 − 130 years (23); and several reanalyses of the IDL in response to (13): (24), (25), (26), and (27). Finally, a study of ≈ 285,000 Netherlands individuals aged 92 years or more at death consistently found positive estimates of κ, as well as a limit of ≈ 115 − 130 years for maximum human lifespan, with no trend over time (12).
Almost all papers, even those which estimated a nonnegative κ value, find that the probability of survival becomes either exactly 0 or negligibly higher than 0 at some point between 115–130 years. Indeed, the classification of lifespan as “unlimited” based on a non-zero yet extremely small chance of survival has been criticized previously (28). Our analyses confirm and strengthen this consensus and refute other analyses that predict longevity exceeding 125 years (20). Even assuming unabated exponential growth in the number of supercentenarians and applying an optimistic model of supercentenarian survival, we predict that the probability of observing an individual living past 125 by 2100 is 0.23.
But exponential growth in the number of supercentenarians past 2100 is improbable, anyway. The present exponential growth in supercentenarians is driven by two factors: growth in birth cohort sizes and increased survival to age 110. Birth cohort sizes peaked in the mid-20th century for many nations and are unlikely to contribute to higher supercentenarian numbers during the 22nd century. Meanwhile, the stagnation, or even decline, in life expectancy for developed nations indicates that the proportion of each birth cohort surviving to age 110 is unlikely to increase beyond this century. Together, these factors suggest that supercentenarian cohort sizes — and longevity records — will stop growing or even fall in the following century, if not sooner.
Alternatively, a breakthrough in medicine could result in greatly increased longevity at extreme ages. However, consideration of this is beyond the scope of our present analysis.
Late-life mortality plateau and deceleration effect
An earlier study (8) tracked every person born in Italy between 1896 and 1910 who lived to age 105 or beyond; a total of 3,836 individuals (3,373 women and 463 men). That study’s hazard rate analysis suggested a “mortality hazard rate plateau” between ages 105 and 110 and a consequent “late-life mortality deceleration effect” – that is, that the rate of increase in mortality with advanced age becomes slower than exponential (as it is in the Gompertz law) between those ages. This effect is usually understood to imply a model with a potentially infinite life span. The existence of a late-life mortality hazard rate plateau is also argued by earlier studies (10), (9), but disputed and critiqued by several commentaries (28), (29), (30), (31). As we observed in the main text, the existence of an advanced age mortality acceleration resolves any contradiction of these studies with our conclusion of finite lifetimes.
The three laws of biodemography
The compensation law of mortality ( (32), (33), (34)) states that for a given species, differences in death rates between different sub-populations (cohorts, in our case) decrease with age — because higher or lower initial death rates are accompanied by a lower or higher rate of mortality increase with age, such that mortality rates for different cohorts tend to equalise after some high age.
This tendency exists in the Netherlands data and is captured in the STLT model fit. The threshold ages N at which the convergence takes place are around 98 years for females and (with somewhat greater variability) around 96 years for males (Tables 2 and 3). These are close to the estimate of around 95 years made in (33) on quite different grounds.
The three laws of biodemography have been stated as:
the Gompertz-Makeham law of mortality;
the compensation law of mortality, and
the existence of a late-life mortality deceleration.
The Netherlands data and the STLT model fitted to it are consistent with the compensation law, so all three laws are satisfied. The model also allows for, and for that data, predicts, via an advanced age mortality acceleration, a finite limit to human life span. The STLT model is unique in possessing all of these properties.
The STLT model and the mixture model derived from it herein appear to represent the most biologically and statistically rigorous evaluation of late-life human mortality to date.
Materials and Methods
The STLT model: Gompertz & Pareto components
The model fitted in (5) and (6) to each of the cohorts in the Netherlands and Australian data consists of a Gompertz distribution with cumulative distribution function (cdf) for parameters B > 0 and C > 1, describing a lifetime of length x up till a data determined threshold age N, together with a generalised Pareto distribution for ages greater than N. This Pareto part has the form ( (11), p.162) for parameters θ > 0 and ξ. (We denote the survival function (tail function) of a cdf F by In general the parameter ξ can take either sign, but in the Netherlands data in (5) it was estimated to be significantly negative in all 32 cohorts, hence resulting in the finite upper bound for x indicated in (3). To avoid many minus signs in formulae, we write κ for |ξ| = −ξ throughout. With this notation the estimates of κ together with estimates of other significant parameters are in Tables 2 and 3 (Netherlands data) and Tables 4 and 5 (Australian data).
An innovative feature of the STLT model is that a smooth transition in hazard rates between ages less than and greater than the threshold age N is enforced. This means that the parameters in (2) and (3) are connected by the relation θ = 1/BCN.
We use the Kaplan-Meier empirical distribution estimator (KME) to display the lifetime data. Figure 4 shows the Netherlands data with the fitted STLT models (Gompertz up till the transition age N, Pareto thereafter), for females (upper panel) and males (lower panel). The left column is for all ages, 65+. The right column is a magnified view of the left column for ages greater than N. In this column the Gompertz distribution fitted to ages less than N is extrapolated to ages greater than N. Note that the extrapolated Gompertz tends to overestimate mortality slightly by comparison with the Pareto in this age range. The parameter estimates used to construct the curves are in Tables 2 and 3.
STLT model: hazards
Formulae for the hazards of the Gompertz and Pareto components can be calculated from (2) and (3) and it can be checked that the STLT constraint forces θ = 1/BCN as claimed. Further analysis shows that there is a point xA in (N, U) at which the hazards cross. For x ∈ [N, xA) the Pareto hazard increases more slowly than the Gompertz hazard; but for x ∈ (xA, U], the Pareto hazard increases faster than the Gompertz hazard (that is, with the Gompertz curve extended into the interval [N, U]). This is illustrated in Fig.2 for females and males separately.
The point xA is the point at which the rate of increase of the Pareto hazard changes from being slower than that of the Gompertz hazard, to being faster. So we see a late life mortality deceleration in [N, xA) together with an advanced age mortality acceleration in (xA, U]. These are features consistent with a finite upper bound to the lifetimes distribution. This situation occurs in all the Netherlands and Australian cohorts. There is another point, xB, at which the hazards are equal, also given in Tables 2 and 3, and illustrated in Fig. 2. Other locations of interest, also given in Tables 2 and 3, are the point of inflection of the Gompertz, xI, and the point xC where the first derivatives of the log hazards are equal for Gompertz and Pareto. Estimated parameters for the Australian cohorts are in Tables 4 and 5.
The female 1860 cohort has an outlier in due to some exceptionally low mortalities at extreme ages. Refitting the STLT model with these omitted gives (SE = 0.653) in line with the other cohorts.
The mixture model for lifetimes
To formulate this model we concentrate on the extreme right hand end of the mortality distribution, that is, for ages above the transition age N. Write the lifetime T of a person in the population as T = N + Y, where Y has the distribution of T conditional on T > N, thus, from (3), for given κ, θ > 0,
Assume now that associated with person i is a positive random variable Ki whose realized value is κ in (4); thus, Yi for person i has the distribution in (4), conditional on the given value of Ki:
Assume a density fK(κ), κ > 0, for the Ki and integrate (5) with respect to it to get the marginal cdf for a typical member of the population as
Differentiating (5) gives the conditional density of Y as
The conditional expectation calculated from (7) is E(Y |K = κ) = θ(κ + 1)−1, showing that, as κ increases, the expected lifetime decreases. Persons with higher values of κ have shorter lifetimes, on average. As κ approaches 0 lifetimes become longer and would be infinite for κ = 0.
Inspection of the Netherlands and Australian data in Fig. 5 suggests that the κ values might be approximately modeled as a lognormal or gamma distribution. We found a reasonably good fit of a gamma distribution. The limited number of observations on κ will increase as more data and analyses become available.
Suppose now in our setup that the Ki have a gamma distribution with parameters a, b > 0 and density as shown in Fig. 5. Consequently, overall, the Yi are taken to be independent and identically distributed with marginal density
Parameters θ and N are specific to the cohort under consideration.
Observed vs expected centenarians
Table 6 gives the observed and expected numbers in the cohorts living to ages 100+, 105+, and 110+, according to the mixture model.
The regression model for centenarians
Fig. 6 shows the raw data and regression predictions on linear and log scales. The regression function is where x is calendar year above 2021 and y is the number of supercentenarians dying in that year.
Data Availability
The data used in this paper are all publicly available. The Netherlands data is obtained from the online data source provided in (12) and the Human Mortality Database (https://www.mortality.org/). More details about the data augmentation and quality checking can be found in (5). The Australian data is obtained from the Human Mortality Database (https://www.mortality.org/).The England and Wales supercentenarians data is obtained from the International Database on Longevity (https://www.supercentenarians.org/).
Code Availability
The STLT model is implemented in R using the package published in GitHub and available at https://github.com/u5838836/STLT. Other codes used in this paper can be provided by contacting feihuang{at}unsw.edu.au upon request.
Competing interests
Authors declare no competing interests.