Biodiversity Conservation Can Coexist with Increasing Human Footprint

Fundamental to the success of sustainable development is a foundation of intact ecosystems. While the United Nations Sustainable Development Goal 15, “Life on Land”, seeks to protect biodiversity in terrestrial ecosystems, accelerating human-driven changes across the Earth system are undermining efforts to preserve biodiversity. Understanding this tension has never been more urgent and requires tools that reveal pathways for development that also support biodiversity. Here we introduce a near-present, global-scale machine learning-based human footprint index which is capable of routine update. By comparing global changes in the machine learning human footprint index between 2000 and 2019 to national-scale biodiversity metrics for Goal 15, we find that some countries are experiencing increases in their human footprint while biodiversity metrics are improving as well. We further examine development and policy dynamics to uncover enabling mechanisms for balancing increased human pressure with biodiversity gains. This has immediate policy relevance, since the majority of countries globally are not on track to achieve Goal 15 by the declared deadline of 2030. Moving forward, the machine learning human footprint index can be used for ongoing monitoring and evaluation support toward the twin goals of fostering a thriving society and global Earth system.


Introduction
The human footprint index (HFI) represents one of the most important tools for interpreting human pressure on the landscape (1) . A dimensionless metric which captures the extent of human influence on the terrestrial surface, the HFI is distinct from many land-use metrics in that it captures the total influence of human existence on a given location in a single index, rather than categorizing individual land use types (2) . The applications to which the HFI is put to use are manifold (3)(4)(5)(6) , with enormous demand for information about human pressure on the land surface to support policies related to land use change, biodiversity conservation, and climate action (6)(7)(8) . A key challenge for expanded use of the HFI for operational efforts, however, is that new updates typically lag the present day by seven or more years (9,10) . This time lag means that large-scale or substantial changes to the land surface caused by human activities can occur well before we detect them, hampering our ability to monitor and respond to their effects 2 on biodiversity. This temporal mismatch pertains directly to the achievement of the United Nations Sustainable Development Goals (SDG) since the deadline for their achievement is 2030 (11)(12)(13) , with some specific targets within SDG15, "Life on Land", set for even sooner (14) .
While international (14) and voluntary national-scale assessments of various biodiversity metrics (15) are available for monitoring progress toward SDG15, comparing those SDG15 assessments with an operational interpretation of human influence on the terrestrial land surface -one that can be easily and regularly updated -would provide deeper insights on the social-ecological outcomes and processes of landscape change.
Machine learning, specifically convolutional neural networks (CNN), are well suited to the task of identifying patterns in imagery (16) , including patterns of human activity across the land surface (17)(18)(19) . We present the first near-present, global, machine learning-based HFI (ml-HFI) for the years 2000 and 2019 (Fig 1). We train a CNN to ingest space-borne imagery of the earth's surface and predict the most up-to-date version of the HFI for the year 2000 (Fig 1a; hereafter the Williams-HFI) (10) . The conventional approach for creating the HFI requires a harmonization of eight different sub indices representing different aspects of human pressures on the terrestrial surface of the Earth, including built infrastructure, population density, and a variety of land use and land cover data. This resulting HFI is a dimensionless index ranging from a low to high human footprint. The Hansen Global Forest Change imagery version 1.7 (GFCv1.7) (20) includes global, cloud-free, growing season Landsat imagery for the years 2000 and 2019 (20) , and so we exploit the overlap between the year-2000 Williams-HFI and the year-2000 GFCv1.7 imagery for training. We then use the trained CNN to predict the 2019 ml-HFI based on 2019 GFCv1.7 imagery.

Results
Human pressure increasing globally Global, regional, and local patterns of human footprint from the Williams-HFI (Fig 1a) are identified in the year 2000 ml-HFI with high fidelity (Fig 1b). High accuracy is realized over all levels of human footprint by training on only 3% of the terrestrial locations ( Supp Fig 3), with a 3 mean absolute error on unseen testing data of 0.07 (Supp Fig 6). With that said, there are regions of the planet that are systematically more challenging for the CNN to reproduce (Supp Fig 8), including rocky areas of remote deserts and high latitude mountain ranges. Training the model on a larger portion of locations is expected to overcome this limitation in future iterations. Once trained, the CNN is then tasked with predicting the most current version of the ml-HFI from 2019 GFCv1.7 imagery (Fig 1c). This 2019 map represents the most current quantification of the human footprint index, previously estimated for 2013 (10) .  (21,22) to reveal the features that the neural network deems most relevant for its calculation of the ml-HFI (right panels). Inspection of the LRP maps indicates that the CNN is correctly identifying human-made features and activities (e.g. transportation networks or land-cover conversion) providing confidence in, and interpretability of, the CNN's predictions. This step is especially important if the ml-HFI is to be used for policy decisions as national and international laws have begun to require explainable AI systems for decision-making (23,24) .

Biodiversity gains despite development
Previous research underscores the transformative effects that human activities have on natural ecosystems (25,26) . Many of these studies indicate that heightened human pressure on the landscape diminishes the capacity for those lands to support biodiversity. We interrogate those assumptions using the ml-HFI for the years 2000 and 2019 with a contemporary assessment of country-level progress toward achievement of SDG15 (14) . We find that 76 countries are experiencing substantial increases in human pressure together with stagnating or decreasing progress toward SDG15. This supports the common belief that an increasing human footprint is not compatible with the maintenance or improvement of terrestrial biodiversity. Contrary to this common belief, however, we also find that 43 countries (Fig 2, green outlines) are experiencing substantial increases in human pressure over a large fraction of their land surface (increases of 0.25 or larger over 1% or more of their area) while at the same time "moderately improving" or actively "on track to achieve" SDG15. These substantial changes identified by the ml-HFI have been further confirmed manually for more than 2,000 locations across the 43 countries using the time lapse feature in Google Earth Pro.
Our core finding, that the ml-HFI is increasing while biodiversity gains are also increasing, remains nascent in the conservation and development discourse (4,(27)(28)(29) . To better understand this coexistence, we summarize the types of changes detected with the ml-HFI along with specific SDG15 indicator progress in Fig 3. The ml-HFI reveals a wide spectrum of human pressure on the landscape, including: deforestation for extensive mining (Suriname, Guyana), spreading road networks (Estonia, Slovenia), urbanization (Uganda, The Gambia), and expanding agriculture (Benin, Morocco). Moreover, there is no uniform pathway for making progress across the SDG15 indicators for forest-(30) , freshwater-(31, 32) , and threatened "red list" species conservation (33) . For example, Slovenia and Estonia are on track to achieve SDG15 in all three indicators. But for most countries, it is a very mixed pathway across the three indicators. For example, Uganda may be making progress on forest and freshwater conservation, while decreasing toward its "red list" species target. Evidently, the types of increasing pressures revealed by the ml-HFI do not lead to identical patterns of SDG15 progress. This suggests additional country-level policy information may deepen our understanding, and we focus on three cases: Guyana, The Gambia, and Morocco.

Cases of development processes
A core function of the ml-HFI is to monitor remote regions experiencing rapid change, as clearly 5 identified in Guyana -including considerable expansion of road networks, deforestation and extensive mining in the forested interior (Fig 3, Supp. Fig. 15). Notwithstanding its accelerating economic development, Guyana has the second-highest per capita forest cover in the world, second only to its neighbor, Suriname (15) . Given the enormous potential for forest preservation, Guyana was one of the first countries to implement REDD+ (Reduced Emissions from Deforestation and Forest Degradation), partnering with Norway which provides financial resources to incentivize the reduction in deforestation rates, and the establishment of protected areas in collaboration with indigenous communities (15) . Yet, the growing network of roads and mining across the Guiana Shield revealed by the ml-HFI shows how quickly development processes can encroach into wild and remote areas.
Urbanization of agricultural areas is a key feature of development (34) , and as evidenced by the ml-HFI results for The Gambia, can take place explosively over the course of less than 20 years (Fig 3, Supp. Fig. 16). Elsewhere in The Gambia, the ml-HFI reveals changes driven by deforestation for roads and agriculture. The Gambia's Voluntary National Review for SDG15 highlights the role that climate change has on exacerbating challenges to natural resource management, as well as the tension between preserving biodiversity with food security for its population -which remains 50% rural (35) . Nonetheless, The Gambia includes multiple protected forests, including biodiverse areas near the mouth of the The Gambia River. The ml-HFI results for 2019 show the lowest levels of human pressure in precisely these riparian locations, emphasizing the need for continued protection of these forests.
Inspection of the ml-HFI shows that Morocco is experiencing development pressure in the form of expanding peri-urban farmland, as well as road and energy infrastructure (Fig 3, Supp. Fig.   17). As a middle income country, with both semi-arid forests and desert ecosystems, Morocco has focused on boundary demarcation of forested areas, enforcement of forest protection from illegal activities, and expanding coverage of forest management plans (36) . As with Guyana and The Gambia, development is still taking place and, as clarified in Morocco's Voluntary National Review for the SDGs, there are ongoing plans for water and land development as well as forest conservation in Morocco (36) . The ml-HFI results for 2019 support the assessment that considerable potential exists for conservation, especially in the remote mountains and arid ecosystems of the east and southern reaches of Morocco.

Prospects for development monitoring
Combining the SDG15 progress with our 2019 present-day human footprint index shows that progress toward biodiversity goals can be made despite increases in human pressure. These increases occur in countries ranging in wealth, economic development, or reliance on specific natural resources. For many countries that are aiming for resource intensive economic development, monitoring and evaluation will be critical for tracking changes in near-real time.
Countries that are still actively developing may find the ml-HFI useful to identify strategies to provide effective guardrails for staying on track for protecting biodiversity (e.g. Dominican Republic, Republic of Congo), especially when increased human activity encroaches on pockets of biodiversity. For wealthier or more developed countries with longstanding regulatory structures for protected areas, the ml-HFI can help determine conservation policy efficacy and ensure that additional increases in HFI concentrate in already developed areas (e.g. Sweden, Switzerland).
Although we detail compelling evidence of progress toward biodiversity in the face of increasing human pressure for some countries, the more common pattern is that of increasing human pressure at the expense of gains in biodiversity conservation. We do not single out any specific countries that are not on track to achieve SDG15 (14) , but the changes in their ml-HFI results reveal similar pressures to the cases detailed above -urban growth, agricultural expansion, and resource extraction (mining and timber harvesting). While considerable efforts are being made to protect intact ecosystems from encroaching or intensifying human activities, growth in human influence is likely outpacing field-based monitoring protocols to assess environmental impacts and highlight that many current policies are insufficient to regulate development (30) . Capable of capturing near present changes to the land surface, the ml-HFI could be an important tool for assisting countries that are flagging behind their SDG15 targets to close the gap by 2030. 7 That some countries can simultaneously be progressing toward biodiversity goals while experiencing dramatic increases in human pressure confounds expectations. Moving forward, science and policy must work to understand why and how this is possible. This will permit the development of better policies that will foster a thriving society and environment in the Anthropocene.

Human Footprint Index
We quantify human pressure on the landscape using the human footprint index (HFI), which is a global, dimensionless index of human pressure on the global land surface (9,37) . We employ the most up-to-date version of the HFI (10) , which we denote as "Williams-HFI" throughout this text. This version of the HFI is comprised of eight different sub indices representing different aspects of human pressures to the terrestrial surface of the Earth, including 1) extent of the built environment, 2) population density, 3) electric infrastructure, 4) agricultural lands, 5) pasture lands, 6) roadways, 7) railways, and 8) navigable waterways. We rescale the original index, which ranges from 0 to 50, to a range from 0 to 1.

Landsat imagery
We use the Global Forest Change dataset, which is a global analysis of forest cover change based on Landsat imagery (20) . The data consist of processed Landsat imagery from the bands 3, 4, 5, and 7. The global processing identifies growing-season imagery and only includes cloud-free images. We specifically use bands 3, 5, and 7 in our analysis (corresponding to the "red" band, and two bands of "near infrared"). We chose these bands based on preliminary sensitivity tests, which revealed these were the most sensitive to interpreting the terrestrial changes we needed to identify. In all images shown here, the three GFCv1.7 channels are combined to create a false color image as near infrared bands are outside of the visible spectrum.
The latest version of the Global Forest Change product includes cloud-free processed Landsat

Architecture
We train a 2D convolutional neural network (CNN) with 5 convolutional layers with zero padding and stride of 2, and a final hidden dense layer of 32 units that feeds into the output layer of size 1 (a single predicted HFI value). The first 4 convolutional layers are each followed by max pooling layers with pooling size 2x2, stride of 2, and valid padding. A dropout layer with a 50% dropout rate is applied following the final convolutional layer, prior to the dense layer. The first convolutional layer has 64 filters, while all subsequent convolutional layers have 128. All filters have size 3x3. All activations use the rectified linear unit (ReLu) except for the dense layer, which utilizes a sigmoid activation function to constrain the output between 0 and 1.
The input into the CNN is an array with shape (120,120,3) representing 3 Landsat channels of 120 pixels x 120 pixels each. An HFI pixel is approximately 40x40 Landsat pixels, however, we found that the model performed best when the input included surrounding Landsat pixels, likely to give the CNN more "context" to predict the HFI in the center of the image. Thus, we input 3 120x120 pixel Landsat images and task the CNN to output a single value between 0 and 1. This value represents the HFI in the center (see white boxes in Supp. Fig. 2).

Data Input and Output
We train a separate CNN for each region shown in Supp. Fig. 1 between 70S and 70N. The CNN trained on Outer Asia was used to evaluate pixels within Oceania due to the small sample size of the region. The model used to evaluate Northern Africa was trained on all of Africa but only evaluated over Northern Africa. A separate CNN was trained over Southern Africa. Pixels used 9 for training are shown in Supp. Fig. 3 (purple shading), and were chosen to optimize computation time while maximizing the representation of unique geophysical features and human impacts.
Over all trained CNNs, we train on 3.8 million HFI pixels and predict 1.4e8 ml-HFI values for 2000 and 2019, separately. We only train on approximately 3% of the non-water data available due to computational and storage limitations, but expect many more samples could be included to further improve the ml-HFI accuracy. Because across the globe there are substantially more low HFI pixels than high pixels (see Fig. 1 and Supp. Fig. 4)), we restrict training to a maximum of 40,000 pixels per 10 deg. by 10 deg. box, and enforce that no HFI bin is represented by more than about 4,000 samples each. Since high HFI values are quite rare (see Supp. Fig. 4), we augment our training set by rotating the inputs by 90 degrees to create 4 unique training/output samples for a single HFI pixel.

Training and Loss Function
The CNN is trained to minimize the mean squared error of the predicted ml-HFI compared to the Williams-HFI. This is done for the year 2000 data (both the Williams-HFI and the Landsat imagery); the CNN is then used to predict the 2019 ml-HFI when no Williams-HFI exists. We have designed the CNN to output a single, continuous value. However, an HFI value of zero holds special meaning, as it implies a pristine landscape with no human influence. To force the CNN to predict the value of exactly zero (rather than just a small number), we re-label HFI values of zero as -0.2 during training when computing the loss function. Because the sigmoid activation function is applied prior to the output of the CNN, the CNN is unable to ever predict a value below 0; thus, this relabeling encourages the CNN to predict values of exactly zero.
As a preprocessing step, Landsat input channels (1,3,4) are standardized, first dividing by 255 (to convert to values between 0 and 1), then subtracting (0.08,0.24,0.12) and dividing by (0.46,0.58,0.47). In this way, we obtain values that are centered around zero and extend to approximately plus/minus 1.0. Throughout our analysis, we noticed that the general "brightness" of the Landsat imagery can vary in space, and also between year 2000 and 2019 within the GFCv1.7 data. We add a random integer between (-10,10) to each input channel prior to training.
In this way, we eliminate the tendency of the CNN to base its predictions too heavily on the magnitude of the input values. This step may be exceptionally important if one wishes to train and predict from various versions of the GFC Landsat imagery.
We train each CNN using a learning rate of 0.001 and a batch size of 128. After selecting a maximum of 40,000 samples for each 10 deg. by 10 deg. Landsat granule, we withhold 750 samples for validation during training. These samples are balanced to distribute similar counts in each of 10 HFI bins. Training is halted via early stopping when the validation mean squared error increases for 3 epochs in a row. At that point, we restore the weights from the best model from prior epochs to define the final CNN.

Attribution of the CNN's Decisions using Layer-wise Relevance Propagation
To gain insight and confidence in the CNN predictions, we apply a technique called layerwise relevance propagation (LRP; (21,22) ). This method highlights the regions of the input, or the pixels of the 3 Landsat images, that were most relevant to the CNNs output (i.e prediction of the ml-HFI). This technique acts to "open the black box" for neural network tasks and, importantly, provides intuitive confirmation or possible refutation of the CNN's decision-making process. In our case, we use LRP as a quality control step to check that the CNN is paying attention to the correct features that would lead to high or low HFI predicted values.
LRP, applied separately to each prediction of the CNN, produces a heatmap displaying the relevance of each pixel of each of the 3 input Landsat images (i.e. channels). We have computed the LRP relevance heatmaps for selected predictions of the 2019 ml-HFI where human pressure has substantially increased. The average relevance across the 3 input channels are displayed in To demonstrate the success of the ml-HFI in predicting human pressure from GFCv1.7 Landsat imagery, Supp. Figs. 9-12 provide example outputs of the ml-HFI for urban, agricultural, mining, and rural landscapes in the year-2000. In all of these cases, the ml-HFI compares well with the Williams-HFI, although it is clear that the Williams-HFI emphasizes road networks and is smoother than the ml-HFI which is predicted on a pixel-by-pixel basis.
We assess the skill in the ml-HFI values using the Williams-HFI values as "truth"; however, the Williams-HFI values themselves are not ground truth, but are estimations that can also be incorrect at times (although this is likely rare due to the extensive amount of work that goes into developing this index). With that said, we have found multiple locations where the ml-HFI predicts high human pressure but the Williams-HFI does not. Upon inspection of these cases believe the ml-HFI to be correct. Three such examples, provided in Supp. Fig. 13, highlight how the ml-HFI may be useful for identifying regions of high human pressure that may have been overlooked in the Williams-HFI.
There are also regions where we believe the ml-HFI struggles. The largest errors appear over dry, rocky terrain devoid of human activity, for example, the rocky parts of the Saharan Desert, the Australian desert, and the spine of the Andes (Supp. Fig. 7-8). In these areas, the ml-HFI predicts human activity where none exists. We provide three such examples in Supp. Fig. 14. However, ml-HFI errors over these problematic regions of low HFI are still relatively small (Supp. Fig. 6-8), with the 90th percentile error being approximately 0.12.

Calculation and Presentation of HFI Changes between 2000 and 2019
We compute the change in ml-HFI between 2000 and 2019 and plot regions with exceptionally large changes (changes greater than 0.25 are shown in red and changes less than -0.25 are shown in blue) on top of the year-2000 ml-HFI values in Fig 2, which are shaded in gray. Because the mI-HFI appears to struggle over rocky deserts and mountains where the Williams-FHI is zero, we mask those change calculations. This masking is carried out in regions where the Williams-HFI was less than 0.02 in 2000 (termed "Wilderness" by (10) ) and the mI-HFI error was larger than 0.001. In other words, we mask out changes where the 2000 mI-HFI initially struggled with a zero prediction (i.e. no human activity at all).

Green Outlines in Figure 2
The green outlines in Fig 2 summarize countries that are experiencing an increase in HFI between 2000 and 2019 of greater than 0.25 across 1% or more of their land area and are 'moderately improving' or 'on track to achieve SDG15' (Supp Table 1). All 43 countries in the table have been extensively quality-controlled and verified to have experienced change using the time-lapse feature on Google Earth Pro. Norway, Gabon, Libya and Sudan have been removed from the list as further inspection shows that the CNN struggles in those locations. In Norway the presence of high latitude rocky areas are mis-classified as experiencing change. In Gabon there is an apparent Landsat imagery issue which appears to be localized specifically to Gabon's interior. Both Libya and Sudan contain large areas of rocky, remote deserts that are misclassified 13 as experiencing substantial human impact.  Table 1. ml-HFI changes displayed here are degraded in resolution for easier visualization.