ReSurveyGermany: Vegetation-plot time-series over the past hundred years in Germany

Vegetation-plot resurvey data are a main source of information on terrestrial biodiversity change, with records reaching back more than one century. Although more and more data from re-sampled plots have been published, there is not yet a comprehensive open-access dataset available for analysis. Here, we compiled and harmonised vegetation-plot resurvey data from Germany covering almost 100 years. We show the distribution of the plot data in space, time and across habitat types of the European Nature Information System (EUNIS). In addition, we include metadata on geographic location, plot size and vegetation structure. The data allow temporal biodiversity change to be assessed at the community scale, reaching back further into the past than most comparable data yet available. They also enable tracking changes in the incidence and distribution of individual species across Germany. In summary, the data come at a level of detail that holds promise for broadening our understanding of the mechanisms and drivers behind plant diversity change over the last century.


Background & Summary
The current biodiversity crisis threatens an estimated one million species with extinction 1 . The nature and rate of observed changes depend on the spatial scale at which they are observed 2 . At the finest scale, i.e. the local scale of plant communities, vegetation-plot records have been found to become sometimes richer, sometimes poorer in species 3 , while a considerable temporal species turnover is apparent in the majority of cases 4 .
Vegetation-plot resurvey data have been extensively used to assess biodiversity changes by means of monitoring certain vegetation types in local studies, such as managed grasslands 26 and rivers 43 . More recently, time series have been collected across regions, exploring the contribution of local biodiversity change 3 to that observed at broader spatial scales 1,59,60 . While these analyses often failed to detect changes in species richness 3,61,62 , they were able to relate the observed trends to changes in land use and climate 63,64 . Although these studies have compiled databases on vegetation-plot time series, they are currently not openly available. This is also the case for the current initiative of ReSurveyEurope, which collates and mobilizes vegetation-plot data with repeated measurements over time (http://euroveg.org/eva-database-re-survey-europe). Our aim is to provide a comprehensive and taxonomically standardised database of vegetation-plot time series for Germany. We confined the geographical extent to Germany because of a long tradition of German vegetation scientists carrying out temporal observations on # A full list of authors and their affiliations appears at the end of the paper. Fig. 2 Map of all plots of all projects (n = 23,641). Note that green dots may represent one or several plots which were summarised under the same plot resurvey ID (n = 7,738). The more complete coverage of Bavaria resulted from including the grassland monitoring Bavaria which started in 2002 26 .

Data DEScRIptoR opEN
co-occurrence at scales relevant for direct biotic interactions among individuals 70 . An additional advantage of vegetation-plot records is that they report the relative abundance of species, in the case of vegetation records from Germany, typically assessed as cover values 67,71 . While species cover is very often estimated directly in per cent of ground covered by each species, there is a long tradition in vegetation science of using cover scales with distinct classes to facilitate cover estimations. There is a variety of cover scales, with different classes preferred by researchers in different countries 71,72 . The still most frequently used scale in Germany was introduced by Braun-Blanquet 6 . This scale, however, is not only based on cover, but uses the abundance of individual plants as additional criterion for species with a cover of ≤1% (classes r and +), which raises difficulties in numerical analyses 71,73 . To facilitate the estimation of cover changes in time series, Londo introduced a pure cover scale 74 , which in its original or in simplified form (e.g. 75 ) became very popular in permanent plot research. It is common practice that resurvey studies use the same cover scale as in the original resurvey. Nevertheless, for a numerical comparison of changes, cover classes have to be converted into per cent values 72 , for which the Turboveg software introduced standardized transformations using the midpoints of the cover classes 76 . The cover information in vegetation-plot records allows key theories of biogeography to be tested, such as the abundance-range size relationship 77 or the relationship between local abundance and niche breadth 78,79 . Most importantly, several vegetation-plot time series precede the onset of any other systematic plant species monitoring programme, for example the monitoring of Natura 2000 sites in Europe, which only started in 2001 80 . This is particularly important because severe biodiversity loss may have already happened in the second half of the 20 th century, mainly brought about by shifts in the type and intensity of land use as the consequence of technical progress and societal changes 81 . Finally, species-abundance data in plots can be linked to functional information on species 67 , which allows the interpretation of the underlying ecological drivers of the changes observed and the consequences for ecosystem functioning 82 .
Based on the data described here we analysed for the first time the dynamics of losses and gains of plant species 83 . We showed that the difference in cover changes between decreasing and increasing species results in biodiversity change even if species richness at the plot scale remains unchanged. Two mechanisms are responsible for these changes. First, losses at the plot scale were more evenly distributed among losing species than gains among winning species. Second, gains and losses in cover were concentrated in different species, resulting in a higher number of losers than winners at the spatial scale of Germany. The temporal extent of the data allowed us to demonstrate that most species losses occurred already by the 1960s, affecting mostly species from mires and spring fens, grasslands and arable land. Thus, these data already helped to shed light on the most important mechanisms underlying biodiversity change in the second half of the 20 th century.

Methods
ReSurveyGermany is the most comprehensive compilation of repeated long-term vegetation plot records from Germany to date, including published studies as well as surveys from grey literature and nature conservation assessments. A list of all 92 projects included in the database is provided in Supplementary Table S1. A project might comprise one or several studies and focus on one or several vegetation types, but typically carried out the surveys at the same times and with the same methodology. In total, the projects comprise 1,794 vascular plant species recorded in 7,738 vegetation plots. The plots were either marked with poles or magnets (permanent) or recovered from exact descriptions, sketches or marks in high-resolution topographic maps (semi-permanent). The uncertainty of the positions varied among studies, but also within a single study as resurveyed plots might have been marked in the later surveys. If the uncertainty was provided by the author or could be estimated from topographic maps, this information was included in the PRECISION field of the header file of ReSurveyGermany (see below). In addition, there were also studies where plots were not matched in time but a set of plots at a site was compared within another set of plots at the same site in the resurvey (community comparison, Fig. 1). We only considered records with complete lists of vascular plants and information on their relative abundance, which was mostly expressed as percentage cover 84 . A further important criterion for including a study was the existence of vegetation data for at least two points in time, although the number of visits (i.e. vegetation records) per site ranged between two and 54. The time span covered by each project is shown in Fig. 1. All records were made between 1927 and 2020. In total, ReSurvey Germany comprises 23,641 vegetation-plot records and 458,311 species cover records.  Table 1. Representativeness of grid cells ("Messtischblattquadrant, MTBQ", a quadrant of the German ordnance maps, 0°5‶ × 0°3‶) with time series. The estimates were obtained from linear models comparing samples with unsampled MTBQs with respect to population density, road density, urban cover, cropland cover and protected area.
www.nature.com/scientificdata www.nature.com/scientificdata/ Plot locations are not evenly spread across Germany (Fig. 2). We assigned the individual plot locations to the grid cells of the quadrants of German ordnance maps ("MTBQ, " 0°5′ × 0°3′, approximately 5.6 km × 5.9 km in the centre of Germany), and tested whether the grid cells with vegetation-plot time-series records differed from those without observations with respect to population density, road density, urban cover, cropland cover and protected areas. Using the land cover dataset from the European Space Agency Climate Change Initiative 85 , we calculated the proportional cover of urban cover for each MTBQ. Spatial information on protected areas was obtained from GIS shapefiles provided by the German Federal Agency for Nature Conservation (Bundesamt für Naturschutz, BfN). This analysis revealed that the sampled grid cells were not representative for the whole area of Germany. As expected from other studies 86 , the sampled grid cells showed significantly higher human population densities, road densities and urban cover, while cover of cropland and the amount of protected area was lower (Table 1), which indicates that the majority of time series was made in regions with higher human pressures. The lack of spatial representativeness also becomes obvious when plotting maps of plot locations by the decade of the times when they were visited (Fig. 3).
While we did not deliberately exclude certain habitat types, the data mainly consist of semi-natural to intensively managed grasslands and forests. Thus, the time series in ReSurveyGermany are biased with respect to habitat types. We assigned EUNIS habitat types to each plot record. The European Nature Information System (EUNIS) provides a comprehensive typology for the terrestrial and marine habitats of Europe 87 . Habitat types are arranged in a hierarchy, from the highest level 1 to the lowest level 4. Here, we show the assignment of plot records to level 3, which was accomplished by using the expert system EUNIS-ESy 88 and the corresponding R code 89 . Plot records covered a total of 92 EUNIS habitat types out of the 150 ones distinguished for Germany. About 63% of the 23,641 plot records came from grasslands (level 1 EUNIS habitat R, n = 14,849), followed by forests and other wooded lands (T, n = 5,440, 23%). In contrast, arable land (V, vegetated man-made habitats), which makes up more than 36% of the land cover in Germany, was only represented by 3% (816 plot records).

Data Records
The data of the ReSurveyGermany dataset as described above is available https://doi.org/10.25829/idiv.3514-0qsq70 under the terms specified by CC BY 4.0 90 .
A separate database was created for each project that contributed data, using the data-management software Turboveg 2 76 . The database is composed of two main tables, following the structure of Turboveg and common practice in vegetation science. The plot-species-abundance table contains six fields as described in Table 2. It is linked to the plot metadata (header file) through PROJECT_ID_RELEVE_NR, which is a unique Plot observation ID of a combination of PROJECT_ID (see Supplementary Table S1) and the plot observation ID (called RELEVE_NR), the name of the observed taxon (TaxonName), the vertical layer (tree layer, shrub layer, herb layer, moss layer) in which the species was observed (LAYER) and the taxon's cover in the plot (Cover_Perc). The latter was obtained by transforming the original cover classes in per cent cover, using the midpoints of the cover classes as provided by the Turboveg software 76 . For example, the seven cover classes of the Braun-Blanquet scale 6 , r, +, 1, 2, 3, 4, 5 were transformed to 1%, 2%, 3%, 13%, 38%, 63%, 88%, respectively. The other table is the so-called header file, which holds all important plot-level information, such as plot sizes, geographic location and vegetation structure for each plot observation ID ( Table 3).
The taxon names in the plot-species-abundance table were standardised using German SL 1.3 91 . The nomenclature for vascular plants followed Wisskirchen et al. 92 , with additional aggregations to higher taxonomic levels according to German SL 1.3. As some authors recorded subspecies and other infraspecific taxa, species were aggregated at the species level, using the R package vegdata 93 . Some closely related species that, from our experience, are often mistaken in the field were merged at the aggregate or genus level. Species aggregates were also used when different taxon names of the same aggregate occurred in different projects, to prevent that the same taxon might appear under different taxon names. We used our own R code to merge taxon names and the notation of the ESy expert system 88 to protocol all steps. The species harmonisation forms the first section of the ESy system and shows which taxon names were aggregated under the name of a broader taxonomic concept (Supplementary Table S2). In addition, within single projects, we used customised aggregations and segregations when the same taxa were reported with different taxonomic levels at different points in time in the same plot resurvey IDs (Supplementary Table S3). For example, in all years of a time series of a specific plot Orchis militaris was reported but in one year Orchis spec. was recorded at the genus level. Unaccounted for, such a leap between taxonomic levels within a time series would result in incorrect species change observations. To avoid losing the predominating information at the species level by aggregating all records to Orchis, we assumed that the taxon was also Orchis militaris in the particular year when only the genus level was reported. If more than one taxon occurred in previous years, we equally distributed the cover values among those taxa. This happened for example when a record was taken late in spring when the two species Anemone nemorosa and A. ranunculoides could no longer be distinguished.
The percentage cover values of the same aggregated taxon name of the same plot were merged, assuming a random overlap of their cover values and making sure that the combined cover values cannot exceed 100% 76,94 . This often resulted in cover values with decimal points and might suggest an accuracy of cover estimation that   www.nature.com/scientificdata www.nature.com/scientificdata/ is not warranted by the original estimates. As not all projects had recorded cryptogams, we removed bryophytes and lichens in all projects, using the vegdata package in R 93 . As a result, the original list of 3,280 taxon names that included bryophytes and lichens was reduced to 1,794 taxon names of vascular plants. However, if data on lichens and bryophytes are required, they are available on request from the respective dataset custodians (see Supplementary Table S1).
The data structure of the header file of ReSurveyGermany follows the Turboveg 2 standard 76 and in addition holds the fields of ReSurveyEurope (http://euroveg.org/eva-database-re-survey-europe) ( Table 3). The fields relevant for the resurvey are RS_PROJECT, which refers to the resurvey project in Supplementary Table S1. The header field RS_SITE holds the location name of plots and allows for a local geographical scale aggregation of resurvey plots within projects. LOCALITY provides more details on the locality in German.
Within each project, the field RS_PLOT holds a plot resurvey ID that connects plot observations from different times made on the same plot. In resurveys, there are also cases, where the previously provided location was not precise enough. In these cases, resurveys often used several plots to match one previous plot, resulting in a one-to-many relationship. If a set of plots at the same site was compared with plot records from another point in time, several plot records in the same year might have the same RS_PLOT code. The unique code of the one-time observation is a combination of RS_SITE, RS_PLOT and YEAR when the plot was surveyed (RS_OBSERV). We report the exact DATE when a record was made (if available). In addition, the field YEAR lists the year in which the plot was (re)surveyed. If available, we also report the year of the underlying publication (YEAR_PUBL).
Plot area (SURF_AREA) ranges from 0.5 to 2500 m 2 , with 25, 100 and 400 m 2 being the most frequently used plot sizes (Fig. 4). Plot sizes larger than 100 m 2 were typical of forest sites (with a very few exceptions).
Geographic information is given by LONGITUDE, LATITUDE and ALTITUDE. Current monitoring programs and data protection of land owners do not allow us to provide location information at the highest available precision. In addition, some records contain occurrence data of rare and protected species. Thus, information on longitude and latitude was rounded to two decimal digits. Compared to the coordinates at highest available precision, rounding resulted in a mean uncertainty of 371 m (±138 m standard deviation), and thus, is within the somewhat limited range of accuracy provided by many custodians in the first place (see field PRECISION). If more precise coordinates are required for certain analysis we recommend to contact the respective data owners (as shown in Supplementary Table S1). Vegetation-plot time series differ with respect to the accuracy of the plot relocation during the resurvey. In the ideal case, plots are permanently marked, using poles, metal tent pegs or magnets and metal detectors to retrieve their position (shown as "01" in the LOC_METHOD field, Table 3). In other cases, plots only have exact coordinates (using GPS coordinates, "03" or "04") or other ways of descriptions of the exact locality (such as from maps, "05"), but are not marked on the ground, which we refer to as semi-permanent plots. In addition, there is information on the cover scale used for the record, a reference to the data source (or, if published, the publication ID), including the table and column from which the data were taken.
The orientation of the plot can be taken from SLOPE (inclination) and slope ASPECT (compass directions). Vegetation structure is described by the height and cover of the different layers, ranging from tree layer to moss layer and including information on cover of litter and bare soil (if available).
Some of our projects included experimental treatments with different management of habitats (e.g. abandonment or establishment of grazing, succession and disturbance). Plots with experimental manipulation contain "Y" in the MANIPULATE) field. The type of manipulation can be taken from MANIPTYPE. When projects involved treatments that are not appropriate to assess biodiversity change, we included only the control plots 46 , plots that reflected the predominant land use at the site (e.g. mowing for a grassland to counteract natural succession) 22 , that were unfenced 95 or were subjected to continuous grazing 96 .  Supplementary Table S1  0   PROJECT_ID  I  Number of the resurvey project in ReSurveyGermany; see Supplementary  Table S1  0 RS_PLOT C Unique (within the site) code of the resurveyed plot; it is used to pair observations from different times recorded in the same plot; gives a unique identifier for the resurveyed plot or set of plots in time if combined with RS_ PROJECT. Several plots in the same year might have the same RS_PLOT code if they have to be summarised for temporal comparisons. In these cases, they might also contain the community name.

technical Validation
As each dataset was transformed into a Turboveg 2 database 76 , a quality check was made when importing the data. This particularly applied to the taxonomical harmonization of the data, which at the stage of entering the data was adjusted to GermanSL 1.3 91 .

Usage Notes
Users are urged to cite the original sources when using ReSurveyGermany in addition to the present paper (see Supplementary Table S1). As some of the time series will be continued, it might be useful to contact the respective data owners. As described above, the dataset cannot be considered representative of Germany's vegetation, neither spatially, nor temporally, which is typical of vegetation-plot time series 97 . As plots were established with different objectives in different habitats at different points in time, analysis of vegetation-plot resurveys faces various methodological challenges 62 . Yet, we note that ReSurveyGermany covers about 60% of the 2,988 vascular plant species that occur in Germany (without subspecies and segregates 92 ) and includes rare habitats which often harbour rare plant species. This means that even if our sites are not fully representative of the vegetation of Germany and its change over the last century, the data nevertheless can provide important insights into biodiversity change at the level of local communities and individual species.

code availability
The R code to read the plot-species-abundance file (ReSurveyGermany.csv) and combine it with the header data (Header_ReSurveyGermany.csv) is provided on https://github.com/idiv-biodiversity/Read_ReSurveyGermany.