Worldwide Soundscapes: a synthesis of passive acoustic monitoring across realms

The urgency for remote, reliable, and scalable biodiversity monitoring amidst mounting human pressures on climate and ecosystems has sparked worldwide interest in Passive Acoustic Monitoring (PAM), but there has been no comprehensive overview of its coverage across realms. We present metadata from 358 datasets recorded since 1991 in and above land and water constituting the first global synthesis of sampling coverage across spatial, temporal, and ecological scales. We compiled summary statistics (sampling locations, deployment schedules, focal taxa, and recording parameters) and used eleven case studies to assess trends in biological, anthropogenic, and geophysical sounds. Terrestrial sampling is spatially denser (42 sites/M·km2) than aquatic sampling (0.2 and 1.3 sites/M·km2 in oceans and freshwater) with only one subterranean dataset. Although diel and lunar cycles are well-covered in all realms, only marine datasets (65%) comprehensively sample all seasons. Across realms, biological sounds show contrasting diel activity, while declining with distance from the equator and anthropogenic activity. PAM can thus inform phenology, macroecology, and conservation studies, but representation can be improved by widening terrestrial taxonomic breadth, expanding coverage in the high seas, and increasing spatio-temporal replication in freshwater habitats. Overall, PAM shows considerable promise to support global biodiversity monitoring efforts.

Despite the wide-ranging and increasing soundscape sampling effort, the data distribution remains undescribed.Currently soundscape-recording communities are only networked within realms and their methodologies differ.Previous reviews focused on single realms and were either systematic [19][20][21][22] or qualitative 3,23 .Marine scientist networks using PAM exist 24 , but the freshwater community is nascent, and the terrestrial community grows faster than it can unite.Acoustic calibration and sound propagation modeling are advanced in aquatic studies 25 but seldom considered in terrestrial ones (except 26,27 ).Artificial intelligence can identify increasing numbers of species on land 28 whereas aquatic sounds still challenge identification 29,30 .Overall, there is much to gain from sharing data, experience, and methods among PAM users.
Cross-realm PAM studies can yield new theoretical answers 31 and applied solutions: soundscapes track terrestrial and marine resilience to natural disasters 32 ; sound and silence durations in multiple realms follow universal distributions 33 .Transnational sampling could form the basis for comprehensive soniferous biodiversity monitoring, just like communityinitiated telemetry databases 34 , collaborative camera trap surveys 35 , individual animal observation networks 36 , and invasive species control syntheses 37 advanced entire research fields.A global PAM network could establish historical biodiversity baselines, support systematic long-term and large-scale monitoring, and connect with the public through citizen science.Such information is critical to inform global biodiversity policies such as the Kunming-Montreal Global Biodiversity Framework.
We present the "Worldwide Soundscapes" project, the first global PAM meta-database and network.We use it to quantify the known state of PAM efforts, highlight apparent sampling gaps and biases, illustrate the potential of cross-realm PAM syntheses for research, and federate PAM users.The project currently holds 307 contributors who collated metadata about 358 passively-recorded, replicated soundscape datasets.Metadata describe the exact spatio-temporal coverage, sampled ecosystems (IUCN Global Ecosystem Typology: GET), transmission medium (air, water, or soil), focal taxa (IUCN Red list), recording settings, and data and publication availability.We inferred coverage within administrative (Global Administrative Database: GADM; International Hydrographic Organisation: IHO) and protected areas (World Database on Protected Areas: WDPA) from geographic locations.We selected recordings referenced in the meta-database to quantify soundscape components (biophony, anthropophony, geophony) across eleven ecosystems from all realms.We showcase their relevance to macroecology, conservation biology, and phenology research and identify opportunities to advance the global PAM network.The publicly-accessible meta-database 38,39 continues to grow to enhance accessibility of data and remains open for metadata contributions, facilitating future syntheses.

Summary dataset statistics
To date, 358 validated soundscape meta-datasets (hereafter "datasets") have been registered in our database from across the globe, dating back to 1991 (Fig. 1D).A dataset gathers a team's metadata on a study or project.Metadata were validated by checking that all fields were filled in and formatted correctly, and then cross-checked by their respective contributors using site maps and recording timelines 39 on ecoSound-web 40 .Based on the GET definition of four 'core' realms, our database includes 259, 80, 18, and 1 validated datasets from the terrestrial, marine, freshwater, and subterranean realms, respectively.The transmission medium was air for terrestrial and subterranean datasets and mostly water for aquatic datasets, as three datasets with aerial above-water recordings were included in the freshwater and marine realms.The majority of datasets (84%) include both spatial and temporal replicates (Fig. 1C).Few datasets have openly-accessible recordings (10-15%, Fig. S1).Presently, few terrestrial and freshwater datasets (27% and 22% respectively) are associated with DOI-referenced publications in contrast to marine datasets (44%) (Fig. S1).

Spatial sampling coverage and density
The database contains 11 093 sampling sites, including 147 polar, 8 455 temperate, and 2 491 tropical sites (Fig 1C).On land, 10 387 sites are located within 79 (out of 263) GADM level 0 areas (Fig. 2B), primarily in the Northern Hemisphere (Table S1).Most terrestrial sites occur in Canada (28%), followed by the United States (18%), but a significant proportion is widely distributed (23% do not belong to the top 10 GADM areas).Few terrestrial sites (8%) are located in WDPA category Ia, Ib, or II areas.Our database currently lacks data from vast areas in Russia, Greenland, the Antarctic, interior Australia, North Africa, and Central Asia.Site elevations range from sea level up to 3 420 m (Fig. 2A), but mountains above 4 000 m in the Northern Hemisphere, above 2 000 m in the Southern Hemisphere, as well as the Transantarctic Mountains are currently not represented in the data.At sea, 469 sites are located within 32 (out of 101) IHO sea areas.Most marine sites occur in the North Pacific Ocean (22%), followed by Southeast Alaskan and British Columbian coastal waters (17%), but a significant proportion is widespread (17% do not belong to the top 10 IHO areas).Many sites are situated in WDPA high-protection category Ia and II areas (14%).Our database currently lacks datasets from Arctic waters off Eurasia, the Southeast Pacific, and Southeast Asian coastal areas.Sampling sites span ocean depths from sea water surface to depths of 10 090 m, but tropical bathypelagic and Southern benthic areas are poorly represented (Table S2).Few GADM areas (11) are represented in the freshwater datasets.Spain holds most freshwater sites (56%).Few freshwater sites (2%) are in WDPA category II or Ib areas.Freshwater bodies are sampled at elevations from sea level up to 950 m.Mountain freshwater bodies and those in Africa, Asia, and Oceania are currently not represented in our database.The database contains seven subterranean sites situated in Brazil between 277 and 810 m.

Temporal sampling extent, coverage, and density
We compare sampling coverage (in sampled years summed over sites) across time windows of the diel cycle (dawn, dusk, day, night), lunar phases (bright and dark), and the seasons (spring, summer, autumn, winter -only for temperate sites) (Fig. 3A).Dawn and dusk diel windows are shorter than day and night diel windows for most locations and less intensively sampled.The lunar phase cycle is evenly covered across realms and comprehensively covered within datasets (Fig. S2).In the terrestrial realm, daytime coverage surpasses nighttime coverage (611 vs. 398 years), while 75% of datasets sampled all diel time windows.Terrestrial datasets mostly sampled spring (415 years, 32%) and summer (564 years, 44%) while 24% sampled all seasons.Temporal coverage per site is highest in Japan (272 days) and Taiwan (250 days).In the marine realm, temporal coverage among diel time windows is even and 88% of marine datasets sampled all diel time windows.Marine datasets have high and similar coverage for winter and spring (117 and 122 years, combined 59% of seasonal coverage) and 65% cover the full seasonal cycle.In the freshwater realm, temporal coverage among diel time windows is even and 76% of freshwater datasets sampled all diel time windows.Similarly, 24% sampled all seasons.The subterranean tropical sites primarily covered the nighttime, across all seasons.

Sampling in ecosystems
Our database includes 76 of the 110 GET functional groups and all biomes except anthropogenic shorelines and most subterranean biomes (Table S2).Sampling intensity differed across ecosystem levels -realms, biomes, and functional groups, with transitional realms representing the interface between core realms (Fig. 4A).The terrestrial realm has the third-largest extent and the highest spatial sampling density among realms (42 sites per Mkm 2 ), but temporal coverage is comparatively low (17% sampled out of mean extent per site: 155 days).The most commonly sampled biome is temperate-boreal forests and woodlands (59% of sites).The marine realm is the most extensive and spatial sampling density is the lowest (0.2 sites per Mkm 2 ), but temporal sampling extent and coverage are the highest among all realms (69% out of 497 days sampled).The most commonly sampled biomes are the marine shelf and pelagic ocean waters (43% and 42% of sites respectively).The freshwater realm has low spatial sampling densities (1.3 sites per Mkm 2 ) and high temporal sampling densities (30% out of 115 days sampled).Rivers and streams are most commonly sampled (47% of sites).The terrestrial-freshwater realm, representing 81% of the area of non-subterranean realms, has the third-highest spatial sampling density (7.9 sites per Mkm 2 ) and similar temporal sampling density to the terrestrial realm (16% days sampled out of 230).The subterranean realm, though second-largest, includes seven tropical sites in aerobic caves from one dataset.

Target taxa and frequency ranges
Most marine datasets do not target specific taxa (70%) and use wide frequency ranges from 0.01 to 30 kHz (mean bounds of frequency ranges across datasets, Fig. 4B).Marine datasets that focus on single taxa comprise cetaceans (8%, 0.006 -7 kHz) and fish (8%, 0.003 -29 kHz).Similarly, most freshwater datasets are taxonomically unspecific (66%) and cover frequencies from 1 to 22 kHz.Some datasets (20%) focus on ray-finned fish, covering frequencies from 0.001 to 23 kHz.In contrast, terrestrial datasets mostly target single taxa and have narrow frequency ranges.Bird-focused datasets are most common (44%), spanning frequencies from 0.06 to 21 kHz, while bat-focused datasets are next (12%) and range from 5 to 139 kHz.Taxonomically unspecific datasets account for 24% of datasets, covering a broad range from 0.1 to 24 kHz.Generally, datasets targeting multiple taxa use wider frequency ranges than those targeting single taxa.

Soundscape case studies
We analysed 150 recordings from eleven different ecosystems representing the diversity of soundscapes on Earth, spanning latitudes from 69 degrees South to 67 degrees North (Table S3, Fig. 5A).Biophony dominated with an average soundscape occupation of 30% across all ecosystems, notably: photic coral reefs (Okinawa, Japan) with snapping shrimps and grunting fish choruses (75% soundscape occupancy) 41 ; tropical lowland rainforests (Jambi, Indonesia) with buzzing insects and echoing bird and primate songs (61%).Only marine island slopes (off Sanriku, Japan) and polar outcrops (Antarctic) contained no or very little (2%) biophony, respectively.Geophony was absent in most soundscape samples, with the exception of high wind noise in polar outcrops (14%) and some wind in montane tropical forests (5%).Anthropophony occupied on average 9% of the soundscapes.Cities (Jambi, Indonesia; Montreal, Canada) exhibited the highest anthropophony (45%) with prevalent engine noise and human voices, while deep-sea mining and vessel communication caused high anthropophony in marine island slopes (32%) 42 .Silence occupancy was highest in polar outcrops (82%) and large lowland rivers (78%).
The selected soundscapes reveal greater biological activity closer to the equator, a negative relationship between biophony and anthropophony, and variable phenology of soniferous organisms over the diel cycle (Fig. 5).All Bayesian beta regression models for biophony occupancy converged.We detected a negative correlation of biophony occupancy with increasing distance from the equator (Pnegative=1) and with anthropophony occupancy (Pnegative=1).The phenology model predicted biophony occupancy values for each diel time window and realm (Fig. 5B), revealing similar phenology for the terrestrial and marine realm, and opposed freshwater and freshwater-marine realm phenology.

Discussion
The "Worldwide Soundscapes" project has -to our knowledge -assembled the first global meta-database of PAM datasets across realms.We analysed its current content to quantify sampling extent, coverage, and density across spatiotemporal and ecological scales.We annotated soundscapes from eleven ecosystems to investigate macroecological, conservation biology, and phenological trends.The database remains open for contributions 39 and can be openly accessed to source datasets and initiate collaborative studies 38 .Next, we discuss the state and potential of PAM globally (Table 1).
Our results likely represent global PAM trends, even though spatial gaps reflect the background of the project contributors.Our database still misses some national programs (e.g., Australian Acoustic Observatory 43 ), but otherwise our terrestrial coverage is similar to a recent systematic review 21 .The gaps in North Africa and Northeastern Europe correspond with the paucity of bioacoustic datasets for these regions 44 .Our database comprises 469 marine sampling locations while 991 were found in a recent systematic review 20 , but most overlap (Fig S2).Marine tropical waters that are under-represented in our database reflect gaps found in the International Quiet Ocean Experiment network coverage 45 .Our marine and terrestrial database coverage is thus broadly comparable with published data.However, as the database originates from an active network of researchers, it represents the current availability of mostly as-yet unpublished data (Fig. S1).To our knowledge, no other spatiallyexplicit review of freshwater sampling exists for comparison, and no other work quantified temporal coverage.
Geographic coverage strikingly differs between realms: marine coverage is sparse but widespread; terrestrial coverage is comparatively intensive in the Americas and Western Europe; freshwater coverage is scattered.This partly reflects funding priorities for where conservation and active management priorities lie, for instance in biodiversity hotspots and densely-populated areas.Technical limits in extreme environments also drive geographic patterns: high latitudes and elevations entail extremely cold temperatures that present challenges for operation and maintenance, some of which can be solved with robust power setups (solar panels, freeze-resistant batteries).Marine deployments are generally even more constrained due to costly and demanding underwater work, but some deployments reach the poles as water temperatures are buffered below the freezing point.Affordable underwater recorders 46 may help to intensify sampling of marine coastal areas and close gaps in freshwater coverage.By contrast, terrestrial monitoring is generally straightforward.Northern temperate areas outside of Northeastern Europe are comparatively better covered, and the tropics outside of Africa better represented.It appears that gaps in North Africa, Central Asia, and Northeastern Europe arise from differential research means and priorities between countries.Taken together, these gaps help to correct spatial biases 47 and to identify high-priority, unique research areas that should be included in global assessments.
Currently, only marine studies achieve relatively even coverage of temporal cycles.Indeed, offshore deployments -especially in the deep sea -are expensive and limited duration deployments are not cost-effective 48 .Marine soundscapes fluctuate stochastically 49 , but the ocean buffers water temperatures so that animals retain a basal activity level year-round.Alhough lunar phases affect marine life [50][51][52] , we did not consider lunar tides, which affect some ecosystems.In the terrestrial and freshwater realms, most deployments cover the entire diel cycle but monitoring on land focuses either on diurnal birds or on nocturnal bats, as found in the literature 21 .In contrast, although seasons drive activity cycles on land too 53,54 , spring-and summertime monitoring is disproportionately common and we lack a thorough understanding of seasonal dynamics.Terrestrial deployments in particular may be short for logistic reasons: in the cold, batteries struggle and access is harder; recorders are at risk of theft; and limited numbers may cycle between sites 55 .Lunar phases -probably evenly sampled by chance -also influence land animals 56 but should be explicitly considered.Overall, we encourage longer-duration setups with regularly-spread sampling inside temporal cycles to alleviate the higher expenses, energy consumption, storage, and tradedoff spatial coverage 57 .Global changes impact soundscapes in largely unpredictable ways through changing species distributions and phenology, necessitating higher and unbiased coverage across multiple time scales -including inter-annual ones -to successfully monitor ongoing changes 58 .
Re-use of soundscape datasets is restricted by their taxonomic focus.Admittedly, taxonomically untargeted, long, and regular deployments in oceans, coupled to large detection ranges, concurrently sample many taxa 59 .However, many soundscape recordings sample particular frequencies, often in the human-audible range 60 , although biophony ranges from infrasound 61 to ultrasound 62 .For instance, studies of toothed whales or bats often use triggers and high-pass filters to record purely ultrasonic recordings only when signals are detected, resulting in spectrally-restricted and temporally-biased soundscape recordings.Less-studied taxa such as anurans and insects could effectively be co-sampled by adjusting ongoing deployments, so we encourage terrestrial researchers to maximise frequency ranges to enhance interdisciplinary collaboration.These collaborations can help to mutualise resources for mitigating potentially prohibitive power, storage, transportation, and postprocessing costs 63 .Emerging embedded-AI audio detectors may offer an alternative 64 , but soundscape recordings will remain essential for broader application.
In every realm, ecosystems await acoustic discovery.Except for one dataset from aerobic caves, we lack data from all subterranean realms (anthropogenic voids, ground streams, sea caves), while endolithic systems may be the only irrelevant ecosystem for biophony.Access is usually challenging or restricted for non-specialists, but subterranean biodiversity shows high spatial turnover 65 .Freshwater datasets were less rare, but several ones with unreplicated sampling could not be included.Temporary, dynamic water bodies (seasonal, episodic, and ephemeral ecosystems) are not yet well-studied 66 although they are accessible from land.Notably, sounds can pass the boundary between recording media and be captured in so-called holo-soundscapes in freshwater and shallow coastal areas 5,67,68 .Advances are imminent as the freshwater acoustic research community is growing rapidly 69,70 .In the oceans, sound propagates far and multiple biomes can be sampled at once (e.g., recorders on the seafloor sample pelagic waters) so that most ecosystems are covered.Still, coverage gaps may exist in rhodolith/Maërl beds, upwelling zones, and deepwater coastal inlets.On land, sampling is biased towards biodiverse forests, and our database coverage suggests that the most challenging terrestrial ecosystems, such as arid zones (sclerophyll hot deserts and semi-deserts, semi-desert steppes), rocky habitats (young rocky pavements, lava flows and screes) but also some vegetated temperate ecosystems (cool temperate heathlands, temperate pyric humid forests, temperate subhumid grasslands) are poorly sampled.Within the IUCN GET framework, soil soundscapes also belong to the terrestrial realm and to date no eligible datasets are in our database, despite recent studies 71,72 .
Our database highlights well-known global sampling biases 73 which could be resolved with collaboration to remove cultural and socioeconomic barriers 74 .Technological progress for more affordable equipment renders PAM more accessible in lower-income countries.However, high-and deep-sea work remains considerably more expensive, and tropical developing countries in particular often lack funding for marine programmes requiring large vessels, underwater vehicles, or cabled stations on the seafloor 48 .Our network currently consists of active, English-speaking members from 50 countries.Collaborative projects, shared sea missions, and equipment loans should promote the establishment of soundscape research communities 75 .Increased international collaboration with scientists and local stakeholders supporting citizen-science 76 in heavily underrepresented regions would improve not only data coverage, but also representation and dialogue within the field.
Collaborative soundscape research relies on interoperable data.We harmonised metadata with a bottom-up approach leading to our global inventory, but comprehensive standards for PAM do not currently exist, even though initiatives for the marine realm are ongoing 77,78 .Few affordable solutions exist for sharing large audio data volumes 40 , underlining the need for distributed soundscape recording repositories 79 .Marine oil and gas industry projects routinely upload data as part of their efforts to mitigate noise impacts on marine animals 80 , but these recordings often focus on frequencies relevant to seismic prospecting and access may be restricted 81,82 .Furthermore, recording equipment requires calibration, sound detection spaces need to be measured 26 , and data privacy must be ensured on land 83 .In parallel, species sound libraries 44,[84][85][86] grow and continue to provide invaluable acoustic and taxonomic references.International organisations such as GBIF will be key to roll-out standards (e.g., Darwin Core) in a top-down manner.For the moment, we encourage early planning of data archival.In the future, the Worldwide Soundscapes will interoperate with other databases to close remaining coverage gaps.
A unified approach of ecology with PAM is now possible.More comprehensive coverage is needed to decisively answer research, conservation, and management questions that PAM can address, so we encourage prospective contributors to archive their metadata and join our inclusive project.We show that a large portion of the PAM community is willing to collaborate across realms and form a global network.We advocate for a bolder PAM effort to inform the agenda of soundscape ecology 87 , reaching out to places where no sound has been recorded before as well as to urban settings 88 .The research community may open new avenues to study environmental effects on acoustic activity 58 , social species interactions 89 , human-wildlife relationships 41 , function and phylogeny 90 , soundscape effects on human health 91 , acoustic adaptation and niche hypotheses 92,93 , macroecological patterns across ecosystems 1 , and initiate an integrated approach to noise impacts on wildlife.
PAM is now an established method that can be applied over large spatial and temporal scales.Consistent, large-scale monitoring of the Earth's soundscapes is essential to establish baselines for historical trends 94 and quantify rapid changes in biodiversity and natural systems.International funding schemes should integrate PAM into biodiversity monitoring platforms such as GBIF 95 and GEO BON 96,97 .Soundscapes are just starting to be used in legislation as an ecosystem feature to be preserved 98 .Occupancy maps for soniferous wildlife obtained from PAM would underpin the evaluation of progress towards threat reduction and ecosystem service provision of the Kunming-Montreal Global Biodiversity Framework 99 .By building collaborations around the knowledge frontiers identified here, we can aim to comprehensively describe and understand the acoustic makeup of the planet.

Ethics & Inclusion statement
The present study has involved people who carried out PAM-based studies as primary contributors.They could be corresponding authors for published studies, referred contacts for unpublished studies, or principal investigators, and were asked to identify further primary and secondary contributors of their study.Primary contributors could become co-authors, and secondary contributors are acknowledged here.Some primary contributors were invited as co-leads to be responsible for particular realms or biomes and are listed in the first-tier authors list.Primary contributors who additionally provided soundscape recordings are also listed in the first-tier authors list.All primary contributors were asked to identify further contacts to reach a comprehensive coverage for the database.Given the nature of the study (using passive acoustic monitoring methods), the research did not suffer from restrictions or prohibitions in the setting of the researchers.Necessary permits from landowners or environmental protection agencies, precautions to limit biological contaminations (e.g., biofilm on marine recorders) or for animal welfare, and local ethics committee reviews were the responsibility of the respective primary contributors.None of the PAM activities resulted in personal risk to participants; neither did they involve health, safety, security or other risk to researchers.Local and regional research relevant to our study were taken into account in citations throughout the main text.

Database construction
The database construction started in August 2021 within the frame of the Worldwide Soundscapes project 38 using collaborative, peer-driven metadata collation.It represents the current state of knowledge about publicly accessible meta-datasets for PAM within our network.We started contacting contributors that we personally know, and co-lead authors helped to contact potential contributors in the respective ecosystem types of their expertise (i.e.terrestrial, urban, freshwater, and marine).We conducted focal publication searches to actively plug coverage gaps by inviting the respective corresponding authors.We posted the call for contributors on specialised ecoacoustics platforms and social media, and will keep the project open for any contributor owning suitable soundscape recordings beyond the present publication.We communicated mostly in English but also in Spanish, French, Portuguese, German, Russian, and Chinese to gather metadata.We ended up including metadata from larger groups such as the Silent Cities project, Ocean Networks Canada, the Australian Acoustic Observatory, and Parks Canada.
Primary contributors provided the metadata that formed the basis for the database.Their willingness to be informed through a mailing list, responsibility for their metadata, approval for sharing the meta-data publicly, and willingness to participate in this study as co-authors were explicitly stated in an online form-based collaboration agreement.Primary contributors who became co-authors all fulfilled either data curation (e.g., as providers of structured metadata) or project administration (e.g., as principal investigators designing the corrresponding study) roles and additionally a manuscript revision role.Primary contributors cross-validated their metadata using maps and graphical timelines visualising their input.As a result, apart from basic coherence checks conducted by the co-leads and research assistants, the responsibility of the database content is borne by their contributors.The database information page is integrated in our online collaborative ecoacoustics platform ecoSound-web 100 that also hosts the soundscape recordings of the case study (https://ecosound-web.de/ecosound_web/collection/show/49).
Our network consists of people reachable through email contact.Out of 588 contacted contributors potentially involved in PAM, 298 (49%) provided some metadata.More datasets exist but 24% of those contacted have not yet responded.This may be due to limitations in sharing research data or, newcomers to PAM, or out-dated contact details.Older soundscape datasets may be stored on analog media that are inaccessible over the internet, but these are more often attended recordings 53 -thus not fulfilling our inclusion criteria (see below).The missing share of datasets could be estimated with a systematic literature search, but we posit that our dataset search was exhaustive, and we were able to fill gaps more effectively due to focused calls for missing datasets in our large group of contributors.
Potential soundscape recording datasets were required to meet four criteria: 1) stationarymobile recorders would have complex, constantly changing spatial assignments, thus we excluded recordings from cars, transect walks, or towed deployments; 2) passiveobtained from automated, unattended recorders, without human presence near the acoustic sensor; 3) ambient -with no particular recording direction or temporal selection, as obtained from omnidirectional microphones and with non-triggered recordings; 4) from spatially or temporally replicated study designs (Fig. 1).Datasets with spatial and temporal replication are necessary to disentangle spatial and temporal effects from other soundscape determinants.We considered datasets to have spatial replicates when several sites were sampled simultaneously, and temporally replicated when one site was sampled over multiple days at the same time of day.We chose sampling sites and days as the most elemental units for defining replication in our unified analysis.However, we acknowledge that replicates are defined differently in each study: for instance, sampling sites may only be spatial replicates if they belong to the same category (i.e.habitat, management type).Temporal replicates could also be defined at the scale of other solar and lunar cycles (e.g., multiple full moons).Taken together, our requirements homogenise the dataset to enable statistical analyses across datasets and future collaborative syntheses.

Time and space
Soundscapes are fundamentally determined by their time and location.Wildlife and human activities are broadly determined by solar and lunar cycles 32,54,56,101 , and geographical positions on the planet relative to the poles or equator, or the land and water surface.For all datasets, we determined their spatial coverage in terms of number of sites, and sampling density as the number of sites within the realms' areal extent (as determined by IUCN GET maps).Although spatial extent could be defined as the area bounded by the sites of a dataset, challenges with calculating extents on the world sphere (especially for datasets with very large extents spanning the globe), and unknown sampling areas covered in each site led us to ignore spatial extents.Until sound detection spaces can be accurately measured, our measure of spatial sampling density using points per area is thus provisional.Indeed, acoustic sampling areas or volumes are extremely rarely measured in terrestrial sites and seldom measured but sometimes simulated in marine environments 102 .They vary with sound source intensity, frequency, directivity; recording medium temperature, currents, pressure, humidity (for air); habitat structure such as topography and vegetation structure; ambient sound level.Generally, detection ranges are greater in underwater environments due to the higher density of the recording medium.For all datasets, we also determined their temporal extent (i.e., the time range from the start of the first to the end of the last recording), coverage (i.e., time sampled per site), and density (i.e., proportion of recorded time inside the temporal extent).
We quantified the latitudinal and topographical coverage across realms by collecting latitude, longitude and topography (above and below the sea level) data for each site.Contributors could provide self-measured topography values for sites on land and gaps were filled in using General Bathymetric Chart of the Oceans surface elevation data.For underwater sites, depth values (below sea level for marine sites, below the surface for freshwater) were provided by the contributors.We assigned sampling sites to administrative areas (GADM divisions for freshwater and land, IHO for sea areas) and extracted their WDPA category.Sites' climates were geographically classified into tropical (between -23.5° and 23.5° latitude), polar (below -66.5° or above 6`.5°latitude), and temperate (between polar and tropical regions).
Contributors were asked to specify the timing of their deployments using start and end dates and times, operation modes (continuous, scheduled, periodical).For scheduled operation schedules (daily start and end times or durations) were required, and duty cycles were required for periodical operation.Duty cycles could additionally be indicated for scheduled deployments.Deployments were assigned to single sampling sites.Subsequently added sites and sampling repetitions were considered as additional deployments within datasets.A temporal framework was used to quantify sampling coverage in three solar and lunar cycles.Temporal information was inferred from the timing of sound recordings relative to time events that structure life and meteorological events on Earth (Fig. 3).Seasonal cycles were inferred only for temperate sites, by splitting the year cycle into four meteorological seasons (winter: December-February, spring: March-May, summer: June-August, fall: September-November).The daily cycle was split into four windows delimiting dawn (from astronomical dawn start at -18° solar altitude until 18° solar altitude), day, dusk (from 18° solar altitude to astronomical dusk end at -18° solar altitude), and night.The lunar illumination cycle was split into two time windows centred on the full and new moon phases.It follows that the extrema and ecotones in the temporal cycles define the relevant sampling time windows, noting that in temperate zones, the equinoxes roughly correspond to thermal ecotones.Seasonal cycles in tropical and polar regions arising from precipitation patterns were not considered in this analysis.

Ecological characterisation
We assigned individual sampling locations to ecosystem types following the IUCN GET (https://global-ecosystems.org).Correspondiongly, sites were assigned hierarchically to realms (core or transitional ones), biomes, and functional groups.We calculated the major and minor occurrence areas of all functional groups based on ecosystem maps 1,103,104 .We used these data to quantify spatio-temporal extent, coverage, and sampling density within realms and biomes, and to identify sampling gaps.To identify which taxa acoustic deployments were designed for, deployments were linked to IUCN taxa (class, order, family, or genus) when applicable -some datasets were collected without taxonomic focus.We use these data to depict the acoustic frequency ranges depending on the target taxon.

Acoustic frequency ranges
Sound recordings store a representation of the original soundscape, so we needed to determine the spectral scope of the recording datasets.Microphones have variable frequency responses, and for digital recorders the spectral scope of the recording is limited by the sampling frequency.Contributors were asked to provide the audio parameters of their deployments (sampling rate, high-pass filter) as well as auxiliary metadata about the microphone and recorder model and brands.The frequency range is additionally determined by microphone roll-off frequencies at the low end and in general, by their frequency response, but we could not gather these detailed, often unpublished data for the high number of equipment types used here.

Soundscape case studies
To illustrate how the database can be used for macroecology, conservation biology, and phenology analyses, we selected 121 recordings across a variety of topographical, latitudinal, and anthropisation conditions -all fundamental gradients of assembly filters in both terrestrial and marine realms 1 -belonging to 10 functional groups.We aimed to select datasets containing four spatial replicates within the same functional group, with 10-minute recordings spanning all four diel time windows from the same date during the biologically active season, but the available data did not always allow this (Table S2).We extracted recordings starting at sunrise, solar noon, sunset, and solar midnight.Recordings covered the audible frequency range with sampling frequencies of at least 44.1 kHz in single or dual channel configuration.We acknowledge that this targeted, non-systematic selection is not statistically representative of global patterns but rather illustrative of the database potential.Each recording's spectrogram (i.e., visualisation of sound intensity along time and frequency axes) was annotated with the three fundamental soundscape components: biophony, anthropophony, and geophony.Soundscape recordings were uploaded to ecoSound-web (https://ecosound-web.de/ecosound_web/collection/show/49) 40for annotation: KD listened to them and visually inspected the spectrograms (Fast Fourier Transform window size of 1 024) at a visual density of 1 116 pixels per 10 minutes to create annotations.Annotations were rectangular tags on the spectrogram, encompassing only the annotated sound, with defined coordinates in the time and frequency dimensions.Annotations of different soundscape components could overlap if they were simultaneously visible or audible.Annotations of the same soundscape component were adjacent and non-overlapping to avoid double-counting.Soundscape components occurring above 22.05 kHz were excluded from the analysis as we focused on audible frequencies only, which were the frequencies targeted by most recordings.Sounds caused by microphone or recorder self-noise were excluded.All annotations were reviewed by the recording creators using the peer-review mode on ecoSound-web, and only accepted ones were used in the analysis.The annotations were exported and acoustic space occupancy for each soundscape component in each recording was calculated as the proportion of the spectro-temporal space 60 (i.e., duration multiplied by frequency range, divided by total area of spectrogram; range: 0-1).
All Bayesian beta regression models (4 chains of 1000 sampling iterations with 1000 warmup iterations) for biophony occupancy converged as determined by trace plots and R hat values equaling 1.The models using latitude and anthropophony as predictors included the functional group as a random intercept.The phenology model used diel time windows and realms, as well as their interactions, as predictors.

Table 1 :
An agenda towards comprehensive coverage and a global PAM network

Overview of Worldwide Soundscapes meta-database
Temporal extent and coverage, based on recorded days, split by core realm.An enlarged version of panel D without terrestrial sites can be found in Fig S3.For panels C and D, Sites from transitional realms were assigned to their parent core realm.
: A) Framework used to define spatial and temporal replicates.B) Number of datasets in each core realm for the different replication levels.C) Spatial extent and coverage, based on sampling sites, split by core realm.Due to their higher representation and to avoid overlapping site clusters, terrestrial site densities were plotted on a 3 degree resolution raster (Interactive map: https://ecosound-web.de/ecosound_web/collection/index/106).D)

Spatial distribution of sampling sites
. A) Latitudinal and topographic distribution of sampling sites across core realms.Due to their higher representation and to avoid overlapping site clusters, terrestrial sites are shown with transparency.The minimum (deepest seafloor) and maximum (highest elevation of land or sea level) topographical limits (dark grey lines) are shown against latitude, based on General Bathymetric Chart of the Oceans data105.Minimum topography above the sea level and maximum topography under the sea level were set to zero as the sea level represents the minimum and maximum in these cases.B) Number of sampling sites within different administrative regions (GADM level 0 and IHO sea areas), split by core realm, across WDPA categories (Ia: strict nature reserve; Ib: wilderness area, II: national park).The areas that do not belong to the top 10 in terms of datasets have been aggregated under "others".One subterranean site in Brazil is not shown.Sites from transitional realms were assigned to their parent core realm.Temporal sampling distribution.A) Temporal sampling coverage across solar and lunar cycles for the three core realms.Cycles consist of solar (daily and seasonal) and lunar time cycles (lunar phase), subdivided in time windows.Seasons were only analysed in temperate regions.Sampling coverage is represented with sampling days in number labels.B) Mean number of sampling days per site within different administrative regions (GADM level 0, IHO areas, WDPA categories), split by core realm.WDPA categories (Ia: strict nature reserve; Ib: wilderness area, II: national park) are shown separately.One subterranean site in Brazil is not shown.Numbers to the right of bars indicate the number of sites the means were calculated from.Sites from transitional realms were assigned to their parent core realm.

Sampling distribution across ecological scales
. A) Spatial extent of realms, based on major areas according to IUCN GET (coloured disk area proportional to area); spatial sampling density (in sites per Mkm 2 ) and coverage (in number of sites); temporal extent (mean range between first and last recording day), coverage (days sampled per site and density (proportion of days sampled per extent).B) Frequency ranges of datasets across realms (using Nyquist frequency i.e., actual recorded frequencies) for the main studied taxa.The dots at the ends of colored lines represent means of the lowest and highest recorded frequencies, and the range between the minimum and maximum of these values are indicated with black error bars.The limits of human hearing are indicated with dashed lines.Number of datasets indicated above lines -datasets can be counted several times if they contain deployments targeting different taxa.

Figure 5: Soundscape components analysis. A
) Mean acoustic space occupancy of soundscape components (biophony, geophony, anthropophony) as calculated from annotations for the selected ecosystems, measured in proportion of spectro-temporal space used, across eleven ecosystems over 150 recordings covering the time windows of the diel cycle.Annotated recordings are accessible at https://ecosound-web.de/ecosound_web/collection/show/49.Sample spectrograms are shown in the background.B) Soundscape component occupancy data illustrate three research questions linked to macroecology (i.e., biophony with distance from equator relationship), conservation biology trade-offs (i.e., biophony with anthropophony relationship); and phenological trends (i.e., mean biophony along diel time windows for each realm).Gray ribbons indicate 95% credible intervals and numbers indicate probabilities of positive or negative relationships.