RT Journal Article SR Electronic T1 Enhancing georeferenced biodiversity inventories: automated information extraction from literature records reveal the gaps JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.16.908962 DO 10.1101/2020.01.16.908962 A1 Kopperud, Bjørn Tore A1 Lidgard, Scott A1 Liow, Lee Hsiang YR 2020 UL http://biorxiv.org/content/early/2020/01/17/2020.01.16.908962.abstract AB We use natural language processing (NLP) to retrieve location data for cheilostome bryozoan species (text-mined occurrences [TMO]) in an automated procedure. We compare these results with data from the Ocean Biogeographic Information System (OBIS). Using OBIS and TMO data separately and in combination, we present latitudinal species richness curves using standard estimators (Chao2 and the Jackknife) and range-through approaches. Our combined OBIS and TMO species richness curves quantitatively document a bimodal global latitudinal diversity gradient for cheilostomes for the first time, with peaks in the temperate zones. 79% of the georeferenced species we retrieved from TMO (N = 1780) and OBIS (N = 2453) are non-overlapping and underestimate known species richness, even in combination. Despite clear indications that global location data compiled for cheilostomes should be improved with concerted effort, our study supports the view that latitudinal species richness patterns deviate from the canonical LDG. Moreover, combining online biodiversity databases with automated information retrieval from the published literature is a promising avenue for expanding taxon-location datasets.