Abstract Detail

Biodiversity Informatics & Herbarium Digitization


Data quality initiatives at GBIF.

GBIF—the Global Biodiversity Information Facility—is an international network and research infrastructure funded by the world's governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth. GBIF mediates over 310 million occurrence plant and fungal records, with over 85 million preserved specimens from the world’s herbaria.  Over 1600 institutions have published more than 53,000 datasets with GBIF and we have tracked over 4500 peer reviewed papers using your data. Within this enormous amount of data there is variability in data quality among datasets. Some data variability is intrinsic to the data such as lack of lat/longs on historical collections but other data quality issues are more easily improved when the issues can be identified.  GBIF works with many collection management system partners to share data and there are quality control mechanism available when data is registered. For each dataset GBIF maintains a dataset page that is a resource for data improvement. During this talk I will demonstrate the improvements and plans for how data providers can access information to improve data. The process of aggregating thousands of datasets can lead to unintended problems. Data quality can be assessed and improved when it is downloaded for use. GBIF has recently added multiple data filters that allow users to easily filter out data errors such as implausible coordinates, centroids and botanical gardens. We are in the process of adding more filters so that in the future the post download data cleaning step is minimized or gone all together. Attribution for your work is critical not only for your career but also for the future funding for your herbarium.  GBIF tracks the use of data downloaded and used in peer-reviewed publications globally (#CiteTheDOI).  We link use back to the individual datasets so that a herbarium curator can prove data use. We support the independent efforts of @bloodhoundtrack which takes GBIF mediated data of a curator’s collections and identifications and links them their ORCID identifier. This allows quantification of how many specimens an individual has collected, identified and how many papers have used those data.

Related Links:
Miller Lab Website
The Land Institute

1 - Universitetsparken 15, Copenhagen, 2100, Denmark

portaldata aggregation.

Presentation Type: Oral Paper
Session: BIHD2, Biodiversity Informatics & Herbarium Digitization II
Location: Virtual/Virtual
Date: Thursday, July 30th, 2020
Time: 12:45 PM
Number: BIHD2002
Abstract ID:739
Candidate for Awards:None

Copyright © 2000-2020, Botanical Society of America. All rights reserved