Abstract Detail



Biodiversity Informatics & Herbarium Digitization

Webb, Campbell O. [1], Ickert-Bond, Stefanie [2].

Cyborg matching of taxonomic names, using nomenclatural logic.

Taxonomic names are the linkage keys that permit biological datasets to be reconciled. However, name reconciliation is difficult due to frequent variation in the character string for a single name. Computers can assist in the recognition of cognates between two sources, using some form of fuzzy matching, but subsequent human decision-making is usually needed to apply the complex rules of nomenclature and to weigh likelihoods of character substitutions.  In our work on a new Flora of Alaska, we are generating a checklist of names and synonyms from a variety of names resources (a 30-year database of names developed at the University of Alaska Fairbanks herbarium, a checklist developed by the Alaska Center for Conservation Science, the Panarctic Flora checklist, Flora of North America, etc.). The process of adding each names resource requires reconciling the new names to our own growing checklist, by recognizing variation in the name character strings of the same name in each input list. Several powerful, fuzzy-matching tools have been created to facilitate the matching of a user’s list of names to the names in various online names resources (IPNI, Tropicos, NCBI Taxonomy, etc.). These tools include the Taxonomic Name Resolution Service, and the Global Names Resolver, but none of these tools is suitable for our purpose. We therefore built a new taxonomic names matching application: ‘matchnames’ (https://github.com/camwebb/taxon-tools). The app i) parses the elements of a raw name string, ii) applies a set of (botanical) taxonomic author string decomposition rules (e.g., omitted basionym, omitted ‘ex’ or ‘in’ author), iii) seeks exact matches on acceptable name string variants, iv) performs a fuzzy match on acceptable name string variants, and v) offers fuzzy-match choices to a human operator, who with minimal keystrokes can accept or reject candidates for the same name. In this way the app and the user work rapidly, behaving as an optimized ‘cyborg’ system. In our usage to date, between one in 20 and one in 100 names in an input list require human assistance.  This number might be further reduced by understanding the human decisions and adding them to the app as additional rules.


Related Links:
New taxonomic names matching application: ‘matchnames’
A new Flora of Alaska
Cam Webb
Frontier Botany: Ickert-Bond Lab- systematics meets ecology, paleontology, and genomics


1 - University Of Alaska Fairbanks, Herbarium (ALA) , And Dept. Of Biology And Wildlife, University Of Alaska Fairbanks, , 1962 Yukon Dr., Fairbanks, AK, 99775, United States
2 - University Of Alaska Fairbanks, Herbarium (ALA) And Dept. Of Biology And Wildlife, University Of Alaska Fairbanks, 1962 Yukon Dr., Fairbanks, AK, 99775, United States

Keywords:
nomenclatural logic
Biodiversity Informatics
Flora of Alaska
nrITS.

Presentation Type: Poster
Session: P, Biodiversity Informatics & Herbarium Digitization Posters
Location: Virtual/Virtual
Date: Monday, July 27th, 2020
Time: 5:00 PM Time and date to be determined
Number: PBI001
Abstract ID:253
Candidate for Awards:None


Copyright © 2000-2020, Botanical Society of America. All rights reserved