Abstract Detail

Paleobotany

Spagnuolo, Edward [1], Wilf, Peter [1].

Decoding leaf characters that drive family level identification through computer vision.

Computer vision holds massive potential for improving fossil leaf identification. The extensive machine-learning literature has focused on species-level identification of extant leaves, largely using black box methods. However, paleobotany will require identification at family level because nearly all angiosperm fossil leaves are extinct species from extant families, most of which contain significant morphological variation. Recent computer vision work successfully generalized diagnostic information at family level using a large image library of cleared leaves (Wilf, Serre et al. 2016, PNAS), while providing extensive visual feedback in the form of heat maps, our focus here. These heat maps directly illustrate diagnostic regions on leaves that can inform the development of novel traditional characters. We present the first attempt to “translate” information from heat maps into a form useful to human botanists. We analyzed published heat maps from Wilf et al. (2016) that indicate hotspot regions of highest importance for correct family-level identification. Our analysis includes 2193 leaves, in total, of Rosaceae, Sapindaceae, Salicaceae, Fagaceae, and Fabaceae, all containing abundant leaf-fossil reccords, and Rubiaceae and Ericaceae, which do not. We developed a scoring system for the locations of the heat-map hotspots based on standard leaf-architectural characters that define venation, tooth morphology, location on leaf, and other characters. Our procedure identified several features that appear to be novel and potentially informative for family-level fossil leaf identification, involving, for example, the apical margin for Rubiaceae; the lower-order venation and midsection margin for Fagaceae; the apical mucros and high order venation in Fabaceae; the secondary vein ramification patterns of Salicaceae; lower-order venation and lobe margins in Sapindaceae; the basal margins and tooth apicies (for toothed species) in Ericaceae; along with secondary veins and tooth apices for Rosaceae. Multivariate principal coordinate ordinations (PCoA) and cluster analyses of family mean values show grouping of the largely untoothed families, Fabaceae & Rubiaceae, with no grouping signal for orders. Salicaceae tends to group with untoothed families, whereas Fagaceae and Sapindaceae cluster together, both families having similar, roughly equal percentages of toothed and untoothed leaves. Rosaceae is often an outlier, most likely driven by strong heat map signals from tooth morphology, while Ericaceae plots as an intermediary between toothed and untoothed families. These observations demonstrate that output visualizations (heat maps) are an important auxiliary benefit of machine-learning experiments with significant potential to help paleobotanists develop novel, human-friendly characters for fossil leaf classification.

1 - Pennsylvania State University, Dept. of Geosciences, University Park, Pennsylvania, 16802, United States
2 - Pennsylvania State University, Dept. of Geosciences, University Park, Pennsylvania, 16802, United States

Keywords:
Leaf Architecture
Machine Vision
Fossil Leaves
Fagaceae
Rubiaceae
Fabaceae
Salicaceae
Sapindaceae
Ericaceae
Rosaceae
Paleobotany
Fossil Identification.

Presentation Type: Oral Paper
Session: PAL2, Cookson Award Session II
Location: Virtual/Virtual
Date: Monday, July 27th, 2020
Time: 1:00 PM
Number: PAL2002
Abstract ID:310
Candidate for Awards:Isabel Cookson Award,Maynard F. Moseley Award