Citizen Science on GBIF - 2019

Citizen Science datasets on GBIF plotted with all other (gray) GBIF datasets (>100K occurrences). There are many citizen science datasets with millions of occurrences (eBird, (Swedish) Artportalen), and the top 3 datasets on GBIF are all citizen science datasets. But in terms of number of unique species, only iNaturalist competes with large museum datasets like Smithsonian NMNH. Because of very large datasets like eBird and Artportalen, Citizen Science makes up a large percentage of the total occurrence records on GBIF.

Exploring es50 for GBIF

jpg | pdf | svg | code It has been suggested that GBIF could make es50 maps similar to what organizations like OBIS are already doing. I decided to make one for land animals (graph above). link to code es50 (Hulbert index) is the statistically expected number of unique species in a random sample of 50 occurrence records, and is an indicator of biodiversity richness. The score can be computed without random sampling, but the mean of infinite random sampling will produce the same result.

Not a bird download

Recently we were asked on GitHub whether there was a way to get all animal occurrences that are not a bird. This seems like an easy enough request, but unfortunately, there is currently no way to exclude groups from a download search and get everything but a certain group. A user can get all birds, but they can’t get no birds! I thought this was an interesting question and probably useful for other people wanting smaller downloads, since there are currenly around half a billion occurrence records for birds.

Big National Checklists

link to interactive map Big 15-300K total names Medium 5-15K total names Small 0-5K total names Here I plot the total names in checklists published on GBIF linked to a single country. A checklist dataset is a term for any dataset that contains primarily a list of taxonomic names. National species checklists are lists of species recorded from a country usually through some organized effort. GBIF has published a guide on best practices for making national checklist datasets, which advises making national checklists as big as possible.

GBIF checklist datasets and data gaps

A checklist dataset is a catch-all term describing any dataset that contains primarily a list of taxonomic names. The lines between a checklist dataset and an occurrence dataset can be blurry. GBIF classifies at least 6 types of datasets as checklists. National (or regional) lists of species example Taxonomic list of species example Species description example Checklists made up of other checklists GBIF backbone taxonomy & Catalogue of Life Checklists with occurrences example Checklists made from occurrences example The top two are probably what most people imagine when they think of a checklist dataset.