How to choose a dataset class on GBIF?

If you are a (first time) publisher on GBIF and you are trying to decide which type of dataset would best fit your data, this blogpost is for you. All the records shared on GBIF are organized into datasets. Each dataset is associated with some metadata describing its content (the classic “what, where, when, why, how”). The dataset’s content depends strongly on the dataset’s class. GBIF currently support four types of dataset:

Understanding basis of record - a living specimen becomes a preserved specimen

Recently a user noticed that there were Asian Red Pandas (Ailuridae) occurring in North America, and wondered if someone had made a mistake. When an occurrence observation comes from a zoo or botanical garden, it is usually considered a living specimen, but when it comes from a museum it is usually called a preserved specimen. This label helps users remove records that they might not want, which come from zoos.

Search, download, analyze and cite (repeat if necessary)

Finding and accessing data There is a lot of GBIF-mediated data available. More than 1.3 B occurrence records covering hundreds of thousands of species in all part of the worlds. All free, open and available at the touch of a button. Users can download data through the portal, via the GBIF API, or one of the third-party tools available for programmatic access, e.g. rgbif. If there is one area in which GBIF has been immensely successful, it’s making the data available to users.

Six questions answered about the GBIF Backbone Taxomomy

This past week our informatics team has been updating the Backbone taxonomy on This is a fairly disruptive process which sometimes involves massive taxonomic changes but DON’T PANIC. This update is a good thing. It means that some of the taxonomic issues reported have been addressed (see for example this issue concerning the Xylophagidae family) and that new species are now visible on GBIF. Plus, it gives me an excellent opportunity to talk about the GBIF backbone taxonomy and answer some of the questions you might have.

Downloading occurrences from a long list of species in R and Python

It is now possible to download up to 100,000 names on GBIF! Until recently it was not possible to download occurrences for more than a few hundred species at the same time, but it is now possible to request more species names (up to 100,000 taxonkeys). For those multiple taxa downloaders out there, GBIF now supports download requests of up to 100,000(!) taxa. That should cover most use cases :) For such large requests, however, you will need to POST you query to the Occurrence Download API service: https://t.