This (long-delayed; sorry, it’s 2020!) final post in my series about Project BirdMap goes into detail about getting the data, processing it, and using it to build the map. Fair warning: from this point, it’s less birdy and more nerdy.

The code for Project BirdMap is available in GitHub:

Getting data from EOD

In my Building Project BirdMap post, I talked about EOD, the eBird Observation Dataset. The full dataset is over 300 GB in size, but fortunately I could filter down to just what I needed. For my Woodpeckers map, I used the Scientific name filter and browsed down through the bird taxonomy until I found Piciformes (the woodpecker-like order) and finally Picidae (just woodpeckers).

GBIF EOD dataset for picidae

This took the dataset down to a more manageable 13 GB, but to shrink it a little further I filtered on the observation year and pulled only sightings from 2010 to 2020. I figured that if I wanted to know where to travel in 2020 and beyond, historic observations would be less valuable. This left me with a download of 11 GB, or about 23.6 million bird sightings.

Read More →