knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
This packages provides functionality to access, download, and manipulate data from the Coordinated Waterbird Counts Project. It is possible to download these same data using the CWAC API, but being able to pull these data straight into R, in a standard format, should make them more accessible, easier to analyse, and eventually make our analyses more reliable and reproducible.
There is another package named ABAP
that provides similar functionality, but in this case to download count data from the
African Bird Atlas Project. In addition, there is a companion package
the ABDtools
package, which adds
the functionality necessary to annotate different data formats (points and polygons)
with environmental information from the
Google Earth Engine data catalog.
To install CWAC
from GitHub using the remotes package, run:
install.packages("remotes") remotes::install_github("AfricaBirdData/CWAC")
There are two possible ways of downloading data from CWAC. We can select a site and download data for all species in that site or, vice versa, we can select a species and download all data for that species across all sites. In this section, we will explore the former and download all data associated with a CWAC site.
Let's use Barberspan, a CWAC site located in the North West province of South Africa
as an example. The first thing we need to do is find our site's ID code. We can use
the function listCwacSites
to list all the sites in the North West and find this code.
library(CWAC) # List all sites at the North West province nw_sites <- listCwacSites(.region_type = "province", .region = "North West") # Find the code for Barberspan site_id <- nw_sites[nw_sites$LocationName == "Barberspan", "LocationCode", drop = TRUE] # We can find more info about this site with getCwacSiteInfo() getCwacSiteInfo(site_id)
Once we have the code for the CWAC site, we can download count data collected
there using the function getCwacSiteCounts
. This will download all the CWAC cards
submitted for Barberspan.
bp_counts <- getCwacSiteCounts(site_id)
We can do all of this using a dplyr approach.
library(dplyr, warn.conflicts = FALSE) bp_counts <- listCwacSites(.region_type = "province", .region = "North West") %>% filter(LocationName == "Barberspan") %>% pull("LocationCode") %>% getCwacSiteCounts()
We might not be interested in any particular site, but in all records of certain species. We can follow a similar procedure: 1) find the species code, and 2) download the data.
To find the species code we can use the function listCwacSpp
, which lists all
CWAC species, and from this list we can find the one we want. Say we are interested in the
African Black Duck.
# List all species spp_all <- listCwacSpp() # Find the code for Black Duck sp_id <- spp_all[spp_all$Common_species == "African Black" & spp_all$Common_group == "Duck", "SppRef", drop = TRUE]
Once we have the code, we can download the data for this species across all CWAC
sites with the function getCwacSppCounts()
bd_counts <- getCwacSppCounts(sp_id)
You may get some warnings from the parser because data weren't entered in the standard format they were supposed to. We preferred printing these messages, so that you can make sure your analysis won't be impacted by this.
Again, we can also do all this using dplyr.
library(dplyr, warn.conflicts = FALSE) bd_counts <- listCwacSpp() %>% filter(Common_species == "African Black", Common_group == "Duck") %>% pull("SppRef") %>% getCwacSppCounts()
With the function getCwacSiteBoundary
we can download the polygons enclosing
CWAC sites. Please, note that these polygons are not always up to date or even present
for all sites in the database and therefore, we should always check that they are
accurate after downloading.
getCwacSiteBoundary
can retrieve boundaries for multiple sites at once, in a
single API call, making the process much more efficient. It is always a good idea
to retrieve all sites in a single call instead of one at a time.
# Download our counts for the Black Duck just in case they got lost counts <- listCwacSpp() %>% filter(Common_species == "African Black", Common_group == "Duck") %>% pull("SppRef") %>% getCwacSppCounts() # Then let's extract the boundaries of the CWAC sites in our data # At the moment getCwacSiteBoundary() can only retrieve boundaries from sites of # one province/country at a time # First identify the countries in our data unique(counts$Country) # It's only South Africa at the time, so we can pull all the sites at once. Otherwise, # we would just repeat the process for the different countries. sites <- unique(counts$LocationCode) boundaries <- getCwacSiteBoundary(loc_code = sites, region_type = "country", region = "South Africa") # Add boundaries to the count data counts_bd <- counts %>% dplyr::left_join(boundaries, by = "LocationCode")
First clone the repository to your local machine:
For site owners:
There is the danger of multiple people working simultaneously on the project code. If you make changes locally on your computer and, before you push your changes, others push theirs, there might be conflicts. This is because the HEAD pointer in the main branch has moved since you started working.
To deal with these lurking issues, I would suggest opening and working on a topic branch. This is a just a regular branch that has a short lifespan. In steps:
Opening branches is quick and easy, so there is no harm in opening multiple branches a day. However, it is important to merge and delete them often to keep things tidy. Git provides functionality to deal with conflicting branches. More about branches here:
https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell
Another idea is to use the ‘issues’ tab that you find in the project header. There, we can identify issues with the package, assign tasks and warn other contributors that we will be working on the code.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.