In saramortara/rocc: Workflows for biodiversity data download and cleaning

knitr::opts_chunk$set(echo = TRUE)
#devtools::load_all()
library(Rocc)
library(knitr)
library(dplyr)

0. Installing and loading package

remotes::install_github("liibre/Rocc")
library(dplyr)
library(Rocc)

1. Download and bind data from different sources

Here, we have a short list of two fern species.

species_search <- c("Asplenium truncorum", "Lindsaea lancea")

Here, we are downloading data from two species of ferns.

Species Link

data_splink <- list()
for (sp in species_search) {
  data_splink[[sp]] <- rspeciesLink(species = sp, 
                              filename = paste0(gsub(" ", "_", sp), "_splink"))
}

df_splink <- bind_rows(data_splink, .id = "species_search") 
dim(df_splink)
unique(df_splink$species_search)

GBIF

data_gbif <- list()

for (sp in species_search) {
  data_gbif[[sp]] <- rgbif2(species = sp, 
                      filename = paste0(gsub(" ", "_", sp), "_gbif"))
}

names(data_gbif) <- species_search
df_gbif <- bind_rows(data_gbif, .id = "species_search")

2. Binding data from different sources

df <- bind_dwc(splink_data = df_splink, gbif_data = df_gbif)

3. Check string in species name

Given that the data base might come from source with errors, we perform a basic check on the string of a species name. We will select only unique entries in species names.

# Vector of unique entries in species names
species_name_raw <- unique(df$scientificName)

For the unique entries, we will perform a basic check on the string.

species_name_check  <- check_string(species_name_raw)
species_name_check

Here, we are interested only in the names assigned with possibly_ok and name_w_authors. Now we will filter the occurrence data within these categories.

verbatimSpecies_ok <- species_name_check$verbatimSpecies[species_name_check$speciesStatus %in% c("possibly_ok", "name_w_authors")]
df_ok <- df[df$scientificName %in% verbatimSpecies_ok, ]

In this cleaning we went from a total of r nrow(df) occurrences to r nrow(df_ok) occurrences.

Finally, we can write the resultant occurrence data on disk.

write.csv(df_ok, "results/occurrence_data.csv", row.names = FALSE)

saramortara/rocc documentation built on April 3, 2022, 3:41 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

saramortara/rocc
Workflows for biodiversity data download and cleaning

In saramortara/rocc: Workflows for biodiversity data download and cleaning

0. Installing and loading package

1. Download and bind data from different sources

Species Link

GBIF

2. Binding data from different sources

3. Check string in species name

R Package Documentation

Browse R Packages

We want your feedback!

saramortara/rocc Workflows for biodiversity data download and cleaning

In saramortara/rocc: Workflows for biodiversity data download and cleaning

0. Installing and loading package

1. Download and bind data from different sources

Species Link

GBIF

2. Binding data from different sources

3. Check string in species name

R Package Documentation

Browse R Packages

We want your feedback!

saramortara/rocc
Workflows for biodiversity data download and cleaning