knitr::opts_chunk$set( collapse = TRUE, eval = TRUE, echo = TRUE, comment = "#>" )
The R package forcis
is an interface to the FORCIS database on global foraminifera distribution (Chaabane et al. 2023). This database includes data on living planktonic foraminifera diversity and distribution in the global oceans from 1910 until 2018 collected using plankton tows, continuous plankton recorder, sediment traps and plankton pump from the global ocean.
This package has been developed for researchers interested in working with the FORCIS database, even without advanced R skills. It provides basic functions to facilitate the handling of this large database, including functions to download, select, filter, homogenize, and visualize the data. It also enables users to explore the spatial distribution and temporal evolution of planktonic foraminifera.
This vignette is an overview of the main features of the package.
To install the forcis
package, run:
## Install < remotes > package (if not already installed) ---- if (!requireNamespace("remotes", quietly = TRUE)) { install.packages("remotes") } ## Install dev version of < forcis > from GitHub ---- remotes::install_github("FRBCesab/forcis")
The
forcis
package depends on thesf
package which requires some spatial system libraries (GDAL and PROJ). Please read this page if you have any trouble to installforcis
.
Now let's attach the required packages.
library(forcis)
The FORCIS database consists of a collection of five csv
files hosted on Zenodo. These csv
are regularly updated and we recommend to use the latest version
Let's download the latest version of the FORCIS database with download_forcis_db()
:
# Create a data/ folder in the current directory ---- dir.create("data") # Download latest version of the database ---- download_forcis_db( path = "data", version = NULL )
By default (i.e. version = NULL
), this function downloads the latest version of the database. The database is saved in data/forcis-db/version-99/
, where 99
is the version number.
N.B. The package forcis
is designed to handle the versioning of the database on Zenodo. Read the Database versions for more information.
In this vignette, we will use the plankton nets data of the FORCIS database. Let's import the latest release of the data.
file_name <- system.file( file.path("extdata", "FORCIS_net_sample.csv"), package = "forcis" ) net_data <- read.csv(file_name) |> tibble::as_tibble()
# Import plankton nets data ---- net_data <- read_plankton_nets_data(path = "data")
# Print data ----
net_data
N.B. For this vignette, we use a subset of the plankton nets data, not the whole dataset.
The FORCIS database provides three different taxonomies:
OT
: original taxonomy, i.e. the initial list of species names and attributes (e.g., shell pigmentation, coiling direction) as reported in various datasets and studies.VT
: validated taxonomy, i.e. a refined version of the original taxonomy that resolves issues of synonymy (different names for the same taxon) and shifting taxonomic concepts.LT
: lumped taxonomy, i.e. a simplified version of the validated taxonomy. It merges taxa that are difficult to distinguish across datasets (morphospecies), ensuring consistency and comparability in broader analyses. See the associated data paper for further information.
After importing the data and before going any further, the next step involves choosing the taxonomic level for the analyses. This is mandatory to avoid duplicated records.
Let's use the function select_taxonomy()
to select the VT taxonomy (validated taxonomy):
# Select taxonomy ---- net_data_vt <- net_data |> select_taxonomy(taxonomy = "VT") net_data_vt
This function has removed species columns associated with other taxonomies.
At this stage user can choose what he/she wants to do with this cleaned dataset. In the next sections, we present some use cases.
In this first use case, we want to have an overview of our data.
# How many subsamples do we have? ---- nrow(net_data_vt) # How many species have been sampled? ---- net_data_vt |> get_species_names() |> length()
We can use the plot_record_by_year()
function to display the number of samples per year.
# What is the temporal extent? ---- plot_record_by_year(net_data_vt)
The plot_record_by_month()
and plot_record_by_season()
are also available to display samples at different temporal resolutions.
Let's use the ggmap_data()
function to get an idea of the spatial extent of these data.
# What is the spatial extent? ---- ggmap_data(net_data_vt)
The Data visualization vignette provides a complete description of all plotting functions available in forcis
.
In this second use case we want to answer the following question:
What is the distribution of the planktonic foraminifera species Neogloboquadrina pachyderma between 1970 and 2000 in the Mediterranean Sea?
We can divide the problem into different stages:
# Get all species names ---- species_list <- net_data_vt |> get_species_names() # Search for species containing the word 'pachyderma' ---- species_list[grep("pachyderma", species_list)] # Store the species name ---- sp_name <- "n_pachyderma_VT"
# Filter data by species ---- net_data_vt_pachyderma <- net_data_vt |> filter_by_species(species = sp_name) net_data_vt_pachyderma # Remove empty samples for N. pachyderma ---- net_data_vt_pachyderma <- net_data_vt_pachyderma |> dplyr::filter(n_pachyderma_VT > 0) net_data_vt_pachyderma
# Filter data by years ---- net_data_vt_pachyderma_7000 <- net_data_vt_pachyderma |> filter_by_year(years = 1970:2000) # Number of records ---- nrow(net_data_vt_pachyderma_7000)
# Get the list of ocean names ---- get_ocean_names() # Filter data by ocean ---- net_data_vt_pachyderma_7000_med <- net_data_vt_pachyderma_7000 |> filter_by_ocean(ocean = "Mediterranean Sea") # Number of records ---- nrow(net_data_vt_pachyderma_7000_med)
# Plot N. pachyderma records on a World map ---- ggmap_data(net_data_vt_pachyderma_7000_med)
Finally, we can combine all these steps into one single pipeline:
# Final use case 2 code ---- net_data_vt |> filter_by_species(species = "n_pachyderma_VT") |> dplyr::filter(n_pachyderma_VT > 0) |> filter_by_year(years = 1970:2000) |> filter_by_ocean(ocean = "Mediterranean Sea") |> ggmap_data()
The Select, reshape, and filter data vignette shows examples to handle FORCIS data.
Additional vignettes are available depending on user wishes:
forcis
to compute abundances, concentrations, and frequenciesforcis
Chaabane S, De Garidel-Thoron T, Giraud X, Schiebel R, Beaugrand G, Brummer G-J, Casajus N, Greco M, Grigoratou M, Howa H, Jonkers L, Kucera M, Kuroyanagi A, Meilland J, Monteiro F, Mortyn G, Almogi-Labin A, Asahi H, Avnaim-Katav S, Bassinot F, Davis CV, Field DB, Hernández-Almeida I, Herut B, Hosie G, Howard W, Jentzen A, Johns DG, Keigwin L, Kitchener J, Kohfeld KE, Lessa DVO, Manno C, Marchant M, Ofstad S, Ortiz JD, Post A, Rigual-Hernandez A, Rillo MC, Robinson K, Sagawa T, Sierro F, Takahashi KT, Torfstein A, Venancio I, Yamasaki M & Ziveri P (2023) The FORCIS database: A global census of planktonic Foraminifera from ocean waters. Scientific Data, 10, 354. DOI: 10.1038/s41597-023-02264-2.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.