knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" ) hook_output <- knitr::knit_hooks$get("output") knitr::knit_hooks$set(output = function(x, options) { if (!is.null(n <- options$out.lines)) { x <- xfun::split_lines(x) if (length(x) > n) { if (x[length(x)] == "") { x <- c(head(x, n / 2), "....", tail(x, n / 2 + 1)) } else { x <- c(head(x, n / 2), "....", tail(x, n / 2)) } } } hook_output(x, options) })
paleobioDB
is a package for downloading, visualizing and processing
data from the Paleobiology Database.
Install the latest release from CRAN:
install.packages("paleobioDB")
Alternatively, you can install the development version of paleobioDB from GitHub with:
# install.packages("devtools") devtools::install_github("ropensci/paleobioDB")
paleobioDB
has 19 functions to wrap most endpoints of the PaleobioDB
API, plus 8 functions to visualize and process the fossil data. The
API documentation for the Paleobiology Database can be found
here.
pbdb_occurrences
Here is an example of how to download all fossil occurrences that belong to the family Canidae in the Quaternary:
library(paleobioDB) canidae <- pbdb_occurrences( base_name = "canidae", interval = "Quaternary", show = c("coords", "classext"), vocab = "pbdb", limit = "all" ) dim(canidae) head(canidae, 3)
Note that if the plotting and analysis functions of this package are
going to be used (as demonstrated in the sections below), it is
necessary to specify the parameter show = c("coords", "classext")
in
the pbdb_occurrences()
function. This returns taxonomic and
geographic information for the occurrences that is required by these
functions.
Beware of synonyms and errors, they could twist your estimations about
species richness, evolutionary and extinction rates, etc. paleobioDB
users should be critical about the raw data downloaded from the
database and filter the data before analyzing it.
For instance, when using base_name
for downloading information with
the function pbdb_occurrences()
, check out the synonyms and errors
that could appear in accepted_name
, genus
, etc. If they are not
corrected or eliminated, they will increase the richness of genera.
pbdb_map
Plots a map showing fossil occurrences and invisibly returns a data frame with the number of occurrences per coordinate.
(pbdb_map(canidae))
pbdb_map_occur
Returns a map and a raster object with the sampling effort (number of fossil records per cell). The user can change the resolution of the cells.
pbdb_map_occur(canidae, res = 5)
pbdb_map_richness
Returns a map and a raster object with the number of different
species, genera, family, etc. per cell. As with pbdb_map_occur()
,
the user can change the resolution of the cells.
pbdb_map_richness(canidae, res = 5, rank = "species")
If you do not need the plot and you are only interested in obtaining a
richness raster for some other purposes, you could use the argument
do_plot = FALSE
. For instance, this returns the same raster object
as above (a SpatRaster
object from the terra
package) without
plotting it:
pbdb_map_richness(canidae, res = 5, rank = "species", do_plot = FALSE)
The do_plot
argument is available in all the functions in the
package that produce a plot and return an object. This means that the
plot is optional in all the other plotting functions that are
described here. Check their documentation for more details.
pbdb_temp_range
Returns a data frame and a plot with the time span of the species, genera, families, etc. in your query. Make sure that enough vertical space is provided in the graphics device used to do the plotting if there are many taxa of the specified rank in your data set.
pbdb_temp_range(canidae, rank = "species")
pbdb_richness
Returns a data frame and a plot with the number of species (or genera, families, etc.) across time. You should set the temporal extent and the temporal resolution for the steps.
pbdb_richness(canidae, rank = "species", temporal_extent = c(0, 5), res = 0.5)
pbdb_orig_ext
Returns a data frame and a plot with the number of appearances and
dissapearances of taxa between consecutive time intervals in the data
you provide. These time intervals are defined by the temporal extent
(temporal_extent
) and resolution (res
) arguments. This is another
way of visualizing the same information that is shown in the
pbdb_temp_range()
plot. orig_ext = 1
plots new appearances:
pbdb_orig_ext( canidae, rank = "species", orig_ext = 1, temporal_extent = c(0, 5), res = 0.5 )
And orig_ext = 2
plots disappearances of taxa between time intervals
in the provided data frame.
pbdb_orig_ext( canidae, rank = "species", orig_ext = 2, temporal_extent = c(0, 5), res = 0.5 )
pbdb_subtaxa
Returns a plot and a data frame with the number of species, genera, families, etc. in your dataset.
pbdb_subtaxa(canidae)
pbdb_temporal_resolution
Returns a plot and a list with a summary of the temporal resolution of the fossil records.
pbdb_temporal_resolution(canidae)
We include a Dockerfile to ease working on the package as it fulfills all its system dependencies.
How to load the package with Docker:
Install Docker. Reference here: https://docs.docker.com/get-started/
Build the docker image. From the root folder of this repository, type:
docker build -t rpbdb Docker
This command will create a docker image in your system based on some
of the rocker/tidyverse
images. You can see the new image with docker image ls
.
<password>
of your choice.docker run -d --name="rpbdb_rstudio" --rm -p 8787:8787 \ -e PASSWORD=<password> -v $PWD:/home/rstudio rpbdb
This will start a container with access to your current folder where
all the code of the package is. Inside the container, the code will
be located in /home/rstudio
. It also exposes the port 8787 of the
container so you may access the RStudio web application which is
bundled in the Rocker base image.
Then, you can either:
Navigate to http://localhost:8787
. Enter with username
"rstudio" and the password you used in the command above.
Or you can enter the container via console with:
bash
docker exec -ti -u rstudio -w /home/rstudio rpbdb_rstudio R
Either from RStudio or from within the container you can install the package from source with:
install.packages(".", repos = NULL, type = "source")
docker container stop rpbdb_rstudio
Please report any issues or bugs.
License: GPL-2
citation("paleobioDB")
This package is part of the rOpenSci project.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.