restax
is a R package with implements the methods of Cuffney et al. (2007) to
resolve ambiguous taxa.
Currently the following functions are available:
rpkc_*()
mcwp_*()
rpmc_*()
dpac_*()
All functions are available in two variants:
*_s
: Resolve separately for each sample.*_g
: Resolve for a group of samples.Moreover, some functions have additional options:
option = 'C'
: if the ambiguous parent has no child in a sample,
substitute the most frequently occurring child for the parent.option = 'L'
: if the ambiguous parent has no child in a sample,
substitute all of the children associated with the parent in the grouped data.level = 'Family'
: remove all ambiguous parents above the specified level
(here family) before resolving ambiguous taxa.restax
is currently only available on github. To install restax
use:
install.packages('devtools') require(devtools) install_github('restax', 'EDiLD') require(restax)
Let's say we have our data in the wide format (as commonly used in community ecology). Here we have data of 5 samples (S1-S4, A) with 12 taxa:
require(restax) data(samp) samp
restax
currently accepts only the transposed wide format. So we need to reformat it
df <- data.frame(t(samp), stringsAsFactors = FALSE) df[ , 'taxon'] <- rownames(df) df
The first step is to retrieve taxonomic information about our taxa:
This is done via the get_hier()
function,
which uses taxize to query ITIS:
df_w <- get_hier(df, taxa.var = 'taxon', db = 'itis') df_w
We need to specify the column containing the taxon names. This function returns the original data and accompanying taxonomic information.
To resolve ambiguous taxa using the RPKC-S
method (remove parents, keep childs - per sample) for sample A we use:
df_res <- rpkc_s(df_w, value.var = 'A') df_res$comm
We see that only the 5 Baetidae species and Argia sp. are kept, all others (as they are parents) are set to zero.
Similarly, all samples can be resolved with on call by omitting value.var
. However, this currently works only with group variants
*_g()
. If group
is ommited all samples are used as grouping variables.
Here we resolve all samples simultaneously using all samples as group using the RPMC-G
variant:
# remove sample A df <- data.frame(t(samp[1:4, ]), stringsAsFactors = FALSE) df[ , 'taxon'] <- rownames(df) # get hierachy df_w <- get_hier(df, taxa.var = 'taxon', db = 'itis') # resolve df_rpmc <- rpmc_g(df_w) # same as # rpmc_g(df_w, group = group = c('S1', 'S2', 'S3', 'S4')) df_rpmc$comm
This package is currently under development and the code has not been tested extensively! Moreover, there is some work needed to make the package more user friendly!
It currently can reproduce the appendix of Cuffney et al. (2007), but may break with other data!
Please use only ITIS as taxonomic back-end, as others have not been tested yet.
dpac_g()
and option='K'
are currently not available.
Please report any issues or bugs.
Cuffney, T. F., Bilger, M. D. & Haigler, A. M. Ambiguous taxa: effects on the characterization and interpretation of invertebrate assemblages. Journal of the North American Benthological Society 26, 286–307 (2007).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.