restax

Build Status

restax is a R package with implements the methods of Cuffney et al. (2007) to resolve ambiguous taxa.

Methods

Currently the following functions are available:

Variants

All functions are available in two variants:

Options

Moreover, some functions have additional options:

rpkc_g() and dpac_g()
mcwp_*()

Installation

restax is currently only available on github. To install restax use:

install.packages('devtools')
require(devtools)
install_github('restax', 'EDiLD')
require(restax)

Example

Let's say we have our data in the wide format (as commonly used in community ecology). Here we have data of 5 samples (S1-S4, A) with 12 taxa:

require(restax)
data(samp)
samp

restax currently accepts only the transposed wide format. So we need to reformat it

df <- data.frame(t(samp), stringsAsFactors = FALSE)
df[ , 'taxon'] <- rownames(df)
df

The first step is to retrieve taxonomic information about our taxa:

This is done via the get_hier() function, which uses taxize to query ITIS:

df_w <- get_hier(df, taxa.var = 'taxon', db = 'itis')
df_w

We need to specify the column containing the taxon names. This function returns the original data and accompanying taxonomic information.

To resolve ambiguous taxa using the RPKC-S method (remove parents, keep childs - per sample) for sample A we use:

df_res <- rpkc_s(df_w, value.var = 'A')
df_res$comm

We see that only the 5 Baetidae species and Argia sp. are kept, all others (as they are parents) are set to zero.

Similarly, all samples can be resolved with on call by omitting value.var. However, this currently works only with group variants *_g() . If group is ommited all samples are used as grouping variables.

Here we resolve all samples simultaneously using all samples as group using the RPMC-G variant:

# remove sample A
df <- data.frame(t(samp[1:4, ]), stringsAsFactors = FALSE)
df[ , 'taxon'] <- rownames(df)
# get hierachy
df_w <- get_hier(df, taxa.var = 'taxon', db = 'itis')
# resolve
df_rpmc <- rpmc_g(df_w)
# same as
# rpmc_g(df_w, group = group = c('S1', 'S2', 'S3', 'S4'))
df_rpmc$comm

NOTES

This package is currently under development and the code has not been tested extensively! Moreover, there is some work needed to make the package more user friendly!

It currently can reproduce the appendix of Cuffney et al. (2007), but may break with other data!

Please use only ITIS as taxonomic back-end, as others have not been tested yet.

dpac_g() and option='K' are currently not available.

Please report any issues or bugs.

TODOs

References

Cuffney, T. F., Bilger, M. D. & Haigler, A. M. Ambiguous taxa: effects on the characterization and interpretation of invertebrate assemblages. Journal of the North American Benthological Society 26, 286–307 (2007).



EDiLD/restax documentation built on May 6, 2019, 3:04 p.m.