prep_data: Prepare data for evaluation

View source: R/prep_data.R

prep_dataR Documentation

Prepare data for evaluation

Description

prep_data() formats and arranges the initial data so that it can be readily used by the other functions in the package. The function first gets the species names and the number of samples for each species from the input data frame. Then, it permutes the sampling efforts and calculates the pseudo-F statistic and the mean squares for each permutation. Finally, it returns a data frame with the permutations, pseudo-F statistic, and mean squares.

Usage

prep_data(
  data,
  type = "counts",
  Sest.method = "average",
  cases = 5,
  N = 100,
  sites = 10,
  n,
  m,
  k = 50,
  transformation = "none",
  method = "bray",
  dummy = FALSE,
  useParallel = TRUE,
  model = "single.factor"
)

Arguments

data

Data frame with species names (columns) and samples (rows) information. The first column should indicate the site to which the sample belongs, regardless of whether a single site has been sampled.

type

Nature of the data to be processed. It may be presence / absence ("P/A"), counts of individuals ("counts"), or coverage ("cover")

Sest.method

Method for estimating species richness. The function specpool is used for this. Available methods are the incidence-based Chao "chao", first order jackknife "jack1", second order jackknife "jack2" and Bootstrap "boot". By default, the "average" of the four estimates is used.

cases

Number of data sets to be simulated.

N

Total number of samples to be simulated in each site.

sites

Total number of sites to be simulated in each data set.

n

Maximum number of samples to consider.

m

Maximum number of sites.

k

Number of resamples the process will take. Defaults to 50.

transformation

Mathematical function to reduce the weight of very dominant species: 'square root', 'fourth root', 'Log (X+1)', 'P/A', 'none'

method

The appropriate distance/dissimilarity metric (e.g. Gower, Bray–Curtis, Jaccard, etc). The function vegan::vegdist() is called for that purpose.

dummy

Logical. It is recommended to use TRUE in cases where there are observations that are empty.

useParallel

Logical. Perform the analysis in parallel? Defaults to TRUE.

model

Select the model to use. Options, so far, are 'single.factor' and 'nested.symmetric'.

Value

prep_data() returns an object of class "ecocbo_data".

An object of class "ecocbo_data" is a list containing: $Results, a data frame that lists the estimates of pseudoF for simH0 and simHa that can be used to compute the statistical power for different sampling efforts, as well as the square means necessary for calculating the variation components.

Author(s)

Edlin Guerra-Castro (edlinguerra@gmail.com), Arturo Sanchez-Porras

References

Underwood, A. J. (1997). Experiments in ecology: their logical design and interpretation using analysis of variance. Cambridge university press.

Underwood, A. J., & Chapman, M. G. (2003). Power, precaution, Type II error and sampling design in assessment of environmental impacts. Journal of Experimental Marine Biology and Ecology, 296(1), 49-70.

See Also

sim_beta() plot_power() sim_cbo() scompvar()

Examples


simResults <- prep_data(data = epiDat, type = "counts", Sest.method = "average",
                        cases = 5, N = 100, sites = 10,
                        n = 5, m = 5, k = 30,
                        transformation = "none", method = "bray",
                        dummy = FALSE, useParallel = FALSE,
                        model = "single.factor")

simResults


ecocbo documentation built on Sept. 11, 2024, 8:09 p.m.