prep_data: Prepare data for evaluation
In ecocbo: Calculating Optimum Sampling Effort in Community Ecology

prep_data

R Documentation

Prepare data for evaluation

Description

prep_data() formats and arranges the initial data so that it can be readily used by the other functions in the package. The function first gets the species names and the number of samples for each species from the input data frame. Then, it permutes the sampling efforts and calculates the pseudo-F statistic and the mean squares for each permutation. Finally, it returns a data frame with the permutations, pseudo-F statistic, and mean squares.

Usage

prep_data(
  data,
  type = "counts",
  Sest.method = "average",
  cases = 5,
  N = 100,
  sites = 10,
  n,
  m,
  k = 50,
  transformation = "none",
  method = "bray",
  dummy = FALSE,
  useParallel = TRUE,
  model = "single.factor"
)

Arguments

`data`	Data frame with species names (columns) and samples (rows) information. The first column should indicate the site to which the sample belongs, regardless of whether a single site has been sampled.
`type`	Nature of the data to be processed. It may be presence / absence ("P/A"), counts of individuals ("counts"), or coverage ("cover")
`Sest.method`	Method for estimating species richness. The function specpool is used for this. Available methods are the incidence-based Chao "chao", first order jackknife "jack1", second order jackknife "jack2" and Bootstrap "boot". By default, the "average" of the four estimates is used.
`cases`	Number of data sets to be simulated.
`N`	Total number of samples to be simulated in each site.
`sites`	Total number of sites to be simulated in each data set.
`n`	Maximum number of samples to consider.
`m`	Maximum number of sites.
`k`	Number of resamples the process will take. Defaults to 50.
`transformation`	Mathematical function to reduce the weight of very dominant species: 'square root', 'fourth root', 'Log (X+1)', 'P/A', 'none'
`method`	The appropriate distance/dissimilarity metric (e.g. Gower, Bray–Curtis, Jaccard, etc). The function `vegan::vegdist()` is called for that purpose.
`dummy`	Logical. It is recommended to use TRUE in cases where there are observations that are empty.
`useParallel`	Logical. Perform the analysis in parallel? Defaults to TRUE.
`model`	Select the model to use. Options, so far, are 'single.factor' and 'nested.symmetric'.

Value

prep_data() returns an object of class "ecocbo_data".

An object of class "ecocbo_data" is a list containing: $Results, a data frame that lists the estimates of pseudoF for simH0 and simHa that can be used to compute the statistical power for different sampling efforts, as well as the square means necessary for calculating the variation components.

Author(s)

Edlin Guerra-Castro (edlinguerra@gmail.com), Arturo Sanchez-Porras

References

Underwood, A. J. (1997). Experiments in ecology: their logical design and interpretation using analysis of variance. Cambridge university press.

Underwood, A. J., & Chapman, M. G. (2003). Power, precaution, Type II error and sampling design in assessment of environmental impacts. Journal of Experimental Marine Biology and Ecology, 296(1), 49-70.

Examples


simResults <- prep_data(data = epiDat, type = "counts", Sest.method = "average",
                        cases = 5, N = 100, sites = 10,
                        n = 5, m = 5, k = 30,
                        transformation = "none", method = "bray",
                        dummy = FALSE, useParallel = FALSE,
                        model = "single.factor")

simResults

ecocbo documentation built on Sept. 11, 2024, 8:09 p.m.