FidelitySummary: Data filtering and compliance assessment

View source: R/FidelitySummary.R

FidelitySummaryR Documentation

Data filtering and compliance assessment

Description

FidelitySummary function evaluates if the input data are adequate and properly formatted for carrying out fidelity analyses. When called by other PaleoFidelity function, FidelitySummary allows the user to filter data by removing small samples and rare taxa.

Usage

FidelitySummary(
  live,
  dead,
  gp = NULL,
  report = FALSE,
  n.filters = 0,
  t.filters = 1,
  output = FALSE
)

Arguments

live

A matrix with counts of live-collected specimens (rows=sites, columns=taxa). Dimensions of 'live' and 'dead' matrices must match exactly.

dead

A matrix with counts of dead-collected specimens (rows=sites, columns=taxa). Dimensions of 'live' and 'dead' matrices must match exactly.

gp

An optional univariate factor defining groups of sites. The length of gp must equal number of rows of 'live' and 'dead' matrices.

report

Logical (default = FALSE), set report=TRUE to print notes, warnings, and data summary

n.filters

Integer (default = 0) to remove small samples with n < n.filters occurrences

t.filters

Integer (default = 1) to remove rare taxa with t < t.filters occurrences. Note that the default value of 1 keeps all taxa, but removes empty columns. This may not be appropriate for certain applications such as simulations of null models or when measuring live-dead rank correlation when assessing compositional fidelity.

output

Logical (default = FALSE) determines if an output with filtered datasets should be produced.

Details

This function is implemented in other PaleoFidelity functions. However, prior to any fidelity analysis, it is recommended to check for errors/warnings and generate a basic summary of datasets (report=TRUE), and explore how filtering ("n.filters" and "t.filters") affect data dimensionality

NOTE: FidelitySummary function provides an initial compliance evaluation and allows for assessing if removing those samples is advisable. Once determined, the desired numerical values of "n.filters" and "t.filters" need to be specified in other PaleoFidelity functions.

Value

A list (returned only if output = TRUE) including the following components:

live

The filtered live dataset where rows=sites and columns=taxa

dead

The filtered dead dataset where rows=sites and columns=taxa

gp

The grouping factor associated with sites (if provided)

tax

The grouping factor associated with taxa (if provided)

Examples

data(FidData)
FidelitySummary(live=FidData$live, dead=FidData$dead, report=TRUE)
FidelitySummary(live=FidData$live, dead=FidData$dead, gp=FidData$habitat, report=TRUE, n.filters=50)


MJKowalewski/PaleoFidelity documentation built on Aug. 25, 2024, 8:27 p.m.