AUCG.test: The Area Under the Curve (AUC) for each taxon in relation to...

Description Usage Arguments Details Author(s) Examples

View source: R/AUCG.test_function.R

Description

The AUCG.test function determines the AUC of an taxon/group in relation to the distribution of the total assembly (all other samples). For this the AUC.test function is used.

Usage

1
AUCG.test(samp, tax, var, conf.level = 0.95, boot = FALSE, nboot = 500, pairwise = FALSE, mintax = 10)

Arguments

samp

A vector containing codes for each sample.

tax

A vector containing names or codes for the different taxa/groups.

var

A numeric vector containing the variables of an environmental gradient.

conf.level

A numeric argument that can set the percentile of the confidence interval (see AUC.test on how confidence intervals are calculated). Default is 0.95.

boot

An argument that states if the AUC and confidence intervals are determined by bootstrapping (see AUC.test). Default is set to FALSE.

nboot

The number of bootstraps used to determine the AUC and confidence intervals. This number needs a minimal of a 100 bootstraps and is by default 500.

pairwise

The pairwise argument is under default FALSE. When this is set to TRUE, the function will also perform a pairwise comparison between each taxa. This will increase processing time and will generate a large data frame for each pairwise combination. It will also generate an matrix which displays the overlap of each pairwise combination.

mintax

An argument removes taxa with less than the given integer value. By default only taxa with more than 10 occurences are used.

Details

The vectors samp tax and var should be of the same length. The var vector should be in numeric form. The AUCG.test compares each taxon with to the residual sample assembly excluding its own samples (residual assembly = total sample assembly - taxon/group). This function uses the AUC.test function. Taxon/groups occurring less than 10 times are removed from the dataset. Before applying this function it is advice to explore the variation of the parameters within between taxa/groups. The overall results are summarized as TAD (Total Assembly Deviation), which is the mean probability that a random sample from a random chosen taxon/group deviates from the residual assembly. The resulting value for TADr (relative Total Assembly Deviation) is a value between 0 and 1. This value indicates how strongly samples deviate from a maximum theoretical separation that is possible. Since, the maximal amount of deviation reached decreases with the number of taxa/groups (For example, max TAD for 2 groups = 1 and 3 group 0.83) TADr was created. TADr takes in account this decrease in TAD, but TADr has no absolute meaning, since it is a relative value. The returned items in the data frame are: AUC.stat.estimate which is the probability that a random sample from the taxon/group x ranks higher than a random sample from the residual assembly y. This value goes accompanied with the lower and higher confidence intervals, named Low.conf.AUC and High.conf.AUC, The vectors x.obs and y.obs contain the number of observations for the taxon (x) and the residual assembly (y). Missing values (NAs) are omitted from the dataset.

Author(s)

Willem Kaijser

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
#Using the provided dataset named "hco3"
results <- AUCG.test(hco3$Sample, hco3$Taxon, hco3$Variable)

#Display the results
View(results[[2]])

#Perform pairwise comparison
results2 <- AUCG.test(hco3$Sample, hco3$Taxon, hco3$Variable, pairwise = TRUE)

#Display pairwise comparison of the AUC
results2[[3]]

#Display the overlap matrix
results2[[4]]
## End(Not run)

snwikaij/GRASS documentation built on July 29, 2020, 1:54 p.m.