validate_ftree: Predictions of assembly performances using a species...

Description Usage Arguments Details Value

View source: R/validating.R

Description

Take a hierarchical tree of species clustering, a matrix of occurrency and the corresponding vector of performances, and return the predictions, statistics and other informations.

Usage

1
2
3
4
5
6
7
8
validate_ftree(tree.I, fobs, mOccur,
              xpr = stats::setNames(rep(1, length(fobs)),
                                    rep("a", length(fobs))),
              opt.method = "divisive", opt.mean = "amean",
              opt.model = "byelt",
              opt.jack   = FALSE,
              jack       = as.integer(c(3, 4)),
              opt.nbMax  = dim(mOccur)[2])

Arguments

tree.I

an integer square-matrix. The matrix represents a hierarchical tree of species clustering.

fobs

a numeric vector. The vector fobs contains the quantitative performances of assemblages.

mOccur

a matrix of occurrence (occurrence of elements). Its first dimension equals to length(fobs). Its second dimension equals to the number of elements.

xpr

a vector of numerics of length(fobs). The vector xpr contains the weight of each experiment, and the labels (in names(xpr)) of different experiments. The weigth of each experiment is used in the computation of the Residual Sum of Squares in the function rss_clustering. The used formula is rss if each experiment has the same weight. The used formula is wrss (barycenter of RSS for each experiment) if each experiment has different weights. All assemblages that belong to a given experiment should then have a same weigth. Each experiment is identified by its names (names(xpr)) and the RSS of each experiment is weighted by values of xpr. The vector xpr is generated by the function stats::setNames.

opt.method

a string that specifies the method to use. opt.method = c("sort", "divisive", "agglomerative", "cluster"). The three first methods generate hierarchical trees. Each tree is complete, running from a unique trunk to as many leaves as components. The last method generates, at each level of the tree, a clustering of components into a given, predefined number of clusters. Because it is repeated from the trunk until to leaves, by increasing the number of clusters, the method generates a non-hierarchical tree.

If opt.method = "sort", the components are sorted by their effect of assemblage performances.

If opt.method = "divisive", the components are clustered according to a hierarchical process by using a divisive method, from the trivial cluster where all components are together, towards the clustering where each component is a cluster.

If opt.method = "agglomerative", the components are clustered according to a hierarchical process by using an agglomerative method, from the trivial clustering where each component is a clsuter, towards the cluster where all components are together. The method that gives the best result is opt.method = "divisive".

If opt.method = "cluster", the components are clustered according to a non-hierarchical process by using the method of McNaughton-Smith et al., 1964. In this case, one must specify the number of wished clusters.

Recall that, if affectElt is specified, the option opt.method does not need to be filled out. affectElt determines a level of component clustering, and a tree is built: (i) by using opt.method = "divisive" from the defined level in tree towards as many leaves as components; (ii) by using opt.method = "agglomerative" from the defined level in tree towards the trunk of tree.

opt.mean

a character equals to "amean" or "gmean". Switch to arithmetic formula if opt.mean = "amean". Switch to geometric formula if opt.mean = "gmean".

opt.model

a character equals to "bymot" or "byelt". Switch to simple mean by assembly motif if opt.model = "bymot". Switch to linear model with assembly motif if opt.model = "byelt".

opt.jack

a logical, that switchs towards cross-validation method.

If opt.jack = FALSE, a "leave-one-out" is used: predicted performances are computed as the mean of performances of assemblages that share a same assembly motif, experiment by experiment, except the only assemblage to predict.

If opt.jack = TRUE, a jackknife method is used: the set of assemblages belonging to a same assembly motif is divided into jack[2] subsets of jack[1] assemblages. Predicted performances are computed, experiment by experiment, by excluding jack[1] assemblages, including the assemblage to predict. If the total number of assemblages belonging to the assembly motif is lower than jack[1]*jack[2], predictions are computed by Leave-One-Out method.

jack

an integer vector of length 2. The vector specifies the parameters for jackknife method. The first integer jack[1] specifies the size of subset, the second integer jack[2] specifies the number of subsets.

opt.nbMax

an integer, that indicates the maximum number of tree levels to cluster. By default, opt.nbMax = nbElt for clustering components all along the tree, from the trunc to the leaves, to be able to determine the optimum number of component functional groups. However, in ftest_* and fboot_* functions, there is no point in cluster the components beyond the optimum number of functional groups. In these functions, opt.nbMax = optimum number of functional groups, by default.

Details

None.

Value

Return a list containing predictions of assembly performances and statistics computed by using a species clustering tree.

Recall of inputs:

Primary and secondary trees of element clustering:

Matrices of calibration and prediction using tree.I and associated statistics:

Matrices of calibaration and prediction using tree.II and associated statistics:


functClust documentation built on Dec. 2, 2020, 5:06 p.m.