best_distri: A function to rank distributions (of same length) and...

Description Usage Arguments Details Value

View source: R/IterCrossV_functions.R

Description

A function to rank distributions (of same length) and statistically compare them to the best one

Usage

1
2
3
4
5
6
7
8
9
best_distri(
  x,
  w,
  test = c("wilcoxon", "vanderWaerden", "median", "KruskalWallis"),
  na.max = 0.5,
  p.min = 0.01,
  silent = TRUE,
  cl = NULL
)

Arguments

x

Typically a matrix where rows are different distributions of the same length to be compared while paired

w

vector of weights with the same length a ncol(x) if outputs do not have the same weight. Used for weighted.mean and for p-value calculation.

test

test used to compare distribution as used by svyranktest

na.max

proportion maximum of NA value allowed in one distribution. If proportion of NA is upper na.max, model is ranked at the end and no p-value is calculated

p.min

minimum p-value under which the order of distribution is not important because following distributions will not be kept... If set, when p-value is lower than p.min, distributions are supposed significantly "worse" than the best one. Remaining distributions are ordered according to their mean and p-values are not calculated.

silent

Logical Whether to show % remained or not

cl

a cluster as made with makeCluster. If empty, nbclust in modelselect_opt will be used.

Details

This function has been developed to compare indices of goodness of fit calculated after a cross-validation procedure. The best distribution is the one being the best on average for all cross- validation sub-samples. The best average hides extreme values that may be due to particular crossV samples (chosen randomly). Distribution are then compared statistically to the best one with paired test. Because the k-fold may return folds with different lengths, the weight of each fold may be corrected with the w parameter.

Value

orderModels: number of columns of x re-ordered from best to worse p.values: p-values of difference between all distributions and the best one ordered like orderModels p.min.test: Logical. FALSE if distribution is ordered after the first distribution occurring with a p.value lower than p.min. Indeed, large distributions with high outliers may be not significantly different than distribution 1.


statnmap/SDMSelect documentation built on April 1, 2021, 2:01 p.m.