bHierarchicalTest: Bayesian hierarchical model for the analysis of two...

View source: R/bayesian.R

bHierarchicalTestR Documentation

Bayesian hierarchical model for the analysis of two algorithms in multiple datasets

Description

Bayesian hierarchical model for the simulatenous analysis of two algorithms in multiple datasets as presented in Benavoli et al. 2017

Usage

bHierarchicalTest(
  x.matrix,
  y.matrix = NULL,
  rho,
  std.upper = 1000,
  d0.lower = NULL,
  d0.upper = NULL,
  alpha.lower = 0.5,
  alpha.upper = 5,
  beta.lower = 0.05,
  beta.upper = 0.15,
  rope = c(-0.01, 0.01),
  nsim = 2000,
  nchains = 8,
  parallel = TRUE,
  stan.output.file = NULL,
  seed = as.numeric(Sys.time()),
  ...
)

Arguments

x.matrix

First sample, a matrix with the results obtained by the first algorithm (each dataset in a row)

y.matrix

Second sample, a matrix with the results obtained by the second algorithm (each dataset in a row) (if not provided, x is assumed to be the difference)

std.upper

Factor to set the upper bound for both sigma_i and sigma_0 (see Benavoli et al. 2017 for more details)

d0.lower

Lower bound for the prior for mu_0. If not provided, the smallest observed difference is used

d0.upper

Upper bound for the prior for mu_0. If not provided, the biggest observed difference is used

alpha.lower

Lower bound for the (uniform) prior for the alpha hyperparameter (see Benavoli et al. 2017 for more details). Default value set at 0.5, as in the original paper

alpha.upper

Upper bound for the (uniform) prior for the alpha hyperparameter (see Benavoli et al. 2017 for more details). Default value set at 5, as in the original paper

beta.lower

Upper bound for the (uniform) prior for the beta hyperparameter (see Benavoli et al. 2017 for more details). Default value set at 0.15, as in the original paper

rope

Interval for the difference considered as "irrelevant"

nsim

Number of samples (per chain) used to estimate the posterior distribution. Note that, by default, half the simulations are used for the burn-in

parallel

Logical value. If true, Stan code is executed in parallel

stan.output.file

String containing the base name for the output files produced by Stan. If NULL, no files are stored.

seed

Optional parameter used to fix the random seed

...

Additional arguments for the rstan::stan function that runs the analysis

z0

Position of the pseudo-observation associated to the prior Dirichlet Process. The default value is set to 0 (inside the rope)

nchain

Number of MC chains to be simulated. As half the simulations are used for the warm-up, the total number of simulations will be nchain*nsim/2

Details

The results includes the typical information relative to the three areas of the posterior density (left, right and rope probabilities), both global and per dataset (in the additional information). Also, the simulation results are included.

As for the prior parameters, they are set to the default values indicated in Benavoli et al. 2017, except for the bound for the prior distribution of mu_0, which are set to the maximum and minimum values observed in the sample. You should not modify them unless you know what you are doing.

Value

A list with the following elements:

method

a string with the name of the method used

parameters

parameters used by the method

posterior.probabilities

a vector with the left, rope and right probabilities

approximated

a logical value, TRUE if the posterior distribution is approximated (sampled) and FALSE if it is exact

posterior

Sampled probabilities (see details)

additional

Additional information provided by the model. This includes:per.dataset, the results per dataset (left, rope and right probabilities together with the expected mean value); global.sin sampled probabilities of mu_0 being positive or negative and stan.results, the complete set of results produced by Stan program

References

A. Benavoli, G. Corani, J. Demsar, M. Zaffalon (2017) Time for a Change: a Tutorial for Comparing Multiple Classifiers Through Bayesian Analysis. Journal of Machine Learning Research, 18, 1-36.

Examples

sample1 <- matrix(rnorm(25*5, 1, 1), nrow=5)
sample2 <- matrix(rnorm(25*5, 1.2, 1), nrow=5)
results <- bHierarchicalTest (x.matrix=sample1, y.matrix=sample2, rho=0, rope=c(-0.05, 0.05))
res$posterior.probabilities


b0rxa/scmamp documentation built on Jan. 17, 2024, 10:49 a.m.