mEliminator: The Eliminator for mpda
In larssnip/mpda: Classification in a multivariate setting

View source: R/eliminator.R

mEliminator

R Documentation

The Eliminator for mpda

Description

Variable elimination in mpda.

Usage

mEliminator(
  y,
  X,
  reg1 = 0.5,
  reg2 = 1,
  prior = NULL,
  max.dim = NULL,
  frac = 0.25,
  vip.lim = 1,
  n.seg = 10,
  verbose = TRUE
)

Arguments

`y`	Vector of responses, a factor of exact 2 levels.
`X`	Matrix of predictor values.
`reg1`	The regularization parameter for `pdaDim`.
`reg2`	The regularization parameter for selection, see below.
`prior`	Vector of prior probabilities, one value for each factor level in `y`.
`max.dim`	Integer, the maximum number of dimensions to consider.
`frac`	Fraction of unimportant variables to eliminate in each iteration (default is 0.25).
`vip.lim`	The threshold for the VIP criterion (default is 1.0).
`n.seg`	Integer, the number of cross-validation segments (default is 10).
`verbose`	Logical, turns on/off output during computations.

Details

This is a wrapper for doing variable selection with the eliminator on an mpda object.

You use this function if you have a multi-level classification problem, and wants a standardized (and regularized) variable selection. This function uses mpda for the multi-level problem, which means all pairs of levels are modelled. A variable selection is performed for each level-pair, using the eliminator algorithm.

The argument reg2 is a regularization parameter along the same line as reg1, which is used by pdaDim. It is a rejection level of the mcnemar.test. In the eliminator algorithm, this test is performed after each elimination step, to see if the resulting accuracy is significantly pooerer than the maximum accuracy seen up to that step. As long as the corresponding p-value is at least as large as reg2, the elimination should continue. Thus, setting reg2=1.0 (default) means there is no regularization, and the selection producing the maximum accuracy is the result. By lowering reg2 you get a more stable selection, at the potential cost of elimination too much.

Value

A matrix with one row for each level-pair and one column for each variable (column) in X.

Each row is a logical vector indicating which variables (TRUE) that were selected for the corresponding level-pair. Thus, if we denote this matrix S, then X[,S[1,]] is the sub-matrix of X selected to be optimal for the use for level-pair 1, etc.

Author(s)

Lars Snipen.

Examples

data(poems)
y <- poems[,1]
X <- as.matrix(poems[, -1])
# Variable selection
S <- mEliminator(y, X, max.dim = 10)

# Fitting model with selection information
mp.trn <- mpda(y, X, prior = c(1,1,1), selected = S, max.dim = 10)
# Predicting...
predict(mp.trn)

larssnip/mpda documentation built on March 28, 2022, 3:37 p.m.