shaving: Repeated shaving of variables

View source: R/shaving.R

shavingR Documentation

Repeated shaving of variables

Description

One of five filter methods can be chosen for repeated shaving of a certain percentage of the worst performing variables. Performance of the reduced models are stored and viewable through print and plot methods.

Usage

shaving(
  y,
  X,
  ncomp = 10,
  method = c("SR", "VIP", "sMC", "LW", "RC"),
  prop = 0.2,
  min.left = 2,
  comp.type = c("CV", "max"),
  validation = c("CV", 1),
  fixed = integer(0),
  newy = NULL,
  newX = NULL,
  segments = 10,
  plsType = "plsr",
  Y.add = NULL,
  ...
)

## S3 method for class 'shaved'
plot(x, y, what = c("error", "spectra"), index = "min", log = "x", ...)

## S3 method for class 'shaved'
print(x, ...)

Arguments

y

vector of response values (numeric or factor).

X

numeric predictor matrix.

ncomp

integer number of components (default = 10).

method

filter method, i.e. SR, VIP, sMC, LW or RC given as character.

prop

proportion of variables to be removed in each iteration (numeric).

min.left

minimum number of remaining variables.

comp.type

use number of components chosen by cross-validation, "CV", or fixed, "max".

validation

type of validation for plsr. The default is "CV". If more than one set of CV segments is wanted, use a vector of lenth two, e.g. c("CV",5).

fixed

vector of indeces for compulsory/fixed variables that should always be included in the modelling.

newy

validation response for RMSEP/error computations.

newX

validation predictors for RMSEP/error computations.

segments

see mvr for documentation of segment choices.

plsType

Type of PLS model, "plsr" or "cppls".

Y.add

Additional response for CPPLS, see plsType.

...

additional arguments for plsr or cvsegments.

x

object of class shaved for plotting or printing.

what

plot type. Default = "error". Alternative = "spectra".

index

which iteration to plot. Default = "min"; corresponding to minimum RMSEP.

log

logarithmic x (default) or y scale.

Details

Variables are first sorted with respect to some importancemeasure, and usually one of the filter measures described above are used. Secondly, a threshold is used to eliminate a subset of the least informative variables. Then a model is fitted again to the remaining variables and performance is measured. The procedure is repeated until maximum model performance is achieved.

Value

Returns a list object of class shaved containing the method type, the error, number of components, and number of variables per reduced model. It also contains a list of all sets of reduced variable sets plus the original data.

Author(s)

Kristian Hovde Liland

See Also

VIP (SR/sMC/LW/RC), filterPLSR, shaving, stpls, truncation, bve_pls, ga_pls, ipw_pls, mcuve_pls, rep_pls, spa_pls, lda_from_pls, lda_from_pls_cv, setDA.

Examples

data(mayonnaise, package = "pls")
sh <- shaving(mayonnaise$design[,1], pls::msc(mayonnaise$NIR), type = "interleaved")
pars <- par(mfrow = c(2,1), mar = c(4,4,1,1))
plot(sh)
plot(sh, what = "spectra")
par(pars)
print(sh)


khliland/plsVarSel documentation built on Feb. 5, 2023, 3:15 a.m.