fs.pls: Feature Selection Using PLS

View source: R/mt_fs.R

fs.plsR Documentation

Feature Selection Using PLS


Feature selection using coefficient of regression and VIP values of PLS.


  fs.pls(x,y, pls="simpls",ncomp=10,...)
  fs.plsvip(x,y, ncomp=10,...)
  fs.plsvip.1(x,y, ncomp=10,...)
  fs.plsvip.2(x,y, ncomp=10,...)



A data frame or matrix of data set.


A factor or vector of class.


A method for calculating PLS scores and loadings. The following methods are supported:

  • simpls: SIMPLS algorithm.

  • kernelpls: kernel algorithm.

  • oscorespls: orthogonal scores algorithm.

For details, see simpls.fit, kernelpls.fit and oscorespls.fit in package pls.


The number of components to be used.


Arguments passed to or from other methods.


fs.pls ranks the features by regression coefficient of PLS. Since the coefficient is a matrix due to the dummy multiple response variables designed for the classification (category) problem, the Mahalanobis distance of coefficient is applied to select the features. (Other ways, for example, the sum of absolute values of coefficient, or squared root of coefficient, can be used.)

fs.plsvip and fs.plsvip.1 carry out feature selection based on the the Mahalanobis distance and absolute values of PLS's VIP, respectively.

fs.plsvip.2 is similar to fs.plsvip and fs.plsvip.1, but the category response is not treated as dummy multiple response matrix.


A list with components:


A vector of feature ranking scores.


A vector of feature order from best to worst.


A vector of measurements.


Wanchang Lin

See Also



## prepare data set
cls <- factor(abr1$fact$class)
dat <- abr1$pos
## dat <- abr1$pos[,110:1930]

## fill zeros with NAs
dat <- mv.zene(dat)

## missing values summary
mv <- mv.stats(dat, grp=cls) 
mv    ## View the missing value pattern

## filter missing value variables
## dim(dat)
dat <- dat[,mv$mv.var < 0.15]
## dim(dat)

## fill NAs with mean
dat <- mv.fill(dat,method="mean")

## log transformation
dat <- preproc(dat, method="log10")

## select class "1" and "2" for feature ranking
ind <- grepl("1|2", cls)
mat <- dat[ind,,drop=FALSE] 
mat <- as.matrix(mat)
grp <- cls[ind, drop=TRUE]   

## apply PLS methods for feature selection
res.pls      <- fs.pls(mat,grp, ncomp=4)
res.plsvip   <- fs.plsvip(mat,grp, ncomp=4)
res.plsvip.1 <- fs.plsvip.1(mat,grp, ncomp=4)
res.plsvip.2 <- fs.plsvip.2(mat,grp, ncomp=4)

## check differences among these methods
fs.order <- data.frame(pls      = res.pls$fs.order,
                       plsvip   = res.plsvip$fs.order,
                       plsvip.1 = res.plsvip.1$fs.order,
                       plsvip.2 = res.plsvip.2$fs.order)
head(fs.order, 20)

mt documentation built on June 22, 2024, 12:24 p.m.

Related to fs.pls in mt...