fs.pca | R Documentation |
Feature selection using PCA loadings.
fs.pca(x,thres=0.8, ...)
x |
A data frame or matrix of data set. |
thres |
The threshold of the cumulative percentage of PC's explained variances. |
... |
Additional arguments to |
Since PCA loadings is a matrix with respect to PCs, the Mahalanobis distance of loadings is applied to select the features. (Other ways, for example, the sum of absolute values of loadings, or squared root of loadings, can be used.)
It should be noticed that this feature selection method is unsupervised.
A list with components:
fs.rank |
A vector of feature ranking scores. |
fs.order |
A vector of feature order from best to worst. |
stats |
A vector of measurements. |
Wanchang Lin
feat.rank.re
## prepare data set
data(abr1)
cls <- factor(abr1$fact$class)
dat <- abr1$pos
## dat <- abr1$pos[,110:1930]
## fill zeros with NAs
dat <- mv.zene(dat)
## missing values summary
mv <- mv.stats(dat, grp=cls)
mv ## View the missing value pattern
## filter missing value variables
## dim(dat)
dat <- dat[,mv$mv.var < 0.15]
## dim(dat)
## fill NAs with mean
dat <- mv.fill(dat,method="mean")
## log transformation
dat <- preproc(dat, method="log10")
## select class "1" and "2" for feature ranking
ind <- grepl("1|2", cls)
mat <- dat[ind,,drop=FALSE]
mat <- as.matrix(mat)
grp <- cls[ind, drop=TRUE]
## feature selection by PCA
res <- fs.pca(dat)
names(res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.