feat.mfs: Multiple Feature Selection

Description Usage Arguments Details Value Note Author(s) See Also Examples

View source: R/mt_fs.R

Description

Multiple feature selection with or without resampling procedures.

Usage

1
2
3
4
5
6
feat.mfs(x,y,method,pars = valipars(),is.resam = TRUE, ...)
         
feat.mfs.stab(fs.res,rank.cutoff = 20,freq.cutoff = 0.5)

feat.mfs.stats(fs.stats,cumu.plot=FALSE, main="Stats Plot", 
               ylab="Values", xlab="Index of variable", ...)

Arguments

x

A matrix or data frame containing the explanatory variables.

y

A factor specifying the class for each observation.

method

Multiple feature selection/ranking method to be used.

pars

A list of resampling scheme. See valipars for details.

is.resam

A logical value indicating whether the resampling should be applied.

fs.res

A list obtained by running feat.mfs .

rank.cutoff

Cutoff of top features for frequency calculating.

freq.cutoff

Cutoff of feature frequency.

fs.stats

A matrix of feature statistics or values outputted by feat.mfs

cumu.plot

A logical value indicating the cumulative scores should be plotted.

main,xlab,ylab

Plot parameters

...

Additional parameters.

Details

feat.mfs.stab summarises multiple feature selection only when resampling strategy is employed (i.e. is.resam is TRUE when calling feat.mfs). It obtains these results based on feat.mfs's returned value called all.

feat.mfs.stats handles the statistical values or scores. Its purpose is to provide a guidance in selecting the best number of features by spotting the elbow point. This method should work in conjunction with plotting of p-values and their corresponding adjusted values such as FDR and Bonferroni in the multiple hypothesis test.

Value

feat.mfs returns a list with components:

fs.order

A data frame of feature order from best to worst.

fs.rank

A matrix of feature ranking scores.

fs.stats

A matrix of feature statistics or values.

all

A list of output of feat.rank.re for each feature selection method.

feat.mfs.stab returns a list with components:

fs.freq

Feature frequencies larger than freq.cutoff.

fs.subs

Feature with frequencies larger than freq.cutoff.

fs.stab

Stability rate of feature ranking.

fs.cons

A matrix of feature consensus table based on feature frequency.

feat.mfs.stats returns a list with components:

stats.tab

A statistical values with their corresponding names.

stats.long

Long-format of statistical values for plotting.

stats.p

An object of class "trellis".

Note

The feature order can be computed directly from the overall statistics fs.stats. It is, however, slightly different from fs.order obtained by rank aggregation when resampling is employed.

The fs.cons and fs.freq are computed based on fs.order.

Author(s)

Wanchang Lin

See Also

feat.rank.re, feat.freq

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
## Not run: 
library(lattice)	
data(abr1)
dat <- preproc(abr1$pos[,200:400], method="log10")  
cls <- factor(abr1$fact$class)

tmp <- dat.sel(dat, cls, choices=c("1","2"))
x   <- tmp[[1]]$dat
y   <- tmp[[1]]$cls

fs.method <- c("fs.anova","fs.rf","fs.rfe")
fs.pars   <- valipars(sampling="cv",niter=10,nreps=5)
fs <- feat.mfs(x, y, fs.method, fs.pars)   ## with resampling
names(fs)

## frequency, consensus and stabilities of feature selection 
fs.stab <- feat.mfs.stab(fs)
print(fs.stab$fs.cons,digits=2,na.print="")

## plot feature selection frequency
freq <- fs.stab$fs.freq
dotplot(freq$fs.anova, type="o", main="Feature Selection Frequencies")
barchart(freq$fs.anova)

## rank aggregation 
fs.agg <- feat.agg(fs$fs.rank)

## stats table and plotting
fs.stats <- fs$fs.stats
tmp <- feat.mfs.stats(fs.stats, cumu.plot = TRUE)
tmp$stats.p
fs.tab <- tmp$stats.tab
## convert to matrix
fs.tab <- list2df(un.list(fs.tab))

## without resampling
fs.1 <- feat.mfs(x, y, method=fs.method, is.resam = FALSE)

## End(Not run)

mt documentation built on Feb. 2, 2022, 1:07 a.m.

Related to feat.mfs in mt...