FeaLect-package: Scores Features for Feature Selection

Description Details Author(s) References See Also Examples

Description

Suppose you have a feature matrix with 200 features and only 20 samples and your goal is to build a classifier. You can run the FeaLect() function to compute the scores for your features. Only the relatively high score features (say the top 20) are recommended for further analysis. In this way, one can prevent overfitting by reducing the number of features significantly.

Details

The DESCRIPTION file: This package was not yet installed at build time.

Index: This package was not yet installed at build time.

Author(s)

Habil Zare

Maintainer: Habil Zare <zare@u.washington.edu>

References

Zare, Habil, et al. "Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis." BMC genomics. Vol. 14. No. 1. BioMed Central, 2013.

See Also

FeaLect, train.doctor, doctor.validate, random.subset, compute.balanced,compute.logistic.score, ignore.redundant, input.check.FeaLect, lars-package, and SparseLearner-package

Examples

1
2
3
4
5
6
7
8
9
library(FeaLect)
data(mcl_sll)
F <- as.matrix(mcl_sll[ ,-1])	# The Feature matrix
L <- as.numeric(mcl_sll[ ,1])	# The labels
names(L) <- rownames(F)
message(dim(F)[1], " samples and ",dim(F)[2], " features.")

## For this data, total.num.of.models is suggested to be at least 100.
FeaLect.result.1 <-FeaLect(F=F,L=L,maximum.features.num=10,total.num.of.models=20,talk=TRUE)

Example output

Loading required package: lars
Loaded lars 1.2

Loading required package: rms
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2

Attaching package: 'Hmisc'

The following objects are masked from 'package:base':

    format.pval, units

Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

22 samples and 236 features.
***********************************************
Scoring 236 features using 22 samples.
 - started at: 2019-05-14 10:19:40
 - sampling.index: 1
 - sampling.index: 2
 - sampling.index: 3
 - sampling.index: 4
 - sampling.index: 5
 - sampling.index: 6
 - sampling.index: 7
 - sampling.index: 8
 - sampling.index: 9
 - sampling.index: 10
 - sampling.index: 11
 - sampling.index: 12
 - sampling.index: 13
 - sampling.index: 14
 - sampling.index: 15
 - sampling.index: 16
 - sampling.index: 17
 - sampling.index: 18
 - sampling.index: 19
 - sampling.index: 20
****************************************************
validation ended at: 2019-05-14 10:19:43   taking:   2.26112723350525
****************************************************
There were 31 warnings (use warnings() to see them)

FeaLect documentation built on Feb. 26, 2020, 1:06 a.m.