Description Usage Arguments Details Value Note Author(s) See Also Examples
support for feature selection in cross-validation
1 2 3 | fs.absT(N)
fs.probT(p)
fs.topVariance(p)
|
N |
number of features to retain; features are ordered by descending value of abs(two-sample t stat.), and the top N are used. |
p |
cumulative probability (in (0,1)) in the distribution of absolute t statistics above which we retain features |
This function returns a function that will be used as a parameter
to xvalSpec
in applications of MLearn
.
a function is returned, that will itself return a formula
consisting of the selected features for application of MLearn
.
The functions fs.absT
and fs.probT
are
two examples of approaches to embedded feature selection that make
sense for two-sample prediction problems. For selection based on
linear models or other discrimination measures, you will need to create
your own selection helper, following the code in these functions as
examples.
fs.topVariance performs non-specific feature selection based on the variance. Argument p is the variance percentile beneath which features are discarded.
VJ Carey <stvjc@channing.harvard.edu>
1 2 3 4 5 6 7 8 9 10 11 | library("MASS")
data(crabs)
# we will demonstrate this procedure with the crabs data.
# first, create the closure to pick 3 features
demFS = fs.absT(3)
# run it on the entire dataset with features excluding sex
demFS(sp~.-sex, crabs)
# emulate cross-validation by excluding last 50 records
demFS(sp~.-sex, crabs[1:150,])
# emulate cross-validation by excluding first 50 records -- different features retained
demFS(sp~.-sex, crabs[51:200,])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.