Selecting the optimal number of features

Description

tuneFeatureNb iterates through increasing feature numbers to calculate kappa values which represents the performance of the model computed with the given features. We recommand to save the object containing the optimal number of features for the following steps.

Usage

1
2
3
4
tuneFeatureNb(data, cl = 1, feature.ranking, step.nb = 10,
  valid.times = 10, cost = NULL, gamma = NULL, kernel = "linear",
  numcores = ifelse(.Platform$OS.type == "windows", 1, parallel::detectCores()
  - 1), file.prefix = NULL)

Arguments

data

data.frame containing the training set

cl

integer indicating the column number corresponding to the response vector that classify positive and negative regions (default = 1)

feature.ranking

List of ordered features.

step.nb

Number of features to add at each step (default = 10)

valid.times

Integer indicating how many times the training set will be split for the cross validation step (default = 10). This number must be smaller than positive and negative sets sizes.

cost

The SVM cost parameter for both linear and radial kernels. If NULL (default), the function mcTune is run.

gamma

The SVM gamma parameter for radial kernel. If radial kernel and NULL (default), the function mcTune is run.

kernel

SVM kernel, a character string: "linear" or "radial". (default = "radial")

numcores

Number of cores to use for parallel computing (default: the number of available cores in the machine - 1)

file.prefix

A character string that will be used as a prefix followed by "_kappa_measures.png" for the result plot file. If it is NULL (default), no plot is returned

Value

A list with two objects.

performance

2-columns data frame. first column correspond to the number of tested features, second column contains the corresponding kappa value

best.feature.nb

Integer corresponding to the number of features producing the model with the highest kappa value

Examples

1
2
3
4
5
6
7
8
data(crm.features)
data(feature.ranking)
cost <- 1
gamma <- 1
 #feature.nb.obj <- tuneFeatureNb(data.granges=crm.features,
 #    feature.ranking=feature.ranking, kernel='linear', cost=cost,gamma=gamma,
 #    file.prefix = "test")
 #names(feature.nb.obj)