ker.score.classifier.holdout: RANKS held-out procedure for a single class

Description Usage Arguments Details Value See Also Examples

Description

Function to perform an held-out procedure for a single class with a kernel-based score method

Usage

1
2
3
ker.score.classifier.holdout(K, ind.pos, ind.test, m = 5, p = 10, 
alpha = seq(from = 0.05, to = 0.6, by = 0.05), init.seed = 0, 
opt.fun = compute.F, fun = KNN.score, ...)

Arguments

K

matrix. Kernel matrix or any valid symmetric matrix

ind.pos

indices of the positive examples of the training set. They are the indices the row of RW corresponding to positive examples of the training set

ind.test

indices of the examples of the test set. They are the indices the row of RW corresponding to examples of the test set

m

number of folds for the cross-validation on the training set

p

number of repeated cross-validations on the training set

alpha

vector of the quantiles to be tested

init.seed

nitial seed for the random generator (def: 0)

opt.fun

Function implementing the metric to select the optimal threshold. The F-score (compute.F) is the default. Available functions:

- compute.F: F-score (default)

- compute.acc:accuracy.

Any function having two arguments representing the vector of predicted and true labels can be in principle used.

fun

function. It must be a kernel-based score method (default KNN.score)

...

optional arguments for the function fun

Details

Function to classify labels according to an hold-out procedure with a kernel-based score method. The optimal threshold for a given class is obtained by (possibly multiple) internal cross-validation on the training set. Scores of the held-out nodes are computed. Thresholds are computed on the training set by cross-validation and then are used to classify the held-out nodes in the test set. The optimal quantile and corresponding threshold are selected by internal cross-validation using the F-score as metrics. Note the test examples are given as indices of the rows of the input matrix RW.

Value

a list with four components: A list with 4 components:

labels

vector of the predicted labels for the test set(1 represent positive, 0 negative)

av.scores

a vector with the scores computed on the test set. Elements of the vector av.scores correspond to ind.test rows of RW

opt.alpha

the optimal quantile alpha

opt.thresh

the optimal threshold

See Also

rw.kernel-methods, Kernel functions, ker.score.classifier.cv

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Node label classification of the DrugBank category Penicillins
# on the Tanimoto chemical structure similarity network (1253 drugs)
# with eav-score with 1-step random walk kernel
# using held-out with 5-fold CV repeated 10 times on the training set 
# to set the "optimal" threshold for classifiaction
library(bionetdata);
data(DD.chem.data);
data(DrugBank.Cat);
labels <- DrugBank.Cat[,"Penicillins"];
ind.test <- 1:300;
ind.train <- 301:length(labels);
ind.pos <- which(labels==1);
ind.pos <- ind.pos[ind.pos>300];
K <- rw.kernel(DD.chem.data);
res <- ker.score.classifier.holdout(K, ind.pos, ind.test, m = 5, p = 10, fun = eav.score);

RANKS documentation built on May 1, 2019, 9:27 p.m.