Description Usage Arguments Details Value Examples
View source: R/heuristic.methods.R
High level function to compute the hierarchical heuristic methods MAX, AND, OR (Heuristic Methods MAX, AND, OR (Obozinski et al., Genome Biology, 2008) applying a classical holdout procedure.
1 2 3 4 5 6 7 8 | Do.heuristic.methods.holdout(heuristic.fun = "AND", norm = TRUE,
norm.type = NULL, folds = 5, seed = 23, n.round = 3,
f.criterion = "F", recall.levels = seq(from = 0.1, to = 1, by = 0.1),
compute.performance = FALSE, flat.file = flat.file,
ann.file = ann.file, dag.file = dag.file,
ind.test.set = ind.test.set, ind.dir = ind.dir,
flat.dir = flat.dir, ann.dir = ann.dir, dag.dir = dag.dir,
hierScore.dir = hierScore.dir, perf.dir = perf.dir)
|
heuristic.fun |
can be one of the following three values:
|
norm |
boolean value:
|
norm.type |
can be one of the following three values:
|
folds |
number of folds of the cross validation on which computing the performance metrics averaged across folds ( |
seed |
initialization seed for the random generator to create folds ( |
n.round |
number of rounding digits to be applied to the hierarchical scores matrix ( |
f.criterion |
character. Type of F-measure to be used to select the best F-measure. Two possibilities:
If |
recall.levels |
a vector with the desired recall levels ( |
compute.performance |
boolean value: should the flat and hierarchical performance (
|
flat.file |
name of the file containing the flat scores matrix to be normalized or already normalized (without rda extension). |
ann.file |
name of the file containing the label matrix of the examples (without rda extension). |
dag.file |
name of the file containing the graph that represents the hierarchy of the classes (without rda extension). |
ind.test.set |
name of the file containing a vector of integer numbers corresponding to the indices of the elements (rows) of scores matrix to be used in the test set. |
ind.dir |
relative path to folder where |
flat.dir |
relative path where flat scores matrix is stored. |
ann.dir |
relative path where annotation matrix is stored. |
dag.dir |
relative path where graph is stored. |
hierScore.dir |
relative path where the hierarchical scores matrix must be stored. |
perf.dir |
relative path where the performance measures must be stored. If |
The function checks if the number of classes between the flat scores matrix and the annotations matrix mismatched. If so, the number of terms of the annotations matrix is shrunk to the number of terms of the flat scores matrix and the corresponding subgraph is computed as well. N.B.: it is supposed that all the nodes of the subgraph are accessible from the root.
We excluded the predictions of the root node in computing all the performances, since it is a dummy node added to the ontology for practical reasons (e.g. some graph-based software may require a single root node to work). However, the root node scores are stored in the hierarchical scores matrix.
Two rda
files stored in the respective output directories:
Hierarchical Scores Results
: a matrix with examples on rows and classes on columns representing the computed hierarchical scores
for each example and for each considered class. It is stored in the hierScore.dir
directory;
Performance Measures
: flat and hierarchical performace results:
AUPRC results computed though AUPRC.single.over.classes
(AUPRC
);
AUROC results computed through AUROC.single.over.classes
(AUROC
);
PXR results computed though precision.at.given.recall.levels.over.classes
(PXR
);
FMM results computed though compute.Fmeasure.multilabel
(FMM
);
It is stored in the perf.dir
directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | data(graph);
data(scores);
data(labels);
data(test.index);
tmpdir <- paste0(tempdir(),"/");
save(g, file=paste0(tmpdir,"graph.rda"));
save(L, file=paste0(tmpdir,"labels.rda"));
save(S, file=paste0(tmpdir,"scores.rda"));
save(test.index, file=paste0(tmpdir,"test.index.rda"));
ind.dir <- dag.dir <- flat.dir <- ann.dir <- tmpdir;
hierScore.dir <- perf.dir <- tmpdir;
recall.levels <- seq(from=0.25, to=1, by=0.25);
ind.test.set <- "test.index";
dag.file <- "graph";
flat.file <- "scores";
ann.file <- "labels";
Do.heuristic.methods.holdout(heuristic.fun="MAX", norm=FALSE, norm.type="MaxNorm",
folds=NULL, seed=23, n.round=3, f.criterion ="F", recall.levels=recall.levels,
compute.performance=TRUE, flat.file=flat.file, ann.file=ann.file, dag.file=dag.file,
ind.test.set=ind.test.set, ind.dir=ind.dir, flat.dir=flat.dir, ann.dir=ann.dir,
dag.dir=dag.dir, hierScore.dir=hierScore.dir, perf.dir=perf.dir);
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.