Description Usage Arguments Details Value Author(s) See Also Examples
Provides several evaluation tests of
the ouput of ensemble_fs
. There are
performance test, namely the logreg test and permutation
test as well as tests of stability via the variance
of feature importances and the Jaccard-index (see Details).
1 2 3 |
data |
an object of class data.frame |
efs_table |
a table object of class matrix (retrieved
from |
file_name |
a character string, name which is used for the two possible PDF files. |
classnumber |
a number indicating the index of variable for binary classification |
NA_threshold |
a number in range of [0,1]. Threshold for deletion
of features with a greater proportion of NAs than |
logreg |
a logical value indicating whether to conduct an evaluation via logistic regression or not |
rf |
a logical value indicating whether to conduct an evaluation via random forest or not |
permutation |
a logical value indicating whether to conduct a permutation of the class variable or not |
p_num |
number of permutations |
variances |
a logical value indicating whether to calculate the variances of importances retrieved from bootrapping or not |
jaccard |
a logical value indicating whether to calculate the jaccard-index or not |
bs_num |
a number of boostrap permutations of the importances |
bs_percentage |
a number in range of [0,1]. Proportion of randomly selected samples for boostraping |
A logistic regression model with leave-one-out cross-validation (LOOCV) of the
selected features and of all feature is conducted by logreg = TRUE
.
Both AUC-values of the ROC curves are compared with roc.test
.
The ROC curves are illustrated on the PDF file "file_name" + "LG-ROC.pdf".
By rf = TRUE
, random forst model will be constructed and evaluated.
Parallel to Logreg, the AUC-values of the two ROC curves of all features and a subset
of the best ranked feautres are compared with roc.test
.
The ROC curves are illustrated on the PDF file "file_name" + "RF-ROC.pdf".
The permutation test (permutation = TRUE
) compares the AUC outcome of
an logistic regression with p_num
AUCs from random
permutations of the class variable by a t.test
.
Variances of the importances after a bootstrapping analysis are
calculated by variances = TRUE
. Thereby the number and proportion
of the bootstrapping can be set by bs_num
and bs_percentage
.
The function also provides a PDF file "file_name" +"_Variances.pdf".
Additionally, the Jaccard-index of this bootstrapped importances
can be calculated by setting jaccard=TRUE
.
An object of class list, with the following components:
"AUC of LR with all parameters",
"AUC of LR with EFS parameter"
"P-value of LR-ROC test",
#'
"AUC of RF with all parameters",
"AUC of RF with EFS parameter"
"P-value of RF-ROC test",
"P-value of permutation",
"Variances of feature importances",
"Jaccard-index".
Ursula Neumann
glm, roc,prediction, boxplot, tail, t.test
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## Loading dataset in environment
data(efsdata)
## Generate a ranking based on importance (with default
## NA_threshold = 0.7,cor_threshold = 0.2)
efs<-ensemble_fs(efsdata,5,runs=2)
## Conduct AUC test and permutation test
eval_example <- efs_eval(data = efsdata, efs_table = efs, file_name = 'eval_test',
classnumber = 5, NA_threshold = 0.2,
logreg = TRUE,
rf = FALSE,
permutation = TRUE, p_num = 2,
variances = FALSE, jaccard = FALSE)
## Calculating variances and the Jaccard-index can take several minutes computation time
|
[1] "default value for NA_threshold = 0.2"
[1] "default value for cor_threshold = 0.7"
[1] "default value for selection is c(TRUE, TRUE, TRUE,TRUE, TRUE, TRUE, FALSE, FALSE)"
[1] "Start Median"
[1] "Start Pearson"
[1] "Start Spearman"
[1] "Start LogReg"
[1] "Start RF"
[1] 1
Time difference of 0.02458167 secs
[1] 2
Time difference of 0.2683663 secs
[1] "Build return matrix"
[1] "Done"
Time difference of 0.3356316 secs
Warning messages:
1: In wilcox.test.default(x, y) : cannot compute exact p-value with ties
2: In wilcox.test.default(x, y) : cannot compute exact p-value with ties
3: In wilcox.test.default(x, y) : cannot compute exact p-value with ties
4: In wilcox.test.default(x, y) : cannot compute exact p-value with ties
5: In wilcox.test.default(x, y) : cannot compute exact p-value with ties
[1] "default value for bs_num = 100"
[1] "default value for bs_percentage = 0.9"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.