benchmarkQCBA | R Documentation |
Learn multiple rule models using base rule induction algorithms from arulesCBA and apply QCBA to postprocess them.
benchmarkQCBA(
train,
test,
classAtt,
train_disc = NULL,
test_disc = NULL,
cutPoints = NULL,
algs = c("CBA", "CMAR", "CPAR", "PRM", "FOIL2"),
iterations = 2,
rounding_places = 3,
return_models = FALSE,
debug_prints = FALSE,
...
)
train |
data frame with training data |
test |
data frame with testing data before postprocessing |
classAtt |
the name of the class attribute |
train_disc |
prediscretized training data |
test_disc |
prediscretized tet data |
cutPoints |
specification of cutpoints applied on the data (ignored if train_disc is null) |
algs |
vector with names of baseline rule learning algorithms. Names must correspond to function names from the arulesCBA library |
iterations |
number of executions over base learner, which is used for obtaining a more precise estimate of build time |
rounding_places |
statistics in the resulting dataframe will be rounded to specified number of decimal places |
return_models |
boolean indicating if also learnt rule lists (baseline and postprocessed) should be included in model output |
debug_prints |
print debug information such as rule lists |
... |
Parameters for base learners, the name of the argument is the base learner (one of 'algs' values) and value is a list of parameters to pass. To specify parameters for QCBA pass "QCBA". See also Example 3. |
Outputs a dataframe with evaluation metrics and if 'return_models==TRUE' also the induced baseline and QCBA models (see also Example 3). Included metrics in the dataframe with statistics: **accuracy**: percentage of correct predictions in the test set **rulecount**: number of rules in the rule list. Note that for QCBA the count includes the default rule (rule with empty antecedent), while for base learners this rule may not be included (depending on the base learner) **modelsize**: total number of conditions in the antecedents of all rules in the model **buildtime**: learning time for inference of the model. In case of QCBA, this excludes time for the induction of the base learner
[qcba()] which this function wraps.
# EXAMPLE 1: pass train and test folds, induce multiple base rule learners,
# postprocess each with QCBA and return benchmarking results.
## Not run:
if (identical(Sys.getenv("NOT_CRAN"), "true")) {
# Define input dataset and target variable
df_all <-datasets::iris
classAtt <- "Species"
# Create train/test partition using built-in R functions
tot_rows<-nrow(df_all)
train_proportion<-2/3
df_all <- df_all[sample(tot_rows),]
trainFold <- df_all[1:(train_proportion*tot_rows),]
testFold <- df_all[(1+train_proportion*tot_rows):tot_rows,]
# learn with default metaparameter values
stats<-benchmarkQCBA(trainFold,testFold,classAtt)
print(stats)
# print relative change of QCBA results over baseline algorithms
print(stats[,6:10]/stats[,0:5]-1)
}
## End(Not run)
# EXAMPLE 2: As Example 1 but data are discretizated externally
# Discretize numerical predictors using built-in discretization
# This performs supervised, entropy-based discretization (Fayyad and Irani, 1993)
# of all numerical predictor variables with 3 or more distinct numerical values
# This example could run for more than 5 seconds
## Not run:
if (identical(Sys.getenv("NOT_CRAN"), "true")) {
discrModel <- discrNumeric(trainFold, classAtt)
train_disc <- as.data.frame(lapply(discrModel$Disc.data, as.factor))
test_disc <- applyCuts(testFold, discrModel$cutp, infinite_bounds=TRUE, labels=TRUE)
stats<-benchmarkQCBA(trainFold,testFold,classAtt,train_disc,test_disc,discrModel$cutp)
print(stats)
}
## End(Not run)
# EXAMPLE 3: pass custom metaparameters to selected base rule learner,
# then postprocess with QCBA, evaluate, and return both models
# This example could run for more than 5 seconds
if (identical(Sys.getenv("NOT_CRAN"), "true")) {
# use only CBA as a base learner, return rule lists.
## Not run:
output<-benchmarkQCBA(trainFold,testFold,classAtt,train_disc,test_disc,discrModel$cutp,
CBA=list("support"=0.05,"confidence"=0.5),algs = c("CPAR"),
return_models=TRUE)
message("Evaluation statistics")
print(output$stats)
message("CPAR model")
inspect(output$CPAR[[1]])
message("QCBA model")
print(output$CPAR_QCBA[[1]])
## End(Not run)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.