DF_CV: Decision Forest algorithm: Model training with...

Description Usage Arguments Value Examples

Description

Decision Forest algorithm: Model training with Cross-validation Default is 5-fold cross-validation

Usage

1
2
3
DF_CV(X, Y, stop_step = 10, CV_fold = 5, Max_tree = 20, min_split = 10,
  cp = 0.1, Filter = F, p_val = 0.05, Method = "bACC", Quiet = T,
  Grace_val = 0.05, imp_accu_val = 0.01, imp_accu_criteria = F)

Arguments

X

Training Dataset

Y

Training data endpoint

stop_step

How many extra step would be processed when performance not improved, 1 means one extra step

CV_fold

Fold of cross-validation (Default = 5)

Max_tree

Maximum tree number in Forest

min_split

minimum leaves in tree nodes

cp

parameters to pruning decision tree, default is 0.1

Filter

doing feature selection before training

p_val

P-value threshold measured by t-test used in feature selection, default is 0.05

Method

Which is used for evaluating training process. MIS: Misclassification rate; ACC: accuracy

Quiet

if TRUE (default), don't show any message during the process

Grace_val

Grace Value in evaluation: the next model should have a performance (Accuracy, bACC, MCC) not bad than previous model with threshold

imp_accu_val

improvement in evaluation: adding new tree should improve the overall model performance (Accuracy, bACC, MCC) by threshold

imp_accu_criteria

if TRUE, model must have improvement in accumulated accuracy

Value

.$performance: Overall training accuracy (Cross-validation)

.$pred: Detailed training prediction (Cross-validation)

.$detail: Detailed usage of Decision tree Features/Models and their performances in all CVs

.$Method: pass evaluating Methods used in training

.$cp: pass cp value used in training decision trees

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
  ##data(iris)
  X = iris[,1:4]
  Y = iris[,5]
  names(Y)=rownames(X)

  random_seq=sample(nrow(X))
  split_rate=3
  split_sample = suppressWarnings(split(random_seq,1:split_rate))
  Train_X = X[-random_seq[split_sample[[1]]],]
  Train_Y = Y[-random_seq[split_sample[[1]]]]

  CV_result = DF_CV(Train_X, Train_Y)

Dforest documentation built on May 2, 2019, 6:38 a.m.

Related to DF_CV in Dforest...