DFp1_CV: Decision Forest preferred-1 algorithm: Model training with...
In seldas/Dforest: Decision Forest

Description Usage Arguments Value Examples

Decision Forest algorithm: Model training with Cross-validation Default is 5-fold cross-validation

DFp1_CV(X, Y, CV_fold = 5, stop_step = 4, param_T = 20, param_R = 5,
  param_L = 3, cp = 0.1, Filter = F, p_val = 0.05, Method = "bACC",
  Quiet = T, Grace_ACC = 0.05, imp_ACC_accu = 0.01, Grace_bACC = 0.05,
  imp_bACC_accu = 0.01, Grace_MCC = 0.05, imp_MCC_accu = 0.01,
  Grace_MIS = ceiling(0.05 * length(Y)), imp_MIS_accu = ceiling(0.01 *
  length(Y)))

`X`	Training Dataset
`Y`	Training data endpoint
`CV_fold`	Fold of cross-validation (Default = 5)
`stop_step`	How many extra step would be processed when performance not improved, 1 means one extra step
`param_T`	Parameter T in IDF: Maximum tree number in Forest
`param_R`	parameter R in IDF: maximum occurrence of features
`param_L`	parameter L in IDF: minimum leaves in tree nodes
`cp`	parameters to pruning decision tree, default is 0.1
`Filter`	doing feature selection before training
`p_val`	P-value threshold measured by t-test used in feature selection, default is 0.05
`Method`	Which is used for evaluating training process. MIS: Misclassification rate; ACC: accuracy
`Quiet`	if TRUE (default), don't show any message during the process
`Grace_ACC`	Grace Value in evaluation: the next model should have a performance (Accuracy) not bad than previous model with threshold
`imp_ACC_accu`	improvement in evaluation: adding new tree should improve the overall model performance (accuracy) by threshold
`Grace_bACC`	Grace Value in evaluation: (Balanced Accuracy)
`imp_bACC_accu`	improvement in evaluation: (Balanced Accuracy)
`Grace_MCC`	Grace Value in evaluation: (MCC)
`imp_MCC_accu`	improvement in evaluation: (MCC)
`Grace_MIS`	Grace Value in evaluation: (MIS)
`imp_MIS_accu`	improvement in evaluation: (MIS)

.$accuracy: Overall training accuracy (Cross-validation)

.$pred: Detailed training prediction (Cross-validation)

.$detail: Detailed usage of Decision tree Features/Models and their performances in all CVs

  ##data(iris)
  X = iris[,1:4]
  Y = iris[,5]
  names(Y)=rownames(X)

  random_seq=sample(nrow(X))
  split_rate=3
  split_sample = suppressWarnings(split(random_seq,1:split_rate))
  Train_X = X[-random_seq[split_sample[[1]]],]
  Train_Y = Y[-random_seq[split_sample[[1]]]]

  CV_result = DFp1_CV(Train_X, Train_Y)