ImbTreeEntropy: Fit a Decision Trees
In KrzyGajow/ImbTreeEntropy: Software to build Decision Trees for imbalanced data

Usage Arguments Value See Also Examples

View source: R/ImbTreeEntropy.R

1
2
3

ImbTreeEntropy( Y_name, X_names, data, depth = 5, min_obs = 5, type = "Shannon", entropy_par = 1, 
                cp = 0, n_cores = 1, weights = NULL, cost = NULL, class_th = "equal", 
                overfit = "leafcut", cf = 0.25 )

`Y_name`	Name of the target variable. Character vector of one element.
`X_names`	Attribute names used for target (Y_name) modelling. Character vector of many elements.
`data`	Data.frame in which to interpret the parameters Yname and Xnames.
`depth`	Set the maximum depth of any node of the final tree, with the root node counted as depth 0. Numeric vector of one element which is greater or equal to 0.
`min_obs`	The minimum number of observations that must exist in any terminal node (leaf). Numeric vector of one element which is greater or equal to 1.
`type`	Method used for learning. Character vector of one element with one of the: "Shannon", "Renyi", "Tsallis", "Sharma-Mittal", ""Sharma-Taneja", "Kapur".
`entropy_par`	Numeric vector specifying parameters for the following entropies: "Renyi", "Tsallis", "Sharma-Mittal", "Sharma-Taneja", "Kapur". For "Renyi", "Tsallis" it is one-element vector with q-value. For "Sharma-Mittal" or "Sharma-Taneja" and "Kapura" it is two-element vector with either q-value and r-value or alpha-value and beta-value, respectively.
`cp`	Complexity parameter, i.e. any split that does not decrease the overall lack of fit by a factor of cp is not attempted. It refers to miss-classification error. If cost or weights are specified aforementioned measure takes these parameter into account. Numeric vector of one element which is greater or equal to 0.
`n_cores`	Number of cores used for parallel processing. Numeric vector of one element which is greater or equal to 1.
`weights`	Numeric vector of cases weights. It should have as many elements as the number of observation in the data.frame passed to the data parameter.
`cost`	Matrix of costs associated with the possible errors. The matrix should have k columns and rows, where k is the number of class levels. Rows contain true classes while columns contain predicted classes. Rows and columns names should take all possible categories (labels) of the target variable.
`class_th`	Method used for determining thresholds based on which the final class for each node is derived. If cost is specified it can take one of the following: "theoretical", "tuned", otherwise it takes "equal". Character vector of one element.
`overfit`	Character vector of one element with one of the: "none",”leafcut”, "prune", "avoid" specifying which method overcoming overfitting should be used. ”leafcut” method is used when the full tree is built, it reduces the subtree when both siblings choose the same class label. "avoid" method is incorporated during the recursive partitioning, it prohibit the split when both sibling chose the same class. “prune” method employs pessimistic error pruning procedure, it should be specified along with the cf parameter.
`cf`	Numeric vector of one element with the number in (0, 1) for the optional pessimistic-error-rate-based pruning step.

A fitted model/object of class Node, R6. See. data.tree.

ImbTreeEntropy, ImbTreeEntropyInter, PredictTree, PrintTree, PrintTreeInter, ExtractRules

library("ImbTreeEntropy")
data(iris)
Tree <- ImbTreeEntropy(Y_name = "Species", 
                       X_names = colnames(iris)[-ncol(iris)], 
                       data = iris)
PrintTree(Tree)