classification_tree: Build a classification tree

Description Usage Arguments Value

classification_tree builds a classification tree using Information Gain as the splitting criterion


classification_tree(df, target, nxt = "seq", stopc = 1000)



the data frame containing the dataset


the target variable, specified either using a character string or the column index in df


the method of growing the tree, which should have one of the following two values: "seq" (sequentially, level by level) and "ig" (by highest IG, node by node)


the value at which three building should stop, representing the number of levels when nxt="seq" and the number of nodes when nxt="ig"


A dataframe containing a row for each tree node, with the following columns:

Parent: row index of parent node

Value: the value of the parent's splitting attribute that defines this node

Attrib: the splitting attribute for this node

Level: the level of the node (starts with 1 for the root node, has value 2 for the root nodes' child nodes etc.)

IG: the Information Gain achieved by the splitting attribute at this node

Split: a logical value indicating whether the node has been split or not (some nodes may have a splitting attribute assigned but have IG=0, resulting in them not being split)

P_<cls>: proportion of the target class in the subset represented by the node, one for each of the target classes

See Also


