Description Usage Arguments Value Author(s) See Also Examples
View source: R/criteria.after.split.calculator.R
Calculates Entropy or Gini Index of a particular node after a particular split;
this function is called within construct.treeRK
function.
The argument split.record
is a kidids_split
object
from the package partykit
; the method kidids_split
splits the
data according to the criteria specified by an user ahead of time, and returns
a vector storing the index of the split group (group "1" or "2") that each
observation from the original data in question belongs to after the split has
occurred.
For more information about the function, please see the partykit
documentation.
1 2 3 | criteria.after.split.calculator(x.node = data.frame(), y.new.node = c(),
split.record = kidids_split(),
entropy = TRUE)
|
x.node |
numericized data frame of covariates (obtained via |
y.new.node |
numericized class type of each observation from a particular node that is to
be split; |
split.record |
output of the |
entropy |
|
The value of Entropy or Gini Index of a particular node after a particular split.
Hyunjin Cho, h56cho@uwaterloo.ca Rebecca Su, y57su@uwaterloo.ca
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ## example: iris dataset
library(forestRK) # load the package forestRK
library(partykit)
# covariates of training data set
x.train <- x.organizer(iris[,1:4], encoding = "num")[c(1:25,51:75,101:125),]
# numericized class types of observations of training dataset
y.train <- y.organizer(iris[c(1:25,51:75,101:125),5])$y.new
## criteria.after.split.calculator() example in the implementation
## of the forestRK algorithm
ent.status <- TRUE
# number.of.columns.of.x.node
# = total number of covariates that we consider
number.of.columns.of.x.node <- dim(x.train)[2]
# m.try = the randomly chosen number of covariates that we consider
# at the time of split
m.try <- sample(1:(number.of.columns.of.x.node),1)
## sample m.try number of covariates from the list of all covariates
K <- sample(1:(number.of.columns.of.x.node), m.try)
# split the data
# (the choice of the type of split used here is only arbitrary)
# for more information about kidids_split,
# please refer to the documentation for the package 'partykit'
sp <- partysplit(varid=K[1], breaks = x.train[1,K[1]], index = NULL,
right = TRUE, prob = NULL, info = NULL)
split.record <- kidids_split(sp, data=x.train)
# implement critera.after.split function based on kidids_split object
criteria.after.split <- criteria.after.split.calculator(x.train,
y.train, split.record, ent.status)
criteria.after.split
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.