Makes predictions on the observations in the test dataset based on the
rktree model constructed from the training dataset.
Please be aware that, at the end of the
pred.treeRK function, the test
data points in
prediction.df are re-ordered by the increasing original
index number (the original rownames) of those test observations. So if you
shuffled the data before seperating them into a training and a test set,
the order of the data points in which they are presented under the data frame
prediction.df may not be same as the shuffled order in your original
Users of this function may be interested in identifying the original name of
the numericized predicted class type shown in the last column of data frame
prediction.df. This can easily be done by extracting the attribute
y.factor.levels from the
y.organizer object. For example, if the
prediction.df indicates that the predicted class type of the
1st test observation is "2", that means the actual name of the predicted
class type for that 1st test observation is indicated as the 2nd element of the
y.organizer.object$y.factor.levels that we can obtain during
the data cleaning phase.
pred.treeRK function makes a use of the list of hierarchical flags
generated by the
construct.treeRK function; the function uses the list
of hierarchical flag as a guide to how it should split the test set to make
predictions. The function
pred.treeRK itself actually generates a list
of hierarchical flag of its own as it splits the test set, and at the end of
pred.treeRK tries to match the list of hierarchical flag it
generated with the list of hierarchical flag from the
function. If the two flags match exactly, then it is a good sign since this
would imply that the splitting on the test set was done in the manner consistent
with how the training set was split when the rkTree in question was built.
If there is any difference in the two flags, however, this is not a good sign
since it would signal that the splitting on the test set has done in a different
manner than how the splitting was done on the training set; if the mismatch
pred.treeRK function will stop and throw an error. For more
information about the hierarchical flags of a
rkTree, please see the
construct.treeRK section of this documentation.
a numericized data frame of covariates of the test observations
or the observations that we want to make predictions for (obtained via
A list containing the following items:
a data frame of test observations. If
the hierarchical flag of splits performed on the test set by applying the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
## example: iris dataset ## load the forestRK package library(forestRK) ## numericize the data x.train <- x.organizer(iris[,1:4], encoding = "num")[c(1:25,51:75,101:125),] x.test <- x.organizer(iris[,1:4], encoding = "num")[c(26:50,76:100,126:150),] y.train <- y.organizer(iris[c(1:25,51:75,101:125),5])$y.new ## Construct a tree # min.num.obs.end.node.tree is set to 5 by default; # entropy is set to TRUE by default tree.entropy <- construct.treeRK(x.train, y.train) tree.gini <- construct.treeRK(x.train, y.train, min.num.obs.end.node.tree = 6, entropy = FALSE) ## Make predictions on the test set based on the constructed rktree model # last column of prediction.df stores predicted class on the test observations # based on a given rktree prediction.df <- pred.treeRK(X = x.test, tree.entropy)$prediction.df flag.pred <- pred.treeRK(X = x.test, tree.entropy)$flag.pred
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.