Numericizes a vector of categorical class type of each (training) data point.
NOTE: In order to use other functions within the forestRK package,
you must ensure that the original vector of class type
contains no missing record (
NaN), that is,
you have to remove any record containing
prior to applying the
Following is the summary of the data cleaning process with
1. remove all
NaN's from the dataset
2. split the training dataset into a data frame that contains
covariates of ALL observations (BOTH training and test observations),
and a vector that contains class types of the training observations;
3. apply the
y.organizer to the vector that contains class type
of each training observation.
PROPER DATA CLEANING IS NECESSARY FOR THE forestRK FUNCTIONS TO WORK!
a vector containing the class type of each observation
from the dataset on which we want to build our
rktree models (the training dataset);
A list containing the following items:
a vector containing numericized class type of each observation from the dataset from which our rktree models are generated from. (these are typically the observations from the training set)
a vector storing original names of the numericized class types.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
## example: iris dataset ## load the package forestRK library(forestRK) ## Basic Procedures: ## 1. Extract the portion of the data that stores class type of each ## TRAINING observation, and make it as a vector ## 2. apply y.organizer function to the vector obtained from 1 y.train <- y.organizer(as.vector(iris[c(1:25,51:75,101:125),5])) ## retrieves the original names of each class type, if the class names ## were originally non-numeric y.train$y.factor.levels ## retrieves the numericized vector that stores classification category y.train$y.new
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.