y.organizer: Numericize the vector containing categorical class type('y')...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/y.organizer.R

Description

Numericizes a vector of categorical class type of each (training) data point.

NOTE: In order to use other functions within the forestRK package, you must ensure that the original vector of class type y contains no missing record (NA, NaN), that is, you have to remove any record containing NA or NaN prior to applying the y.organizer function.

Following is the summary of the data cleaning process with y.organizer(): 1. remove all NA or NaN's from the dataset in hand. 2. split the training dataset into a data frame that contains covariates of ALL observations (BOTH training and test observations), and a vector that contains class types of the training observations; 3. apply the y.organizer to the vector that contains class type of each training observation.

PROPER DATA CLEANING IS NECESSARY FOR THE forestRK FUNCTIONS TO WORK!

Usage

1
 y.organizer(y = c())

Arguments

y

a vector containing the class type of each observation from the dataset on which we want to build our rktree models (the training dataset); y should contain no NA or NaN.

Value

A list containing the following items:

y.new

a vector containing numericized class type of each observation from the dataset from which our rktree models are generated from. (these are typically the observations from the training set)

y.factor.levels

a vector storing original names of the numericized class types.

Author(s)

Hyunjin Cho, h56cho@uwaterloo.ca Rebecca Su, y57su@uwaterloo.ca

See Also

x.organizer

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  ## example: iris dataset
  ## load the package forestRK
  library(forestRK)

  ## Basic Procedures:
  ## 1. Extract the portion of the data that stores class type of each
  ##    TRAINING observation, and make it as a vector
  ## 2. apply y.organizer function to the vector obtained from 1

  y.train <- y.organizer(as.vector(iris[c(1:25,51:75,101:125),5]))
  ## retrieves the original names of each class type, if the class names
  ## were originally non-numeric
  y.train$y.factor.levels
  ## retrieves the numericized vector that stores classification category
  y.train$y.new

forestRK documentation built on July 19, 2019, 5:04 p.m.