library(tree.bins) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
The package 'tree.bins' provides users the ability to recategorize categorical variables dependent on a response variable by iteratively creating a decision tree for each of the categorical variables (class factor) and the selected response variable. The decision tree is created from the rpart() function from the 'rpart' package. The rules from the leaves of the decision tree are extracted, and used to recategorize (bin) the appropriate categorical variable (predictor). This step is performed for each of the categorical variables that is passed onto the data component of the function. Only variables containing more than 2 factor levels will be considered in the function. The final output generates a data set containing the recategorized variables and/or a list containing a mapping table for each of the candidate variables. For more details see Dr. Yan-yan Song article (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4466856/) or T. Hastie et al (2009, ISBN: 978-0-387-84857-0). For detailed examples and functionality see vignettes.
You can install tree.bins with:
#Easiest way to install tree.bins is by: install.packages("tree.bins") #Alternatively, the development version from GitHub: # install.packages("devtools") devtools::install_github("pikos90/tree.bins")
Uses tree.bins() to recategorize your data.
## basic example code sample.df <- AmesImpFctrs[, c("Neighborhood","MS.Zoning", "SalePrice" )] recategorized.df <- tree.bins(data = sample.df, y = SalePrice) head(recategorized.df)
Uses tree.bins() to create a list of mapping tables.
## basic example code sample.df <- AmesImpFctrs[, c("Neighborhood","MS.Zoning", "SalePrice" )] recategorized.list <- tree.bins(data = sample.df, y = SalePrice, return = "lkup.list") head(recategorized.list[[1]])
Use that list to recategorize your a different data set with bin.oth().
other.sample.df <- AmesImpFctrs[, c("Neighborhood","MS.Zoning", "Sale.Condition", "SalePrice" )] other.df <- bin.oth(list = recategorized.list, data = other.sample.df) head(other.df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.