View source: R/AutoBin_Binary.R
autoBin.binary | R Documentation |
Automatically compute optimal cutting points (based on mutual information) to dichotomize quantitative variables. This function can be used as a pre-processing step before using the CASMI-based functions.
autoBin.binary(data, index)
data |
data frame with variables as columns and observations as rows. The outcome variable (Y) MUST be categorical or discrete. The outcome variable (Y) MUST be the last column. |
index |
index or a vector of indices of the quantitative features (a.k.a., predictors, factors, independent variables) that need to be automatically categorized. |
'autoBin.binary()' returns the entire data frame after automatically dichotomizing the selected quantitative variable(s).
## Using the "iris" dataset embedded in R
data("iris")
head(iris) # The original data
# ---- Dichotomize One Single Feature ----
# Dichotomize the column with index 1.
newData1 <- autoBin.binary(iris, 1)
head(newData1)
# ---- Dichotomize Multiple Features at a Time ----
# Dichotomize the columns with indices 1, 2, 3, and 4.
newData2 <- autoBin.binary(iris, c(1,2,3,4))
head(newData2)
# ---- Dichotomize Features Using Column Names ----
# Dichotomize the columns with the names "Sepal.Length" and "Sepal.Width".
cols_of_interest <- c("Sepal.Length", "Sepal.Width")
col_indices <- which(names(iris) %in% cols_of_interest)
newData3 <- autoBin.binary(iris, col_indices)
head(newData3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.