autoBin.binary: Automatically Dichotomize Quantitative Variables
In CASMI: 'CASMI'-Based Functions

View source: R/AutoBin_Binary.R

autoBin.binary

R Documentation

Automatically Dichotomize Quantitative Variables

Description

Automatically compute optimal cutting points (based on mutual information) to dichotomize quantitative variables. This function can be used as a pre-processing step before using the CASMI-based functions.

Usage

autoBin.binary(data, index)

Arguments

`data`	data frame with variables as columns and observations as rows. The outcome variable (Y) MUST be categorical or discrete. The outcome variable (Y) MUST be the last column.
`index`	index or a vector of indices of the quantitative features (a.k.a., predictors, factors, independent variables) that need to be automatically categorized.

Value

'autoBin.binary()' returns the entire data frame after automatically dichotomizing the selected quantitative variable(s).

Examples

## Using the "iris" dataset embedded in R
data("iris")
head(iris) # The original data

# ---- Dichotomize One Single Feature ----
# Dichotomize the column with index 1.
newData1 <- autoBin.binary(iris, 1)
head(newData1)

# ---- Dichotomize Multiple Features at a Time ----
# Dichotomize the columns with indices 1, 2, 3, and 4.
newData2 <- autoBin.binary(iris, c(1,2,3,4))
head(newData2)

# ---- Dichotomize Features Using Column Names ----
# Dichotomize the columns with the names "Sepal.Length" and "Sepal.Width".
cols_of_interest <- c("Sepal.Length", "Sepal.Width")
col_indices <- which(names(iris) %in% cols_of_interest)
newData3 <- autoBin.binary(iris, col_indices)
head(newData3)

CASMI documentation built on April 3, 2025, 10:56 p.m.