club_cat_class: Clubbing class of a categorical variable with low population...

Description Usage Arguments Value Author(s) Examples

View source: R/functions.R

Description

The function groups classes of categorical variable, which have population percentage less than a threshold, with another class of similar event rate. If a class of exactly same event rate is not available, it is clubbed with the one having a higher event rate closest to it.

Usage

1
club_cat_class(base, target, variable, threshold, event = 1)

Arguments

base

input dataframe

target

column / field name for the target variable to be passed as string (must be 0/1 type)

variable

column name of categorical variable on which the operation is to be done, to be passed as string

threshold

threshold population percentage below which the class will be considered to be be clubbed with another class, to be provided as decimal/fraction

event

(optional) the event class, to be passed as 0 or 1 (default is 1)

Value

The function returns a dataframe after clubbing low percentage classes with another class of similar or closest but higher event rate.

Author(s)

Arya Poddar <aryapoddar290990@gmail.com>

Kanishk Dogar <kanishkd4@gmail.com>

Examples

1
2
3
4
data <- iris[1:110,]
data$Species <- as.character(data$Species)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
data_clubclass <- club_cat_class(base = data,target = "Y",variable = "Species",threshold = 0.2)

scorecardModelUtils documentation built on May 2, 2019, 9:59 a.m.