Binning on Factor Variables

Share:

Description

It generates the output table for the uniques values of a given factor variable.

Usage

1
smbinning.factor(df, y, x, maxcat = 10)

Arguments

df

A data frame.

y

Binary response variable (0,1). Integer (int) is required. Name of y must not have a dot.

x

A factor variable with at least 2 different values. Value Inf is not allowed.

maxcat

Specifies the maximum number of categories. Default value is 10. Name of x must not have a dot.

Value

The command smbinning.factor generates and object containing the necessary info and utilities for binning. The user should save the output result so it can be used with smbinning.plot, smbinning.sql, and smbinning.gen.factor.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Package loading and data exploration
library(smbinning) # Load package and its data
data(chileancredit) # Load smbinning sample dataset (Chilean Credit)
str(chileancredit) # Quick description of the data
table(chileancredit$FlagGB) # Tabulate target variable

# Training and testing samples (Just some basic formality for Modeling) 
chileancredit.train=subset(chileancredit,FlagSample==1)
chileancredit.test=subset(chileancredit,FlagSample==0)

# Package application and results
result.train=smbinning.factor(df=chileancredit.train,
                               y="FlagGB",x="IncomeLevel")
result.train$ivtable
result.test=smbinning.factor(df=chileancredit.test,
                               y="FlagGB",x="IncomeLevel")
result.test$ivtable

# Plots
par(mfrow=c(2,2))
smbinning.plot(result.train,option="dist",sub="Income Level (Tranining Sample)")
smbinning.plot(result.train,option="badrate",sub="Income Level (Tranining Sample)")
smbinning.plot(result.test,option="dist",sub="Income Level (Test Sample)")
smbinning.plot(result.test,option="badrate",sub="Income Level (Test Sample)")