View source: R/fn_exp_categorical.R
ExpCatStat | R Documentation |
This function combines results from weight of evidence, information value and summary statistics.
ExpCatStat(
data,
Target = NULL,
result = "Stat",
clim = 10,
nlim = 10,
bins = 10,
Pclass = NULL,
plot = FALSE,
top = 20,
Round = 2
)
data |
dataframe or matrix |
Target |
target variable |
result |
"Stat" - summary statistics, "IV" - information value |
clim |
maximum unique levles for categorical variable. Variables will be dropped if unique levels is higher than clim for class factor/character variable |
nlim |
maximum unique values for numeric variable. |
bins |
number of bins (default is 10) |
Pclass |
reference category of target variable |
plot |
Information value barplot (default FALSE) |
top |
for plotting top information values (default value is 20) |
Round |
round of value |
Criteria used for categorical variable predictive power classification are
If information value is < 0.03
then predictive power = "Not Predictive"
If information value is 0.3 to 0.1
then predictive power = "Somewhat Predictive"
If information value is 0.1 to 0.3
then predictive power = "Meidum Predictive"
If information value is >0.3
then predictive power = "Highly Predictive"
This function provides summary statistics for categorical variable
Stat
- Summary statistics includes Chi square test scores, p value, Information values, Cramers V and Degree if association
IV
- Weight of evidence and Information values
Columns description:
Variable
variable name
Target
- Target variable
class
- name of bin (variable value otherwise)
out0
- number of good observations
out1
- number of bad observations
Total
- Total values for each category
pct1
- good observations / total good observations
pct0
- bad observations / total bad observations
odds
- Odds ratio [(a/b)/(c/d)]
woe
- Weight of Evidence – calculated as ln(odds)
iv
- Information Value - ln(odds) * (pct0 – pct1)
dubrangala
# Example 1
## Read mtcars data
# Target variable "am" - Transmission (0 = automatic, 1 = manual)
# Summary statistics
ExpCatStat(mtcars,Target="am",result = "Stat",clim=10,nlim=10,bins=10,
Pclass=1,plot=FALSE,top=20,Round=2)
# Information value plot
ExpCatStat(mtcars,Target="am",result = "Stat",clim=10,nlim=10,bins=10,
Pclass=1,plot=TRUE,top=20,Round=2)
# Information value for categorical Independent variables
ExpCatStat(mtcars,Target="am",result = "IV",clim=10,nlim=10,bins=10,
Pclass=1,plot=FALSE,top=20,Round=2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.