performance_bin | R Documentation |
The performance_bin() calculates metrics to evaluate the performance of binned variable for binomial classification model.
performance_bin(y, x, na.rm = FALSE)
y |
character or numeric, integer, factor. a binary response variable (0, 1). The variable must contain only the integers 0 and 1 as element. However, in the case of factor/character having two levels, it is performed while type conversion is performed in the calculation process. |
x |
integer or factor, character. At least 2 different values. and Inf is not allowed. |
na.rm |
logical. a logical indicating whether missing values should be removed. |
This function is useful when used with the mutate/transmute function of the dplyr package.
an object of "performance_bin" class. vaue of data.frame is as follows.
Bin : character. bins.
CntRec : integer. frequency by bins.
CntPos : integer. frequency of positive by bins.
CntNeg : integer. frequency of negative by bins.
CntCumPos : integer. cumulate frequency of positive by bins.
CntCumNeg : integer. cumulate frequency of negative by bins.
RatePos : integer. relative frequency of positive by bins.
RateNeg : integer. relative frequency of negative by bins.
RateCumPos : numeric. cumulate relative frequency of positive by bins.
RateCumNeg : numeric. cumulate relative frequency of negative by bins.
Odds : numeric. odd ratio.
LnOdds : numeric. loged odd ratio.
WoE : numeric. weight of evidence.
IV : numeric. Jeffrey's Information Value.
JSD : numeric. Jensen-Shannon Divergence.
AUC : numeric. AUC. area under curve.
Attributes of "performance_bin" class is as follows.
names : character. variable name of data.frame with "Binning Table".
class : character. name of class. "performance_bin" "data.frame".
row.names : character. row name of data.frame with "Binning Table".
IV : numeric. Jeffrey's Information Value.
JSD : numeric. Jensen-Shannon Divergence.
KS : numeric. Kolmogorov-Smirnov Statistics.
gini : numeric. Gini index.
HHI : numeric. Herfindahl-Hirschman Index.
HHI_norm : numeric.normalized Herfindahl-Hirschman Index.
Cramer_V : numeric. Cramer's V Statistics.
chisq_test : data.frame. table of significance tests. name is as follows.
Bin A : character. first bins.
Bin B : character. second bins.
statistics : numeric. statistics of Chi-square test.
p_value : numeric. p-value of Chi-square test.
summary.performance_bin
, plot.performance_bin
, binning_by
.
# Generate data for the example
heartfailure2 <- heartfailure
set.seed(123)
heartfailure2[sample(seq(NROW(heartfailure2)), 5), "creatinine"] <- NA
# Change the target variable to 0(negative) and 1(positive).
heartfailure2$death_event_2 <- ifelse(heartfailure2$death_event %in% "Yes", 1, 0)
# Binnig from creatinine to platelets_bin.
breaks <- c(0, 1, 2, 10)
heartfailure2$creatinine_bin <- cut(heartfailure2$creatinine, breaks)
# Diagnose performance binned variable
perf <- performance_bin(heartfailure2$death_event_2, heartfailure2$creatinine_bin)
perf
summary(perf)
plot(perf)
# Diagnose performance binned variable without NA
perf <- performance_bin(heartfailure2$death_event_2, heartfailure2$creatinine_bin, na.rm = TRUE)
perf
summary(perf)
plot(perf)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.