knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )

After installing scorecard via instructions in the README section, load the package into your environment.

```
library(scorecard)
```

Let's use the *germancredit* dataset for the purposes of this demonstration.

data("germancredit") str(germancredit)

The `var_filter`

function drops column variables that don't meet the thresholds for missing rate (> 95% by default), information value (IV) (< 0.02 by default), or identical value rate (> 95% by default).

dt_f <- var_filter(germancredit, y = "creditability")

When building scorecard models, a subset of the observations should be held out from the data used to train the model (similar to most other traditional modeling approaches), and instead be apportioned to the *test* set. We can perform this sampling to create the *train* and *test* datasets using the `split_df`

function.

dt_list <- split_df(dt_f, y = "creditability", ratio = c(0.6, 0.4), seed = 30) label_list <- lapply(dt_list, function(x) x$creditability)

Weight-of-Evidence binning is a technique for binning both continuous and categorical independent variables in a way that provides the most robust bifurcation of the data against the dependent variable. This technique can be easily executed across all independent variables using the `woebin`

function.

bins <- woebin(dt_f, y = "creditability") # woebin_plot(bins)

The user can also adjust bin breaks interactively by using the `woebin_adj`

function.

```
# breaks_adj <- woebin_adj(dt_f, y = "creditability", bins = bins)
```

Furthermore, the user can set the bin breaks manually via the `breaks_list = list()`

argument in the `woebin`

function. Note the use of *%,%* as a separator to create a single bin from two classes in a categorical independent variable.

breaks_adj <- list( age.in.years = c(26, 35, 40), other.debtors.or.guarantors = c("none", "co-applicant%,%guarantor") ) bins_adj <- woebin(dt_f, y = "creditability", breaks_list = breaks_adj)

Once your WoE bins are established for all desired independent variables, apply the binning logic to the training and test datasets.

dt_woe_list <- lapply(dt_list, function(x) woebin_ply(x, bins_adj))

Logistic regression can often be leveraged effectively to assist in building the scorecards.

m1 <- glm( creditability ~ ., family = binomial(), data = dt_woe_list$train) # vif(m1, merge_coef = TRUE) # summary(m1) # Select a formula-based model by AIC (or by LASSO for large dataset) m_step <- step(m1, direction = "both", trace = FALSE) m2 <- eval(m_step$call) # vif(m2, merge_coef = TRUE) # summary(m2)

If oversampling is a concern, the following code chunk could be uncommented and run to help adjust for this issue.

# Read documentation on handling oversampling (support.sas.com/kb/22/601.html) # library(data.table) # p1 <- 0.03 # bad probability in population # r1 <- 0.3 # bad probability in sample dataset # dt_woe <- copy(dt_woe_list$train)[, weight := ifelse(creditability == 1, p1/r1, (1-p1)/(1-r1) )][] # fmla <- as.formula(paste("creditability ~", paste(names(coef(m2))[-1], collapse = "+"))) # m3 <- glm(fmla, family = binomial(), data = dt_woe, weights = weight)

The `perf_eva`

function provides model accuracy statistics (such as mse, rmse, logloss, r2, ks, auc, gini) and plots (such as ks, lift, gain, roc, lz, pr, f1, density).

# First, get probabalistic predictions pred_list <- lapply(dt_woe_list, function(x) predict(m2, x, type = 'response')) # Then evaluate model accuracy perf <- perf_eva(pred = pred_list, label = label_list)

Once the model has been selected, scorecards can be created via the `scorecard`

function. Note that the default target points is 600, target odds is 1/19 and points to double the odds is 50. See `?scorecard`

for more information on the function and its arguments.

The scorecard can then be applied to the original data using the `scorecard_ply`

function. Lastly, a chart encompassing Population Stability Index (PSI) statistics can be rendered via the `perf_psi`

function.

# Build the card card <- scorecard(bins_adj, m2) # Obtain Credit Scores score_list <- lapply(dt_list, function(x) scorecard_ply(x, card)) # Analyze the PSI perf_psi(score = score_list, label = label_list)

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.