embedded.blocks: Embedded blocks regression
In PDtoolkit: Collection of Tools for PD Rating Model Development and Validation

embedded.blocks

R Documentation

Embedded blocks regression

Description

embedded.blocks performs blockwise regression where the predictions of each blocks' model is used as an risk factor for the model of the following block.

Usage

embedded.blocks(
  method,
  target,
  db,
  coding = "WoE",
  blocks,
  p.value = 0.05,
  miv.threshold = 0.02,
  m.ch.p.val = 0.05
)

Arguments

`method`	Regression method applied on each block. Available methods: `"stepMIV"`, `"stepFWD"`, `"stepRPC"`, `"stepFWDr"`, and `"stepRPCr"`.
`target`	Name of target variable within `db` argument.
`db`	Modeling data with risk factors and target variable.
`coding`	Type of risk factor coding within the model. Available options are: `"WoE"` and `"dummy"`. If `"WoE"` is selected, then modalities of the risk factors are replaced by WoE values, while for `"dummy"` option dummies (0/1) will be created for `n-1` modalities where `n` is total number of modalities of analyzed risk factor.
`blocks`	Data frame with defined risk factor groups. It has to contain the following columns: `rf` and `block`.
`p.value`	Significance level of p-value for the estimated coefficient. For `WoE` coding this value is is directly compared to p-value of the estimated coefficient, while for `dummy` coding multiple Wald test is employed and its p-value is used for comparison with selected threshold (`p.value`). This argument is applicable only for `"stepFWD"` and `"stepRPC"` selected methods.
`miv.threshold`	MIV (marginal information value) entrance threshold applicable only for code"stepMIV" method. Only the risk factors with MIV higher than the threshold are candidate for the new model. Additional criteria is that MIV value should significantly separate good from bad cases measured by marginal chi-square test.
`m.ch.p.val`	Significance level of p-value for marginal chi-square test applicable only for code"stepMIV" method. This test additionally supports MIV value of candidate risk factor for final decision.

Value

The command embedded.blocks returns a list of three objects.
The first object (model) is the list of the models of each block (an object of class inheriting from "glm").
The second object (steps), is the data frame with risk factors selected from the each block.
The third object (dev.db), returns the list of block's model development databases.

References

Anderson, R.A. (2021). Credit Intelligence & Modelling, Many Paths through the Forest of Credit Rating and Scoring, OUP Oxford

Examples

suppressMessages(library(PDtoolkit))
data(loans)
#create risk factor priority groups
rf.all <- names(loans)[-1]
set.seed(22)
blocks <- data.frame(rf = rf.all, block = sample(1:3, length(rf.all), rep = TRUE))
blocks <- blocks[order(blocks$block), ]
blocks
#method: stepFWDr
res <- embedded.blocks(method = "stepFWDr", 
		     target = "Creditability",
		     db = loans, 
		     blocks = blocks, 
		     p.value = 0.05)
names(res)
nb <- length(res[["models"]])
res$models[[nb]]

auc.model(predictions = predict(res$models[[nb]], type = "response", 
				    newdata = res$dev.db[[nb]]),
      observed = res$dev.db[[nb]]$Creditability)

PDtoolkit documentation built on Sept. 20, 2023, 9:06 a.m.