stepFWD | R Documentation |
stepFWD
customized stepwise regression with p-value and trend check. Trend check is performed
comparing observed trend between target and analyzed risk factor and trend of the estimated coefficients within the
logistic regression. Note that procedure checks the column names of supplied db
data frame therefore some
renaming (replacement of special characters) is possible to happen. For details check help example.
stepFWD(
start.model,
p.value = 0.05,
coding = "WoE",
coding.start.model = TRUE,
check.start.model = TRUE,
db,
offset.vals = NULL
)
start.model |
Formula class that represents starting model. It can include some risk factors, but it can be
defined only with intercept ( |
p.value |
Significance level of p-value of the estimated coefficients. For |
coding |
Type of risk factor coding within the model. Available options are: |
coding.start.model |
Logical ( |
check.start.model |
Logical ( |
db |
Modeling data with risk factors and target variable. All risk factors (apart from the risk factors from the starting model) should be categorized and as of character type. |
offset.vals |
This can be used to specify an a priori known component to be included in the linear predictor during fitting.
This should be |
The command stepFWD
returns a list of four objects.
The first object (model
), is the final model, an object of class inheriting from "glm"
.
The second object (steps
), is the data frame with risk factors selected at each iteration.
The third object (warnings
), is the data frame with warnings if any observed.
The warnings refer to the following checks: if risk factor has more than 10 modalities,
if any of the bins (groups) has less than 5% of observations and
if there are problems with WoE calculations.
The final, fourth, object dev.db
returns the model development database.
suppressMessages(library(PDtoolkit))
data(loans)
#identify numeric risk factors
num.rf <- sapply(loans, is.numeric)
num.rf <- names(num.rf)[!names(num.rf)%in%"Creditability" & num.rf]
#discretized numeric risk factors using ndr.bin from monobin package
loans[, num.rf] <- sapply(num.rf, function(x)
ndr.bin(x = loans[, x], y = loans[, "Creditability"])[[2]])
str(loans)
res <- stepFWD(start.model = Creditability ~ 1,
p.value = 0.05,
coding = "dummy",
db = loans)
summary(res$model)$coefficients
rf.check <- tapply(res$dev.db$Creditability,
res$dev.db$Instalment_per_cent,
mean)
rf.check
diff(rf.check)
res$steps
head(res$dev.db)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.