Description Usage Format Details Source Examples
The dataset contains data of past credit applicants. The applicants are rated as good or bad. Models of this data can be used to determine if new applicants present a good or bad credit risk.
1 | data("GermanCredit")
|
A data frame containing 1,000 observations on 21 variables.
factor variable indicating the status of the existing checking account, with levels ... < 0 DM
, 0 <= ... < 200 DM
, ... >= 200 DM/salary for at least 1 year
and no checking account
.
duration in months.
factor variable indicating credit history, with levels no credits taken/all credits paid back duly
, all credits at this bank paid back duly
, existing credits paid back duly till now
, delay in paying off in the past
and critical account/other credits existing
.
factor variable indicating the credit's purpose, with levels car (new)
, car (used)
, furniture/equipment
, radio/television
, domestic appliances
, repairs
, education
, retraining
, business
and others
.
credit amount.
factor. savings account/bonds, with levels ... < 100 DM
, 100 <= ... < 500 DM
, 500 <= ... < 1000 DM
, ... >= 1000 DM
and unknown/no savings account
.
ordered factor indicating the duration of the current employment, with levels unemployed
, ... < 1 year
, 1 <= ... < 4 years
, 4 <= ... < 7 years
and ... >= 7 years
.
installment rate in percentage of disposable income.
factor variable indicating personal status and sex, with levels
male:divorced/separated
, female:divorced/separated/married
,
male:single
, male:married/widowed
and female:single
.
factor. Other debtors, with levels none
, co-applicant
and guarantor
.
present residence since?
factor variable indicating the client's highest valued property, with levels real estate
, building society savings agreement/life insurance
, car or other
and unknown/no property
.
client's age.
factor variable indicating other installment plans, with levels bank
, stores
and none
.
factor variable indicating housing, with levels rent
, own
and for free
.
number of existing credits at this bank.
factor indicating employment status, with levels unemployed/unskilled - non-resident
, unskilled - resident
, skilled employee/official
and management/self-employed/highly qualified employee/officer
.
Number of people being liable to provide maintenance.
binary variable indicating if the customer has a registered telephone number.
binary variable indicating if the customer is a foreign worker.
binary variable indicating credit risk, with levels good
and bad
.
The use of a cost matrix is suggested for this dataset. It is worse to class a customer as good when they are bad (cost = 5), than it is to class a customer as bad when they are good (cost = 1).
The original data was provided by:
Professor Dr. Hans Hofmann, Institut fuer Statistik und Oekonometrie, Universitaet Hamburg, FB Wirtschaftswissenschaften, Von-Melle-Park 5, 2000 Hamburg 13
The dataset has been taken from the UCI Repository Of Machine Learning Databases at
http://archive.ics.uci.edu/ml/.
1 2 3 4 5 6 7 8 9 10 11 12 13 | data("GermanCredit")
summary(GermanCredit)
## Not run:
gcw <- array(1, nrow(GermanCredit))
gcw[GermanCredit$credit_risk == "bad"] <- 5
suppressWarnings(RNGversion("3.5.0"))
set.seed(1090)
gct <- evtree(credit_risk ~ . , data = GermanCredit, weights = gcw)
gct
table(predict(gct), GermanCredit$credit_risk)
plot(gct)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.