Statlog German Credit
The dataset contains data of past credit applicants. The applicants are rated as good or bad. Models of this data can be used to determine if new applicants present a good or bad credit risk.
A data frame containing 1,000 observations on 21 variables.
factor variable indicating the status of the existing checking account, with levels
... < 100 DM,
0 <= ... < 200 DM,
... >= 200 DM/salary for at least 1 yearand
no checking account.
duration in months.
factor variable indicating credit history, with levels
no credits taken/all credits paid back duly,
all credits at this bank paid back duly,
existing credits paid back duly till now,
delay in paying off in the pastand
critical account/other credits existing.
factor variable indicating the credit's purpose, with levels
factor. savings account/bonds, with levels
... < 100 DM,
100 <= ... < 500 DM,
500 <= ... < 1000 DM,
... >= 1000 DMand
unknown/no savings account.
ordered factor indicating the duration of the current employment, with levels
... < 1 year,
1 <= ... < 4 years,
4 <= ... < 7 yearsand
... >= 7 years.
installment rate in percentage of disposable income.
factor variable indicating personal status and sex, with levels
factor. Other debtors, with levels
present residence since?
factor variable indicating the client's highest valued property, with levels
building society savings agreement/life insurance,
car or otherand
factor variable indicating other installment plans, with levels
factor variable indicating housing, with levels
number of existing credits at this bank.
factor indicating employment status, with levels
unemployed/unskilled - non-resident,
unskilled - resident,
management/self-employed/highly qualified employee/officer.
Number of people being liable to provide maintenance.
binary variable indicating if the customer has a registered telephone number.
binary variable indicating if the customer is a foreign worker.
binary variable indicating credit risk, with levels
The use of a cost matrix is suggested for this dataset. It is worse to class a customer as good when they are bad (cost = 5), than it is to class a customer as bad when they are good (cost = 1).
The original data was provided by:
Professor Dr. Hans Hofmann, Institut fuer Statistik und Oekonometrie, Universitaet Hamburg, FB Wirtschaftswissenschaften, Von-Melle-Park 5, 2000 Hamburg 13
The dataset has been taken from the UCI Repository Of Machine Learning Databases at
1 2 3 4 5 6 7 8 9 10 11 12
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.