credit | R Documentation |
Preprocessed version of the German Credit Risk dataset available on kaggle, based on the Statlog (German credit dataset) of Hofmann (1994) available on UCI.
A data frame with 522 and 6 variables:
age of the customer [19-75]
sex of the customer (female, male)
saving account balance of the customer (little, moderate, rich)
payback duration of credit (in month) [6-72]
credit amount [276-18424]
whether the credit is of low/good or high/bad risk (bad, good)
The dataset was further adapted: rows with missing values were removed, low-cardinal classes were binned, classes of the job feature were renamed, the features on the savings and checking account were defined as ordinal variables, and all feature names were transposed to lower. Only a subset of features was selected: "age", "sex", "saving.accounts", "duration", "credit.amount", "risk".
Hofmann, Hans (1994). “Statlog (German Credit Data).” UCI Machine Learning Repository. https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data.
Ferreira L (2018). “German credit risk.” Last accessed 10.04.2024, https://www.kaggle.com/datasets/kabure/german-credit-data-with-risk.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.