germancredit: Modified german credit dataset

Description Usage Format Source

Description

germancredit is a credit scoring data set that can be used to study algorithmic (un)fairness. This data was used to predict defaults on consumer loans in the German market. In this dataset, a model to predict default has already been fit and predicted probabilities and predicted status (yes/no) for default have been concatenated to the original data.

Usage

1

Format

A data frame with 1000 rows and 23 variables:

Account_status

factor, status of existing checking account

Duration

numeric, loan duration in month

Credit_history

factor, previous credit history

Purpose

factor, loan purpose

Amount

numeric, credit amount

Savings

factor, savings account/bonds

Employment

factor, present employment since

Installment_rate

numeric, installment rate in percentage of disposable income

Guarantors

factor, other debtors / guarantors

Resident_since

factor, present residence since

Property

factor, property

Age

numeric, age in years

Other_plans

factor, other installment plans

Housing

factor, housing

Num_credits

numeric, Number of existing credits at this bank

Job

factor, job

People_maintenance

numeric, number of people being liable to provide maintenance for

Phone

factor, telephone

Foreign

factor, foreign worker

BAD

factor, GOOD/BAD for whether a customer has defaulted on a loan. This is the outcome or target in this dataset

Female

factor, female/male for gender

probability

numeric, predicted probabilities for default, ranges from 0 to 1

predicted

numeric, predicted values for default, 0/1 for no/yes

Source

The dataset has undergone modifications (e.g. categorical variables were encoded, prediction model was fit and predicted probabilities and predicted status were concatenated to the original dataset).


fairness documentation built on April 14, 2021, 5:09 p.m.