psi: Population Stability Index (PSI)

View source: R/15_PSI.R

psiR Documentation

Population Stability Index (PSI)

Description

psi calculates Population Stability Index (PSI) for a given base and target vectors. Function can be used for testing the stability of final model score, but also for testing a risk factor stability (aka Characteristic Stability Index). Function also provides so-called critical values of z-score (based on normal distribution assumption) and chi-square (based on Chi-square distribution) that can be used as alternatives for fixed "rule of thumb" thresholds (10% and 25%). For details see the Reference.

Usage

psi(base, target, bin = 10, alpha = 0.05)

Arguments

base

Vector of value from base sample. Usually this is training (model development) sample.

target

Vector of value from target sample. Usually this is testing or portfolio application sample.

bin

Number of bins. Applied only for numeric base and target and used for discretization of its values. Default is 10.

alpha

Significance level used for calculation of statistical critical values (cv.zscore and cv.chisq). Default is 0.05, which refers to 0.95 confidence interval.

Value

The command psi returns a list of two data frames. The first data frame contains values of PSI along with statistical critical values for confidence level of 1 - alpha, while second data frame presents summary table used for the calculation of overall PSI. For numeric base and target vectors, summary table is presented on the bin (bucket level), while for the categorical modalities of base and target vectors are tabulated.

References

Yurdakul, B. (2018). Statistical Properties of Population Stability Index . Dissertations. 3208. downloaded from here

Examples

suppressMessages(library(PDtoolkit))
data(loans)
#split on training and testing data set
set.seed(1122)
tt.indx <- sample(1:nrow(loans), 700, replace = FALSE)
training <- loans[tt.indx, ]
testing <- loans[-tt.indx, ]
#calculate psi for numeric risk factor
psi(base = training[, "Age (years)"], target = testing[, "Age (years)"], 
   bin = 10, alpha = 0.05)
#calculate psi for categorical risk factor
psi(base = training[, "Account Balance"], target = testing[, "Account Balance"], 
   bin = 10, alpha = 0.05)

PDtoolkit documentation built on Sept. 20, 2023, 9:06 a.m.