PSI.calc: Calcuate the Population Stability Index for a Given Variable

Description Usage Arguments Details Value Examples

Description

psi calculates the popolation stability index.

Usage

1
PSI.calc(main_data, second_data, variable, bin = 10)

Arguments

main_data

The main data set of a measurement, should be a factor or numeric

variable

A variable needs to be specified. It won't work if main data and second data are factors, and it cannot be NULL if main data and current data are numerical. This function uses this argument to bin main_data and second_data with left-closed right-open intervals.

bin

The desired number of bins should be specified. Default value is 10.

secon_data

The second data set of a measurement, should be a factor or numeric

Details

psi measures the stablity of the population. Usually we can believe the population stays the same as the past if psi is less than 0.1, and a significant shift can be recognised if psi is greater than 0.25. The outcome of this function is a numeric, with details stored as attributes. You can use summary function to see all of the detailed information. Fot the situation where some of the levels has no element in either original population or current population and the psi does not exist for such levels, the empty levels will not be taken into account and a warning will inform you of this. Again, by using summary you could know everything inside.

Value

a psi object

Examples

1
2
3
4
5
6
7
data("iris")
train <- sample(nrow(iris), nrow(iris) * .7)
train.species <- iris$Species[train]
test.species <- iris$Species[-train]
p <- PSI.calc(train.species, test.species)
p
summary(p)

ayhandis/creditR documentation built on May 9, 2019, 8:41 a.m.