get_psi_iv_all: Calculate IV & PSI

Description Usage Arguments See Also Examples

View source: R/data_anaylsis.R

Description

get_iv_psi is used to calculate Information Value (IV) and Population Stability Index (PSI) of an independent variable. get_iv_psi_all can loop through IV & PSI for all specified independent variables.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
get_psi_iv_all(
  dat,
  dat_test = NULL,
  x_list = NULL,
  target,
  ex_cols = NULL,
  pos_flag = NULL,
  breaks_list = NULL,
  occur_time = NULL,
  oot_pct = 0.7,
  equal_bins = FALSE,
  cut_bin = "equal_depth",
  tree_control = NULL,
  bins_control = NULL,
  bins_total = FALSE,
  best = TRUE,
  g = 10,
  as_table = TRUE,
  note = FALSE,
  parallel = FALSE,
  bins_no = TRUE
)

get_psi_iv(
  dat,
  dat_test = NULL,
  x,
  target,
  pos_flag = NULL,
  breaks = NULL,
  breaks_list = NULL,
  occur_time = NULL,
  oot_pct = 0.7,
  equal_bins = FALSE,
  cut_bin = "equal_depth",
  tree_control = NULL,
  bins_control = NULL,
  bins_total = FALSE,
  best = TRUE,
  g = 10,
  as_table = TRUE,
  note = FALSE,
  bins_no = TRUE
)

Arguments

dat

A data.frame with independent variables and target variable.

dat_test

A data.frame of test data. Default is NULL.

x_list

Names of independent variables.

target

The name of target variable.

ex_cols

A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

pos_flag

The value of positive class of target variable, default: "1".

breaks_list

A table containing a list of splitting points for each independent variable. Default is NULL.

occur_time

The name of the variable that represents the time at which each observation takes place.

oot_pct

Percentage of observations retained for overtime test (especially to calculate PSI). Defualt is 0.7

equal_bins

Logical, generates initial breaks for equal frequency or width binning.

cut_bin

A string, if equal_bins is TRUE, 'equal_depth' or 'equal_width', default is 'equal_depth'.

tree_control

Parameters of using Decision Tree to segment initial breaks. See detials: get_tree_breaks

bins_control

Parameters used to control binning. See detials: select_best_class, select_best_breaks

bins_total

Logical, total sum for each variable.

best

Logical, merge initial breaks to get optimal breaks for binning.

g

Number of initial breakpoints for equal frequency binning.

as_table

Logical, output results in a table. Default is TRUE.

note

Logical, outputs info. Default is TRUE.

parallel

Logical, parallel computing. Default is FALSE.

bins_no

Logical, add serial numbers to bins. Default is FALSE.

x

The name of an independent variable.

breaks

Splitting points for an independent variable. Default is NULL.

See Also

get_iv,get_iv_all,get_psi,get_psi_all

Examples

1
2
3
4
5
iv_list = get_psi_iv_all(dat = UCICreditCard[1:1000, ],
x_list = names(UCICreditCard)[3:5], equal_bins = TRUE,
target = "default.payment.next.month", ex_cols = "ID|apply_date")
get_psi_iv(UCICreditCard, x = "PAY_3",
target = "default.payment.next.month",bins_total = TRUE)

Example output

Package 'creditmodel' version 1.2.7
  Feature         bins cuts #total #expected expected_0 expected_1 #actual
1   PAY_3        00.NA   -1   5938      4089       3449        640    1849
2   PAY_3 01.(-Inf,-1]    1   4085      2859       2324        535    1226
3   PAY_3    02.(-1,1]  Inf  15768     11077       9167       1910    4691
4   PAY_3  03.(1, Inf] <NA>   4209      2975       1410       1565    1234
5   Total           --   --  30000     21000      16350       4650    9000
  actual_0 actual_1 %total %expected %actual %total_1 %expected_1 %actual_1
1     1563      286    0.2      0.19    0.21     0.16        0.16      0.15
2     1004      222   0.14      0.14    0.14     0.19        0.19      0.18
3     3849      842   0.53      0.53    0.52     0.17        0.17      0.18
4      598      636   0.14      0.14    0.14     0.52        0.53      0.52
5     7014     1986      1         1       1     0.22        0.22      0.22
  odds_ratio odds_ratio_s  PSIi   IVi
1      1.537            0 0.001 0.032
2      1.249        0.002     0 0.006
3      1.343        0.004     0 0.042
4      0.259            0     0 0.332
5          1        0.006 0.001 0.412

creditmodel documentation built on Jan. 7, 2022, 5:06 p.m.