get_iv_all: Calculate Information Value (IV) 'get_iv' is used to...

Description Usage Arguments Details References See Also Examples

View source: R/data_anaylsis.R

Description

Calculate Information Value (IV) get_iv is used to calculate Information Value (IV) of an independent variable. get_iv_all can loop through IV for all specified independent variables.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
get_iv_all(
  dat,
  x_list = NULL,
  ex_cols = NULL,
  breaks_list = NULL,
  target = NULL,
  pos_flag = NULL,
  best = TRUE,
  equal_bins = FALSE,
  tree_control = NULL,
  bins_control = NULL,
  g = 10,
  parallel = FALSE,
  note = FALSE
)

get_iv(
  dat,
  x,
  target = NULL,
  pos_flag = NULL,
  breaks = NULL,
  breaks_list = NULL,
  best = TRUE,
  equal_bins = FALSE,
  tree_control = NULL,
  bins_control = NULL,
  g = 10,
  note = FALSE
)

Arguments

dat

A data.frame with independent variables and target variable.

x_list

Names of independent variables.

ex_cols

A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

breaks_list

A table containing a list of splitting points for each independent variable. Default is NULL.

target

The name of target variable.

pos_flag

Value of positive class, Default is "1".

best

Logical, merge initial breaks to get optimal breaks for binning.

equal_bins

Logical, generates initial breaks for equal frequency binning.

tree_control

Parameters of using Decision Tree to segment initial breaks. See detials: get_tree_breaks

bins_control

Parameters used to control binning. See detials: select_best_class, select_best_breaks

g

Number of initial breakpoints for equal frequency binning.

parallel

Logical, parallel computing. Default is FALSE.

note

Logical, outputs info. Default is TRUE.

x

The name of an independent variable.

breaks

Splitting points for an independent variable. Default is NULL.

Details

IV Rules of Thumb for evaluating the strength a predictor Less than 0.02:unpredictive 0.02 to 0.1:weak 0.1 to 0.3:medium 0.3 + :strong

References

Information Value Statistic:Bruce Lund, Magnify Analytics Solutions, a Division of Marketing Associates, Detroit, MI(Paper AA - 14 - 2013)

See Also

get_iv,get_iv_all,get_psi,get_psi_all

Examples

1
2
3
4
5
6
7
8
get_iv_all(dat = UCICreditCard,
 x_list = names(UCICreditCard)[3:10],
 equal_bins = TRUE, best = FALSE,
 target = "default.payment.next.month",
 ex_cols = "ID|apply_date")
get_iv(UCICreditCard, x = "PAY_3",
       equal_bins = TRUE, best = FALSE,
 target = "default.payment.next.month")

Example output

Package 'creditmodel' version 1.2.7
    Feature    IV     strength
1 LIMIT_BAL 0.178       Strong
2       SEX 0.009 Unpredictive
3 EDUCATION 0.038         Weak
4  MARRIAGE 0.008 Unpredictive
5       AGE 0.021         Weak
6     PAY_0 0.874  Very Strong
7     PAY_2 0.545  Very Strong
8     PAY_3 0.413  Very Strong
  Feature    IV    strength
1   PAY_3 0.413 Very Strong

creditmodel documentation built on Jan. 7, 2022, 5:06 p.m.