Description Usage Arguments Details Examples
Calculates information value for defined columns in given data frame. Columns can have numeric or character type (including factors).
1 2 |
df |
data frame with at least two columns |
y |
column (integer or factor) with binary outcome. It is suggested that y is factor with two levels "bad" and "good" If there are no levels good/bad than the following assumptions are applied - if y is integer, than 0=good and 1=bad. If y is factor than level 2 is assumed to mean bad and 1 good. |
summary |
Only total information value for variable is returned when summary is TRUE. Output is sorted by information value, starting with highest value. |
vars |
List of variables. If not specified, all character variables will be used |
verbose |
Prints additional details when TRUE. Useful mainly for debugging. |
rcontrol |
Additional parameters used for rpart tree
generation. Use |
Information Value (IV) is concept used in risk management to assess predictive power of variable. IV is defined as: WoE (Weight of Evidence) is defined as:
1 2 3 4 5 6 7 | iv.mult(german_data,"gb")
iv.mult(german_data,"gb",TRUE)
iv.mult(german_data,"gb",TRUE,c("ca_status","housing","job","duration")) # str(german_data)
iv.mult(german_data,"gb",vars=c("ca_status","housing","job","duration"))
iv.mult(german_data,"gb",summary=TRUE, verbose=TRUE)
# Use varlist() function to get all numeric variables
iv.mult(german_data,y="gb",vars=varlist(german_data,"numeric"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.