preBinningFun: Variable Pre-Binning, then Computing WOE and IV

Description Usage Arguments Details Value See Also

Description

preBinningFun is often used to pre-binning with returning plots and a lots of details for variables filtering, and it has integrated 5 different binning methods(see details). Important notes: the calculated WOE value in this function is the opposite of the actual value.

Usage

1
2
preBinningFun(mydata, binMethod = 1, p = 0.05, aliquots = 5,
  mydict = NULL)

Arguments

mydata

A data frame of dataset consisting of only Xs and Y variables, the last column must be Y. All character x variables must be converted to factors in advance.

binMethod

An integer from 1 to 5, indicates 5 different binning methods(see details), default 1.

p

A numeric, means percentage of records per bin, from 0 to 0.5, default 0.05.

aliquots

An integer, specifies the number of bins for equal-frequency or equal-interval binning method, default 5.

mydict

Optional, default NULL. File name character of variable dictionary with csv file extension, or a dataframe representing the variable dictionary, see details of this parameter in fread_basedict.

Details

binMethod=c(1,2,3,4,5), means: 1 means optimal binning, and equal-frequency binning is an alternative when optimal binning is not available. 2 means optimal binning, and equal-interval binning is an alternative when optimal binning is not available. 3 means equal-frequency binning. 4 means equal-interval binning. 5 means optimal binning only.

this function will generate four files in current directory, including 'binGraph.pdf', 'varSummary.csv', 'summaryIV.csv' and 'insignificantVars.csv'(if it exists), as well as mass csv files in '~/binDetails/' subdirectory. If the subdirectory does not exist, it will be created automatically.

Value

A list

See Also

Other dataset binning and woe-encoding functions: convertCutPoints, dfBinningFun, executeBinFun_df, genConfigList, smbinning2, woeEncodeFun_df


xxzcool/scoremodel documentation built on May 4, 2019, 10:56 a.m.