lgbm.fi: LightGBM Feature Importance

Description Usage Arguments Value Examples

Description

This function allows to get the feature importance on a LightGBM model. The model file must be "workingdir", where "workingdir" is the folder and input_model is the model file name.

Usage

1
2
3
lgbm.fi(model, workingdir = ifelse(is.list(model), model[["Path"]], getwd()),
  input_model = ifelse(is.list(model), model[["Name"]], "lgbm_model.txt"),
  feature_names = NA, ntreelimit = 0, data.table = TRUE)

Arguments

model

Type: list. The model file. If a character vector is provided, it is considered to be the model which is going to be saved as input_model. If a list is provided, it is used to setup to fetch the correct variables, which you can override by setting the arguments manually. If a single value is provided (like NA), then it is ignored and uses the other arguments to fetch the model locally.

workingdir

Type: character. The working directory of the model file. Defaults to ifelse(is.list(model), model[["Path"]], getwd()), which means "take the model working directory if provided the model list, else take the default working directory".

input_model

Type: character. The file name of the model. Defaults to ifelse(is.list(model), model[["Name"]], 'lgbm_model.txt'), which means "take the input model name if provided the model list, else take "lgbm_model.txt".

feature_names

Type: vector of characters. The names of the features, in the order they were fed to LightGBM. Returns column numbers if left as NA. Defaults to NA.

ntreelimit

Type: integer. The number of trees to select, starting from the first tree. Defaults to 0.

data.table

Type: boolean. Whether to return a data.table (TRUE) or a data.frame (FALSE). Defaults to TRUE.

Value

A data.table (or data.frame) with 9 columns: c("Feature", "Gain", "Gain_Rel_Ratio", "Gain_Abs_Ratio", "Gain_Std", "Gain_Std_Rel_Ratio", "Gain_Std_Abs_Ratio", "Freq", "Freq_Rel_Ratio", "Freq_Abs_Ratio")

Examples

1
2
3
4
5
6
7
8
## Not run: 
# Feature importance on a single model without any tree limit.
lgbm.fi(model = trained, feature_names = colnames(data), ntreelimit = 0)

# Feature importance on the first model from a cross-validation without any tree limit.
lgbm.fi(model = trained.cv[["Models"]][[1]], feature_names = colnames(data))

## End(Not run)

Laurae2/Laurae documentation built on May 8, 2019, 7:59 p.m.