Description Usage Arguments Value Examples
View source: R/feature_importance_permutation.R
returns which columns are the most important in the fitted model. This is done by permuting the inputs and measuring the deterioration of the metric This implementation is not suitbile for one-hot encoded categorical variables. The permutation importance is defined to be the difference between the baseline metric and metric from permutating the feature column.
1 2 3 4 5 6 7 8 9 10 | feature_importance_permutation(
data,
model,
actual,
weight = rep(1, nrow(data)),
metric = metric_rmse,
nrounds = 10,
seed = 666,
...
)
|
data |
dataframe - data from which the model can give predictions.
For xgboost it must contain only features used in |
model |
model object - tested examples are lm glm and xgboost |
actual |
vector[Numeric] - target to be predicted. Must be normalised by exposure |
weight |
vector[Numeric] - exposure for predictions |
metric |
function - Of type in admr::metric_ - must have arguments actual, predicted, weight |
nrounds |
integer - Number of times to permute each feature |
seed |
integer - random seed for permuations |
... |
OPTIONAL: - Arguments included but not defined above will be carried through to the metric |
dataframe with columns
:col_index - position of feature in data
:feature - name of feature in data
:importance_mean - importance of feature
:importance_sd - standard deviation of importance, will be NA if nrounds = 1
this data can be used to find the most important features in the model
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | input_data <- data.frame(x1=runif(100, 0, 25), x2=runif(100, 0, 25), x3=runif(100, 0, 25)) %>%
mutate(target=x1^2 * 0.01 + x2 + rnorm(n(),sd=5))
#LM
model_lm <- glm(target ~ poly(x1, 2) + x2, data=input_data)
feature_importance_permutation(data=input_data %>% select(-target), model=model_lm, actual=input_data[["target"]])
#GLM
model_glm <- glm(target ~ poly(x1, 2) + x2 + x3, data=input_data)
feature_importance_permutation(data=input_data %>% select(-target), model=model_glm, actual=input_data[["target"]])
#GBM
model_gbm <- xgboost(data = as.matrix(input_data %>% select(-target)), label=input_data[["target"]], nrounds=20, verbose = 0)
feature_importance_permutation(model=model_gbm,
data=input_data %>% select(-target),
actual=input_data[["target"]])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.