generateFeatureImportanceData: Generate feature importance.

Description Usage Arguments Value References See Also Examples

View source: R/generateFeatureImportance.R

Description

Estimate how important individual features or groups of features are by contrasting prediction performances. For method “permutation.importance” compute the change in performance from permuting the values of a feature (or a group of features) and compare that to the predictions made on the unmcuted data.

Usage

1
2
3
4
generateFeatureImportanceData(task, method = "permutation.importance",
  learner, features = getTaskFeatureNames(task), interaction = FALSE,
  measure, contrast = function(x, y) x - y, aggregation = mean, nmc = 50L,
  replace = TRUE, local = FALSE)

Arguments

task

[Task]
The task.

method

[character(1)]
The method used to compute the feature importance. The only method available is “permutation.importance”. Default is “permutation.importance”.

learner

[Learner | character(1)]
The learner. If you pass a string the learner will be created via makeLearner.

features

[character]
The features to compute the importance of. The default is all of the features contained in the Task.

interaction

[logical(1)]
Whether to compute the importance of the features argument jointly. For method = "permutation.importance" this entails permuting the values of all features together and then contrasting the performance with that of the performance without the features being permuted. The default is FALSE.

measure

[Measure]
Performance measure. Default is the first measure used in the benchmark experiment.

contrast

[function]
A difference function that takes a numeric vector and returns a numeric vector of the same length. The default is element-wise difference between the vectors.

aggregation

[function]
A function which aggregates the differences. This function must take a numeric vector and return a numeric vector of length 1. The default is mean.

nmc

[integer(1)]
The number of Monte-Carlo iterations to use in computing the feature importance. If nmc == -1 and method = "permutation.importance" then all permutations of the features are used. The default is 50.

replace

[logical(1)]
Whether or not to sample the feature values with or without replacement. The default is TRUE.

local

[logical(1)]
Whether to compute the per-observation importance. The default is FALSE.

Value

[FeatureImportance]. A named list which contains the computed feature importance and the input arguments.

Object members:

res

[data.frame]
Has columns for each feature or combination of features (colon separated) for which the importance is computed. A row coresponds to importance of the feature specified in the column for the target.

interaction

[logical(1)]
Whether or not the importance of the features was computed jointly rather than individually.

measure

[Measure]


The measure used to compute performance.

contrast

[function]
The function used to compare the performance of predictions.

aggregation

[function]
The function which is used to aggregate the contrast between the performance of predictions across Monte-Carlo iterations.

replace

[logical(1)]
Whether or not, when method = "permutation.importance", the feature values are sampled with replacement.

nmc

[integer(1)]
The number of Monte-Carlo iterations used to compute the feature importance. When nmc == -1 and method = "permutation.importance" all permutations are used.

local

[logical(1)]
Whether observation-specific importance is computed for the features.

References

Jerome Friedman; Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232.

See Also

Other generate_plot_data: generateCalibrationData, generateCritDifferencesData, generateFilterValuesData, generateFunctionalANOVAData, generateLearningCurveData, generatePartialDependenceData, generateThreshVsPerfData, getFilterValues, plotFilterValues

Examples

1
2
3
4
lrn = makeLearner("classif.rpart", predict.type = "prob")
fit = train(lrn, iris.task)
imp = generateFeatureImportanceData(iris.task, "permutation.importance",
  lrn, "Petal.Width", nmc = 10L, local = TRUE)

guillermozbta/s2 documentation built on May 17, 2019, 4:01 p.m.