generateFeatureImportanceData: Generate feature importance.
In guillermozbta/s2: Machine Learning in R

Description Usage Arguments Value References See Also Examples

View source: R/generateFeatureImportance.R

Estimate how important individual features or groups of features are by contrasting prediction performances. For method “permutation.importance” compute the change in performance from permuting the values of a feature (or a group of features) and compare that to the predictions made on the unmcuted data.

generateFeatureImportanceData(task, method = "permutation.importance",
  learner, features = getTaskFeatureNames(task), interaction = FALSE,
  measure, contrast = function(x, y) x - y, aggregation = mean, nmc = 50L,
  replace = TRUE, local = FALSE)

`task`	[`Task`] The task.
`method`	[`character(1)`] The method used to compute the feature importance. The only method available is “permutation.importance”. Default is “permutation.importance”.
`learner`	[`Learner` \| `character(1)`] The learner. If you pass a string the learner will be created via `makeLearner`.
`features`	[`character`] The features to compute the importance of. The default is all of the features contained in the `Task`.
`interaction`	[`logical(1)`] Whether to compute the importance of the `features` argument jointly. For `method = "permutation.importance"` this entails permuting the values of all `features` together and then contrasting the performance with that of the performance without the features being permuted. The default is `FALSE`.
`measure`	[`Measure`] Performance measure. Default is the first measure used in the benchmark experiment.
`contrast`	[`function`] A difference function that takes a numeric vector and returns a numeric vector of the same length. The default is element-wise difference between the vectors.
`aggregation`	[`function`] A function which aggregates the differences. This function must take a numeric vector and return a numeric vector of length 1. The default is `mean`.
`nmc`	[`integer(1)`] The number of Monte-Carlo iterations to use in computing the feature importance. If `nmc == -1` and `method = "permutation.importance"` then all permutations of the `features` are used. The default is 50.
`replace`	[`logical(1)`] Whether or not to sample the feature values with or without replacement. The default is `TRUE`.
`local`	[`logical(1)`] Whether to compute the per-observation importance. The default is `FALSE`.

[FeatureImportance]. A named list which contains the computed feature importance and the input arguments.

Object members:

`res`	[`data.frame`] Has columns for each feature or combination of features (colon separated) for which the importance is computed. A row coresponds to importance of the feature specified in the column for the target.
`interaction`	[`logical(1)`] Whether or not the importance of the `features` was computed jointly rather than individually.
`measure`	[`Measure`]

The measure used to compute performance.

`contrast`	[`function`] The function used to compare the performance of predictions.
`aggregation`	[`function`] The function which is used to aggregate the contrast between the performance of predictions across Monte-Carlo iterations.
`replace`	[`logical(1)`] Whether or not, when `method = "permutation.importance"`, the feature values are sampled with replacement.
`nmc`	[`integer(1)`] The number of Monte-Carlo iterations used to compute the feature importance. When `nmc == -1` and `method = "permutation.importance"` all permutations are used.
`local`	[`logical(1)`] Whether observation-specific importance is computed for the `features`.

Jerome Friedman; Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232.

Other generate_plot_data: generateCalibrationData, generateCritDifferencesData, generateFilterValuesData, generateFunctionalANOVAData, generateLearningCurveData, generatePartialDependenceData, generateThreshVsPerfData, getFilterValues, plotFilterValues

lrn = makeLearner("classif.rpart", predict.type = "prob")
fit = train(lrn, iris.task)
imp = generateFeatureImportanceData(iris.task, "permutation.importance",
  lrn, "Petal.Width", nmc = 10L, local = TRUE)