var_importance: Variable importance for uplift trees and uplift random...

View source: R/utree.R

var_importance.uforestR Documentation

Variable importance for uplift trees and uplift random forest.

Description

This is the extractor function for variable importance measures as produced by utree and uforest.

Usage

## S3 method for class 'uforest'
var_importance(x, type = "I", valid.data = NULL,
  error.fun = "sel")

## S3 method for class 'utree'
var_importance(x, type = "I", valid.data = NULL,
  error.fun = "sel")

Arguments

x

An object of class "utree" or "uforest"

type

Either "I" or "II", specifying the type of importance measure. See details.

valid.data

For type = "II", importance is measured based on a validation data frame, which must be provided.

error.fun

The prediction error used to compute variable importance when type = "II". Possible values are "sel" for squared-error loss (default), or "abs" for absolute loss. See details.

Details

For type I, the measure of importance given to a predictor is the sum of the values given by the split-criterion produced over all internal nodes for which it was chosen as the splitting variable. For uplift random forest, this relative influence measure is naturally extended by averaging the importance for each variable over the collection of trees. For type II, variable importance is measured based on an independent validation sample, with the aim of quantifying the prediction strength of each variable. This is achieved by first measuring the prediction accuracy on this validation sample. Subsequently, the values for the jth variable are randomly permuted, and the accuracy again computed. The decrease in accuracy as a result of this permutation is the importance attributed to the jth variable.The accuracy is measured by the squared-error or absolute error between the predicted and true uplift on each terminal node of the tree.

Value

A data frame with the variable importance.

Author(s)

Leo Guelman leo.guelman@gmail.com

Examples

set.seed(1)
df <- sim_uplift(n = 1000, p = 50, response = "binary")
form <- create_uplift_formula(x = names(df)[-c(1:3)], y = "y", trt = "T")
fit <- utree(form, data = df, maxdepth = 3)
var_importance(fit)

leoguelman/uplift2 documentation built on April 15, 2022, 4:34 a.m.