Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/find.interaction.rfsrc.R
Find pairwise interactions between variables.
1 2 3 4 5 6 7  ## S3 method for class 'rfsrc'
find.interaction(object, xvar.names, cause, m.target,
importance = c("permute", "random", "anti",
"permute.ensemble", "random.ensemble", "anti.ensemble"),
method = c("maxsubtree", "vimp"), sorted = TRUE, nvar, nrep = 1, subset,
na.action = c("na.omit", "na.impute"),
seed = NULL, do.trace = FALSE, verbose = TRUE, ...)

object 
An object of class 
xvar.names 
Character vector of names of target xvariables. Default is to use all variables. 
cause 
For competing risk families, integer value between 1
and 
m.target 
Character value for multivariate families specifying the target outcome to be used. If left unspecified, the algorithm will choose a default target. 
importance 
Type of variable importance (VIMP). See

method 
Method of analysis: maximal subtree or VIMP. See details below. 
sorted 
Should variables be sorted by VIMP? Does not apply for competing risks. 
nvar 
Number of variables to be used. 
nrep 
Number of Monte Carlo replicates when method="vimp". 
subset 
Vector indicating which rows of the xvariable matrix
from the 
na.action 
Action to be taken if the data contains 
seed 
Seed for random number generator. Must be a negative integer. 
do.trace 
Number of seconds between updates to the user on approximate time to completion. 
verbose 
Set to 
... 
Further arguments passed to or from other methods. 
Using a previously grown forest, identify pairwise interactions for all pairs of variables from a specified list. There are two distinct approaches specified by the option method.
method="maxsubtree"
This invokes a maximal subtree analysis. In this case, a matrix is returned where entries [i][i] are the normalized minimal depth of variable [i] relative to the root node (normalized wrt the size of the tree) and entries [i][j] indicate the normalized minimal depth of a variable [j] wrt the maximal subtree for variable [i] (normalized wrt the size of [i]'s maximal subtree). Smaller [i][i] entries indicate predictive variables. Small [i][j] entries having small [i][i] entries are a sign of an interaction between variable i and j (note: the user should scan rows, not columns, for small entries). See Ishwaran et al. (2010, 2011) for more details.
method="vimp"
This invokes a jointVIMP approach. Two variables are paired and their paired VIMP calculated (refered to as 'Paired' importance). The VIMP for each separate variable is also calculated. The sum of these two values is refered to as 'Additive' importance. A large positive or negative difference between 'Paired' and 'Additive' indicates an association worth pursuing if the univariate VIMP for each of the pairedvariables is reasonably large. See Ishwaran (2007) for more details.
Computations might be slow depending upon the size of the data and the forest. In such cases, consider setting nvar to a smaller number. If method="maxsubtree", consider using a smaller number of trees in the original grow call.
If nrep is greater than 1, the analysis is repeated
nrep
times and results averaged over the replications
(applies only when method="vimp").
Invisibly, the interaction table (a list for competing risk data) or the maximal subtree matrix.
Hemant Ishwaran and Udaya B. Kogalur
Ishwaran H. (2007). Variable importance in binary regression trees and forests, Electronic J. Statist., 1:519537.
Ishwaran H., Kogalur U.B., Gorodeski E.Z, Minn A.J. and Lauer M.S. (2010). Highdimensional variable selection for survival data. J. Amer. Statist. Assoc., 105:205217.
Ishwaran H., Kogalur U.B., Chen X. and Minn A.J. (2011). Random survival forests for highdimensional data. Statist. Anal. Data Mining, 4:115132.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45  ## 
## find interactions, survival setting
## 
data(pbc, package = "randomForestSRC")
pbc.obj < rfsrc(Surv(days,status) ~ ., pbc, importance = TRUE)
find.interaction(pbc.obj, method = "vimp", nvar = 8)
## 
## find interactions, competing risks
## 
data(wihs, package = "randomForestSRC")
wihs.obj < rfsrc(Surv(time, status) ~ ., wihs, nsplit = 3, ntree = 100,
importance = TRUE)
find.interaction(wihs.obj)
find.interaction(wihs.obj, method = "vimp")
## 
## find interactions, regression setting
## 
airq.obj < rfsrc(Ozone ~ ., data = airquality, importance = TRUE)
find.interaction(airq.obj, method = "vimp", nrep = 3)
find.interaction(airq.obj)
## 
## find interactions, classification setting
## 
iris.obj < rfsrc(Species ~., data = iris, importance = TRUE)
find.interaction(iris.obj, method = "vimp", nrep = 3)
find.interaction(iris.obj)
## 
## interactions for multivariate mixed forests
## 
mtcars2 < mtcars
mtcars2$cyl < factor(mtcars2$cyl)
mtcars2$carb < factor(mtcars2$carb, ordered = TRUE)
mv.obj < rfsrc(cbind(carb, mpg, cyl) ~., data = mtcars2, importance = TRUE)
find.interaction(mv.obj, method = "vimp", outcome.target = "carb")
find.interaction(mv.obj, method = "vimp", outcome.target = "mpg")
find.interaction(mv.obj, method = "vimp", outcome.target = "cyl")

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.