View source: R/plot.variable.rfsrc.R
plot.variable.rfsrc | R Documentation |
Plot the marginal effect of an x-variable on the class probability (classification), response (regression), mortality (survival), or the expected years lost (competing risk). Users can select between marginal (unadjusted, but fast) and partial plots (adjusted, but slower).
## S3 method for class 'rfsrc'
plot.variable(x, xvar.names, target,
m.target = NULL, time, surv.type = c("mort", "rel.freq",
"surv", "years.lost", "cif", "chf"), class.type =
c("prob", "bayes"), partial = FALSE, oob = TRUE,
show.plots = TRUE, plots.per.page = 4, granule = 5, sorted = TRUE,
nvar, npts = 25, smooth.lines = FALSE, subset, ...)
x |
An object of class |
xvar.names |
Character vector of x-variable names to include. If not specified, all variables are used. |
target |
For classification, an integer or character specifying the class of interest (default is the first class). For competing risks, an integer between 1 and |
m.target |
Character value for multivariate families specifying the target outcome. If unspecified, a default is automatically chosen. |
time |
(Survival only) Time point at which the predicted survival value is evaluated, depending on |
surv.type |
(Survival only) Type of predicted survival value to compute. See |
class.type |
(Classification only) Type of predicted classification value to use. See |
partial |
Logical. If |
oob |
Logical. If |
show.plots |
Logical. If |
plots.per.page |
Integer controlling the number of plots displayed per page. |
granule |
Integer controlling the coercion of continuous variables to factors (used to generate boxplots). Larger values increase coercion. |
sorted |
Logical. If |
nvar |
Number of variables to plot. Defaults to all available variables. |
npts |
Maximum number of points used when generating partial plots for continuous variables. |
smooth.lines |
Logical. If |
subset |
Vector indicating which rows of |
... |
Additional arguments passed to or from other methods. |
The vertical axis displays the ensemble-predicted value, while x-variables are plotted along the horizontal axis.
For regression, the predicted response is plotted.
For classification, the plotted value is the predicted class
probability for the class specified by target
, or the most
probable class (Bayes rule) depending on whether class.type
is
set to "prob"
or "bayes"
.
For multivariate families, the prediction corresponds to the outcome specified by m.target
. If this is a classification outcome, target
may also be used to indicate the class of interest.
For survival, the vertical axis shows the predicted value determined by surv.type
, with the following options:
mort
: Mortality (Ishwaran et al., 2008), interpreted as the expected number of events for an individual with the same covariates.
rel.freq
: Relative frequency of mortality.
surv
: Predicted survival probability at a specified time point (default is the median follow-up time), controlled via time
.
For competing risks, the vertical axis shows one of the following quantities, depending on surv.type
:
years.lost
: Expected number of life-years lost.
cif
: Cumulative incidence function for the specified event.
chf
: Cause-specific cumulative hazard function.
In all competing risks settings, the event of interest is specified using target
, and cif
and chf
are evaluated at the time point given by time
.
To generate partial dependence plots, set partial = TRUE
. These differ from marginal plots in that they isolate the effect of a single variable X
on the predicted value by averaging over all other covariates:
\tilde{f}(x) = \frac{1}{n} \sum_{i=1}^n \hat{f}(x, x_{i,o}),
where x_{i,o}
denotes the observed values of all covariates other than X
for individual i
, and \hat{f}
is the prediction function. Generating partial plots can be computationally expensive; use a smaller value for npts
to reduce the number of grid points evaluated for x
.
Plot display conventions:
For continuous variables: red points indicate partial values; dashed red lines represent an error band of two standard errors. Black dashed lines show the raw partial values. Use smooth.lines = TRUE
to overlay a lowess
smoothed line.
For discrete (factor) variables: boxplots are used, with whiskers extending approximately two standard errors from the mean.
Standard errors are provided only as rough indicators and should be interpreted cautiously.
Partial plots can be slow to compute. Setting npts
to a small value can improve performance.
For additional flexibility and speed, consider using partial.rfsrc
, which directly computes partial plot data and allows for greater customization.
Hemant Ishwaran and Udaya B. Kogalur
Friedman J.H. (2001). Greedy function approximation: a gradient boosting machine, Ann. of Statist., 5:1189-1232.
Ishwaran H., Kogalur U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.
Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests, Ann. App. Statist., 2:841-860.
Ishwaran H., Gerds T.A., Kogalur U.B., Moore R.D., Gange S.J. and Lau B.M. (2014). Random survival forests for competing risks. Biostatistics, 15(4):757-773.
rfsrc
,
synthetic.rfsrc
,
partial.rfsrc
,
predict.rfsrc
## ------------------------------------------------------------
## survival/competing risk
## ------------------------------------------------------------
## survival
data(veteran, package = "randomForestSRC")
v.obj <- rfsrc(Surv(time,status)~., veteran, ntree = 100)
plot.variable(v.obj, plots.per.page = 3)
plot.variable(v.obj, plots.per.page = 2, xvar.names = c("trt", "karno", "age"))
plot.variable(v.obj, surv.type = "surv", nvar = 1, time = 200)
plot.variable(v.obj, surv.type = "surv", partial = TRUE, smooth.lines = TRUE)
plot.variable(v.obj, surv.type = "rel.freq", partial = TRUE, nvar = 2)
## example of plot.variable calling a pre-processed plot.variable object
p.v <- plot.variable(v.obj, surv.type = "surv", partial = TRUE, smooth.lines = TRUE)
plot.variable(p.v)
p.v$plots.per.page <- 1
p.v$smooth.lines <- FALSE
plot.variable(p.v)
## example using a pre-processed plot.variable to define custom plots
p.v <- plot.variable(v.obj, surv.type = "surv", partial = TRUE, show.plots = FALSE)
plotthis <- p.v$plotthis
plot(plotthis[["age"]], xlab = "age", ylab = "partial effect", type = "b")
boxplot(yhat ~ x, plotthis[["trt"]], xlab = "treatment", ylab = "partial effect")
## competing risks
data(follic, package = "randomForestSRC")
follic.obj <- rfsrc(Surv(time, status) ~ ., follic, nsplit = 3, ntree = 100)
plot.variable(follic.obj, target = 2)
## ------------------------------------------------------------
## regression
## ------------------------------------------------------------
## airquality
airq.obj <- rfsrc(Ozone ~ ., data = airquality)
plot.variable(airq.obj, partial = TRUE, smooth.lines = TRUE)
plot.variable(airq.obj, partial = TRUE, subset = airq.obj$xvar$Solar.R < 200)
## motor trend cars
mtcars.obj <- rfsrc(mpg ~ ., data = mtcars)
plot.variable(mtcars.obj, partial = TRUE, smooth.lines = TRUE)
## ------------------------------------------------------------
## classification
## ------------------------------------------------------------
## iris
iris.obj <- rfsrc(Species ~., data = iris)
plot.variable(iris.obj, partial = TRUE)
## motor trend cars: predict number of carburetors
mtcars2 <- mtcars
mtcars2$carb <- factor(mtcars2$carb,
labels = paste("carb", sort(unique(mtcars$carb))))
mtcars2.obj <- rfsrc(carb ~ ., data = mtcars2)
plot.variable(mtcars2.obj, partial = TRUE)
## ------------------------------------------------------------
## multivariate regression
## ------------------------------------------------------------
mtcars.mreg <- rfsrc(Multivar(mpg, cyl) ~., data = mtcars)
plot.variable(mtcars.mreg, m.target = "mpg", partial = TRUE, nvar = 1)
plot.variable(mtcars.mreg, m.target = "cyl", partial = TRUE, nvar = 1)
## ------------------------------------------------------------
## multivariate mixed outcomes
## ------------------------------------------------------------
mtcars2 <- mtcars
mtcars2$carb <- factor(mtcars2$carb)
mtcars2$cyl <- factor(mtcars2$cyl)
mtcars.mix <- rfsrc(Multivar(carb, mpg, cyl) ~ ., data = mtcars2)
plot.variable(mtcars.mix, m.target = "cyl", target = "4", partial = TRUE, nvar = 1)
plot.variable(mtcars.mix, m.target = "cyl", target = 2, partial = TRUE, nvar = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.