plot.variable: Plot Marginal Effect of Variables
In ehrlinger/randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)

Description Usage Arguments Details Author(s) References See Also Examples

Plot the marginal effect of an x-variable on the class probability (classification), response (regression), mortality (survival), or the expected years lost (competing risk) from a RF-SRC analysis. Users can select between marginal (unadjusted, but fast) and partial plots (adjusted, but slow).

## S3 method for class 'rfsrc'
plot.variable(x, xvar.names, which.class,
  outcome.target = NULL, time, surv.type = c("mort", "rel.freq",
  "surv", "years.lost", "cif", "chf"), class.type =
  c("prob", "bayes"), partial = FALSE, oob = TRUE, show.plots = TRUE,
  plots.per.page = 4, granule = 5, sorted = TRUE, nvar, npts = 25,
  smooth.lines = FALSE, subset, ...)

`x`	An object of class `(rfsrc, grow)`, `(rfsrc, synthetic)`, `(rfsrc, predict)`, or `(rfsrc, plot.variable)`. See the examples below for illustration of the latter.
`xvar.names`	Names of the x-variables to be used.
`which.class`	For classification families, an integer or character value specifying the class to focus on (defaults to the first class). For competing risk families, an integer value between 1 and `J` indicating the event of interest, where `J` is the number of event types. The default is to use the first event type.
`outcome.target`	Character value for multivariate families specifying the target outcome to be used. The default is to use the first coordinate.
`time`	For survival families, the time at which the predicted survival value is evaluated at (depends on `surv.type`).
`surv.type`	For survival families, specifies the predicted value. See details below.
`class.type`	For classification families, specifies the predicted value. See details below.
`partial`	Should partial plots be used?
`oob`	OOB (TRUE) or in-bag (FALSE) predicted values.
`show.plots`	Should plots be displayed?
`plots.per.page`	Integer value controlling page layout.
`granule`	Integer value controlling whether a plot for a specific variable should be treated as a factor and therefore given as a boxplot. Larger values coerce boxplots.
`sorted`	Should variables be sorted by importance values.
`nvar`	Number of variables to be plotted. Default is all.
`npts`	Maximum number of points used when generating partial plots for continuous variables.
`smooth.lines`	Use lowess to smooth partial plots.
`subset`	Vector indicating which rows of the x-variable matrix `x$xvar` to use. All rows are used if not specified.
`...`	Further arguments passed to or from other methods.

The vertical axis displays the ensemble predicted value, while x-variables are plotted on the horizontal axis.

For regression, the predicted response is used.
For classification, it is the predicted class probability specified by which.class, or the class of maximum probability depending on class.type.
For multivariate families, it is the predicted value of the outcome specified by outcome.target and if that is a classification outcome, by which.class.
For survival, the choices are:
- Mortality (mort).
- Relative frequency of mortality (rel.freq).
- Predicted survival (surv), where the predicted survival is for the time point specified using time (the default is the median follow up time).
For competing risks, the choices are:
- The expected number of life years lost (years.lost).
- The cumulative incidence function (cif).
- The cumulative hazard function (chf).
In all three cases, the predicted value is for the event type specified by which.class. For cif and chf the quantity is evaluated at the time point specified by time.

For partial plots use partial=TRUE. Their interpretation are different than marginal plots. The y-value for a variable X, evaluated at X=x, is

\tilde{f}(x) = \frac{1}{n} ∑_{i=1}^n \hat{f}(x, x_{i,o}),

where x_{i,o} represents the value for all other variables other than X for individual i and \hat{f} is the predicted value. Generating partial plots can be very slow. Choosing a small value for npts can speed up computational times as this restricts the number of distinct x values used in computing \tilde{f}.

For continuous variables, red points are used to indicate partial values and dashed red lines indicate a smoothed error bar of +/- two standard errors. Black dashed line are the partial values. Set smooth.lines=TRUE for lowess smoothed lines. For discrete variables, partial values are indicated using boxplots with whiskers extending out approximately two standard errors from the mean. Standard errors are meant only to be a guide and should be interpreted with caution.

Partial plots can be slow. Setting npts to a smaller number can help.

Hemant Ishwaran and Udaya B. Kogalur

Friedman J.H. (2001). Greedy function approximation: a gradient boosting machine, Ann. of Statist., 5:1189-1232.

Ishwaran H., Kogalur U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests, Ann. App. Statist., 2:841-860.

Ishwaran H., Gerds T.A., Kogalur U.B., Moore R.D., Gange S.J. and Lau B.M. (2014). Random survival forests for competing risks. To appear in Biostatistics.

rfsrc, rfsrcSyn, predict.rfsrc

## Not run: 
## ------------------------------------------------------------
## survival/competing risk
## ------------------------------------------------------------

## survival
data(veteran, package = "randomForestSRC")
v.obj <- rfsrc(Surv(time,status)~., veteran, nsplit = 10, ntree = 100)
plot.variable(v.obj, plots.per.page = 3)
plot.variable(v.obj, plots.per.page = 2, xvar.names = c("trt", "karno", "age"))
plot.variable(v.obj, surv.type = "surv", nvar = 1, time = 200)
plot.variable(v.obj, surv.type = "surv", partial = TRUE, smooth.lines = TRUE)
plot.variable(v.obj, surv.type = "rel.freq", partial = TRUE, nvar = 2)

## example of plot.variable calling a pre-processed plot.variable object
p.v <- plot.variable(v.obj, surv.type = "surv", partial = TRUE, smooth.lines = TRUE)
plot.variable(p.v)
p.v$plots.per.page <- 1
p.v$smooth.lines <- FALSE
plot.variable(p.v)

## competing risks
data(follic, package = "randomForestSRC")
follic.obj <- rfsrc(Surv(time, status) ~ ., follic, nsplit = 3, ntree = 100)
plot.variable(follic.obj, which.class = 2)

## ------------------------------------------------------------
## regression
## ------------------------------------------------------------

## airquality
airq.obj <- rfsrc(Ozone ~ ., data = airquality)
plot.variable(airq.obj, partial = TRUE, smooth.lines = TRUE)

## motor trend cars
mtcars.obj <- rfsrc(mpg ~ ., data = mtcars)
plot.variable(mtcars.obj, partial = TRUE, smooth.lines = TRUE)

## ------------------------------------------------------------
## classification
## ------------------------------------------------------------

## iris
iris.obj <- rfsrc(Species ~., data = iris)
plot.variable(iris.obj, partial = TRUE)

## motor trend cars: predict number of carburetors
mtcars2 <- mtcars
mtcars2$carb <- factor(mtcars2$carb,
   labels = paste("carb", sort(unique(mtcars$carb))))
mtcars2.obj <- rfsrc(carb ~ ., data = mtcars2)
plot.variable(mtcars2.obj, partial = TRUE)

## ------------------------------------------------------------
## multivariate regression
## ------------------------------------------------------------
mtcars.mreg <- rfsrc(Multivar(mpg, cyl) ~., data = mtcars)
plot.variable(mtcars.mreg, outcome.target = "mpg", partial = TRUE, nvar = 1)
plot.variable(mtcars.mreg, outcome.target = "cyl", partial = TRUE, nvar = 1)


## ------------------------------------------------------------
## multivariate mixed outcomes
## ------------------------------------------------------------
mtcars2 <- mtcars
mtcars2$carb <- factor(mtcars2$carb)
mtcars2$cyl <- factor(mtcars2$cyl)
mtcars.mix <- rfsrc(Multivar(carb, mpg, cyl) ~ ., data = mtcars2)
plot.variable(mtcars.mix, outcome.target = "cyl", which.class = "4", partial = TRUE, nvar = 1)
plot.variable(mtcars.mix, outcome.target = "cyl", which.class = 2, partial = TRUE, nvar = 1)



## End(Not run)

ehrlinger/randomForestSRC documentation built on May 16, 2019, 1:20 a.m.

ehrlinger/randomForestSRC index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ehrlinger/randomForestSRC
Random Forests for Survival, Regression and Classification (RF-SRC)

plot.variable: Plot Marginal Effect of Variables
In ehrlinger/randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)

Description

Usage

Arguments

Details

Author(s)

References

See Also

Examples

Related to plot.variable in ehrlinger/randomForestSRC...

R Package Documentation

Browse R Packages

We want your feedback!

ehrlinger/randomForestSRC Random Forests for Survival, Regression and Classification (RF-SRC)

plot.variable: Plot Marginal Effect of Variables In ehrlinger/randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)

Description

Usage

Arguments

Details

Author(s)

References

See Also

Examples

Related to plot.variable in ehrlinger/randomForestSRC...

R Package Documentation

Browse R Packages

We want your feedback!

ehrlinger/randomForestSRC
Random Forests for Survival, Regression and Classification (RF-SRC)

plot.variable: Plot Marginal Effect of Variables
In ehrlinger/randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC)