partial_data: Cached 'plot.variable' objects for examples, diagnostics and...

Description Format Details References See Also Examples

Description

Cached plot.variable objects for examples, diagnostics and vignettes.

Data sets storing plot.variable objects corresponding to training data according to the following naming convention:

Format

plot.variable

Details

Constructing partial plot data with the randomForestsSRC::plot.variable function are computationally expensive. We cache plot.variable objects to improve the ggRandomForests examples, diagnostics and vignettes run times. (see cache_rfsrc_datasets to rebuild a complete set of these data sets.)

For each data set listed, we build a rfsrc (see rfsrc_data), then calculate the partial plot data with plot.variable function, setting partial=TRUE. Each data set is built with the cache_rfsrc_datasets with the randomForestSRC version listed in the ggRandomForests DESCRIPTION file.

References

#——————— randomForestSRC ———————

Ishwaran H. and Kogalur U.B. (2014). Random Forests for Survival, Regression and Classification (RF-SRC), R package version 1.5.5.

Ishwaran H. and Kogalur U.B. (2007). Random survival forests for R. R News 7(2), 25-31.

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. Ann. Appl. Statist. 2(3), 841-860.

#——————— Boston data set ———————

Belsley, D.A., E. Kuh, and R.E. Welsch. 1980. Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.

Harrison, D., and D.L. Rubinfeld. 1978. "Hedonic Prices and the Demand for Clean Air." J. Environ. Economics and Management 5: 81-102.

#——————— Iris data set ———————

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth \& Brooks/Cole. (has iris3 as iris.)

Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179-188.

Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2-5.

#——————— pbc data set ———————

Flemming T.R and Harrington D.P., (1991) Counting Processes and Survival Analysis. New York: Wiley.

T Therneau and P Grambsch (2000), Modeling Survival Data: Extending the Cox Model, Springer-Verlag, New York. ISBN: 0-387-98784-3.

See Also

iris MASS::Boston pbc plot.variable rfsrc_data cache_rfsrc_datasets gg_partial plot.gg_partial

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
## Not run: 
#---------------------------------------------------------------------
# iris data - classification random forest
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_iris, package="ggRandomForests")

# The plot.variable call
 partial_iris <- plot.variable(rfsrc_iris,
                               partial=TRUE, show.plots=FALSE)

# plot the forest partial plots
gg_dta <- gg_partial(partial_iris)
plot(gg_dta, panel=TRUE)

#---------------------------------------------------------------------
# MASS::Boston data - regression random forest 
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_Boston, package="ggRandomForests")

# The plot.variable call
partial_Boston <- plot.variable(rfsrc_Boston,
                                partial=TRUE, show.plots = FALSE )

# plot the forest partial plots
gg_dta <- gg_partial(partial_Boston)
plot(gg_dta, panel=TRUE)

#---------------------------------------------------------------------
# randomForestSRC::pbc data - survival random forest
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_pbc, package="ggRandomForests")

# The plot.variable call - 
# survival requires a time point specification.
# for the pbc data, we want 1, 3 and 5 year survival.
partial_pbc <- lapply(c(1,3,5), function(tm){
                      plot.variable(rfsrc_pbc, surv.type = "surv", 
                                    time = tm,
                                    xvar.names = xvar, 
                                    partial = TRUE,
                                    show.plots = FALSE)
                                    })
                                    
# plot the forest partial plots
gg_dta <- gg_partial(partial_pbc)
plot(gg_dta)

## End(Not run)

ehrlinger/ggRFVignette documentation built on May 16, 2019, 12:16 a.m.