varsel_data: Cached 'var.select' objects for examples, diagnostics and...

Description Format Details References See Also Examples

Description

Cached var.select objects for examples, diagnostics and vignettes.

Data sets storing var.select objects corresponding to training data according to the following naming convention:

Format

var.select object

Details

Constructing minimal depth variable selection with the randomForestsSRC::var.select function is computationally expensive. We cache var.select objects to improve the ggRandomForests examples, diagnostics and vignettes run times. (see cache_rfsrc_datasets to rebuild a complete set of these data sets.)

For each data set listed, we build a rfsrc (see rfsrc_data), then calculate the minimal depth variable selection with var.select function, setting method="md". Each data set is built with the cache_rfsrc_datasets with the randomForestSRC version listed in the ggRandomForests DESCRIPTION file.

References

#——————— randomForestSRC ———————

Ishwaran H. and Kogalur U.B. (2014). Random Forests for Survival, Regression and Classification (RF-SRC), R package version 1.5.5.

Ishwaran H. and Kogalur U.B. (2007). Random survival forests for R. R News 7(2), 25-31.

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. Ann. Appl. Statist. 2(3), 841-860.

#——————— Boston data set ———————

Belsley, D.A., E. Kuh, and R.E. Welsch. 1980. Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.

Harrison, D., and D.L. Rubinfeld. 1978. "Hedonic Prices and the Demand for Clean Air." J. Environ. Economics and Management 5: 81-102.

#——————— Iris data set ———————

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth \& Brooks/Cole. (has iris3 as iris.)

Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179-188.

Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2-5.

#——————— pbc data set ———————

Flemming T.R and Harrington D.P., (1991) Counting Processes and Survival Analysis. New York: Wiley.

T Therneau and P Grambsch (2000), Modeling Survival Data: Extending the Cox Model, Springer-Verlag, New York. ISBN: 0-387-98784-3.

See Also

iris Boston pbc var.select rfsrc_data cache_rfsrc_datasets gg_minimal_depth plot.gg_minimal_depth gg_minimal_vimp plot.gg_minimal_vimp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
## Not run: 
#---------------------------------------------------------------------
# iris data - classification random forest
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_iris, package="ggRandomForests")

# The var.select call
 varsel_iris <- var.select(rfsrc_iris)

# plot the forestminimal depth ranking
gg_dta <- gg_minimal_depth(varsel_iris)
plot(gg_dta)


#---------------------------------------------------------------------
# MASS::Boston data - regression random forest 
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_Boston, package="ggRandomForests")

# The var.select call
varsel_Boston <- var.select(rfsrc_Boston)

# plot the forestminimal depth ranking
gg_dta <- gg_minimal_depth(varsel_Boston)
plot(gg_dta)

#---------------------------------------------------------------------
# randomForestSRC::pbc data - survival random forest
#---------------------------------------------------------------------
# load the rfsrc object from the cached data
data(rfsrc_pbc, package="ggRandomForests")

# The var.select call 
varsel_pbc <- var.select(rfsrc_pbc)
                                    
# plot the forestminimal depth ranking
gg_dta <- gg_minimal_depth(varsel_pbc)
plot(gg_dta)
                   

## End(Not run)

ehrlinger/ggRFVignette documentation built on May 16, 2019, 12:16 a.m.