Description Usage Arguments Details Value References See Also Examples
View source: R/mobforest.analysis.R
Main function that takes all the necessary arguments to start model-based random forest analysis.
1 2 3 4 5 | mobforest.analysis(formula, partition_vars, data,
mobforest_controls = mobforest.control(),
new_test_data = as.data.frame(matrix(0, 0, 0)), processors = 1,
model = linearModel, family = NULL, prob_cutoff = 0.5,
seed = sample(1:1e+07, 1))
|
formula |
An object of class formula specifying the model. This should be of type y ~ x_1 + ... + x_k, where the variables x_1, x_2, ..., x_k are predictor variables and y represents an outcome variable. This model is referred to as the node model |
partition_vars |
A character vector specifying the partition variables |
data |
An input dataset that is used for constructing trees in random forest. |
mobforest_controls |
An object of class
|
new_test_data |
A data frame representing test data for validating random forest model. This data is not used in in tree building process. |
processors |
A number of processors/cores on your computer that should be used for parallel computation. |
model |
A model of class |
family |
A description of error distribution and link function to be used in the model. This parameter needs to be specified if generalized linear model is considered. The parameter "binomial()" is to be specified when logistic regression is considered and "poisson()" when Poisson regression is considered as the node model. The values allowed for this parameter are binomial() and poisson(). |
prob_cutoff |
In case of logistic regression as a node model, the predicted probabilities for OOB cases are converted into classes (yes/no, high/low, etc as specified) based on this probability cutoff. If logistic regression is not considered as node model, the prob_cutoff = NULL. By default it is 0.5 when parameter not specified (and logistic regression considered). |
seed |
Since this function uses parallel processes,
to replicate results, set the cluster
|
mobforest.analysis
is the main function that takes all the input
parameters - model, partition variables, and forest control parameters -
and starts the model-based random forest analysis. mobforest.analysis
calls bootstrap
function which constructs decision trees, computes
out-of-bag (OOB) predictions, OOB predictive accuracy and perturbation in
OOB predictive accuracy through permutation. bootstrap
constructs
trees on multiple cores/processors simultaneously through parallel
computation. Later, the get.mf.object
function wraps the
analysis output into mobforest.output
object.
Predictive accuracy estimates are computed using pseudo-R2 metric, defined
as the proportion of total variation in outcome variable explained by a
tree model on out-of-bag cases. R2 ranges from 0 to 1. R2 of zero suggests
worst tree model (in terms of predicting outcome) and R2 of 1 suggests
perfect tree model.
An object of class mobforest.output
.
Achim Zeileis, Torsten Hothorn, and Kurt Hornik (2008).
Model-Based Recursive Partitioning. Journal of Computational and
Graphical Statistics, 17(2), 492-514.
Hothorn, T., Hornik, K. and Zeileis, A. (2006) Unbiased recursive
partitioning: A conditional inference framework, J Compute Graph
Stat, 15, 651-674.
Strobl, C., Malley, J. and Tutz, G. (2009) An introduction to recursive
partitioning: rationale, application, and characteristics of classification
and regression trees, bagging, and random forests, Psychol Methods,
14, 323-348.
mobforest.control(),
mobforest.output-class
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | library(mlbench)
set.seed(1111)
# Random Forest analysis of model based recursive partitioning load data
data("BostonHousing", package = "mlbench")
BostonHousing <- BostonHousing[1:90, c("rad", "tax", "crim", "medv", "lstat")]
# Recursive partitioning based on linear regression model medv ~ lstat with 3
# trees. 1 core/processor used.
rfout <- mobforest.analysis(as.formula(medv ~ lstat), c("rad", "tax", "crim"),
mobforest_controls = mobforest.control(ntree = 3, mtry = 2, replace = TRUE,
alpha = 0.05, bonferroni = TRUE, minsplit = 25), data = BostonHousing,
processors = 1, model = linearModel, seed = 1111)
## Not run:
rfout
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.