circforest: Distributional Regression Forests for a Circular Response

Description Usage Arguments Details Value See Also Examples

View source: R/circforest.R

Description

Distributional forests based on maximum-likelihood estimation of parameters for a circular response employing the von Mises distribution.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
circforest(formula, data, response_range = NULL, subset, 
           na.action = na.pass, weights, offset, cluster, strata, 
           control = disttree_control(teststat = "quad", testtype = "Univ", 
           mincriterion = 0, saveinfo = FALSE, minsplit = 20, minbucket = 7, 
           splittry = 2, ...), ntree = 500L, fit.par = FALSE, 
           perturb = list(replace = FALSE, fraction = 0.632),
           mtry = ceiling(sqrt(nvar)), applyfun = NULL, cores = NULL, trace = FALSE, ...)
## S3 method for class 'circforest'
predict(object, newdata = NULL,
        type = c("parameter", "response", "weights", "node"),
        OOB = TRUE, scale = TRUE, ...)

Arguments

formula

a symbolic description of the model to be fit. This should be of type y ~ x1 + x2 where y should be the response variable and x1 and x2 are used as partitioning variables.

data

an optional data frame containing the variables in the model.

response_range

an optional vector specifying a range of the circular response.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain missing value.

weights

optional numeric vector of case weights.

offset

an optional vector of offset values.

cluster

an optional factor indicating independent clusters. Highly experimental, use at your own risk.

strata

an optional factor for stratified sampling.

control

a list with control parameters passed to extree_fit via disttree_control The default values that are not set within the call of distforest correspond to those of the default values used by disttree from the disttree package. saveinfo = FALSE leads to less memory hungry representations of trees. Note that arguments mtry, cores and applyfun in disttree_control are ignored for distforest, because they are already set.

ntree

number of trees to grow for the forest.

fit.par

logical. if TRUE, fitted and predicted values and predicted parameters are calculated for the learning data (together with loglikelihood)

perturb

a list with arguments replace and fraction determining which type of resampling with replace = TRUE referring to the n-out-of-n bootstrap and replace = FALSE to sample splitting. fraction is the number of observations to draw without replacement.

mtry

number of input variables randomly sampled as candidates at each node for random forest like algorithms. Bagging, as special case of a random forest without random input variable sampling, can be performed by setting mtry either equal to Inf or manually equal to the number of input variables.

applyfun

an optional lapply-style function with arguments function(X, FUN, ...). It is used for computing the variable selection criterion. The default is to use the basic lapply function unless the cores argument is specified (see below).

cores

numeric. If set to an integer the applyfun is set to mclapply with the desired number of cores.

trace

a logical indicating if a progress bar shall be printed while the forest grows.

object

an object as returned by circforest

newdata

an optional data frame containing test data.

type

a character string denoting the type of predicted value returned. For "parameter" the predicted distributional parameters are returned on the range of (-pi, pi] and for "response" the expectation on the range of the response is returned (response_range). "weights" returns an integer vector of prediction weights. For type = "node", a list of terminal node ids for each of the trees in the forest ist returned.

OOB

a logical defining out-of-bag predictions (only if newdata = NULL).

scale

a logical indicating scaling of the nearest neighbor weights by the sum of weights in the corresponding terminal node of each tree. In the simple regression forest, predicting the conditional mean by nearest neighbor weights will be equivalent to (but slower!) the aggregation of means.

...

arguments to be used to form the default control argument if it is not supplied directly.

Details

Distributional regression forests for a circular response are an application of model-based recursive partitioning and unbiased recursive partitioning based on the implementation in distforest using the infrastructure of extree_fit.

Value

An object of S3 class circforest inheriting from class distforest.

See Also

distforest, disttree, distfit, extree_fit

Examples

1
2
#sdat <- circtree_simulate()
#cf <- circforest(y ~ x1 + x2, data = sdat, ntree = 50)

circtree documentation built on Aug. 14, 2019, 3 p.m.