ftree: Fits a regression tree to functional and multivariate output...

Usage Arguments Details Author(s)

View source: R/main.R

Usage

1
2
3
4
5
6
ftree(.X = NULL, .Y = NULL, .D = NULL, .SIGMA_inv = NULL,
  cost.type = "sse", tree.type = "single", nP = if (tree.type ==
  "randomforest") round((ncol(.X)/3)) else ncol(.X), nBoot = 1000,
  .minSplit = 20, .minBucket = round(.minSplit/3), .cp = 0.005,
  ArgStep = 1, verbose = TRUE, parallel = TRUE, .predictorType = rep(0,
  ncol(.X)))

Arguments

.X

- An nxp matrix of covariates

.Y

- A matrix of functions stacked in columns. Assumes that all functions were evaluated on the same time temporal grid. You can also pass multiple such matrices stacked in a list. This option is allowed only in the case of distance based cost functions (wss, rdist).

.D

- Optional distance matrix for wss/rdist cost function

cost.type

- Cost function type. It can be any of the following: "sse", "mahalanobis", "wss", "l2norm", "rdist", "l2square" (see the details).

tree.type

- What type of tree based predictor you want to fit. Currently supported: single tree, random forest, bagging

nP

- The number of predictors to consider on each attempted split. Active only for tree.type = "randomforest"

nBoot

- The number of trees to consider in bootstrapping

.minSplit

- minimum required number of elements in a node in order to attempt a split.

.minBucket

- minimum number of elements in leaf nodes. Defaults to .minSplit/3.

.cp

- complexity parameter, split is accepted if it provides imporovement that is at least cp*rootGoodness

verbose

- print progres (default = TRUE)

.predictorType

- A boolean vector of length ncol(.X) specifying the types of predictors (0 - Continuous, 1 - Categorical). It defaults to all = 0.

Details

This code implements various functional and multivariate tree splitting routines. See the vignette for a detailed description of each cost function and for a tutorial on how to use the code. Note that this is a research code, hence it has more cost functions than we would normally ship within a release version. The 'sse' and 'wss' cost functions are completely experimental, use them at your own responsibility.

When using 'rdist' there are two modeling paths you can take. The first path is to provide a distance type through variable .D (i.e. .D = 'euclidean'). The distances are computed internally with the generic dist function. The provided distance type must match one of the options available for variable method in dist() (see help(dist)). If you are using a distance that cannot be computed with the dist function then you are allowed to provide a pre-computed distance matrix by assigning it to the input variable .D (i.e. .D = <my_dist_matrix>).

Author(s)

Ognjen Grujic (ognjengr@gmail.com)


ogru/fTree documentation built on May 29, 2019, 7:19 a.m.