longitrees: Construction of Multiple Decision Trees for Longitudinal Data

View source: R/longitree.R

longitreesR Documentation

Construction of Multiple Decision Trees for Longitudinal Data

Description

Generates multiple trees from bootstrap samples and evaluates all three-tree combinations based on two criteria: cross-validated prediction error and tree diversification measured by the adjusted Rand index (ARI). Bootstrap sampling is performed at the subject level to preserve longitudinal structure.

Usage

longitrees(
  formula,
  time,
  random,
  weight = "w",
  data,
  alpha = "no",
  gamma = "no",
  cv = "no",
  maxdepth = 5,
  minbucket = 5,
  minsplit = 20,
  xval = 10,
  bootsize,
  trees = 100,
  mins = 40
)

Arguments

formula

A formula specifying the model. The response variable should be on the left side and covariates on the right side. Use response ~ . to include all covariates except the time variable and the random effect, or select specific covariates such as response ~ x1 + x2. Time-invariant (baseline) covariates are assumed.

time

Character string giving the column name of the time variable. All individuals are assumed to be observed at the same time points.

random

Character string giving the column name of the random effect (subject identifier).

weight

Weight for balancing the main effect of a covariate and its interaction with time. A value in \{0.0, 0.1, \ldots, 1.0\}: 1.0 evaluates only the mean difference in the response variable between the two groups and 0.0 evaluates only the difference in change over time of the response variable between the two groups. Set weight = "w" (the default) to select the optimal weight from the same grid at each node.

data

A data frame containing the variables in formula together with the time and random-effect variables.

alpha

Significance level used as the stopping rule for tree growth. A smaller value produces a more conservative (smaller) tree. Specify a numeric value or "no" (default) if not used. Corresponds to ST2.

gamma

Complexity parameter for pruning. A larger value prunes more aggressively, yielding a smaller and simpler tree; a smaller value retains more branches. Specify a numeric value or "no" (default) if not used. Corresponds to ST3.

cv

Set "yes" to construct the decision tree using cross-validation, or "no" (default) otherwise. Corresponds to ST1.

maxdepth

Maximum depth of the tree (default 5).

minbucket

Minimum number of subjects in a terminal node (default 5).

minsplit

Minimum number of subjects required to attempt a split (default 20).

xval

Number of cross-validation folds (default 10). Used to compute the cross-validated coefficient of determination (R^2_{\mathrm{CV}}); when cv = "yes", also used for final tree selection.

bootsize

Number of subjects in each bootstrap sample.

trees

Number of bootstrap trees to grow (default 100).

mins

Number of top-ranking candidate three-tree subsets to retain (default 40).

Details

See longitree for a description of the three single-tree construction procedures (ST1, ST2, ST3) corresponding to cv, alpha, and gamma.

Value

An object of class "longitrees". Pass to selectionplot to select the optimal three-tree combination.

References

Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s11634-025-00665-2")}

See Also

longitree, selectionplot, threetrees, treeplot


longitree documentation built on May 16, 2026, 5:06 p.m.