LongCART: Longitudinal CART with continuous response via binary...

LongCARTR Documentation

Longitudinal CART with continuous response via binary partitioning

Description

Recursive partitioning for linear mixed effects model with continuous univariate response variables per LonCART algorithm based on baseline partitioning variables (Kundu and Harezlak, 2019).

Usage

LongCART(data, patid, fixed, gvars, tgvars, minsplit=40,
         minbucket=20, alpha=0.05, coef.digits=2, print.lme=FALSE)

Arguments

data

name of the dataset. It must contain variable specified for patid (indicating subject id), all the variables specified in the formula and the baseline partitioning variables.

patid

name of the subject id variable.

fixed

a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. Model with -1 to the end of right side indicates no intercept. For model with no fixed effect beyond intercept, please specify only 1 right to the ~ operator.

gvars

list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through gvars.

tgvars

types (categorical or continuous) of partitioning variables specified in gvar. For each of continuous partitioning variables, specify 1 and for each of the categorical partitioning variables, specify 0. Length of tgvars should match to the length of gvars

minsplit

the minimum number of observations that must exist in a node in order for a split to be attempted.

minbucket

he minimum number of observations in any terminal node.

alpha

alpha (i.e., nominal type I error) level for parameter instability test

coef.digits

decimal points for displaying coefficients in the tree structure.

print.lme

if TRUE, then summary of fitte model from lme() will be printed for each node.

Details

Construct regression tree based on heterogeneity in linear mixed effects models of following type: Y_i(t)= W_i(t)theta + b_i + epsilon_{it} where W_i(t) is the design matrix, theta is the parameter associated with W_i(t) and b_i is the random intercept. Also, epsilon_{it} ~ N(0,sigma ^2) and b_i ~ N(0, sigma_u^2).

Value

Treeout

contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of Treeout include "ID", the (unique) node numbers that follow a binary ordering indexed by node depth, n, the number of observations reaching the node, yval, the fitted model of the response at the node, var, a factor giving the names of the variables used in the split at each, index, the cut-off value of splitting variable for binary partitioning, p (Instability), the p-value for parameter instability test for the splitting variable, loglik, the log-likelihood of the node, improve, the improvement in deviance given by this split, and Terminal, indicator (True or False) of terminal node.

p

number of fixed parameters

AIC.tree

AIC of the tree-structured model

AIC.root

AIC at the root node (i.e., without tree structure)

improve.AIC

improvement in AIC due to tree structure (AIC.tree - AIC.root)

logLik.tree

log-likelihood of the tree-structured model

logLik.root

log-likelihood at the root node (i.e., without tree structure)

Deviance

2*(logLik.tree-logLik.root)

LRT.df

degrees of freedom for likelihood ratio test comparing tree-structured model with the model at root node.

LRT.p

p-value for likelihood ratio test comparing tree-structured model with the model at root node.

nodelab

List of subgroups or terminal nodes with their description

varnam

List of splitting variables

data

the dataset originally supplied

patid

the patid variable originally supplied

fixed

the fixed part of the model originally supplied

frame

rpart compatible object

splits

rpart compatible object

cptable

rpart compatible object

functions

rpart compatible object

Author(s)

Madan Gopal Kundu madan_g.kundu@yahoo.com

References

Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.

See Also

plot, text, ProfilePlot, StabCat, StabCont, predict

Examples


#--- Get the data
data(ACTG175)

#-----------------------------------------------#
#   model: cd4~ time + subject(random)          #
#-----------------------------------------------#

#--- Run LongCART()  
gvars=c("gender", "wtkg", "hemo", "homo", "drugs",
        "karnof", "oprior", "z30", "zprior", "race",
        "str2", "symptom", "treat", "offtrt")
tgvars=c(0, 1, 0, 0, 0,
         1, 0, 0, 0, 0,
         0, 0, 0, 0)


out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time,
                gvars=gvars, tgvars=tgvars, alpha=0.05,
                minsplit=100, minbucket=50, coef.digits=2)

#--- Plot tree
par(mfrow=c(1,1))
par(xpd = TRUE)
plot(out1, compress = TRUE)
text(out1, use.n = TRUE)

#--- Plot longitudinal profiles of subgroups
ProfilePlot(x=out1, timevar="time")

#-----------------------------------------------#
#   model: cd4~ time+ time^2 + subject(random)  #
#-----------------------------------------------#

ACTG175$time2<- ACTG175$time^2

out2<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2,
                gvars=gvars, tgvars=tgvars, alpha=0.05,
                minsplit=100, minbucket=50, coef.digits=2)


par(mfrow=c(1,1))
par(xpd = TRUE)
plot(out2, compress = TRUE)
text(out2, use.n = TRUE)

ProfilePlot(x=out2, timevar="time", timevar.power=c(1,2))


#--------------------------------------------------------#
#   model: cd4~ time+ time^2 + subject(random) + karnof  #
#--------------------------------------------------------#

out3<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2 + karnof,
                gvars=gvars, tgvars=tgvars, alpha=0.05,
                minsplit=100, minbucket=50, coef.digits=2)


par(mfrow=c(1,1))
par(xpd = TRUE)
plot(out3, compress = TRUE)
text(out3, use.n = TRUE)

#the value of the covariate karnof is set at median by default
ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA)) 

#the value of the covariate karnof is set at 120
ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA), 
                     covariate.val=c(NA, NA, 120)) 



LongCART documentation built on May 18, 2022, 1:06 a.m.

Related to LongCART in LongCART...