submod_train: Subgroup Identification: Train Model

Description Usage Arguments Details Value References See Also Examples

View source: R/submod_train.R

Description

Wrapper function to train a subgroup model (submod). Outputs subgroup assignments and fitted model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
submod_train(
  Y,
  A,
  X,
  Xtest = NULL,
  mu_train = NULL,
  family = "gaussian",
  submod,
  hyper = NULL,
  pool = "no",
  delta = ">0",
  ...
)

Arguments

Y

The outcome variable. Must be numeric or survival (ex; Surv(time,cens) )

A

Treatment variable. (Default supports binary treatment, either numeric or factor). "ple_train" accomodates >2 along with binary treatments.

X

Covariate space.

Xtest

Test set. Default is NULL which uses X (training set). Variable types should match X.

mu_train

Patient-level estimates in training set (see ple_train). Default=NULL

family

Outcome type. Options include "gaussion" (default), "binomial", and "survival".

submod

Subgroup identification model function. Maps the observed data and/or PLEs to subgroups. Default for family="gaussian" is "lmtree" (MOB with OLS loss). For "binomial" the default is "glmtree" (MOB with binomial loss). Default for "survival" is "mob_weib" (MOB with weibull loss). "None" uses no submod. Currently only available for binary treatments or A=NULL.

hyper

Hyper-parameters for submod (must be list). Default is NULL.

pool

Whether to pool discovered subgroups. Default is "no" (no pooling). Other options include "otr:logistic", which uses an optimal treatment regime approach, where a weighted logistic regression is fit with I(mu_1-mu_0>delta) as the outcome, the candidate subgroups as covariates, and weights=abs(PLE). Lastly, the youden index is used to assign optimal treatments across the discovered subgroups.

delta

Threshold for defining benefit vs non-benefitting patients. Only applicable for submod="otr", and pool="otr:logistic" or "otr:rf"; Default=">0".

...

Any additional parameters, not currently passed through.

Details

submod_train currently fits a number of tree-based subgroup models, most of which aim to find subgroups with varying treatment effects (i.e. predictive variables). Current options include:

1. lmtree: Wrapper function for the function "lmtree" from the partykit package. Here, model-based partitioning (MOB) with an OLS loss function, Y~MOB_LM(A,X), is used to identify prognostic and/or predictive variables.

Default hyper-parameters are: hyper = list(alpha=0.05, maxdepth=4, parm=NULL, minsize=floor(dim(X)[1]*0.10)).

2. glmtree: Wrapper function for the function "glmtree" from the partykit package. Here, model-based partitioning (MOB) with GLM binomial + identity link loss function, (Y~MOB_GLM(A,X)), is used to identify prognostic and/or predictive variables.

Default hyper-parameters are: hyper = list(link="identity", alpha=0.05, maxdepth=4, parm=NULL, minsize=floor(dim(X)[1]*0.10)).

3. ctree: Wrapper function for the function "ctree" from the partykit package. Here, conditional inference trees are used to identify either prognostic, Y~CTREE(X), or predictive variables, PLE~CTREE(X) (outcome_PLE=TRUE; requires mu_train data).

Default hyper-parameters are: hyper=list(alpha=0.10, minbucket = floor(dim(X)[1]*0.10), maxdepth = 4, outcome_PLE=FALSE).

4. otr: Optimal treatment regime approach using "ctree". Based on patient-level treatment effect estimates, fit PLE~CTREE(X) with weights=abs(PLE).

Default hyper-parameters are: hyper=list(alpha=0.10, minbucket = floor(dim(X)[1]*0.10), maxdepth = 4, thres=">0").

4. mob_weib: Wrapper function for the function "mob" with weibull loss function using the partykit package. Here, model-based partitioning (MOB) with weibull loss (survival), (Y~MOB_WEIB(A,X)), is used to identify prognostic and/or predictive variables.

Default hyper-parameters are: hyper = list(alpha=0.10, maxdepth=4, parm=NULL, minsize=floor(dim(X)[1]*0.10)).

5. rpart: Recursive partitioning through the "rpart" R package. Here, recursive partitioning and regression trees are used to identify either prognostic, Y~rpart(X), or predictive variables, PLE~rpart(X) (outcome_PLE=TRUE; requires mu_train data).

Value

Trained subgroup model and subgroup predictions/estimates for train/test sets.

References

See Also

PRISM

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(StratifiedMedicine)
## Continuous ##
dat_ctns = generate_subgrp_data(family="gaussian")
Y = dat_ctns$Y
X = dat_ctns$X
A = dat_ctns$A

# Fit through submod_train wrapper #
mod1 = submod_train(Y=Y, A=A, X=X, Xtest=X, submod="submod_lmtree")
table(mod1$Subgrps.train)
plot(mod1$fit$mod)

StratifiedMedicine documentation built on Jan. 7, 2021, 9:07 a.m.