Description Usage Arguments Value References See Also Examples
Fits subgroup identification model class of Chen, et al (2017)
1 2 3 4 5 6 7 8 9 10 11 12 13  fit.subgroup(x, y, trt, propensity.func = NULL,
loss = c("sq_loss_lasso", "logistic_loss_lasso", "poisson_loss_lasso",
"cox_loss_lasso", "owl_logistic_loss_lasso",
"owl_logistic_flip_loss_lasso", "owl_hinge_loss", "owl_hinge_flip_loss",
"sq_loss_lasso_gam", "poisson_loss_lasso_gam", "logistic_loss_lasso_gam",
"sq_loss_gam", "poisson_loss_gam", "logistic_loss_gam",
"owl_logistic_loss_gam", "owl_logistic_flip_loss_gam",
"owl_logistic_loss_lasso_gam", "owl_logistic_flip_loss_lasso_gam",
"sq_loss_gbm", "poisson_loss_gbm", "logistic_loss_gbm", "cox_loss_gbm",
"custom"), method = c("weighting", "a_learning"), match.id = NULL,
augment.func = NULL, fit.custom.loss = NULL, cutpoint = 0,
larger.outcome.better = TRUE, reference.trt = NULL, retcall = TRUE,
...)

x 
The design matrix (not including intercept term) 
y 
The response vector 
trt 
treatment vector with each element equal to a 0 or a 1, with 1 indicating treatment status is active. 
propensity.func 
function that inputs the design matrix x and the treatment vector trt and outputs
the propensity score, ie Pr(trt = 1  X = x). Function should take two arguments 1) x and 2) trt. See example below.
For a randomized controlled trial this can simply be a function that returns a constant equal to the proportion
of patients assigned to the treatment group, i.e.:

loss 
choice of both the M function from Chen, et al (2017) and potentially the penalty used for variable selection.
All

method 
subgroup ID model type. Either the weighting or Alearning method of Chen et al, (2017) 
match.id 
a (character, factor, or integer) vector with length equal to the number of observations in 
augment.func 
function which inputs the response Example 1: Example 2: augment.func < function(x, y, trt) { data < data.frame(x, y, trt) lmod < lm(y ~ x * trt) ## get predictions when trt = 1 data$trt < 1 preds_1 < predict(lmod, data) ## get predictions when trt = 1 data$trt < 1 preds_n1 < predict(lmod, data) ## return predictions averaged over trt return(0.5 * (preds_1 + preds_n1)) } For binary and timetoevent outcomes, make sure that predictions are returned on the scale of the predictors Example 3: augment.func < function(x, y) { bmod < glm(y ~ x, family = binomial()) return(predict(bmod, type = "link")) } 
fit.custom.loss 
A function which minimizes a userspecified
custom loss function M(y,v) to be used in model fitting.
If provided, The loss function M(y, v) to be minimized MUST meet the following two criteria:
An example of a valid loss function is M(y, v) = (y  v)^2. In this case D_M(y, v) = 2(y  v). See Chen et al. (2017) for more details on the restrictions on the loss function M(y, v). The provided function MUST return a list with the following elements:
The provided function MUST be a function with the following arguments:
The provided function can also optionally take the following arguments:
Example 1: Here we minimize M(y, v) = (y  v)^2 fit.custom.loss < function(x, y, weights, ...) { df < data.frame(y = y, x) # minimize squared error loss with NO lasso penalty lmf < lm(y ~ x  1, weights = weights, data = df, ...) # save coefficients cfs = coef(lmf) # create prediction function. Notice # how a column of 1's is appended # to ensure treatment main effects are included # in predictions prd = function(x, type = "response") { dfte < cbind(1, x) colnames(dfte) < names(cfs) predict(lmf, data.frame(dfte)) } # return lost of required components list(predict = prd, model = lmf, coefficients = cfs) } Example 2: M(y, v) = y\exp(v) fit.expo.loss < function(x, y, weights, ...) { ## define loss function to be minimized expo.loss < function(beta, x, y, weights) { sum(weights * y * exp(drop(tcrossprod(x, t(beta) ))) } # use optim() to minimize loss function opt < optim(rep(0, NCOL(x)), fn = expo.loss, x = x, y = y, weights = weights) coefs < opt$par pred < function(x, type = "response") { tcrossprod(cbind(1, x), t(coefs)) } # return list of required components list(predict = pred, model = opt, coefficients = coefs) } 
cutpoint 
numeric value for patients with benefit scores above which
(or below which if 
larger.outcome.better 
boolean value of whether a larger outcome is better/preferable. Set to 
reference.trt 
which treatment should be treated as the reference treatment. Defaults to the first level of 
retcall 
boolean value. if 
... 
options to be passed to underlying fitting function. For all For all 
An object of class "subgroup_fitted"
.
predict 
A function that returns predictions of the covariateconditional treatment effects 
model 
An object returned by the underlying fitting function used. For example, if the lasso use used to fit
the underlying subgroup identification model, this will be an object returned by 
coefficients 
If the underlying subgroup identification model is parametric, 
call 
The call that produced the returned object. If 
family 
The family corresponding to the outcome provided 
loss 
The loss function used 
method 
The method used (either weighting or Alearning) 
propensity.func 
The propensity score function used 
larger.outcome.better 
If larger outcomes are preferred for this model 
cutpoint 
Benefit score cutoff value used for determining subgroups 
var.names 
The names of all variables used 
n.trts 
The number of treatment levels 
comparison.trts 
All treatment levels other than the reference level 
reference.trt 
The reference level for the treatment. This should usually be the control group/level 
trts 
All treatment levels 
trt.received 
The vector of treatment assignments 
pi.x 
A vector of propensity scores 
y 
A vector of outcomes 
benefit.scores 
A vector of conditional treatment effects, i.e. benefit scores 
recommended.trts 
A vector of treatment recommendations (i.e. for each patient, which treatment results in the best expected potential outcomes) 
subgroup.trt.effects 
(Biased) estimates of the conditional treatment effects and conditional outcomes. These are essentially just empirical averages within different combinations of treatment assignments and treatment recommendations 
individual.trt.effects 
estimates of the individual treatment effects as returned by

Chen, S., Tian, L., Cai, T. and Yu, M. (2017), A general statistical framework for subgroup identification and comparative treatment scoring. Biometrics. doi:10.1111/biom.12676 http://onlinelibrary.wiley.com/doi/10.1111/biom.12676/abstract
Xu, Y., Yu, M., Zhao, Y. Q., Li, Q., Wang, S., & Shao, J. (2015), Regularized outcome weighted subgroup identification for differential treatment effects. Biometrics, 71(3), 645653. doi: 10.1111/biom.12322 http://onlinelibrary.wiley.com/doi/10.1111/biom.12322/full
Zhao, Y., Zeng, D., Rush, A. J., & Kosorok, M. R. (2012), Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107(499), 11061118. doi: 10.1080/01621459.2012.695674 http://dx.doi.org/10.1080/01621459.2012.695674
validate.subgroup
for function which creates validation results for subgroup
identification models, predict.subgroup_fitted
for a prediction function for fitted models
from fit.subgroup
, plot.subgroup_fitted
for a function which plots
results from fitted models, and print.subgroup_fitted
for arguments for printing options for fit.subgroup()
.
from fit.subgroup
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220  library(personalized)
set.seed(123)
n.obs < 500
n.vars < 15
x < matrix(rnorm(n.obs * n.vars, sd = 3), n.obs, n.vars)
# simulate nonrandomized treatment
xbetat < 0.5 + 0.5 * x[,7]  0.5 * x[,9]
trt.prob < exp(xbetat) / (1 + exp(xbetat))
trt01 < rbinom(n.obs, 1, prob = trt.prob)
trt < 2 * trt01  1
# simulate response
# delta below drives treatment effect heterogeneity
delta < 2 * (0.5 + x[,2]  x[,3]  x[,11] + x[,1] * x[,12] )
xbeta < x[,1] + x[,11]  2 * x[,12]^2 + x[,13] + 0.5 * x[,15] ^ 2
xbeta < xbeta + delta * trt
# continuous outcomes
y < drop(xbeta) + rnorm(n.obs, sd = 2)
# binary outcomes
y.binary < 1 * (xbeta + rnorm(n.obs, sd = 2) > 0 )
# count outcomes
y.count < round(abs(xbeta + rnorm(n.obs, sd = 2)))
# timetoevent outcomes
surv.time < exp(20  xbeta + rnorm(n.obs, sd = 1))
cens.time < exp(rnorm(n.obs, sd = 3))
y.time.to.event < pmin(surv.time, cens.time)
status < 1 * (surv.time <= cens.time)
# create function for fitting propensity score model
prop.func < function(x, trt)
{
# fit propensity score model
propens.model < cv.glmnet(y = trt,
x = x, family = "binomial")
pi.x < predict(propens.model, s = "lambda.min",
newx = x, type = "response")[,1]
pi.x
}
#################### Continuous outcomes ################################
subgrp.model < fit.subgroup(x = x, y = y,
trt = trt01,
propensity.func = prop.func,
loss = "sq_loss_lasso",
nfolds = 10) # option for cv.glmnet
summary(subgrp.model)
# estimates of the individualspecific
# treatment effect estimates:
subgrp.model$individual.trt.effects
# fit lasso + gam model with REML option for gam
subgrp.modelg < fit.subgroup(x = x, y = y,
trt = trt01,
propensity.func = prop.func,
loss = "sq_loss_lasso_gam",
method.gam = "REML", # option for gam
nfolds = 5) # option for cv.glmnet
subgrp.modelg
#################### Using an augmentation function #####################
## augmentation funcions involve modeling the conditional mean E[YT, X]
## and returning predictions that are averaged over the treatment values
## return < 1/2 * (hat{E}[YT=1, X] + hat{E}[YT=1, X])
##########################################################################
augment.func < function(x, y, trt) {
data < data.frame(x, y, trt)
xm < model.matrix(y~trt*x1, data = data)
lmod < cv.glmnet(y = y, x = xm)
## get predictions when trt = 1
data$trt < 1
xm < model.matrix(y~trt*x1, data = data)
preds_1 < predict(lmod, xm, s = "lambda.min")
## get predictions when trt = 1
data$trt < 1
xm < model.matrix(y~trt*x1, data = data)
preds_n1 < predict(lmod, xm, s = "lambda.min")
## return predictions averaged over trt
return(0.5 * (preds_1 + preds_n1))
}
subgrp.model.aug < fit.subgroup(x = x, y = y,
trt = trt01,
propensity.func = prop.func,
augment.func = augment.func,
loss = "sq_loss_lasso",
nfolds = 10) # option for cv.glmnet
summary(subgrp.model.aug)
#################### Binary outcomes ####################################
# use logistic loss for binary outcomes
subgrp.model.bin < fit.subgroup(x = x, y = y.binary,
trt = trt01,
propensity.func = prop.func,
loss = "logistic_loss_lasso",
type.measure = "auc", # option for cv.glmnet
nfolds = 5) # option for cv.glmnet
subgrp.model.bin
#################### Count outcomes #####################################
# use poisson loss for count/poisson outcomes
subgrp.model.poisson < fit.subgroup(x = x, y = y.count,
trt = trt01,
propensity.func = prop.func,
loss = "poisson_loss_lasso",
type.measure = "mse", # option for cv.glmnet
nfolds = 5) # option for cv.glmnet
subgrp.model.poisson
#################### Timetoevent outcomes #############################
library(survival)
subgrp.model.cox < fit.subgroup(x = x, y = Surv(y.time.to.event, status),
trt = trt01,
propensity.func = prop.func,
loss = "cox_loss_lasso",
nfolds = 5) # option for cv.glmnet
subgrp.model.cox
#################### Using custom loss functions ########################
## Use custom loss function for binary outcomes
fit.custom.loss.bin < function(x, y, weights, offset, ...) {
df < data.frame(y = y, x)
# minimize logistic loss with NO lasso penalty
# with allowance for efficiency augmentation
glmf < glm(y ~ x  1, weights = weights,
offset = offset, # offset term allows for efficiency augmentation
family = binomial(), ...)
# save coefficients
cfs = coef(glmf)
# create prediction function.
prd = function(x, type = "response") {
dfte < cbind(1, x)
colnames(dfte) < names(cfs)
## predictions must be returned on the scale
## of the linear predictor
predict(glmf, data.frame(dfte), type = "link")
}
# return lost of required components
list(predict = prd, model = glmf, coefficients = cfs)
}
subgrp.model.bin.cust < fit.subgroup(x = x, y = y.binary,
trt = trt01,
propensity.func = prop.func,
fit.custom.loss = fit.custom.loss.bin)
subgrp.model.bin.cust
## try exponential loss for
## positive outcomes
fit.expo.loss < function(x, y, weights, ...)
{
expo.loss < function(beta, x, y, weights) {
sum(weights * y * exp(drop(x %*% beta)))
}
# use optim() to minimize loss function
opt < optim(rep(0, NCOL(x)), fn = expo.loss, x = x, y = y, weights = weights)
coefs < opt$par
pred < function(x, type = "response") {
tcrossprod(cbind(1, x), t(coefs))
}
# return list of required components
list(predict = pred, model = opt, coefficients = coefs)
}
# use exponential loss for positive outcomes
subgrp.model.expo < fit.subgroup(x = x, y = y.count,
trt = trt01,
propensity.func = prop.func,
fit.custom.loss = fit.expo.loss)
subgrp.model.expo

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.