owl: Integrated Outcome-weighted Learning for Estimating Optimal...
In DTRlearn2: Statistical Learning Methods for Optimizing Dynamic Treatment Regimes

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/owl.R

This function implements a variety of outcome-weighted learning methods for estimating general K-stage DTRs. Different loss functions - SVM hinge loss, SVM ramp loss, binomial deviance loss, and L2 loss - can be adopted to solve the weighted classification problem at each stage. Augmentation in the outcomes is allowed to improve efficiency especially when there are multiple stages with a small sample size. Cross validation is conducted to choose the best tuning parameters if any.

1 2	owl(H, AA, RR, n, K, pi='estimated', res.lasso=TRUE, loss='hinge', kernel='linear', augment=FALSE, c=2^(-2:2), sigma=c(0.03,0.05,0.07), s=2.^(-2:2), m=4)

`H`	subject history information before treatment for the `K` stages. It can be a vector or a matrix when only baseline information is used in estimating the DTR; otherwise, it would be a list of length `K`. Please standardize all the variables in `H` to have mean 0 and standard deviation 1 before using `H` as the input. See details for how to construct H.
`AA`	observed treatment assignments for all subjects at the `K` stages. It is a vector if `K=1`, or a list of `K` vectors corresponding to the `K` stages.
`RR`	observed reward outcomes for all subjects at the `K` stages. It is a vector if `K=1`, or a list of `K` vectors corresponding to the `K` stages.
`n`	sample size, number of subjects in the dataset
`K`	number of stages
`pi`	treatment assignment probabilities of the observed treatments for all subjects at the K stages. It is a vector if `K=1`, or a list of `K` vectors corresponding to the `K` stages. It can be a user specified input if the treatment assignment probabilities are known. The default is `pi="estimated"`, that is we estimate the treatment assignment probabilities based on lasso-penalized logistic regressions with H_k being the predictors at each stage k.
`res.lasso`	whether or not to use lasso penalty in the regression to take residuals for constructing the weights. The default is `res.lasso=TRUE`.
`loss`	loss function for sovling the weighted classification problem at each stage. The options are `"hinge", "ramp", "logit", "logit.lasso", "l2", "l2.lasso"`. `"hinge"` and `"ramp"` are for the SVM hinge loss and SVM ramp loss. `"logit"` and `"logit.lasso"` are for the binomial deviance loss used in the logistic regression, where lasso penalty is applied under `"logit.lasso"`. `"l2"` and `"l2.lasso"` are for the L2 or square loss, where lasso penalty is applied under `"l2.lasso"`. The default is `loss="hinge"`.
`kernel`	kernel function to use under SVM hinge loss or SVM ramp loss. `"linear"` and `"rbf"` kernel are implemented under SVM hinge loss; `"linear"` kernel is implemented under SVM ramp loss. The default is `kernel="linear"`.
`augment`	whether or not to use augmented outcomes at each stage. Augmentation is recommended when there are multiple stages and the sample size is small. The default is `augment=FALSE`.
`c`	a vector specifies the values of the regularization parameter C for tuning under SVM hinge loss or SVM ramp loss. The default is `c=2^(-2:2)`. In practice, a wider range of `c` can be specified based on the data.
`sigma`	a vector specifies the values of the positive parameter sigma in the RBF kernel for tuning under SVM hinge loss, i.e., when `loss="hinge"` and `kernel="rbf"`. The default is `sigma=c(0.03,0.05,0.07)`. In practice, a wider range of `sigma` can be specified based on the data.
`s`	a vector specifies the values of the slope parameter in the SVM ramp loss for tuning, i.e., when `loss="ramp"` and `kernel="linear"`. The default is `c=2^(-2:2)`. In practice, a wider range of `s` can be specified based on the data.
`m`	number of folds in the m-fold cross validation for choosing the tuning parameters `c`, `sigma` or `s`. It is also used for choosing the tuning parameter of the lasso penalty when `res.lasso=T`, `loss="logit.lasso"` or `loss="l2.lasso"` is specified. The default is `m=4`.

A patient's history information prior to the treatment at stage k can be constructed recursively as H_k = (H_{k-1}, A_{k-1}, R_{k-1}, X_k) with H_1=X_1, where X_k is subject-specific variables collected at stage k just prior to the treatment, A_k is the treatment at stage k, and R_k is the outcome observed post the treatment at stage k. Higher order or interaction terms can also be easily incorporated in H_k, e.g., H_k = (H_{k-1}, A_{k-1}, R_{k-1}, X_k, H_{k-1}A_{k-1}, R_{k-1}A_{k-1}, X_kA_{k-1}).

A list of results is returned as an object. It contains the following attributes:

`stage1`	a list of stage 1 results, ...
`stageK`	a list of stage K results
`valuefun`	overall empirical value function under the estimated DTR
`benefit`	overall empirical benefit function under the estimated DTR
`pi`	treatment assignment probabilities of the observed treatments for each subject at the K stages. It is a list of K vectors. If `pi='estimated'` is specified as input, the estimated treatment assignment probabilities from lasso-penalized logistic regressions will be returned.
`type`	object type corresponding to the specified `loss` and `kernel`

In each stage's result, a list is returned which consists of

`beta0`	estimated coefficient of the intercept in the decision function
`beta`	estimated coefficients of H_k in the decision function. It's not returned with RBF kernel under SVM hinge loss.
`fit`	fitted decision function for each subject
`probability`	estimated probability that treatment 1 (vs. -1) is the optimal treatment for each subject in the sample. It's calculated by exp(fit)/(1 + exp(fit)).
`treatment`	the estimated optimal treatment for each subject
`c`	the best regularization parameter C in SVM hinge loss or SVM ramp loss, chosen from the values specified in `c` via cross validation
`sigma`	the best parameter σ in the RBF kernel, chosen from the values specified in `sigma` via cross validation
`s`	the best slope parameter s in the ramp loss, chosen from the values specified in `s` via cross validation.
`iter`	number of iterations under SVM ramp loss
`alpha1`	the solution to the Lagrangian dual problem under SVM hinge loss or SVM ramp loss. It is used for constructing the decision function on the new sample.
`H`	the input H, returned only under SVM hinge loss with RBF kernel. It is used for constructing the RBF kernel on the new sample.

Yuan Chen, Ying Liu, Donglin Zeng, Yuanjia Wang

Maintainer: Yuan Chen <yc3281@columbia.edu><irene.yuan.chen@gmail.com>

Liu, Y., Wang, Y., Kosorok, M., Zhao, Y., & Zeng, D. (2014). Robust hybrid learning for estimating personalized dynamic treatment regimens. arXiv preprint. arXiv, 1611.

Liu, Y., Wang, Y., Kosorok, M., Zhao, Y., & Zeng, D. (2018). Augmented Outcome-weighted Learning for Estimating Optimal Dynamic Treatment Regimens. Statistics in Medicine. In press.

Zhao, Y., Zeng, D., Rush, A. J., & Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107(499), 1106-1118.

Zhao, Y. Q., Zeng, D., Laber, E. B., & Kosorok, M. R. (2015). New statistical learning methods for estimating optimal dynamic treatment regimes. Journal of the American Statistical Association, 110(510), 583-598.

predict.owl, sim_Kstage, ql

# simulate 2-stage training and test sets
n_train = 100
n_test = 500
n_cluster = 10
pinfo = 10
pnoise = 20

train = sim_Kstage(n_train, n_cluster, pinfo, pnoise, K=2)
H1_train = scale(train$X)
H2_train = scale(cbind(H1_train, train$A[[1]], H1_train * train$A[[1]]))
pi_train = list(rep(0.5, n_train), rep(0.5, n_train))

test = sim_Kstage(n_test, n_cluster, pinfo, pnoise, train$centroids, K=2)
H1_test = scale(test$X)
H2_test = scale(cbind(H1_test, test$A[[1]], H1_test * train$A[[1]]))
pi_test = list(rep(0.5, n_test), rep(0.5, n_test))

# estimate DTR with owl on the training sample
owl_train = owl(H=list(H1_train, H2_train), AA=train$A, RR=train$R, n=n_train, K=2, pi=pi_train,
    loss='hinge', augment=TRUE, m=3)
owl_train$stage1$beta
owl_train$stage1$treatment
owl_train$valuefun

# apply the estimated DTR to the test sample
owl_test = predict(owl_train, H=list(H1_test, H2_test), AA=test$A, RR=test$R, K=2, pi=pi_test)
owl_test$treatment
owl_test$valuefun