sim_Kstage: Simulate a K-stage Sequential Multiple Assignment Randomized...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/sim_Kstage.R

Description

This function simulates a K-stage SMART data with (pinfo + pnoise) baseline variables from a multivariate Gaussian distribution. The pinfo variables have variance 1 and pairwise correlation 0.2; the pnoise variables have mean 0 and are uncorrelated with each other and with the pinfo variables.

Subjects are from n_cluster latent groups with equal sizes, and these n_cluster groups are characterized by their differentiable means in the pinfo feature variables. Each latent group has its own optimal treatment sequence, where the optimal treatment for subjects in group g at stage k is generated as A^* = 2( [ g/(2k -1) ] mod 2) - 1. The assigned treatment group (1 or -1) for each subject at each stage is randomly generated with equal probability. The primary outcome is observed only at the end of the trial, which is generated as R = ∑_{k=1}^{K} A_k A_k^* + N(0,1).

Usage

1
sim_Kstage (n, n_cluster, pinfo, pnoise, centroids=NULL, K)

Arguments

n

sample size, should be a multiple of n_cluster.

n_cluster

number of latent groups

pinfo

number of informative baseline variables

pnoise

number of non-informative baseline variables

centroids

centroids of the pinfo variables for the n_cluster groups. It is a matrix of dimension n_cluster by pinfo. It's used as the means of the multivariate Gaussians to generate the pinfo variables for the n_cluster groups. For a training set, do not assign centroids, the centroids are generated randomly from N(0,5) by the function. For a test set, one should assign the same set of centroids as the training set.

K

number of stages.

Value

X

baseline variables. It is a matrix of dimension n by (pinfo + pnoise).

A

treatment assigments for the K-stages. It is a list of K vectors.

R

outcomes of the K-stages. It is a list of K vectors. In this simulation setting, no intermediate outcomes are observed, so the first K-1 vectors are vectors of 0.

optA

optimal treatments for the K-stages. It is a list of K vectors.

centroids

centroids of the pinfo variables for the n_cluster groups. It is a matrix of dimension n_cluster by pinfo.

Author(s)

Yuan Chen, Ying Liu, Donglin Zeng, Yuanjia Wang

Maintainer: Yuan Chen <yc3281@columbia.edu><irene.yuan.chen@gmail.com>

See Also

owl, ql

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
n_train = 100
n_test = 500
n_cluster = 10
pinfo = 10
pnoise = 20

# simulate a 2-stage training set
train = sim_Kstage(n_train, n_cluster, pinfo, pnoise, K=2)

# simulate an independent 2-stage test set with the same centroids of the training set
test = sim_Kstage(n_test, n_cluster, pinfo, pnoise, train$centroids, K=2)

DTRlearn2 documentation built on April 22, 2020, 5:07 p.m.